Human Augmentation

I want to take humanity to a more natural world

Human Augmentation

My research area began in the field known as human-computer interaction, which studies interfaces between humans and computers. Later, around 2008, I shifted my focus to human augmentation. Then, around 2020, I turned to human-AI integration, which is what I’m working on these days.

Motivation for Exploration

The impetus for my pursuing research in this field was quite clear: the Osaka Expo of 1970. I was in second grade at elementary school and I touched a light pen. When I touched the screen with the light from the tip, a picture of Fujio Akatsuka’s cartoon character Nyarome appeared, which I thought was very interesting. I thought, “I’m definitely going to do this kind of work!” and basically still haven’t strayed from it at all.

That is to say, we have computers and other machines, and then we have humans. We absolutely need something that connects the two. So, for example, if I just tapped the screen, and a manga unfolded on it, I would feel as if I myself had become someone who could write a manga story and my abilities had been expanded. This is connected to human augmentation.

Cyborg 009 over Astro Boy

At the same time, I was part of the generation that grew up with the manga Astro Boy and Cyborg 009. Personally, I liked Cyborg 009 more than Astro Boy because it had a bad-boy aspect that I thought was cool. Astro Boy is a good kid, and the perfect AI robot, which is fine, but the 009s were originally humans who underwent cyborg surgery to equip them with various gadgets, they had different speeds and senses of time, and even though their bodies were strengthened as if they had superpowers, their minds were very human. This aspect had themes that were being explored in science fiction novels at the time, and it intrigued me.

I Want to Mediate between Humans and Machines

The influence of these manga and science fiction stories on me, and my interest in the extremely cutting-edge interactive systems I encountered at the Expo, drew me to the study of computer programming. At first, I didn’t really understand programs, so I’d just watch computer courses on NHK and write programs in FORTRAN with a pencil on graph paper. I didn’t have my own microcomputer then, so I couldn’t even try using what I wrote, but I kept doing this until around junior high school. I remember that from this time on, I was very fascinated by this mediation between humans and machines.

I Want to Create Something That Makes That Past Seem Unnatural

A major research result in user interfaces was so-called multi-touch technology. Today, we use multiple fingers to manipulate tablets and smartphones, and this technology was one of the sources of the technology that I invented. Humans naturally use multiple fingers in the real world. In other words, the real world is a multi-touch one, which means that operating a single arrow with a mouse and making everything else fit with it is actually unnatural.

Even a child needs no explanation of how to use multi-touch. Such a property is one of the ideals of technology, that once it’s available, anyone can easily use it. Conversely, once we start using an interface that is incomplete, it becomes a bottleneck that prevents us from fully demonstrating our own inherent abilities.

This is why, more than creating something that is out of the ordinary and novel, I want to make something that, once it’s finished, will make what we were using in the past seem unnatural. Ultimately, I want to create something that will impact all of humanity. In fact, multi-touch technology is now used by billions of people. In order to achieve this, I observe the world from the perspective of “Why can’t we do this now?” and “What is unnatural?”

I Want to Take Humanity to A More Natural World

Now that multi-touch is widespread, we need no explanation to be able to pinch and spread out the screen with our fingers. Since it’s already like a part of our body, if a smartphone suddenly came out that didn’t let us do this, we’d get frustrated. We’d wonder why it doesn’t work. In other words, the world has already moved in that direction, toward the natural. This is my worldview—to take humanity to a world that is more natural in this way.

Toward Augmentation (Extending Abilities)

When multi-touch became widespread, I felt that I had traversed a great mountain of human interface in my work. When I thought about what would happen in the future when trying to augment human capabilities, I felt I wanted to focus on the feeling I had when I moved the manga with the light pen at the Expo—that of my own abilities being unleashed. From around this time, the subject and themes of my research shifted to augmentation of abilities.
For this theme, I created a system called “JackIn“, which allows one to connect their senses with those of others and experience them personally, or to take on the other person’s senses through telepresence.

Some may think this is a bit far removed from the field of user interface research, but I remember that it developed quite naturally. I’m here, and someone else is over there. I can take on the other person’s senses and explain to them, or if there is an expert over there, I can relive that experience. I thought that this already was exactly what human augmentation was.

A State in Which I Understand, And at The Same Time, AI Understands

More recently, about 5-6 years ago, around 2020, I start to feel that the core technology is AI after all. Until then, augmentation meant mechanical things like exoskeletons or devices like cameras. But it’s not only that. For example, it’s a state in which AI sees what I’m seeing at the same time, and understands what I’m seeing at the same time. The AI might even understand what’s going on better than I do. I’ve come to think of this, too, as augmentation.

When we think of AI, I think it’s easy to imagine things like agents and other independent robots, a robot like Astro Boy, or a software robot. Ultimately, it’s an AI agent-like entity that operates automatically. But at the same time, I think there’s probably also a Cyborg 009-type. In other words, we’re moving in the direction of retaining our own minds while an AI that we have live with us augments our information processing and recognition.

Human-AI Integration

This is what I call Human-AI Integration. One specific technology we are currently working on is called *silent speech*. This technology estimates what a person is trying to say based solely on their mouth movements, without producing any sound. Naturally, deep learning is used to recognize and interpret these subtle movements. If one can communicate with a computer silently—just by moving their mouth—then this can be seen as an extension of our ability to speak. As interactions with AI shift from text to conversation, it may eventually feel like we are communicating telepathically with AI.

For individuals with speech impairments or damaged vocal cords who have difficulty speaking or are unable to produce sound at all, being able to reconstruct their voice from mouth movements means that AI can effectively become their voice.

In this way, when AI augments or complements human abilities, it plays a critical role—so much so that we can say the AI becomes a part of ourselves. We are actively exploring this direction through our research.

Letting AI Swim

In the course of human interface research, the paradigm of direct manipulation has continued for a long time, probably for about 50 years. This is where you control a computer by moving your body in real time, for example, moving your fingers, moving a mouse, or making gestures. This is something that I think will loosen up a little in the future.

The ultimate interface might be a world where we achieve a kind of telepathy, and we only have to think, or hardly use our body at all, as in silent speech. Even if that happens, I believe it will happen as an extension of direct manipulation.

Humans can only perform one task at a time, so there are inherent limitations in making them the sole agents of manipulation.　This idea is often linked to concepts like human-centered interfaces or sense of agency, but if we cling too tightly to a human-centered approach, we may never go beyond human limitations. Beyond that, I think we’ll see a world where we coexist with an entity that is somewhat independent, but also a part of us, as if we were letting AI swim, and I think this will be the next extremely large field of research. I think there may be a way for computers and humans to be connected in a slightly looser way, not with a completely autonomous, automatic robot, but neither in a state where we feel like we are always manipulating it. I think that this is still a new area, and it’s the kind of relationship I’d like to pursue.

What Makes Us Happy?

Going forward, broadly speaking, I want to get to a point where, when we compare the world before and after the invention of a certain technology, the world after feels more natural to us. Although research tends to focus on short-term change, my worldview for the research I am conducting is that when we get there, people will realize that the world before this was quite unnatural. So, ultimately, the questions come back to, “What do we want to do?” and “What makes us happy?”

For example, speaking aloud may be natural for one who can speak, but for one who can’t, it’s extremely frustrating. In other words, the use of one’s voice can be a major goal in itself. In consideration of this, I do my research while also taking notice of what each person’s goal is and how it can be achieved.

Everlasting Value

In another worldview, I‘m at Sony CSL – Kyoto, and I’m also one of the people who planned this place. Technology changes quickly these days. New AI technologies are announced almost daily, and things change rapidly. These developments tend to render things from a year ago obsolete. However, the things humans want to do, for example, desires and wishes such as the desire to use one’s voice naturally, do not change that much.

The technology that will fulfill our wishes may be sophisticated, but I think that the aspects of people that don’t change are also important. Part of the reason we are here in Kyoto is that we can access those unchanging, lasting, eternal values through culture and traditional performing arts. Even as I research technology that changes at an incredibly fast pace, I want to try, as a lone human being, to consider the question of what will be the values that last for hundreds of years.

I think that if this pursuit of technology is biased to one side or another, we will become very unbalanced as human beings in today’s world. People who pursue new technology may do it only for the sake of the pursuit and lose sight of its purpose. What is the purpose of what we are doing? I believe that after all, at the root of it is value that doesn’t change.

A World Where The Unchanging And The Quickly Changing Support Each Other in Complementary Ways

I believe that those traditional and cultural aspects can coexist with very cutting-edge things. I hold this belief as part of a very big issue awareness or vision in advancing research, and that’s why Sony CSL-Kyoto is also engaged in initiatives that link traditional items like the tea ceremony and Nishijin-ori textiles with AI. This is because we believe that things that don’t change and things that change rapidly can support each other in complementary ways. We go about our activities at our research base in Kyoto with the belief that this approach sets our work off from research that simply pursues new technology.

Humans Are Both The Scale Limit And The Most Precious Resource

It is natural that technology will continue to evolve, and probably will do so in all fields. This is of course also true in areas outside of computer science. However, even when that happens, I don’t think that the desire to just relax or eat delicious food, or other things that are close to our animal instincts, will change much.

When these things do change, it could be time to say goodbye to Homo sapiens. Things could also be different when we become a completely superhuman race with our thoughts uploaded into computers, and no longer have physical bodies. However, for now, we have this physical body, a real human body; this is both our scale limit and our most precious resource.

That Doesn’t Make Me Particularly Happy

If we merely talk about efficiency, we’ll find more and more areas where it would be better if humans did not exist. Because of this, there will probably be many situations where the term human-centered human interface no longer applies.
On the other hand, each person’s living self is unique to that person. The values that are ultimately arrived at are probably that given person’s time and sensibilities. In other words, without these, even if something happens efficiently, to that person, “It doesn’t concern me, and it doesn’t make me feel particularly happy.”

Although we each have limited resources, I think that what makes us happy and what makes us feel fulfilled are probably the ultimate motivation for any research or technology. These two are very deeply connected, and in terms of the purpose of what we do, I think that anything that doesn’t ultimately result in value and fulfillment for us humans probably has little meaning. “I am me, and I exist as myself. I have a fundamental desire to be comfortable because I think that I’m valuable.” If you think about it that way, I think that everlasting value is the ultimate goal.

Methods Change, But Purpose Doesn’t

To learn about the tea ceremony, you can go to a tea house, or go to cyberspace or learn it from AI. These means are constantly evolving, but I doubt that the root motivation itself—for example, wanting to learn the tea ceremony or to be good at piano—will change much.

From that perspective, we can say that methods change, but purpose doesn’t.

Our ideas about what is fulfilling could change with the times. However, I think that what we want is changing more slowly than the technology that determines what we can do.

Looking to the Future

Technology is constantly changing, so if legendary tea master Sen no Rikyu came to our time, he might be shocked at first, but he’d probably say, “Ah, but the important things haven’t changed after all.” I feel that the important things don’t really change too much over a few hundred years’ time. However, the methods for achieving them change at an incredible rate. If anything, it’s not a good thing to hold back the changes that can be made.

I think it’s possible for us to keep each of these going—to freely change what we should change, and to refrain from changing what we can’t change and what we shouldn’t change. I think it’s better for each of the two to coexist within us humans. As for the questions of what we shouldn’t change and what we should change, I want to challenge myself to repeat the process of connecting, checking, and proving the answers.

*This article has been reconstructed from on an interview held at Sony CSL-Kyoto on March 12, 2025.
Jun Rekimoto (Speaker)
Kei Fukuda (Interviewer)
Jonathan Katz（Translation）