The First Tour of Giant AI’s Robot Lab

Visiting Giant AI

Visiting Giant AI is like getting a tour of a secret lab that shouldn’t exist run by an eccentric genius. The kind of which we remember from “Back to the Future.”

Adrian Kaehler is that genius. 

He built the computer vision system for the first autonomous vehicle that later became Waymo. He played a key role in the early development of Magic Leap, an augmented reality company that just won best of show at the industry’s biggest gathering, AWE (for Augmented World Expo). He also wrote what many say is the book on Computer Vision which is still used by many computer science departments. Today he is leading the Giant AI company which is building humanoid robots that can work on manufacturing lines, doing the same jobs humans used to do and many new ones. Giant is invested in by Bill Gates and Khosla Ventures. 

He saw the problems long ago that robots will bring. The earlier companies’ robots were designed and built to be very precise, which means they remain expensive today. You see many of these in factories today, they are heavy, don’t work well with humans, have to be programmed months in advance and are hard to retrain and don’t recover well when errors are made. Some are too dangerous to be around like the ones in Tesla’s factory in Fremont, which has some robots in cages to keep all humans away. 

He also saw the solution: AI that builds a new kind of operating system. One that learns faster than anything most humans could imagine. It learns so fast that you only need to show it a few times how to do something and it’ll do that thing from then on. One that enables new lower-cost components to be used. Ones that are less precise

When I watch the Universal Worker move, I can see how the tendons that make it work create a very different, animal, sort of motion. It is kind of springy. This would be a non-starter for a traditional robot, but the AI control, just like with a person, manages this and makes it all work out. Dr. Kaehler tells me that the use of this sort of tendon system is central to how the robot can be so light and dexterous, as well as why it can be so much less expensive than traditional robots.

 It’s the new AI that enables this new lower cost and safer approach. 

So, getting into his lab first meant a lot to me. Why? I think it means a lot to you, too. 

It means we will have to rethink work. From scratch.

Is your happiness and income coming from you pushing a button on a machine? Really? I worked on HP’s manufacturing line when I was a young man of 17. One of my first jobs was working the wave soldering machine there and shoving circuit boards into the wave, which instantly solderied the whole board. I had helped my parents and brothers hand build hundreds of Apple II. My mom taught us to solder. If you get good at it, like my mom was, you could do maybe a board in 30 minutes. I saw how manufacturing lines can change labor from my kitchen. My mom worked for Hildy Licht, who got hired by Apple to take on the task since they couldn’t make enough in its own factory. Apple cofounder Steve Wozniak, AKA “Woz,” told me that those boards had fewer failures than the ones made in its own factory. It also makes me Apple’s first child laborer (I was 13 at the time).

Anyway, I never wanted to do such a job again, given how boring it was. I loved that Wave Machine because it saved many hours of labor. I dreamed of a day when a robot would stuff the board too. I had to do that over and over and over again.

I wish I had a Universal Worker by Giant AI Corporation back then.

As he showed me around he was telling me what was making these robots so special. The AI inside is next level. See, I’ve been following AI since the very beginning.

I was the first to see Siri.

That was the first consumer AI app. I also have the first video, on YouTube, of the first Google Self Driving Car. Long before anyone else. That was the first AI on the road. I have been following AI innovators since the ve beginning.

This robot is using the next generation of all that.

Don’t worry, though.

You do get that we are in an exponential world, right? One where we need more workers, not fewer. Even if Giant got a huge funding deal, for, say, a billion valuation, it still couldn’t build enough robots to replace ANY human for MANY years. These are to fill in the gaps for when you can’t get enough workers to keep up with demand.

Anyway, back to the lab. Along each side I saw a row of robot prototypes for what Giant AI is calling “the Universal Worker.” Each was being tended to by staff, as Adrian gave me a tour he explained what each was doing. A new form of ML that uses neural radiance fields to see – the engineers are putting finishing touches on blog posts that will soon come going deep into the technology. In the video Kaehler goes into some depth about what it’s doing and how it works.

Each robot had a humanoid form. Even a smile on the face. And the head moved in a very unique way that I had never seen before. Strangely human like. Which, Adrian says in the video embedded here, is part of its ability to learn quickly and also get the trust of the human working aside it. It also lets it do a wider range of jobs than otherwise. It sees the machine, or task, it is standing in front of like we do – in 3D. And, in fact, there are many other similarities between what runs under robots, virtual beings, autonomous vehicles, augmented reality headsets and glasses. Kaehler is the only human that I know that has built three of those and he says that they all are strongly connected underneath in how they perceive the world, let others see the perceived and synthesized world.

As you get a look around his lab, you do see that they feel like early versions of the Tesla Autopilot system: a little rough and slow. Heck, even today, four years later, it does 6,000 more things, but it still seems a little rough and slow. The Universal Robots feel the same a bit to me. At least at first. Until I started watching that this was real AI learning how to grasp and drop things. It felt humanlike as it dropped a rod onto a machine yet another time in a row without dropping it. 

I remember talking to the manager of the Seagate Hard Drive factory in Wuxi, China, about why he hired so many women. Nearly his entire final assembly line was women, highly trained too, I watched several run a scanning electron microscope. I never will forget what he told me: “Men drop drive off line, pick it up, put it back on line. Women don’t do that. They bring over to me and admit fault.”

This robot was learning quickly how to recover from its mistakes. Which is how it was designed, Adrian told me. It has grids of sensors in each finger, which can do a new kind of “feeling” than I’d ever seen a robot use before. Each of those sensors was being pushed and pulled by a cord going to a machine in the belly of its humanoid form. On the end of an arm that was built from cheap consumer processes. The hand shakes, just slightly, especially if a big forklift goes by. 

Giant’s AI is what makes it possible to become far less expensive. It “sees” the world in a new way, using something the AI engineers call “Neural Radiance Fields.” A new form of 3D scenes that you can walk through. In Giant AI’s case it moves the hands through these radiance fields, which are unlike any 3D data structure we’ve ever seen before.

Its AI is constantly adopting and learning, which lets it figure out how to recover from a mistake very quickly. Adrian wrote the math formula on the board on a previous trip. It keeps pushing the hands toward the best possible outcome. So, you can slap them and they’ll recover. Or, if an earthquake hits and it drops your motor before it goes into the box it was supposed to put it in and the machine shakes. It still should be able to complete the task, just like a human would, or try to save the part, if possible, and if possible it will report a problem. 

Anyway, at this point, you are wondering “why did you hype up Tesla’s robot so much?” Last week I did. Those who are inside the Tesla factory tell me that their simulator gives them an unfair advantage and will let them build a humanoid robot that can walk around and do a variety of tasks much faster than people are expecting. You’ll see Tesla’s robot in September as part of its AI day announcements. Yes, hardware is hard, even if you have the best simulators, it is getting easier.

In a way this is a David vs. Goliath kind of situation. So Giant had to focus on a very specific, but large enough, problem: one of low-skilled workers and what they need help with.

Which is why Giant’s Universal Robot doesn’t have legs. It isn’t a trillion dollar company. It can’t afford to put legs on a robot that doesn’t need them. A worker in a factory always stays in the same place and does the same job over and over and over.

It doesn’t spy on you the way that the Tesla robot will (Giant’s AI only can “look at” the work surface in front of it). It can’t walk around your factory floor mapping it out, or watching workers in other parts of your plant as it walks around.

It also doesn’t have a completed mouth, or a voice response system, or the ability to really communicate well with other human beings the way the Tesla robot will need to do. Which makes the Giant robot far cheaper than the Tesla ones and it ready now, at a speed slower than human, or soon, at same speed.

That said, Kaehler is keeping up to date on the latest computer vision research and he knows that Tesla’s will do many things Giant’s can’t, and that’s fine with him. He doesn’t have a car company to gather data about humans in the real world. It isn’t his goal to build a robot that can deliver pizza. Just do boring jobs that humans need an extra set of hands to help do.

Giant AI already has orders, Adrian says, but the company needs funding to get to the place where it can manufacture the robots themselves.

I remember visiting “Mr. China” Liam Casey. I visited him in his Shenzhen home and he gave me a once in six thousand lifetimes tour of Shenzhen that I treasure to this day. Then he took me on an even wilder one over his homeland of Ireland, where he took me to a research lab that Mark Zuckerberg ended up buying.

What did Casey teach me? He had the same problem. No one would invest in his business, even though he had customers. How did he get his orders done, I asked him “I got them built.”

“But how? Did you have something to trade? A house, an expensive car, secret photos, what?”

“My US Passport.”

The factory owner demanded his passport in trade for building his order. A form of collateral I’d never heard of before. Then had Casey travel the country to all his factories to do a certification on each. That led Casey to see the power of databases, particularly ones for tracking supply chains. Which is why he is Mr. China today, and makes many products in his PCH company that probably are in your home today. He used that early research about China’s factories to become the supply chain leader that many technology companies use to build their products.

Giant needs the same today. A way to get the product finished and manufactured. Capital, and lots of it, to get to where these are working hard to make everyone’s lives better.

Tesla’s simulator has ingested a lot more than just where has the car gone. It knows EVERYTHING about its owners. So, when an engineer wants to recreate a street, it is amazingly real and the people will even stop to say hello or let you check out their dog. Then you can make it rain. Or make it sunny. Over and over and over again. 

Why is it so magical? BECAUSE OF the data the car and phone collects. A Tesla crosses the Golden Gate Bridge every 10 seconds. No one else is even in the same universe in data collection capabilities.

Tesla has a similar bleeding edge AI to Giant’s but Tesla’s has billions of times more data than Giant ever will get its hands on.

However, do you just need a machine to push a button or two every minute or two it notices a job is done or do you need Tesla’s AI and simulator that will have to do a whole lot more? No, at least not now, because the costs will be completely higher for the Tesla robot, which will need to walk and get in and out of autonomous vehicles.

That said, now that I’ve seen the Giant AI and how sophisticated it is with literally no data when compared to the Tesla system I realize that the Tesla one must be far more advanced and started asking around.

The Tesla robot will need to get out of an autonomous vehicle and figure out how to get a pizza up to your apartment, or to your front door, once you figure it out by talking to so many people, like I do.

Kaehler showed me a way how Giant’s AI would do that if it had access to the Tesla data and resources, particularly its simulator where rafts of people can “jump into” and walk around keep training over and over teaching it to get it right. The demos you see in the video are quaint compared to what the resources of Tesla can generate, as impressive as they are.

Every day I’m more and more convinced I’m conservative. Either way, getting the first look at Giant’s Universal Worker gives us a good look at the future of work so I hope you appreciate being first to see this. I sure did. 

When will augmented reality glasses be ready for consumers?

Unfortunately it has taken a lot longer to get augmented reality glasses to consumers than I expected. A lot longer. I thought we would be wearing them all day by now. Heck, when I got Google Glass years ago I thought I never would take those off.

Boy, was I wrong.

Many in Silicon Valley taunt me for my previous optimism, saying “not this year, not next year.”

That doesn’t mean they aren’t getting closer. For the past seven years I’ve been watching Lumus, a small company in Israel, that makes the best lenses/displays I’ve seen so far. Every two years they come and visit me and show me its latest. Every year they get brighter, lighter, more efficient, smaller, and more.

Here, in video, is Lumus’ head of marketing showing me its latest displays and you see just how big an improvement it has made. You can see these are getting much closer to the size and quality that consumers will be happy wearing.

But since I have been so wrong before, I wanted to take a more sober look at these displays and ask myself “when will consumers buy these?”

That may just be the wrong question. Unless I was working at Meta or Apple or Snap.

Enterprise uses of these are coming right now. Just look at the revolution in robotics that is underway. AI pioneer Adrian Kaehler has been retweeting every amazing robot on Twitter (he is CEO of Giant AI, which makes manufacturing robots coming over the next year) and there are dozens that work on all sorts of production lines, not to mention do a variety of other jobs. These glasses would be perfect for controlling, and training, all these new robots. And a variety of other things, from training to surgery. This is why Magic Leap has a new shot at life that I also didn’t see, due to its cord and lack of consumer experiences.

Other augmented reality companies have pivoted away from consumers and toward enterprise uses of these glasses and devices (most notably Magic Leap and Microsoft’s HoloLens).

Why?

Well, for instance, look at some of the limitations of even these amazing new displays from Lumus. While they are many times brighter than, say, the Magic Leap or HoloLens displays, and have bigger fields of view, the image does not quite match my 4K TV, which cost me $8,000 last year.

So, consumers who want to watch TV, or particularly movies, in their glasses will find the image quality still not as nice as a bleeding edge TV, tablet, or phone display (although inside they are damn close). Even though augmented reality glasses give many other advantages (like you can watch in a plane or while walking around, something my big TV can’t do). But these are dramatically better than they were last time I saw Lumus’ latest. White whites. Sharp text. Bright videos and images.

The field of view, too, is 50 degrees. OK, that does match my 83-inch TV when I am sitting on my couch (the image in the Lumus is actually bigger than my TV slightly) but that isn’t enough to “immerse” you the way VR does. Will that matter to consumers? I think it will, but 50 degrees is way better than what Snap is showing in its current Spectacles. In 2024’s devices screens will be virtualized, too, so the hard field of view numbers won’t matter nearly as much. These are certainly better than my HoloLens’s field of view.

Also, bleeding edge TVs, like my Sony OLED, have better color and luminance depth. What does that mean? TV and movies still look better on my TV. But that, also, is sort of a bad comparison. My TV can’t travel with you. These displays are pretty damn good for a variety of uses, I just wish I didn’t need to wait until 2024 to get them.

This is why many who are working on Apple’s first device tell me it is NOT doing see-through glasses like these for its first product. They just don’t match consumer expectations yet (although these Lumus lenses are a lot closer than any others I’ve seen so far).

Apple’s first device is what those of us in the industry call a “passthrough” device and is NOT a pair of glasses like what Lumus is showing here. In other words, you can’t see the real world through the front of the device. Unless the device is on (Apple’s device will present a digital recreation of the room you are in — I hear its new version of augmented reality is pretty mind blowing, too).

Until this next generation of devices happens these glasses will mostly be used for R&D or enterprise uses, like controlling robots or production lines, or doing things like surgery, where field of view, brightness, etc aren’t as important as they will be for consumers. Lumus is selling their much better lenses to consumer-focused partners, but they don’t expect the really interesting glasses until 2024.

I’ve been working with a variety of enterprise users and here there is a deep hunger for better glasses. At Trimble, a construction company, for instance, they are working on a variety of initiatives. They are using the Boston Dynamics’ robots to map out construction sites in 3D and then using HoloLenses to do a variety of tasks. The problem? The HoloLens only has displays that are about 400 nits. Technical term for “dim, poor quality color, very little readability in bright sunlight.” Lumus’ displays are 5,000. Yesterday I took them outside and saw that they are plenty bright enough for bright environments.

Also, the HoloLens is very heavy and big compared to the glasses that Lumus and many others are readying. The construction workers are not happy with the size of the HoloLens, or even the Magic Leap, which has a cord down to a computer that clips on your belt.

These enterprise users are hungry to buy a decent set of augmented reality glasses. Lumus should help its partners get to these markets long before Meta, Snap, or Apple figure out how to get consumers to want to buy glasses.

How will I evaluate whether the market is ready?

Let’s make a list.

1. Brightness. 2,500 nits is perfect for most enterprise uses (HoloLens is only 400 and all my clients complain about lack of brightness and visual quality). Lumus says theirs can do 5,000, which gets close to consumer expectations. Big improvements over the past and over competitors I’ve seen.

2. Color. The Lumus lenses are much better than others I’ve seen. Pure whites and decent color (I could watch TV and movies in them). Enterprise is ready. Will consumers take to these in 2024? I think so. No color fringing like I see on my HoloLens. Much much nicer.

3. Size. The projectors in the Lumus are much smaller than they were three years ago when I last saw Lumus’ work. Very awesome for doctors, construction workers, production line workers, etc but still a bit too big for “RayBan” style glasses. But I could see wearing these for hours.

4. Cost. They avoided this question, sort of, but the cost is now coming down to enable devices that are $2,000 or less. That is acceptable for many enterprise uses, but still too high for most consumers. That said, I’m wearing glasses that cost $1,500 before insurance, so we are heading to consumer pricing.

5. Battery life and heat generation. Here Lumus has made big strides. They claim devices that are running their latest projectors will be able to go for hours, even all day, depending on how often the displays are showing information. That is great for, say, a surgeon, using a system like the one MediView makes. They only need displays on for a few minutes during surgery. Same for many other enterprise uses. Most workers won’t be trying to watch live streaming video all day long, like consumers will be. Also, they don’t heat up like others on the market do. But for consumer uses? Not quite there yet. Consumers will want to watch, say, CNBC all day long, along with working on screens of information.

6. Field of view. Yes, it’s better than my expensive 83-inch TV, but not by much. Consumers will have higher expectations than just 50 degrees. Enterprise users? Don’t care much at all. The benefits of having screens on their eyes outweighs the lack of wrap-around screens.

7. Content. Consumers will want to do everything from edit spreadsheets to watch TV shows and movies and play video games. All of which Lumus will never do, so its partners will need to come up with all of that. Enterprise users are far more focused on very specific use cases, like controlling a robot, or being able to see data on production machinery. That’s a hard job, for sure, but a far easier one than getting the wider range of things consumers expect done. Yes, the Metas, Apples, Googles, Snaps, Niantics, etc, are working on all that but they aren’t nearly ready with enough to get consumers to say “wow.”

8. Resilience. Consumers will want to wear these devices out in the rain. Will drop them. Their kids will step on them. How do I know? All that has happened to my glasses, which I’m forced to wear simply to see. Enterprise users are more focused on safety and many jobs, like surgery, will not need nearly the same kind of resilience that consumers will need.

Now, can all these problems be fixed by, say, an Apple or a Meta or a Snap? Sure, but I bet on Apple being more aggressive and that didn’t happen. So, we need to see how well it does next year with a launch of a bigger, heavier device aimed at home users to see how well consumers react to augmented reality devices on our faces.

Now, is there someone out there that has glasses ready to go sooner? Maybe, but, let’s say NVIDIA has a pair that does a lot, will they have all the advantages of Apple? No way. Not for a while.

This is why Mark Zuckerberg told investors that it will be “years” before augmented reality devices make big money with consumers. Even its VR efforts, after being out for seven years, and having a ton of content and low price of $300, is only selling about 1.5 million units a quarter (Apple sells that many phones in about two days).

Translation: as excited as I am about going to this week’s Augmented World Expo, we still have a couple of years to go, at minimum. I’m bummed writing that, but it’s better to be more realistic about the near future than optimistic.

As blind honor Apple accessibility pioneer my son shows far more work is ahead

It didn’t shock me that MojoVision (a Silicon Valley startup making augmented reality contact lenses) brought a big percentage of their team and had a table right in the middle of Sight Tech’s event honoring Mike Shebanek for his work on Apple’s VoiceOver functionality that enables blind people and those who have vision impairments use iPhones. All around the audience were blind people.

MojoVision’s CEO, Drew Perkins, had cataracts and eye surgery, and has long sought to build a bionic eye. So, it makes sense MojoVision would align themselves with the blind community. But all around were others working on augmented reality products. Meta, Apple, and others.

While Shebanek’s speech will be interesting to any Apple fan (he gives lots of stories about building an accessibility team at Apple, including lots of Steve Jobs stories) I don’t want you to miss what happens about 57 minutes into my video: several of the blind people around the room were called on to tell what Apple’s VoiceOver meant to them.

The stories are heartwarming but the job isn’t done. Why do I say that? My 14-year-old son is a special needs kid and has speech that is hard to understand by many and is autistic. None of the AI voice systems understand him and you should hear his frustration at not being able to communicate with computers like his brother can by talking to Alexa or Siri. He’s had Apple devices since he was two years old.

He can’t use systems like Apple’s Siri, Amazon’s Alexa, or even Google’s Assistant with his voice. They just don’t understand him.

As we move into augmented reality devices, which could greatly help him live his, and those who are like him, life, these technology walls grow more daunting. Why? Five years from now we will be talking to AIs far more frequently than today.

At his public school his special needs classmates have similar problems. Some can’t hold their hands still enough to type on a keyboard. Many have a tough time with speech.

Will my son and his fellow students be included in the next paradigm shift? The paradigm shift of moving to 3D computing and new user interfaces for using your real voice and real hands in. For some users, like my son, this will be a frustrating paradigm shift.

It was an honor hearing Mike Shebanek’s stories. He’s a real pioneer who has had a deep mark on many companies (he now is working at Meta). He gives me hope that my son, and his fellow students, will be included in the computing platform of the future.

Thanks to the Vista Center for inviting me.

The Vista Center empowers individuals who are blind or visually impaired to embrace life to the fullest through evaluation, counseling, education, and training. Learn more: https://vistacenter.org

It has a conference coming in December, 2022, for developers who are shaping new technologies to create a more accessible world for people who are blind. Details on that here: https://sighttechglobal.com