Robust Robotics, 3D Manipulation, and the Rise of Spatial Computing – Live and Learn #35
Welcome to this edition of Live and Learn. This time with reviews of the Apple Vision Pro, Diffusion Models that you can run on your phone, four-legged robots that can automatically avoid obstacles, and more advanced 3D content manipulation tools. As always I hope you enjoy this edition of Live and Learn.
✨ Quote ✨
We believe that if we make both intelligence and energy “too cheap to meter”, the ultimate result will be that all physical goods become as cheap as pencils. Pencils are actually quite technologically complex and difficult to manufacture, and yet nobody gets mad if you borrow a pencil and fail to return it. We should make the same true of all physical goods.
– Marc Andreessen - (source)
🖇️ Links 🖇️
Apple Vision Pro Review by Casei Neistat. The Apple Vision Pro came out in the last few weeks and people have started using it and reviewing all its features. I've watched a couple and this one–in true Casey Neistat–fashion was one of the best. The most memorable thing is this: augmented reality is going to become a thing and it might very soon be normal for everyone to run around with some sort of augmented reality device strapped to their heads. The Apple Vision Pro is not that device yet, but it's a sign of where things are heading quickly. To review whether you want to buy one yourself, you should also watch Marcus Brownlee's Review. It's on point and covers all the things there are to say about the Apple Vision Pro.
Group Things in 3D space. This paper introduces a way to segregate objects from 3D scenes at different levels of detail. You can scan your room, pick out all the objects within it, and extract 3D meshes that you can then just drop into your scenes in Unity or Unreal Engine... Tools like this will change the world of 3D manipulation forever. And this space, at the intersection between 3D and AI, is exploding with activity at the moment. I've run across a few other papers and projects that are creating awesome things there too. Two worth checking out are Replace Anything and Anything in Any Scene. All of this is also going to accelerate the rise of "spatial computing" and the potential use cases of devices like the Apple Vision Pro.
Diffuse to Chooose by Amazon. In this paper, Amazon produced a technology that lets you virtually try on any sort of outfit you want. You can also use it to see how furniture will look in your room. This will be the future of shopping, and again, if combined with devices like the Apple Vision Pro, this will enable a crazy future. One where you can just sit at home, and virtually "try on" everything you would want to buy, before ordering it with Amazon which ships it to your doorstep the next day.
Mobile Diffusion by Google. Diffusion Algorithms for image generation become better and better with every paper released. Now Google created an approach that can run on phones and generate high-quality images within seconds, locally. This is rad and I think it shows how much potential there is for optimization in the space of LLMs and machine learning algorithms in use today. It turns out that our current models are heavily unoptimized and could still be so much better if you optimize the architecture. Soon we will have ultra-smart devices, that run the likes of ChatGPT and Runway video generation models locally. Again, this will interact with and amplify the idea of AR (or Spatial Computing as Apple wants to call it). Faster AI inference for manipulating images also means that eventually, you could create virtually generated worlds on the fly, overlaying or even interleaving them with your real world. Isn't that insane?
Techno Authoritarianism by The Atlantic. This article caught my attention because it got reposted with a laughing emoji by Marc Andreessen from a16z. I think it's a good read because technology has scary potential and the power that people, who create big businesses, have, can be and is misused. But every other form of power, including that of the government, can be misused in the same way. In the end, I think that power, in any form, should be used to do as much good as possible. And that's why I'm still excited about the future of technology. It has the potential to make our lives vastly better. And I don't know why, but I somehow trust people, even Mark Zuckerberg and Meta (to some extent), more so than I trust the government, even if both have track records of doing bad things. I somehow believe that people who create technology, truly want to create a better world, not just push shareholder values up. And that gives me confidence: That there might be a world where technology prospers to solve human problems. Much like the quote from the manifesto at the beginning of this newsletter... But then again, not everybody might agree that a world where everybody has an augmented reality device on their heads (or a smartphone in their hands) is truly better. And that's exactly the point of the article.
Agile but Safe Robotics. Robotics also keeps making strides, and this paper is a good example of that. It's about how to make robots that are safe to be around, but also agile and fast. The robots they train can automatically avoid obstacles that appear in their way while running around. They still look somewhat janky, spastic, and frankly weird in their movements. Like a crazy dog that had too much caffeine... but it's a start. Also, people have more and more success with using reinforcement learning approaches in the real world, and there's another paper worth checking out on that, too. Maybe we'll soon live in a world, where we not only go "shopping" digitally, but our packages are delivered, packed, and maybe even created by robust robots interacting with us in the real world. Then we'll truly live in the realm of science fiction and I wonder how society is going to adapt to those changes because they are coming, and they are coming faster than most people think.
🌌 Traveling 🌌
This newsletter probably finds you a bit earlier than usual... That's because I'm gone for 2-3 weeks crossing the Atlantic without internet to send this from. Right now, while writing this, I am in Cape Verde. It's nice here. Good food, cool people, and a giant grand piano that I can play every day in the Cultural Center. I hope you’re alright, wherever you are too.
🎶 Song 🎶
Beethoven's Piano Sonata No.8 - Pathetique by Hiromi
That's all for this time. I hope you found this newsletter useful, beautiful, or even both!
Have ideas for improving it? As always please let me know.