This workflow marks the transition from manual asset modeling to high-level scene orchestration, effectively turning complex spatial design into a streamlined AI pipeline. It signals a future where the primary skill in 3D development is the strategic integration of specialized generative models rather than individual technical craftsmanship.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Image to 3D World Workflow 👀Indexed:
World Labs just open sourced an agent pipeline that turns a single reference image into a fully interactive 3D environment. Claude identifies the objects, Tripo and Trellis generate the meshes, Nano Banana Pro or GPT Image 2.0 cleans the background, and the whole thing exports to Blender, Unreal, or Unity with working Colliders. Full episode: https://youtu.be/pFYaEeOUB7U If this podcast is helping you, please take 2 minutes to rate our podcast on Spotify or Apple Podcasts, It will help the Podcast reach and help more people! Spotify - https://open.spotify.com/show/12jUe4lIJgxE4yst7rrfmW?si=ab98994cf57541cf Apple Podcasts (Scroll down to review)- https://podcasts.apple.com/us/podcast/bad-decisions-podcast/id1677462934 Join our discord server where we connect and share assets: https://discord.gg/zwycgqezfD If you wanna see us to do cool things follow us here too: Instagram:https://www.instagram.com/badxstudio/ Spotify: https://open.spotify.com/show/12jUe4lIJgxE4yst7rrfmW X: https://x.com/badxstudio TikTok: https://www.tiktok.com/@badxstudio LinkedIn: https://www.linkedin.com/company/badxstudio Apple Podcasts: https://podcasts.apple.com/us/podcast/bad-decisions-podcast/id1677462934 Our personal handles: (if you wanna stalk us) https://www.instagram.com/farhad_baddecisions/ https://www.instagram.com/faraz_baddecisions/ https://www.linkedin.com/in/farhadshababi/ https://www.linkedin.com/in/farazshababi/ #Tech #AI
You can now turn a single image into a fully interactive 3D environment which you can move the objects within a few minutes.
>> Wow.
>> This is something that really blew my mind when I saw the demo. The way it works is that you will orchestrate a bunch of models together. The AI agents will look at the image. They identify different objects. Then they will create an image from every single object send it to a 3D generation model. So the 3D model with mesh will be generated because now we have the images of different objects. We can hide them from our environment. So we are left off with a static environment and the 3D objects within the scene. Then the AI agents will go and add collider to the scene.
And now you can import this interactive 3D scene into Blender, Unreal Engine, Unity or any other 3D tools that you have. This is something that we've been waiting for for the longest time. But the reason that we are here now is because all of the different AI models, the 3D generation, the image generation are now so good.
>> Even the agentic AI that you can spawn multiple agents, do different task at the same time. Now we are here at this stage and if you look at the demo, the guy posts a photo of a street.
>> Go back and just pause on that frame so we can describe a little bit back a little bit more. Yes, this is perfect.
>> Back back a little bit before they lift the chair.
>> Yeah, >> that was good. No, no, that was good.
>> That was good.
>> Okay. Okay.
>> Uh I want to see a comparison to the uh there's Yes, this one. Okay.
>> Okay. So the image is at the bottom right corner of the screen, right? So it's image of an alley. You can see there is a table, there are chairs, there's one umbrella, there is a few other things on the left side. The first thing that the AI agents will do, they will go and describe the scene. They will name the object. They will create a list >> and it's done using clot.
>> Yes. Yeah. But you can change them. The best part about it is there's a GitHub repo. You can go and use different generation models for this.
>> Clot has proven to be the smartest for these kind of jobs. I'm noticing a lot of companies >> by default use clot for these things.
Hicksfield being one of them. Even by default it's on cloud when you go into supercomputer in their own videos they're using claude for their skills >> for their skills really >> and here in this video as well I believe they were using and talking about cloud uh claude so >> by the way shout out to word labs I explained the entire thing without even telling that this is done by one of the team members in word labs the entire orchestration of putting different parts together identifying turning into photo removing the objects and then turning them into 3D by the way the environment would be 3D gin splat at the very lightweight. You can go into it.
You can design different things. I can think about a lot of applications because we tested with 3D gals and splat a lot. Imagine a scene from a movie. You can do it with that. Imagine taking this into VR from a photo, take it to the VR.
There are a lot of use cases for this and I think this is super super exciting.
>> What do you think, Ferris?
>> I was fascinated by the way the process >> Yeah.
>> works.
As far mentioned, so many different technologies had to improve simultaneously to make this possible. The original image being described perfectly. That means Claude and LMS had to have incredible vision models and then be able to translate that into text. Furthermore, once it is described, then we had to be able to have enough quality in our 3D generation tools such as trip or trellis to be able to create the 3D meshes as these objects have been discovered by the LLMs. And that has to be done relatively quickly as well with the material generation and the textures and the UV unwrapping and so on and so forth.
>> There's so much going on. the fact that people look at it in it's not in minutes but there's so much work in the back end and exactly and then for the LLM to recognize what the objects are against the background against the environment and then try to isolate those so that they can create a new prompt to edit that image with the objects now removed using probably Nano Banana Pro or GBT image 2.0 Again, Nano Banana Pro and GPT image 2.0 are relatively new. Before these models, being able to edit an image really quickly and consistently and accurately was almost impossible.
But because the image models have also improved, you get to have that environment without the chair now slapped onto the texture. You know, the environment of the walls are now cleaned up.
>> Yeah. And you know, having the collider, remember when we first started working with 3D Gaus and Splat, was it two years ago? I think it's been two years right viewing the colliders had problem you would go through the wall now this is perfect you can see that >> holes in the environment you identify every single corner of the room and it's 360° the photo is just one angle now the guy goes and turn around it generates everything the more I look at this we are living in a simulation if in 2020 what are we six yes if 2026 we have this technology by 2036 this is so advanced that I can put you in an environment, a fictional environment, you won't even realize it's fictional. This is a simulation matrix.
Wake up.
>> If you have haptic sensors and so on and so forth to be able to sense the environment as well.
>> Why haptic sensors? You're going to have brain chips.
>> Oh my god. Back to the conversation of putting something in our brain.
>> But this is very exciting.
>> I'm actually thinking about virtual production.
>> Okay.
>> Yeah, virtual production.
>> Yeah, actually. Yeah, because you won't control. You won't think about it.
>> Yes.
>> I want to create an ad or a movie shot in an alley like this. There's a lot of objects. I want to be able to move the objects. The director is going to ask you.
>> Oh, that was the biggest problem with Gasin Splat initially. True.
>> So, they'd buy assets online, of course, or they'd have to manually create them.
And the biggest problem is when there is virtual production happening. This is some extra knowledge for you if you don't know. During the shoots, there's always CG artists on standby to make changes and modify the scenes depending on the director's vision because he might have planned something in the storyboard but wants to change it last minute. Let's say when they're doing the Mandalorian, there's a massive rock and for whatever reason that rock has to be moved. Yeah.
>> If the assets and the environment were created like this, they technically have control to move things around. Now >> they can even add they can they can add they can test oh actually testing >> testing imagine I want to add a bladeunner car right in that alley and I want to see >> and how fast they can iterate on different scenes it's insane I think for virtual production this is going to be really interesting >> actually we should give it a try it's a GitHub repo right they haven't brought it into word labs yet it's open source I believe >> yeah so you can you can go on GitHub yes it's there we should definitely >> it's called Image Blaster >> it's created by Neielson Yeah, >> shout out.
>> We We really need to give him a shout out. This is This is amazing. This is amazing.
>> Let us know what you guys think about this. Are you going to try it out or not? Since it's open source, you can go ahead and try it out today and let us know what you guys think about
Related Videos
Ubuntu Touch Q&A 190
UBports
241 views•2026-05-17
Learning k8s ep. 3 - The end of the VM
devcentral
102 views•2026-05-15
Iterators and Generators: Real Use Cases
jsmentor-uk
188 views•2026-05-17
TCS NQT Coding Questions Solution (One Shot) | TCS NQT Preparation 2027 | TCS Actual PYQ 2026
knacademy20
2K views•2026-05-17
The 4 Bit AI Training Trick
explaquiz
414 views•2026-05-19
Why Learn Algorithms in the AI Era
bitsandproofs
245 views•2026-05-17
NFA - Transition Diagram and Transition Table
nesoacademy
198 views•2026-05-19
DSA Topics and Algorithms Overview #coding
DSA-in-Minutes1
423 views•2026-05-15











