This analysis provides a pragmatic roadmap for achieving computational sovereignty by balancing local hardware performance with the architectural nuances of autonomous coding agents. It effectively demystifies the transition from cloud-dependent development to secure, high-performance local execution.
Approfondir
Prérequis
- Pas de données disponibles.
Prochaines étapes
- Pas de données disponibles.
Approfondir
Local Coding Agents on Strix Halo and R9700: Pi, Opencode, and SWE-bench Mini BenchmarksIndexé :
Episode 1 of a series on building and running AI agents on local AMD hardware. This episode covers how coding agents work, the security risks they introduce, and a practical comparison of two coding agents, pi and opencode, running on Strix Halo and the Radeon R9700 AI PRO using Qwen 3.6 quantizations and llama.cpp. Coding agents are built on the same core principles regardless of which one you pick: a control loop around the LLM that manages context, exposes tools for file access and shell execution, handles session state, and optionally spawns subagents. The main differences between agents are their philosophy on context size and what they offer out of the box versus what you configure yourself. When the LLM runs on local hardware rather than a datacenter, context length and token throughput matter more, which shapes which agent design fits better. The episode also covers the security side: prompt injection via untrusted data sources, blast radius, and how sandboxing via bubblewrap on Linux or Docker containers can reduce risk. Finally, there is a benchmarking project based on an adapted version of SWE-bench mini, evaluating pi on 50 curated software engineering tasks using model quantizations that fit on Strix Halo and the R9700. This series is supported by AMD. Timestamps 00:00 Introduction 03:10 How Coding Agents Work 08:51 Security Risks & Sandboxing 12:08 The Hardware (Strix Halo / R9700) 14:00 Pi Coding Agent 23:31 Opencode 32:09 Benchmarks (SWE-bench mini) 40:43 Conclusion Links & Resources Strix Halo Toolboxes & Guides: https://strix-halo-toolboxes.com Building a Coding Agent from Scratch (Sebastian Raschka): https://sebastianraschka.com Pi Coding Agent: https://github.com/ Opencode: https://opencode.ai LLM Chronicles – ReAct Framework Episode: https://llm-chronicles.com LLM Chronicles – Prompt Injection & LLM Security: https://llm-chronicles.com Buy Me a Coffee: https://buymeacoffee.com/dcapitella
Welcome to this new series on local AI agents where we will explore how to build, run, and customize Agentic Workflows on local hardware. As of recently, a few pieces of the puzzle have fallen into place that inspired me to put this together. The first piece is cost. Coding agents relying on large LLMs hosted in data centers have been getting more expensive. Moving to usagebased pricing in the middle of what looks like a general compute crunch. The second piece is hardware. We now have access to capable GPUs that allow us to run mediumsiz models locally at a workable speed without breaking the bank. And the final piece of the puzzle is that we've been seeing better and better model families like Quen 3.6 six that specifically target agentic use cases. Putting all of these together means that using local LLMs for these workflows is now a practical option.
This episode is organized into different sections, so feel free to skip to what you're interested in. I want to start by breaking down how coding agents work.
It's easy to get overwhelmed by the many options that keep popping up left and right. But all of these agents are built on the same principles. If you understand them, it becomes easier to choose and you don't feel lost or anxious that you have to chase every new coding agent that comes out. Next, I will walk you through the hardware that I will be using throughout this series, specifically Stricks, Halo, and the R9700 AI Pro. We will also do some hands-on coding tasks with the PI coding agent and open code, comparing the results, speed, and usage of tokens.
Finally, I know that showing a demo of a coding task is one thing, but to help you understand what is possible and compare different options, you need data. Because of this, I started a project to benchmark the performance of various LLMs on coding tasks, specifically targeting the model quantizations that we can run on strict salo and the R9700. So, at the end of the video, I will show you all the benchmarks.
Before we get into the details, I want to thank AMD for sponsoring this project and giving me essentially a blank slate.
The brief was simple. They said, "Hey, we appreciate what you do for the community and we want to support you in creating an educational series to help people get the most out of their AMD GPUs. show people what is possible on these devices with AI so they can experiment, learn and develop. So again, a big thank you to AMD for supporting me and making this series possible.
There are many coding agents out there and it can be confusing because it feels like every day there is a new one, but in reality they all pretty much work the same. The difference is in the philosophy whether they are minimal or fullyfledged and what they offer out of the box versus what you can customize.
If you like me are really keen on understanding the details, one of the best ways to do that is to build your own coding agent from scratch. And if you're so inclined, I recommend you check out the excellent article by Sebastian Rashka that will show you how to build a mini coding agent from scratch. I will leave a link in the description. But in essence, an agent is not just an LLM. It is a harness, a software scaffold that wraps the LLM in a control loop. This harness manages the context, tracks the state and offers tools to the model so that it can implement agentic workflows. On my channel, I made an episode last year on the early research by Google that allowed models to use tools. It covers the reason and act framework known as React, which is still the foundation that makes these agent workflows possible. So check it out if you are interested in the details. A practical way to look at an agent harness is to focus on the duty cy tendles to make an LLM effective at coding tasks. For starters, the LLM itself does not know your code base. The harness gathers context about the workspace of a project repository by fetching for example the g status, the file structure, the directory layout and essentially it builds a summary that gets added to the LLM prompts. This ensures that the model knows exactly where it is and what is going on even before you issue your first prompt. Next, and arguably one of the most important things that the harness provides is tools because that is what gives the LLM its agency to actually go and do stuff in the real world. The harness exposes tools to read and write files, run shell commands, access the web, and even MCP servers and similar things. And as the agent works, the context can easily become very long, especially when reading large source files or ingesting tool outputs. To handle this, the harness uses different strategies to manage the context, for example, by clipping long tool outputs and periodically summarizing earlier history. This allows the to keep a summarized version of the context. So the LLM is still aware of what's been happening, but its context window doesn't run out of space. The runtime also handles session management across different runs. It saves the state so you can resume a closed session later or you can fork a conversation and even backtrack if the agent takes a wrong turn. It can also implement sub agents for example by spinning off side tasks in the background and in parallel instead of running everything sequentially in the main thread. So these are the standard things any coding agent does. The difference among them is how these concepts are implemented and the philosophy behind them. Some agents like cloud code and open code tend to maximize and implement all features which results in a richer out ofthe-box experience but also potentially longer context windows and higher computational cost. Others like the pi coding agent are designed to be minimal and customizable. Both approaches have advantages and disadvantages. If the LLM powering your agent runs on powerful GPUs in a data center, maximizing features works well. Data center GPU clusters can process long prompts fairly rapidly and frontier models with trillions of parameters tend to maintain quality even on long context. Because of this, agents like cold code and open code worry a little bit less about filling the context. They can leverage the capabilities of larger LLMs and data center hardware and use up a larger context window with many tools and MCP servers, long prompts and for example ingest full error traces etc. The only real issue here is cost and compute availability. However, when the LLM powering your agent runs on local hardware like a framework desktop with strict sero or a server with a radion 9700 AI pro, the situation changes.
Tokens are free, but context length matters and we need to be careful. We cannot fill the context with many NCP servers and a lot of instructions. We need to keep it short because the hardware cannot process data as quickly and the medium-siz models that we can fit into these GPUs tend to degrade more than their larger counterparts when the context grows. Before moving on to some hands-on tasks, we need to discuss the risks of running these agents. To be useful, coding agents require powerful tools, typically access to a shell, file system, read and write permissions, and even access to the internet. And they can be extended with third-party skills and tools. Because they have all of these capabilities, they are quite dangerous if things go wrong. An agent can simply make a mistake or it can be exploited maliciously through an attack called prompt injection. I have a series on my channel dedicated to prompt injection and LM security. So if you want to learn more, just check that out.
But in a nutshell, if an agent ingests untrusted data, for example, from a web search or a GitHub issue, an attacker can potentially hijack its control loop and weaponize the LLM to say steal data from your workstation or even install back doors. The impact of such errors or attacks depends entirely on the level of agency and oversight the agent has. We saw a practical example of this in April 2026 when a coding agent at a startup called Pocket OS autonomously deleted the entire production database.
A common way to reduce this risk is human in the loop supervision where you manually approve every action the agent takes. This was feasible with early agents, but agents today can perform tens to hundreds of actions in a very short period of time. Reviewing and approving every single action is tedious, ruins productivity and is errorprone itself. There is actually a thing called approval fatigue. So we need a tradeoff that maintains acceptable security while allowing the agent to work. The standard trade-off agentic harnesses use to reduce the blast radius of an agent mistake or compromise is to introduce sandboxes. A common compromise is to give the agent free reign to read and edit files inside a restricted project folder but block or at least ask for permission if the agent tries to access resources which are outside of that boundary. The agent harness can enforce these restrictions programmatically. That is, it can inspect each tool invocation the LLM wants to perform and check it against a set of rules that the user configured.
Even better, a sandbox can be enforced at the operating system level using tools like sandbox xac on Mac OS or bubble wrap on Linux. You can also use containers like Docker to further isolate agents. So the message is don't go yolo and get carried away. Keep security in mind and avoid giving excessive unsupervised agency to your AI tools. Let's talk about the setup that I use to run these agents. If you follow my channel, you might have a strict solo device like the Framework Desktop with 128 GB of unified memory or an AMD R9700 AI Pro with 32 GB of VRAM. The available memory and computation power of the GPUs you have access to will of course define what models and what quantization levels you can run. In the following demos, I will use Quen 3.6 six on strick in an 8bit quantization from anot. Later in the video, I will show you benchmarks of other models on both strick and the R9700.
The good news is that even with a single R97 and 32 GB of VRAM, you can run quantizations that perform well for many coding tasks. for running the LLMs themselves. I am using my Llama CPP toolboxes. You do not have to use my toolboxes. You can use any inference engine you like. And I know many people use Olama or LM Studio which will work perfectly fine. Linux, Mac OS, or Windows. Pick what works best for you.
For my setup, I typically run the LLMs on a remote server, SSH into the box, and forward the llama CPP server port to local host on my workstation. There are other solutions. For example, you can run the LLM on the same host as you run your coding agent, but I do like to keep the inference separate from everything else when possible. All right, we are ready to see some of these in action.
Here I will be starting with the Pi coding agent. Very simple to install.
You essentially have one command. And this is not meant to be a PI tutorial.
If you're interested in a tutorial, I like this one by content creator Oven Lewis. So make sure you check it out.
Now in my case I have already installed the pi coding agent on my system and its configuration is under pi / a.py/ aagent in my home directory and what you need to do here so that it can access your local llm. Essentially you need to drop a models.json file where you create a provider. I have named mine llama cpp because uh that's what I'm using but you can call it whatever you want. Obviously you provide the uh base URL uh you don't need any API key and then you can give it a list of models. Now because I'm using llama CPP lama server uh and I'm just giving it one model here the name doesn't really matter probably what matters is the size of the context window. Uh but again I'm of course using the correct name. If I go on my strict sale server, this is inside one of my toolboxes. And what I'm going to run is the uh quen 3.6 35 billion parameter model. Uh on strict salo, I prefer to run mixture of expert models. They will perform considerably better. And this is an slot uh Q8K XL quantization. A very good quant.
So we can load uh that in memory and start llama server. This listens on port 8080. And obviously I want to forward that port to my host which I can do like that. And I'm also going to take a look at the uh GPU. I can see that there are 40 uh odd gigabytes of memory that this model and its context window are currently occupying. Now this is all set up. Pi knows how to use this model. So I can go to my repository and run pi.
Okay. So here we are in the AMD strict sero toolbox repository. And this is the repository that I use to manage all of the llama CPP toolboxes that maybe some of you use. Now the task I am going to do here is to update the 7.2.2 uh raw cam toolbox to 7.2.3 2.3 which is the new version of Roam. And again I have two versions of this toolbox. One includes APR. So I want the agent to do that and to also update all of the scripts that reference uh the uh toolboxes. So I have benchmark scripts that um have references to uh the toolboxes and the refresh toolbox script and also some GitHub uh workflows or actions uh that obviously need to know about that particular new toolbox.
Now as you can see uh as soon as you run uh the agent it loads agents.md.
Uh this is something that I recommend you always provide to your agent because you give the context uh that's important for your project. Uh and this will make the agent work much better. One thing that you will notice is that I don't have many skills. Even the skills here like caveman I'm not actually using with pi. Uh but one thing you'll notice is that I have the pi sandbox extension. We did talk about sandboxing before. I will show it to you because I think this is very important uh that you don't run agents yolo or you will uh not be happy if you try to do that. So the pi sandbox um is a very uh useful skill uh sorry very useful extension which provides operating system level uh sandbox for pi and essentially you can install it with a uh simple command obviously I've already installed it and then you can configure it uh to decide what your agent can have access to. In my case, I'm going to show you my configuration file that it is under the uh PI directory uh sandbox.json.
If I can open it up and you can see that I've enabled the sandbox and I can control which network um domains it can access. In this case, uh anything under GitHub, but I can also control access to the file system. For example, I can say don't read my home directory. uh and you can read the current working directory uh and some configurations also you can write the current working directory and you can also write temp. Now temp is good to give to an agent because he might need to download some files that don't fit in the repository or even write some temporary scripts. So this is something you typically uh want to give it read and write access to. But also you can see here I am denying access to some files that might be sensitive. AMP might contain obviously uh keys in environment variables pam and other stuff like that. Now this is an example but you have to find what works for you.
So let's proceed and I'm going to ask this to essentially update the Rockam 7.2.2 toolbox to use Roam 7.2.3 2.3 and also update the benchmark scripts, the refresh script and the GitHub actions.
Enter.
So now this is going to go and start uh thinking about what it needs to do. And as I said, the um PI um agent aims to keep the context very very small. This is in a way the opposite of what we're going to see later with open code or even like cloud code. The system prompts are very very little. You can see the consumption of tokens uh down here. And you can see how it's using the different tools to figure out what it needs to do.
and it only has a couple of tools actually. It's got a few tools. It's got read and write files and essentially a shell tool. And so here it's running shell commands to get an idea what's available. It's finding the files it needs to edit and then it's reading them.
And after it's got the context, it's um made a plan of what it wants to change.
So you can see here the plan. We didn't have to tell it to make a plan and PI doesn't tell it explicitly to plan. This is the Quen 3.6 model kind of being fine-tuned and optimized to already think like this out of the box. And this is what allows us uh not to fill the context window with uh too many too many instructions. So you can see now that he started uh creating and editing all the files uh that he needs to edit. Uh so obviously the uh containers here and then it's updating the benchmarks, the refresh scripts and the uh GitHub actions. And you can see the uh strict sale APU uh working really hard on this.
But I am not speeding these up uh yet at least because I want you to see how uh responsive it is and now because of how minimal PI coding agent is and only having a few tools and a model that's out of the box already good at doing a lot of these things uh actually results in a usable experience.
So you can see here it did a few checks on its own without being prompted. It said I will I want to check that there are no old uh 7.2.2 references remaining. And so it used the shell uh to just uh check these out and it found a few and it went in uh and uh tried to uh to update them.
All done. Here is a summary of everything it did and it looks perfect.
So, uh that's an example of how the PI coding agent runs on Streaks Halo. One thing to keep in mind is that you can see the token usage here. So, um this sent 19,000 tokens and it generated uh 6,000 tokens. So we have 25,000 tokens for uh this particular task in PI coding agent.
Okay, let's now try the same task with another coding agent. This one is open code and its philosophy is quite different to PI. Instead of being minimal, it aims to be uh fullyfledged uh out of the box. Uh and we'll see uh what that means in practice. But this is more similar to something like cloud code um out of the box. What I want to say before uh we look at this is that PI agent can become as fully fledged as you want because it's very easy to extend uh with its packages and extension framework. But again out of the box you get very a very minimal experience. Now open code you install it again with a uh s simple command and this is how you would configure it to use our strict sale server. So very similar uh to what we had uh before this is just boiler plate. I will put it on a uh repository if people uh are interested. And you can see that I am calling this local model.
I don't have to give it a name if I don't want to. Uh but the thing that we see here that um this has out of the box is a permission model. So and it's very granular here. I'm just showing a highle example where I'm saying you can run anything but when you run bash commands you need to ask me and when you want to go outside of the current directory to edit and read files uh you have to ask me. So this is something um that open code gives you out of the box. That's an example of something being uh fully featured out of the box. Now let's go back to our strict sale repository. And of course I can run open code.
And the first thing that you can see is that it already has the uh local model on the llama CPP server. If you want to uh change it with slashmodels, you can choose uh the model yourself. Now, one thing to notice out of the box is that this has different uh modalities uh that it can prompt uh the LLM with. Here we are in this blue build mode, but we can also tab into plan mode. And again, this will provide the LLM a larger context with more instructions and more direction. So now open code wants the LLM to work. So it gives much more uh guidance than what PI does out of the box. So we are essentially going to give it exactly the same task. Update the Rockam 7.2.2 toolbox to 7.2.3.
Okay. The new session has been created.
Uh you can see it's working here. Uh and we will be able to see the context that uh it is uh consuming. And obviously if we go on the server, we can see uh streak sale working. If you're running llama CPP, you can also kind of see uh the consumption uh and the progress of the individual prompts uh that uh open code is sending.
But immediately you see that we're seeing no output.
This is because the system prompt and the initial instructions that this sends to the LLM are much larger uh than pi.
So it takes much longer uh for our strict server to actually process that.
Okay, we are finally seeing some output.
Uh it is doing actually something similar to what pi did of course uh it is looking in the repository for uh 7.2.2 but it is not using the shell. This has many more tools uh rather than a generic shell. Uh so it's got a grap and a globe tool for example. Uh and this is part of that uh philosophy uh difference. this will actually work well um with more LLMs that are even less capable just because it is much more uh it gives the LLM much more directives and less space to make mistakes. So then of course he read all the files that it's going to need uh to change and it should be coming up to us with a uh comprehensive plan. As you can see, it's writing the plan. But look at the context that it's used for these 30,000 tokens. Uh I need to go back to the Pi one. But I think for the entire task, Pi consumed 20,000 tokens. And here we are just at the planning stage uh and we've consumed uh already 30,000 tokens. But obviously uh one has to admit that this gives you much more information and it is out of the box better structured. One thing that you will see it's also doing it's asking me questions because it is in plan mode. Uh for example for the docker files should I rename the old files or create a new one? And it also gives me some options. um what I want to do and I think the recommended one is good is to rename in place.
So now it will update the plan uh after it's asked me what uh I want to do.
Now one thing here is that we're still in plan mode. Uh but obviously this is just a prompt and the LLM actually went ahead and uh decided to start uh writing files. In theory in plan mode it should not do this and we should have switched to um build mode. But anyway that is not a strict boundary at least the way that uh open code implements it.
And again here very similarly to Pyode actually you can see a diff of all the changes that it does uh in all the files. Uh and I typically recommend you check what this is doing if you have uh the time but it looks like it's doing all of the correct changes in all of the correct places.
And you can also see here on the right a richer UI experience out of the box which shows you the uh changes that it is making. And this is something else I want uh you to see. Now it wants to execute a shell command. And this is where the uh permission model um kicks in. And it's stopped and it's asking me do I want to allow this command to run?
And I have a couple of options. I can allow this once. I can allow it always for this particular session or I can reject. I am going to always allow.
And this is another instance where it decided it wants to run grap and grap is fine. So I'm always allowing to run grap and head. These are just reading. Um, and it's done. And this is a very good summary of all the changes. And again, this really allows you to have an idea that the agent uh did what we asked it to do. Uh, and I'm just checking I think actually, oh no, it did also the benchmark scripts. Uh, so you need to obviously check the usage. Here we have 39,000 tokens uh which is quite a lot more than what Pi used. And of course you could see that this was uh considerably considerably slower.
Demos like the ones that you just saw are useful to get an idea of what's possible, but in reality they only show a few isolated examples. If you want to understand whether investing in hardware to run local agents is actually worth it, you need more data. So I started a project to benchmark the actual model quantizations that we can realistically run on stre and on the R9700 looking at both task success rate and speed. For this project, I chose to evaluate the PI coding agent using an adapted version of S swbench.
SWB bench is a standard software engineering benchmark based on real GitHub issues. I am using the mini version here. This has 50 curated tasks that aim to maintain the difficulty distribution of the full data set. And to keep the evaluation fast, I simply feed the agent solution to the tasks to Gemini 3.1 Pro so that it can work as a judge to evaluate the correctness of each solution. Also, the agent configuration is completely bare here.
There are no custom system prompts and no MCP service or web access.
So you can see here the uh benchmark results that we have on strict salo and the R9700.
Uh this is the model uh that we used in the demos and on this benchmark uh it's got 67% uh success rate and on average on streak sale uh it took 8 minutes to complete a task and obviously uh this mixture of expert model uh works really well on something like strikes halo and if you want you can also click on each task and see um why it was considered uh by the judge to be uh correct or not and you can see the change uh that the model that the model did. Now immediately what jumps uh to me is that Gemma 4 is not very good at uh coding tasks. I haven't been very successful with it and especially if you compare it with Quan 3.6 six. Uh it doesn't look like to me that Google prepared this model uh for for coding. Maybe it's good as a general uh agent, but I haven't had much uh success. The other thing that you can do on straight salo is to run the 3.52 billion parameter model which as you can see has a much higher uh success rate but is much slower on average at completing tasks. Uh I'm running this in Q5 quantization uh from Anslot. Uh and I find this is the model that I tend to use when I have something u more complex uh and typically it is uh quite good especially if you can steer it uh in the right direction. But I wouldn't use this for everyday tasks because as you can see it's quite slow. Now if we compare this with the R9700 uh what I did here I actually took quen 3.6 the same uh as here but a different quantization because I wanted it to fit in one uh GPU. So this is the uh dynamic quant by an slot uh four bits. Um and you can see that it's got at least on this benchmark essentially the same success rate. So at least on this benchmark uh the same performance but it's half the time because obviously the R9700 is a dedicated uh GPU with a much higher memory bandwidth and number of uh compute units. But what is even better on the R9700 is that you can run the dense version of Quan 3.6 60 27 billion parameter model in again the same Q4 quantization which on average takes 9 minutes to complete a task and it's got a very high uh success rate actually essentially the same or even better than the bigger mixture of expert model there and obviously it runs considerably faster than that. So, I'm finding that in general, if you have dedicated GPUs uh that are quite capable like an R9700, you can go with a smaller model but a dense model uh and that will perform in a similar way to a larger model uh but which is a mixture of expert models. So uh hopefully this benchmark uh starts to give you an idea of uh how these models actually uh perform on the hardware uh that you have. As I was editing this video, Lama CPP just merged support for MTP which stands for multi-token prediction. Essentially a way of speeding up token generation for LLMs.
Uh I recently released a video on multi-token prediction and how to get that up and running on strick and the R9700. So check that video out for the details. But I thought it would be interesting to see the impact of MTP on these coding benchmarks. And so I did run the benchmarks with quen 3.6 6 35 billion parameter model with NTP and you can see immediately the average duration of a task dropped to 5 minutes and 36 seconds from 8 minutes and 6 seconds.
Now you will see that there is also some variation in the success rate that is not due to MTP that is due to some uh randomness in the token generation. Uh and sometimes the LLM can just be better at solving an issue. But in general you will see the performance benefit that MTP gives on coding generation. And I also did the same on the R9700 with the 27 billion parameter model. So we went from 9 minutes and 15 seconds down to an average duration of 6 minutes and 9 seconds. Again, the jump in performance should not be related to MTP. MTP will only affect the duration. The jump in performance although it's quite big in this particular case. um that should be affected by the variability of the benchmark. And actually in the future what I plan to do is to run this benchmark three or five times and give you an average which I think would be uh more representative. But to do that, I think I'm going to need at least one to two weeks of uh running it. And I just wanted to get the video out now. Uh but check this page for all of the updates.
Finally, we need to talk about caveats.
Benchmarks have limitations. First, there is the issue of contamination. SWE bench issues might be in the training data of these models, which can of course inflate performance. Second, in this benchmark, the PI coding agent, as I said before, is running fully autonomously with no internet access and no human feedback. In the real world, you typically work alongside the agent.
You can steer it when it gets stuck and you can give it tools like internet access so that it can read and search documentation.
that collaborative workflow is always going to be much more effective than an isolated benchmark. If you find this work and the series useful, I'd really appreciate it if you can support the channel in all the usual ways, mainly by commenting and liking this video to appease the YouTube algorithm. You can also head to buy me a coffee to well buy me a coffee which I always appreciate.
In the next video of the series, we will look at how to create a local agent for deep research. We will look at the different design choices and their impact on speed and accuracy such as dividing work among different concurrent sub agents and choosing appropriate tools and quarters for these tools. I'll also show a simple way to connect the agent securely to a mailbox so that you can email a research task and it will reply with a detailed report. That's all for this episode. Thank you for watching and I will see you in the next one.
Vidéos Similaires
Elon Musk’s XAI, Fiber-Optic Drones & the New Era of US Defense & Winning the AI Arms Race
DefenseNow
250 views•2026-05-15
Decart Raises $300M to Build the Future of Realtime AI
DecartAI
252 views•2026-05-18
I Read Every Google Antigravity 2.0 Doc So You Don't Have To (13-Min Operator Playbook)
hyperautomationlabs1045
120 views•2026-05-19
Could AI change the future of cancer survival?
MotherConservative
999 views•2026-05-16
[RQ] All Preview 2 Midnight Horror School Deepfakes in Macbg Major
macbghuggylego
102 views•2026-05-15
Firefox on Android Just Added 'Shake to Summarize'
BrenTech
349 views•2026-05-19
Google’s NEW AI Just SHOCKED The World…
JulianGoldiePodcast
188 views•2026-05-21
WWDC 2026 Promises Apple Intelligence and Siri Upgrades | Episode 195
TheMacRumorsShow
104 views•2026-05-22











