Cognitive Warmup. Is this another DeepSeek moment for AI to struggle? Chinese tech giant Alibaba says its Alibaba Cloud Platform has successfully tested a new compute pooling system called Aegion, which has reduced the number of Nvidia H20 GPUs needed to serve dozens of models up to 72-billion parameters from 1,192 to 213 GPUs. That’s according to a research paper presented at the 31st Symposium on Operating Systems Principles (SOSP) in Seoul, South Korea this week. “Aegion is the first work to reveal the excessive costs associated with serving concurrent LLM workloads on the market,” researchers from Peking University and Alibaba Cloud wrote. paperThis could be another example of working smarter and not necessarily spending on massive computing infrastructure – exactly something that many AI companies have been talking about deliberately over the past few weeks. The researchers say Alibaba’s methodology means a single GPU can support up to seven models for calling users, compared to a maximum of three under alternative systems. .Investors will soon start realizing this.
algorithm
This week, we’re talking about Microsoft trying to find a more stable AI foundation without having to rely so heavily on OpenAI, Nvidia’s ‘personal AI supercomputer’ that costs a pretty penny for your AI obsession, and OpenAI telling us again that it’s concerned about the well-being of humanity in general.
Microsoft is revamping AI alliances
Microsoft has introduced a new text-to-image model called MAI-Image-1, marking its first generative image system developed in-house. This is a clear move away from reducing reliance on partners like OpenAI, and perhaps also a sign (among many others; the broader scope of Microsoft’s work with Anthropic is also an indicator) that the Redmond-based tech giant wants a stronger base to work from, and a greater stake in the overall creative stack. Initial testing has yielded impressive results on AI benchmarks, which is always a good sign if generic foundations like lighting, textures, and realism aren’t completely screwed from the start. You’d be surprised how many image generators still often fall short. Strategically, this is a big deal – the potential for tighter integration into Copilot, which Microsoft has been talking about a lot recently, and makes the case for better control of intellectual property. As generic image tools are marketed, control and quality will separate the winners from the noise.
Nvidia, a personal AI supercomputer, and Apple
There was a lot of excitement last week when Nvidia started shipping its compact AI computer, the DGX Spark, to developers and researchers. Price $3,999 (this will be approx 3,82,000), it packs some serious power with a Grace-Blackwell GB10 superchip and 128GB of integrated memory that delivers up to petaflops of predictable performance. This means models with hundreds of billions of parameters can now be trained or fine-tuned right on the desktop. Powerful AI compute hardware is coming to your desk. But I was wondering if we didn’t already have the same level of compute available as the Apple Mac Studio with the M1 Max and the Mac Mini with the M4 Pro (and so Nvidia isn’t breaking new ground here)? Got a chance for a nice graphic by lmsys.org Comparing benchmarks, which show that the M1 Max Mac Studio (it costs about $2,000) consistently returns higher output tokens per second than Nvidia’s computing effort. Even the Mac mini’s really compact form factor (it’s about $1,400) matches the DGX Spark. One wonders, what was the idea…
AI therapy by day, AI seduction by night?
OpenAI has established an Expert Council on Wellbeing and AI. It will involve eight experts who will be tasked with studying how constant interaction with AI systems affects human emotion, cognition and behavior. The focus is on defining what “healthy AI use” means in contexts ranging from education to medicine to everyday chatbots. The council will advise on design, ethics and potential behavioral side effects. Is this a sign that the AI conversation is maturing? Unlikely. But it’s clear that OpenAI brings us back to smart systems and often influences the conversation, in order to keep investors, consumers, and perhaps even policymakers happy. Keep in mind, this is the same company that wants to make AI sexting mainstream. AGI can wait, is porn a revenue generator for the immediate future?
Thinking
“They just don’t work. They don’t have enough intelligence, they’re not multimodal enough, they can’t use computers and other things. They don’t have the capacity for continuous learning. You can’t just tell them something and they’ll remember it. They’re cognitively deficient and it’s not working. It’s going to take about a decade to work on all those issues.” – Andrzej Karpathy, ex-OpenAI, on the Dwarkesh Podcast.
Andrzej Karpathy has taken a pin in the AI bubble. He estimates that artificial general intelligence (AGI) is at least 10 years away, with considerable ‘slop’ involved in the code generated by today’s models, and argues that the approach should be to have smaller models that have better recall and context. If you’re in the mood to question Carpathy’s credentials just because he might not align with your world view on all things AI and how AI agents will be superior to humans, here’s something worth considering – he’s a research scientist, a founding member of OpenAI, was senior director of AI at Tesla, and has since founded Eureka Labs which focuses on AI-related education.
Context: Andrzej Karpathy’s words strike at the core of the current AI narrative – and they do so from someone who actually built the systems that are now being mythologized. His comments come as “AI agents” have become Silicon Valley’s new obsession, marketed as the next big leap beyond chatbots and superior to humans in the workplace. But his decision is serious. The technology is not ready at all. In his view, today’s major language models are still imitators of patterns, not thinkers. They can recall, summarize, and respond effectively during conversations, but they lack the cognitive infrastructure that makes intelligence continuous and cumulative. He doesn’t really remember; They refer only to the momentary reference window. They do not understand multiplicity; They correlate text and image tokens without a unified sense of logic between them. They do not learn from experience; They simply perform again from training.
The current hype cycle around agentic AI, which can act on behalf of users, browse the web, execute commands and make autonomous decisions, is just enough to paper over the structural gaps. Every AI company’s demo looks polished, with big claims giving the illusion of progress, but the foundation seems far removed from general intelligence.
A reality check: In an industry accustomed to trying and flaunting quarterly successes, Karpathy’s decade-long timeline seems almost revolutionary. He’s not saying progress will stop, just saying that the necessary leap from simulation to cognition will take much longer than investors or AI owners expect. True agents will require persistent memory, real-time learning, cross-modal reasoning, and secure self-improvement, all of which demand breakthroughs in model architecture, data efficiency, and energy cost. And perhaps the size of the models too, with bigger not always being better.
For AI companies, every new model is always more aligned, more efficient, and more multimodal than the previous model, which is already considered the greatest. We’re just making the best, better? All of this is locked into a narrow range of competence, because the reality is that they cannot maintain context or develop behavior the way humans naturally do.
For enterprises, this perspective matters. This suggests that AI should, for now, be deployed as a piece in the puzzle, and not in a position to lead amplification or replacement for anything that requires human creativity, insight and productivity. Automate basic tasks, that’s all. “Autonomous agents” that overpromise risk frustration and distrust, especially when systems inevitably fail in consistency or nuance. In that sense, Karpathy’s realism is not skepticism. This challenges the industry to focus less on hype and more on the fundamental science: memory systems, continuous learning, interpretation, and integration with real-world perception.
Neural Dispatch is your weekly guide to the rapidly evolving landscape of artificial intelligence. Each edition provides curated insights on the critical technologies, practical applications and strategic implications shaping our digital future.







