<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Artificial Intelligence Made Simple]]></title><description><![CDATA[Covering the important ideas in AI from all angles- technical, social, and economic. Read in over 200 countries.  Useful to everyone who wants to learn AI. Critical to anyone trying to see what happens next. Sister Publication to Tech Made Simple.]]></description><link>https://www.artificialintelligencemadesimple.com</link><image><url>https://substackcdn.com/image/fetch/$s_!Pfon!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png</url><title>Artificial Intelligence Made Simple</title><link>https://www.artificialintelligencemadesimple.com</link></image><generator>Substack</generator><lastBuildDate>Thu, 14 May 2026 11:56:04 GMT</lastBuildDate><atom:link href="https://www.artificialintelligencemadesimple.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Devansh]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[artificialintelligencemadesimple@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[artificialintelligencemadesimple@substack.com]]></itunes:email><itunes:name><![CDATA[Devansh]]></itunes:name></itunes:owner><itunes:author><![CDATA[Devansh]]></itunes:author><googleplay:owner><![CDATA[artificialintelligencemadesimple@substack.com]]></googleplay:owner><googleplay:email><![CDATA[artificialintelligencemadesimple@substack.com]]></googleplay:email><googleplay:author><![CDATA[Devansh]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[How to control your AI Outputs (better than Finetuning)]]></title><description><![CDATA[We&#8217;ve been using flat-earth math to navigate warped AI models. Here is the geometric fix.]]></description><link>https://www.artificialintelligencemadesimple.com/p/how-to-control-your-ai-outputs-better</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/how-to-control-your-ai-outputs-better</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Fri, 08 May 2026 09:14:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!BuJ_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Fine-tuning is the industry&#8217;s favorite blunt-force instrument. It is expensive, computationally heavy, and&#8202;&#8212;&#8202;more often than not&#8202;&#8212;&#8202;it breaks as much as it fixes. In our collective quest for more efficient model control, <strong>Activation Steering</strong> promised a surgical alternative: an inference-time &#8220;nudge&#8221; that costs nothing and changes everything.</p><p>Yet, in production, steering often feels like fighting a ghost. You push for a specific concept, and the model&#8217;s distribution leaks into generic prepositions and hallucinations. You try to steer a model to be &#8220;more professional,&#8221; and suddenly it starts obsessing over the word &#8220;the&#8221; or &#8220;to,&#8221; losing the very nuance you were trying to preserve.</p><p>The problem isn&#8217;t that steering is &#8220;weak&#8221;&#8202;&#8212;&#8202;it&#8217;s that we have been fundamentally miscalculating the &#8220;shape&#8221; of the space our models live in.</p><p>We tend to treat the internal representations of an AI like a flat, simple map where you can just draw a straight line from Point A to Point B. But the moment a model uses <strong>Softmax</strong> to turn raw numbers into a probability distribution, that map warps. Much like mountaineering, you want your AI to account for the curvature of your landscape to get the best outcomes (this was one the things that made Kimi&#8217;s MuonClip work so well).</p><p>The paper we are breaking down today, &#8220;<em><a href="https://arxiv.org/abs/2602.15293">The Information Geometry of Softmax: Probing and Steering</a>&#8221;</em>, introduces a fix called <strong>Dual Steering: &#8220;</strong><em>We prove that dual steering optimally modifies the target concept while minimizing changes to off-target concepts. Empirically, we find that dual steering enhances the controllability and stability of concept manipulation.</em>&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!guWj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!guWj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png 424w, https://substackcdn.com/image/fetch/$s_!guWj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png 848w, https://substackcdn.com/image/fetch/$s_!guWj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png 1272w, https://substackcdn.com/image/fetch/$s_!guWj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!guWj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png" width="1456" height="554" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:554,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!guWj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png 424w, https://substackcdn.com/image/fetch/$s_!guWj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png 848w, https://substackcdn.com/image/fetch/$s_!guWj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png 1272w, https://substackcdn.com/image/fetch/$s_!guWj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1339ef57-4535-43b6-9418-afb8b13633d8_2088x794.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Still a cool paper though, make sure you check it out.</figcaption></figure></div><p>However, instead of going deep into their algorithm (which is shared above), I want to use this research as a starting point to talk about some of the grounding concepts in the research around a model&#8217;s information geometry (how it organizes and navigates internal knowledge representations), so that you can go beyond this paper and start understanding the larger space around LLM Geometry and Activation Steering.</p><p>In this article, we&#8217;re going to walk through:</p><ul><li><p><strong>Why Euclidean Math Lies to You:</strong> We&#8217;ll explain why treating a model like a flat grid causes &#8220;probability leakage,&#8221; and why a tiny nudge in the wrong part of the model&#8217;s &#8220;terrain&#8221; can flip the entire output in ways you didn&#8217;t intend.</p></li><li><p><strong>The &#8220;Two-Map&#8221; System of Softmax:</strong> You&#8217;ll learn how models actually use two different coordinate systems at the same time. One map is for &#8220;freedom&#8221; (where the raw vectors live), and the other is a &#8220;cage&#8221; (where the probabilities live). If you don&#8217;t know which map you&#8217;re using, you&#8217;ll accidentally crush the very ideas you&#8217;re trying to blend.</p></li><li><p><strong>The Difference Between Movement and Measurement:</strong> We will break down a common &#8220;type error&#8221; in AI research. We often try to &#8220;add&#8221; a measurement tool (like a probe) directly into a representation, which is mathematically as nonsensical as trying to physically add a thermometer to a room.</p></li><li><p><strong>Navigating the &#8220;Exit Nodes&#8221;:</strong> We&#8217;ll look at why this math currently only works at the final layers of a model and where the research needs to go next. This will help you make high-signal judgments on where to invest your developer attention and capital.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BuJ_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BuJ_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!BuJ_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!BuJ_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!BuJ_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BuJ_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BuJ_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!BuJ_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!BuJ_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!BuJ_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff96438e0-87ee-41aa-9362-f6d703c799a2_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We&#8217;re moving away from bludgeoning models with scale and toward a more precise way of understanding the math of intelligence. I hope you&#8217;re as excited about this as I am.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FZmA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FZmA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg 424w, https://substackcdn.com/image/fetch/$s_!FZmA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg 848w, https://substackcdn.com/image/fetch/$s_!FZmA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!FZmA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FZmA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg" width="500" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FZmA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg 424w, https://substackcdn.com/image/fetch/$s_!FZmA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg 848w, https://substackcdn.com/image/fetch/$s_!FZmA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!FZmA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2853586-67f9-43d9-90f8-c04701f696a4_500x500.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Executive Highlights (TL;DR of the Article)</h3><p>Our current failure to make <strong>Activation Steering</strong>&#8202;&#8212;&#8202;an efficient, inference-time intervention&#8202;&#8212;&#8202;as effective as expensive <strong>Fine-Tuning</strong> stems from a fundamental geometric &#8220;type error.&#8221; We are treating AI representation space as flat (Euclidean) when it is actually warped (Bregman).</p><h4>The Euclidean Illusion vs. Bregman Reality</h4><ul><li><p><strong>The Problem:</strong> Standard steering assumes adding a vector moves a concept in a straight line without distortion. In practice, this &#8220;naive&#8221; addition causes <strong>probability leaks</strong>. Steering a model toward a specific verb might accidentally spike the probability of a random preposition like &#8220;to,&#8221; degrading the model&#8217;s overall intelligence.</p></li><li><p><strong>The Geometry of Softmax:</strong> Once a representation passes through a <strong>Softmax</strong> operation, Euclidean rules break. Softmax creates a <strong>Bregman geometry</strong> governed by the <strong>log-partition function A(lambda)</strong>. In this space, distance isn&#8217;t a straight line; it is measured by <strong>KL Divergence</strong>.</p></li><li><p><strong>The &#8220;Type Error&#8221;:</strong> Researchers often treat the <strong>Linear Probe</strong> (a measurement tool/covector) as a <strong>Vector</strong> (a displacement). Adding a probe directly to a representation in the residual stream is mathematically akin to trying to &#8220;add a thermometer to a room.&#8221;</p></li></ul><h4>Dual Steering: A Geometric Fix</h4><p><strong>Two Coordinate Systems:</strong> Bregman geometry necessitates two systems: the <strong>Primal (lambda)</strong>, which is the raw, infinite vector in the residual stream, and the <strong>Dual (phi)</strong>, which is the probability-weighted &#8220;center of mass&#8221; of the model&#8217;s vocabulary.</p><ul><li><p><strong>Primal Interpolation</strong> acts like a <strong>Logical AND</strong>, crushing unique traits to find a &#8220;safe,&#8221; generic consensus (often resulting in bland outputs).</p></li><li><p><strong>Dual Interpolation</strong> acts like a <strong>Logical OR</strong>, preserving the union of concepts and allowing for distinct mixtures without collapsing into prepositions.</p></li></ul><p><strong>The Solution:</strong> To steer effectively, one must map the primal vector to the <strong>dual coordinate</strong>, add the probe there, and translate back. This <strong>Dual Steering</strong> ensures a &#8220;KL Projection&#8221;&#8202;&#8212;&#8202;changing the target concept while minimizing the shift of the rest of the distribution.</p><p><strong>The Bottleneck:</strong> Current math for dual steering only applies to <strong>exit nodes</strong> (where Softmax occurs, like final-token distributions or CLIP retrievals). It does not yet exist for the <strong>intermediate layers</strong> where most surgical steering is actually performed. This is a huge problem and will have to be addressed. <strong>That&#8217;s why we treat this work more as a jumping off point into the larger space as opposed to focusing most of our attention on discussing the algorithm.</strong></p><p><strong>Final Takeaway:</strong> we should stop trying to &#8220;bludgeon&#8221; models into submission with compute-heavy fine-tuning and start mastering the <strong>information geometry</strong> of the models themselves. Unlocking the math of how models represent knowledge is the path to efficient, precise control.</p><p>Things you might find interesting:</p><ol><li><p><a href="https://github.com/dl1683/Latent-Space-Reasoning/tree/main">Our research on Latent Space Reasoning, which allows us to unlock new capabilities not present in the base models, without any training</a>.</p></li><li><p>O<a href="https://github.com/dl1683/moonshot-fractal-embeddings">ur work on Fractal embeddings, which allows us to integrate a sense of hierarchy directly into embeddings, allowing for lossless architectural depth hierarchical classification.</a></p></li></ol><p>Both streams of research are promising pointers to how exploring geometry can unlock low-cost, powerful solutions that improve the AI landscape.</p><p>PS: Personal update. A good friend of mine is hosting this event. Y&#8217;all might find it interesting to go to (I&#8217;ll be there as well, lmk if you want to come say hi).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iDyJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iDyJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png 424w, https://substackcdn.com/image/fetch/$s_!iDyJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png 848w, https://substackcdn.com/image/fetch/$s_!iDyJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png 1272w, https://substackcdn.com/image/fetch/$s_!iDyJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iDyJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png" width="784" height="840" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:840,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iDyJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png 424w, https://substackcdn.com/image/fetch/$s_!iDyJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png 848w, https://substackcdn.com/image/fetch/$s_!iDyJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png 1272w, https://substackcdn.com/image/fetch/$s_!iDyJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e127aea-34d1-4a6a-a0d2-3568f20576e2_784x840.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Website: <a href="https://silsilasounds.org/">silsilasounds.org</a> | IG: <a href="https://www.instagram.com/silsilasounds/">@silsilasounds</a></p><p>Hope I&#8217;ll see you here.</p><p><em>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1B2o!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1B2o!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png 424w, https://substackcdn.com/image/fetch/$s_!1B2o!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png 848w, https://substackcdn.com/image/fetch/$s_!1B2o!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png 1272w, https://substackcdn.com/image/fetch/$s_!1B2o!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1B2o!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png" width="952" height="252" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:252,&quot;width&quot;:952,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1B2o!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png 424w, https://substackcdn.com/image/fetch/$s_!1B2o!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png 848w, https://substackcdn.com/image/fetch/$s_!1B2o!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png 1272w, https://substackcdn.com/image/fetch/$s_!1B2o!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30de1cdf-52ef-4765-b39c-3b86a05792be_952x252.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><h3>What Does Activation Steering Actually Do to a Model&#8217;s Output Distribution?</h3><p>Generally, when we want to impose a behavioral change in a model, we tend to rely on Fine-Tuning. When it works (and it never does), fine-tuning a 70B parameter model requires dataset curation, regression testing, and hundreds of GPU hours; thousands of dollars per behavioral tweak (at the end of which you&#8217;re hoping that your expensive lil trip hasn&#8217;t broken something else in the model&#8217;s ability (which it generally does)).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PrC5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PrC5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png 424w, https://substackcdn.com/image/fetch/$s_!PrC5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png 848w, https://substackcdn.com/image/fetch/$s_!PrC5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png 1272w, https://substackcdn.com/image/fetch/$s_!PrC5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PrC5!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png" width="1200" height="1193.4065934065934" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1448,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PrC5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png 424w, https://substackcdn.com/image/fetch/$s_!PrC5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png 848w, https://substackcdn.com/image/fetch/$s_!PrC5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png 1272w, https://substackcdn.com/image/fetch/$s_!PrC5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3190ae3-892b-464b-a4ab-cfd743107b17_1772x1762.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/how-to-teach-llms-to-reason-for-50">I will never not slander Fine Tuning.</a></figcaption></figure></div><p>Activation steering, on the other hand, costs nothing. It is an inference-time intervention&#8202;&#8212;&#8202;a single vector addition during the forward pass. So why don&#8217;t we do it everywhere? It hasn&#8217;t had the kind of results we were expecting from it. The reason why might have been in the way we were thinking about the space our vectors occupy.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!26AM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!26AM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png 424w, https://substackcdn.com/image/fetch/$s_!26AM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png 848w, https://substackcdn.com/image/fetch/$s_!26AM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png 1272w, https://substackcdn.com/image/fetch/$s_!26AM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!26AM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!26AM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png 424w, https://substackcdn.com/image/fetch/$s_!26AM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png 848w, https://substackcdn.com/image/fetch/$s_!26AM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png 1272w, https://substackcdn.com/image/fetch/$s_!26AM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb609b9-76bd-4087-b3d8-5f6c19721651_1848x1032.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The standard approach to steering assumes the model&#8217;s representation space is Euclidean&#8202;&#8212;<strong>&#8202;a flat mathematical environment where adding a vector moves a concept in a straight line without distorting the rest of the distribution.</strong> Here, you find the vector direction for a concept and add it. Let&#8217;s look at an example:</p><ul><li><p>Feed Gemma &#8220;Under Al-Teta, Arsenal play&#8230;&#8221;.</p></li><li><p>The model predicts base verbs like &#8220;haramball&#8221; or &#8220;set piece ball.&#8221;</p></li><li><p>To steer it toward positive verbs like &#8220;exciting football,&#8221; you calculate the &#8220;exciting games&#8221; vector, magnify it (you have to gaslight the model a lot for this example) and add it to the active representation.</p></li></ul><p>The naive mental model assumes probability mass shifts cleanly from the base to the exciting concept. In production, the probability leaks. At intermediate steering strengths, the preposition &#8220;to&#8221; might suddenly pull more mass than any verb in the distribution. You ask for a conjugation and the model might hallucinate a preposition. This is not unique to Language Models; we observe the same failure in vision models. Steer MetaCLIP-2 away from &#8220;cat&#8221; toward &#8220;dog,&#8221; and the top retrieval becomes an image containing both a cat and a dog. The target concept moves, but unrelated concepts get dragged with it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!huz7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!huz7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png 424w, https://substackcdn.com/image/fetch/$s_!huz7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png 848w, https://substackcdn.com/image/fetch/$s_!huz7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png 1272w, https://substackcdn.com/image/fetch/$s_!huz7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!huz7!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png" width="1200" height="647.8021978021978" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:786,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!huz7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png 424w, https://substackcdn.com/image/fetch/$s_!huz7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png 848w, https://substackcdn.com/image/fetch/$s_!huz7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png 1272w, https://substackcdn.com/image/fetch/$s_!huz7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F499d3ed0-8e99-459a-93bb-ad46ddddff7d_2156x1164.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">&#8220;Token probability changes in Gemma-3&#8211;4B when steering the context &#8220;Author gives an insight into what it costs US taxpayers to build and&#8221; using a linear probe for verb &#8658; third-person. <strong>Euclidean steering leaks significant mass to off-target tokens (e.g., &#8220;to&#8221;) during intermediate steps, whereas dual steering directly shifts probability from base tokens (e.g., &#8220;maintain&#8221;, &#8220;operate&#8221;) to target tokens (e.g., &#8220;maintains&#8221;, &#8220;operates&#8221;). Center &amp; Right: Steering MetaCLIP-2 on the context &#8220;a photo of one cat&#8221; for the concept cat &#8658; dog. Dual steering transfers probability from base images (e.g., &#8220;cat&#8221;, &#8220;cat + bicycle&#8221;) directly to targets (e.g., &#8220;dog&#8221;, &#8220;dog + bicycle&#8221;). In contrast, Euclidean steering unintentionally promotes the off-target &#8220;cat + dog&#8221; image (green frame in the right column), which becomes the Top-1 result during intermediate steps. In the probability plots, Top-k tokens (LLM) or images (CLIP) are shown explicitly, with the remainder grouped as &#8220;others.&#8221;</strong>&#8221;</figcaption></figure></div><p>This explains why steering consistently loses to fine-tuning in head-to-head benchmarks. Subspace patching creates illusions of control while actual model behavior degrades.</p><p>To reiterate, since this is an important point, standard steering commits a type error: Once a representation passes through a softmax operation, it no longer exists in a Euclidean space. Softmax creates a Bregman geometry&#8202;&#8212;&#8202;a space that is mathematically flat, but requires two different coordinate systems to map distances correctly. Standard steering collapses these systems into one.</p><p>The research we&#8217;re going to look at derives a fix they call dual steering, and proves that under a clean probe and a concept-factorization assumption, dual steering is the KL projection that changes the target concept while minimizing off-target distribution shift.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z7Us!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z7Us!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!Z7Us!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!Z7Us!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!Z7Us!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z7Us!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z7Us!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!Z7Us!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!Z7Us!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!Z7Us!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede08563-10b3-4406-91dc-6568b928d8f6_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The exact geometry derived in the paper only applies to representations that directly parameterize a softmax distribution. <em><strong>This covers final-token distributions, CLIP retrievals, and attention layers. It does NOT cover arbitrary intermediate layers deep inside the network.</strong></em> Most production steering targets those intermediate layers to intercept concepts before they propagate. The math for Bregman geometry at intermediate layers does not yet exist. So, as we get into this research, consider this more the start of an interesting discussion as opposed to the final say. My hope is that by surfacing this research (and others like this) we can encourage more contributors in our open source community to start exploring the geometry/math of AI, instead of purely looking at the standard axes of scale, tweaking, and model tuning.</p><p>To explore this research in-depth, we must first ask ourselves a very fundamental question&#8202;&#8212;&#8202;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5cTs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5cTs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!5cTs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!5cTs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!5cTs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5cTs!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5cTs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!5cTs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!5cTs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!5cTs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a745ed-bcb2-4fce-b753-b6ef5e199dc0_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>How Does a Representation Become a Distribution, and What Should &#8220;Close&#8221; Mean?</h3><p>When we want to know if two representations are &#8220;close,&#8221; we naturally measure the straight-line distance between their vectors. Why? Because <code>torch.dist()</code> is easy to type, and we like to pretend AI happens in a clean, flat space (the Euclidean assumption we just about).</p><p>But as we saw in the last section, Euclidean distance lies to you. A microscopic nudge near a decision boundary flips the entire output, while a massive shove when the model is 99% confident does absolutely nothing.</p><p>The model doesn&#8217;t care about the geometric distance. It cares about the output distribution. To define &#8220;close&#8221; correctly, we have to look under the hood at how a vector actually becomes a distribution.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2vav!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2vav!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!2vav!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!2vav!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!2vav!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2vav!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2vav!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!2vav!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!2vav!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!2vav!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b09676-c7e5-45a0-9f11-21f65b010e5b_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">We&#8217;ll explain all these symbols below.</figcaption></figure></div><p>The model holds an active representation vector for the current context&#8202;&#8212;&#8202;let&#8217;s call it lambda (personally, I don&#8217;t love the use of Greek letters, but I&#8217;m keeping it here since most of the literature uses them). It also holds a massive lookup table of &#8220;unembedding&#8221; vectors for every possible token in its vocabulary&#8202;&#8212;&#8202;let&#8217;s call those gamma_y. To score a specific token, the model calculates the dot product of lambda and gamma_y.</p><p>Why a dot product? Because a dot product is fundamentally a measure of directional alignment. It asks the math: &#8220;How much does our current context vector point in the exact same direction as the &#8216;exciting football&#8217; vector?&#8221; High alignment means a high raw score.</p><p>But these raw scores (logits) aren&#8217;t probabilities. To get probabilities, the model shoves them through a Softmax function. Softmax does two things</p><ul><li><p>It exponentiates the scores</p></li><li><p>Then it divides by the sum of all the exponentiated scores so everything equals 100%.</p></li></ul><p>Exponentiation is a bloodbath. Let&#8217;s say &#8220;haramball&#8221; scores a 10, &#8220;set piece&#8221; scores an 8, and &#8220;exciting&#8221; scores a 3. In raw score space, 3 is behind 10, but it&#8217;s in the same zip code. Once you exponentiate them (e&#185;&#8304; vs e&#179;), &#8220;haramball&#8221; shoots to roughly 22,000. &#8220;Exciting&#8221; is sitting at 20. Divide by the total, and &#8220;haramball&#8221; owns 88% of the probability mass. &#8220;Exciting&#8221; gets 0.08%. Softmax takes a mild preference and turns it into a blowout.</p><p>This extreme sharpness is exactly where standard mathematical tools break down&#8202;&#8212;&#8202;specifically, the covariance matrix.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!b9Ph!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!b9Ph!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!b9Ph!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!b9Ph!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!b9Ph!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!b9Ph!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!b9Ph!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!b9Ph!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!b9Ph!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!b9Ph!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e9f3b30-e638-4894-8cd3-cabf0e594f5f_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A covariance matrix is just a grid that tracks how different variables change together. In our case, if I tweak the lambda vector, how do the probabilities of all 128,000 tokens shift relative to each other? You need this matrix to map the space. But because Softmax just turned 99.9% of your vocabulary into absolute zeros, those dead tokens contribute absolutely nothing to the variance. The covariance matrix becomes &#8220;rank-deficient&#8221;&#8202;&#8212;&#8202;a mathematical dead end where most of the dimensions carry zero information. <strong>This is a problem because you cannot navigate using a map where most of the coordinates have collapsed.</strong></figcaption></figure></div><p>All of this chaos is controlled by the denominator in that Softmax equation&#8202;&#8212;&#8202;the normalizer that divides everything. Because we usually work in log-space to keep our GPUs from throwing underflow errors, we take the log of that massive sum. <strong>This term is so fundamental it gets its own name: the log-partition function, written as A(lambda).</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nm4d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nm4d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png 424w, https://substackcdn.com/image/fetch/$s_!nm4d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png 848w, https://substackcdn.com/image/fetch/$s_!nm4d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png 1272w, https://substackcdn.com/image/fetch/$s_!nm4d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nm4d!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png" width="1200" height="670.054945054945" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nm4d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png 424w, https://substackcdn.com/image/fetch/$s_!nm4d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png 848w, https://substackcdn.com/image/fetch/$s_!nm4d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png 1272w, https://substackcdn.com/image/fetch/$s_!nm4d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2033d7c0-3d8c-482b-a861-b0aca5b2fcbe_2400x1340.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IkCF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IkCF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!IkCF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!IkCF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!IkCF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IkCF!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IkCF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!IkCF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!IkCF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!IkCF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d86bb8-00e6-45ad-80f7-64bf86a05992_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The actual probability of a token is just its exponentiated score minus this A(lambda) function. A handles all the normalization. This creates a very interesting outcome: the geometry of this space, the duality we need, and the steering fix we are about to build&#8202;&#8212;&#8202;all of it is hiding inside the derivatives of A(lambda).</p><p>To see why, we need a way to measure how different two softmax distributions are. And that leads us to our next point of exploration&#8230;</p><h3>Why Does KL Divergence Measure the Change We Actually Care About?</h3><p>We just established that the log-partition function <code>A(lambda)</code> controls the shape of our representation space. But before we can use it to fix our steering vectors, we have to solve a more immediate problem: how do we measure distance on this new terrain?</p><p>If Euclidean distance is a lie (yet another reason to not trust the Greeks)&#8212; if it treats a massive shove at 99% confidence exactly the same as a tiny nudge at a decision boundary&#8202;&#8212;&#8202;then what is the truth? We need a function that takes two probability distributions, compares them, and returns a single number representing how different they actually are in practice.</p><p>Our hero comes from Information Theoery: Kullback-Leibler (KL) divergence. If the true distribution is P, but your model assumes the distribution is Q, KL(P || Q) is the exact mathematical cost of that error. It calculates how surprised you will be when reality actually happens.</p><p>The formula is a sum over all possible outcomes x: <code>P(x) * log(P(x) / Q(x))</code>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bP50!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bP50!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!bP50!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!bP50!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!bP50!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bP50!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bP50!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!bP50!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!bP50!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!bP50!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce44d0a6-ce91-4f1b-9e71-dfe3ae9d868f_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There are three moving parts here, and they each do a specific job to keep the math grounded in reality:</p><p><strong>1. The Ratio: P(x) / Q(x)</strong> For any token x, how much more likely is it under the true distribution (P) than your steered distribution (Q)? If they agree, the ratio is 1. If P thinks the token is a sure thing and Q thinks it&#8217;s impossible, the ratio explodes.</p><p><strong>2. The Logarithm</strong> Log converts multiplicative ratios into additive scores. A ratio of 1 (total agreement) maps to exactly zero. But more importantly, log makes KL sensitive to <em>proportional</em> changes, not absolute ones. <em>A token dropping from 50% to 40% is only a 1.25x change. A token dropping from 0.1% to 0.01% is a 10x change. Log mathematically enforces the rule that relative probability determines model behavior.</em></p><p><strong>3. The Weighting: P(x)</strong> The whole thing is multiplied by P(x) before summing. This is the &#8220;Do I actually care?&#8221; filter. KL only punishes disagreements where the true distribution P actually puts probability mass. If P thinks a token is garbage (P(x) is near 0), that term vanishes. What about Qs thoughts? Respectfully, who gives us a fuck what a grunt like Q thinks when a baller like P has already made up its mind.</p><p>Putting everything together, KL(P || Q) asks: &#8220;If P is the absolute truth, how surprised would you be if you had to navigate the world using Q?&#8221;</p><p>This weighting creates a profound asymmetry. KL(P || Q) does not equal KL(Q || P). Being wrong about P when Q is true costs a different amount than being wrong about Q when P is true. In the Euclidean space, the distance from New York to London is the same as London to New York. In information space, the penalty depends entirely on which distribution is actually generating your data. This asymmetry is exactly what causes forward and reverse KL to produce entirely different behaviors (which becomes the crucial AND-vs-OR distinction when we talk about interpolation later).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3Y78!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3Y78!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!3Y78!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!3Y78!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!3Y78!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3Y78!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3Y78!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!3Y78!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!3Y78!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!3Y78!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe73873-ac6b-4ab4-800e-7d6b3af6c3fe_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Why does this matter? Let&#8217;s dig into this math just a wee bit more. If we do this step-by-step, the geometry of the entire model falls out of the algebra.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IMjk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IMjk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!IMjk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!IMjk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!IMjk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IMjk!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IMjk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!IMjk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!IMjk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!IMjk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd494cc4-ebac-47bd-9013-292e9dc2fd42_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">This will be a very intense section so here is an image to act as an overall map</figcaption></figure></div><h3>What Happens When You Compute KL Between Two Softmax Distributions?</h3><p>First, let&#8217;s put the pieces back on the board so we don&#8217;t lose track of what we are building:</p><ul><li><p><strong>lambda</strong> is our original, unsteered context vector.</p></li><li><p><strong>lambda-prime</strong> is the steered vector (after we add our behavioral tweak).</p></li><li><p><strong>gamma_y</strong> is the unembedding vector for a specific token (the dictionary definition the model checks against).</p></li><li><p><strong>A(lambda)</strong> is the log-partition function&#8202;&#8212;&#8202;the brutal normalizer that forces everything to sum to 100%.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tim6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tim6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!tim6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!tim6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!tim6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tim6!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tim6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!tim6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!tim6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!tim6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a0feb89-361a-48d0-acb9-63434874f0ce_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The KL formula asks us to compute a ratio for every token: <code>log [ P(y | lambda) / P(y | lambda-prime) ]</code>.</p><blockquote><p><em>In plain English: take the logarithm of the true probability divided by the steered probability.</em></p></blockquote><p>How do we calculate those probabilities? Remember from the previous section: the probability of a token is just <code>exp(score - normalizer)</code>.</p><p>Because we are taking the logarithm of an exponentiated number, the math simplifies beautifully. <strong>The log simply deletes the </strong><code>exp</code><strong>, leaving only the raw terms inside. Division inside a logarithm becomes subtraction outside.</strong></p><p>So, taking the log of that probability ratio strips the math down to just the raw scores and the normalizers. For the top part of the fraction (the original state), we get: <code>lambda * gamma_y - A(lambda)</code></p><p>For the bottom part (the steered state), we subtract it: <code>minus [ lambda-prime * gamma_y - A(lambda-prime) ]</code></p><blockquote><p><em>If you group the similar terms together, the log ratio becomes a clean, linear equation: </em><code>(lambda - lambda-prime) * gamma_y + A(lambda-prime) - A(lambda)</code></p></blockquote><p>Now, the final step of the KL divergence formula tells us to multiply that result by the true probability <code>P(y | lambda)</code> and sum it up over every token in the vocabulary.</p><p>When you do that, something fascinating happens to that <code>gamma_y</code> term. You end up calculating the sum of <code>P(y | lambda) * gamma_y</code>.</p><p>Stop and think about what that is physically. You are taking every single token vector in the model, weighting it by how likely that token is to be generated, and averaging them all together. It is the probability-weighted center of gravity for the model&#8217;s current state.</p><p>As it turns out (and we will prove exactly why in the next section), this center of gravity is exactly the mathematical gradient of our log-partition function A. Let&#8217;s just call it <code>grad-A(lambda)</code>.</p><p>If we substitute that gradient back into our equation, the final KL formula reveals itself:</p><p><code>KL = A(lambda-prime) - A(lambda) - grad-A(lambda) * (lambda-prime - lambda)</code></p><p>Look at the physical architecture of this final result.</p><p><code>A(lambda-prime) - A(lambda)</code> is exactly how much the normalizer <em>actually</em> changed when we steered the vector.</p><p><code>grad-A(lambda) * (lambda-prime - lambda)</code> is how much a straight, flat tangent line <em>predicted</em> the normalizer would change.</p><p><strong>The KL divergence is literally the gap between the true change and the linear approximation.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zqfK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zqfK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!zqfK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!zqfK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!zqfK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zqfK!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zqfK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!zqfK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!zqfK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!zqfK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F855eefe5-c292-4a28-9ddc-716af9d95d51_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Why is this gap always positive? Because the normalizer function A is convex&#8202;&#8212;&#8202;it curves upward like a bowl. For any convex function, a straight tangent line will always sit below the curve. The true curve always bends up and overshoots the straight-line prediction. The gap only hits zero if the two vectors are exactly the same.</p><p><strong>This gap&#8202;&#8212;&#8202;the error between a convex function and its linear approximation&#8202;&#8212;&#8202;has a formal mathematical name. It is a Bregman divergence.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FCpf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FCpf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png 424w, https://substackcdn.com/image/fetch/$s_!FCpf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png 848w, https://substackcdn.com/image/fetch/$s_!FCpf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png 1272w, https://substackcdn.com/image/fetch/$s_!FCpf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FCpf!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png" width="1200" height="658.0110497237569" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:794,&quot;width&quot;:1448,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FCpf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png 424w, https://substackcdn.com/image/fetch/$s_!FCpf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png 848w, https://substackcdn.com/image/fetch/$s_!FCpf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png 1272w, https://substackcdn.com/image/fetch/$s_!FCpf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a3504f-7918-4a79-a77c-9871b9e25c40_1448x794.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.researchgate.net/figure/Geometric-interpretation-of-Bregman-Divergence_fig1_284899225">Image Source</a></figcaption></figure></div><p><code>KL(P || Q)</code> between two softmax distributions isn&#8217;t <em>like</em> a Bregman divergence. It <em>is</em> the Bregman divergence induced by the log-partition function A.</p><p>Nobody chose this. The algebra forced it. It means Euclidean geometry is not the natural default once a representation passes through a Softmax distribution. Every LLM, every CLIP model, and every attention layer is living and breathing in a Bregman geometry that most researchers have never explicitly mapped out. We&#8217;ve been using Euclidean wrenches on Bregman bolts since 2017.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kZ2Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kZ2Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!kZ2Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!kZ2Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!kZ2Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kZ2Y!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kZ2Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!kZ2Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!kZ2Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!kZ2Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c0cb4a-5dc2-4ef0-957b-59d0c3ca8c18_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Information geometers have studied Bregman divergences since the 1980s. And the very first thing their toolkit tells you about a space governed by Bregman geometry is this: the representation space does not have one natural coordinate system. It has two. This has some very juicy implications.</p><h3>What Are the Two Coordinate Systems, and Why Does the Duality Matter?</h3><p>The first coordinate system is the one everyone already uses: <strong>lambda</strong>. This is the raw representation vector sitting in the residual stream. Let&#8217;s call it the <strong>primal</strong> coordinate.</p><p>The primal space is the Wild West. It is entirely unconstrained. You can take your lambda vector, multiply it by a million, and point it absolutely anywhere in that 4,096-dimensional space. The model won&#8217;t crash. Softmax will just take those massive numbers and turn the output into a brutal step-function where one token gets 99.999% of the mass. The Primal Space has that good ol&#8217; Murican freedom, baby.</p><p>The second coordinate system is completely different. It comes directly from that gradient term we isolated earlier: the gradient of our log-partition function. Let&#8217;s call this new coordinate <strong>phi</strong>.</p><p><code>phi = grad-A(lambda)</code></p><p>Let&#8217;s walk through the actual derivative (in case you don&#8217;t remember, derivatives give us the rate of change of something with respect to something else) to see what phi is physically made of. Don&#8217;t skip this, because it is the most elegant piece of math in the entire architecture.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ppbO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ppbO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!ppbO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!ppbO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!ppbO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ppbO!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ppbO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!ppbO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!ppbO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!ppbO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86774f7a-4106-4b6f-8a97-bd988bb38d9f_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Remember that our normalizer function is <code>A(lambda) = log [ sum of all exp(scores) ]</code>.</p><p>To find the gradient, we take the derivative. The chain rule in calculus tells us that the derivative of a logarithm is simply <code>1 / x</code> multiplied by the derivative of whatever is inside the log.</p><ul><li><p>The bottom of our fraction (the <code>x</code>) becomes the inside of the log: the sum of all the exponentiated scores.</p></li><li><p>The top of our fraction becomes the derivative of those scores. If a token&#8217;s score is <code>lambda * gamma_y</code>, its derivative with respect to lambda is just the token vector itself: <code>gamma_y</code>.</p></li></ul><p>So, for any given token, the gradient gives us this exact fraction: <code>exp(score) / [sum of all exp(scores)]</code> ... multiplied by the token vector <code>gamma_y</code>.</p><blockquote><p>Look very closely at the left side of that multiplication. That fraction is literally the exact formula for Softmax probability.</p></blockquote><p>The algebra just handed us a massive gift. <strong>The gradient of the log-partition function is simply every single token vector in the model, multiplied by its Softmax probability, and added together.</strong></p><p><code>phi = sum over y of [ P(y | lambda) * gamma_y ]</code></p><p>This is our dual coordinate. Physically, it is the probability-weighted center of mass for the entire vocabulary. If the model assigns 70% probability to &#8220;maintains&#8221;, 20% to &#8220;operates&#8221;, and 10% to random noise, then your dual coordinate (phi) sits exactly at <code>0.7 * (maintains) + 0.2 * (operates) + 0.1 * (noise)</code>. It is a physical coordinate telling you exactly where the model&#8217;s attention is currently hovering across the dictionary.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HO7E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HO7E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!HO7E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!HO7E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!HO7E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HO7E!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HO7E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!HO7E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!HO7E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!HO7E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f669f53-8572-4d0f-b43d-ba9c443566bb_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Because of how Bregman geometry works, lambda and phi are just two views of the exact same distribution. If you have the primal vector, you can calculate the dual center of mass. If you have the dual center of mass, you can reverse-engineer the primal vector.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YYgA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YYgA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!YYgA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!YYgA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!YYgA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YYgA!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YYgA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!YYgA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!YYgA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!YYgA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e59bbe0-a2b6-4333-bac9-00cd9c83c425_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But there is a massive physical asymmetry between them. We already established that the primal space (lambda) is infinite. The dual space (phi) is trapped in a cage.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BUEe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BUEe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!BUEe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!BUEe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!BUEe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BUEe!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BUEe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!BUEe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!BUEe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!BUEe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F383f2a17-5b2a-41d8-816c-1e361cce8f81_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Why? Because phi is built by multiplying token vectors by probabilities. Probabilities obey strict laws: they can never be negative, and they must add up to exactly 100%. Because of this, phi can never step outside the boundary drawn around your vocabulary. Imagine stretching a massive mathematical rubber band around every single token vector in the model&#8217;s embedding space. That rubber band is called a convex hull. You can move phi anywhere inside the hull by mixing different token probabilities, but you can never push it outside. If you try to steer the dual coordinate outside that hull, the math shatters, because no valid probability distribution could ever put you there.</p><h3>What Do the Two Coordinate Systems Mean Semantically?</h3><p>Why do we care that there are two systems? Because they answer the most basic question in geometry completely differently: <em>how do you draw a straight line?</em> This might seem like a silly questions, but drawing a straight line between two concepts is how we blend them. When you want to combine two ideas, you take their representations and find the midpoint. But on a warped Bregman surface, your midpoint completely depends on which coordinate system you use to draw the line.</p><p>Let&#8217;s look at a concrete example. You have two representations.</p><ul><li><p><strong>Vector 0</strong> is the model&#8217;s state after reading: &#8220;Q: What is the capital of France? A: It is&#8221; (Probability sits heavily on the token &#8220;Paris&#8221;).</p></li><li><p><strong>Vector 1</strong> is the state after reading: &#8220;Q: What is the capital of Germany? A: It is&#8221; (Probability sits heavily on &#8220;Berlin&#8221;).</p></li></ul><p>You want to find the exact 50/50 midpoint between these two concepts.</p><p>If you do a <strong>primal interpolation</strong>, you draw a straight line between the two raw lambda vectors in the unconstrained residual stream. Because of how the Bregman algebra shakes out, moving in a straight line in primal space mathematically forces the model to minimize the <em>reverse</em> KL divergence.</p><p><strong>Think back to our KL section. Reverse KL is the &#8220;Do Not Hallucinate&#8221; penalty. It looks at the target distribution and says, &#8220;If the target thinks a token is garbage, you will pay a massive penalty for putting probability mass there.&#8221;</strong></p><p>Look at what happens to the math when you stand at the primal midpoint. The France endpoint looks at the word &#8220;Berlin.&#8221; It sees that &#8220;Berlin&#8221; has near-zero probability in the France distribution, so it slaps the model with a massive penalty for including it. Simultaneously, the Germany endpoint looks at the word &#8220;Paris,&#8221; sees near-zero probability, and slaps the model with a massive penalty for including it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DuMM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DuMM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!DuMM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!DuMM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!DuMM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DuMM!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DuMM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!DuMM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!DuMM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!DuMM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5605649-2d44-4e99-8c34-1d853baa955a_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What survives? Only the tokens that <em>both</em> endpoints agree are mathematically harmless&#8202;&#8212;&#8202;generic words like &#8220;The&#8221;, &#8220;is&#8221;, or &#8220;called&#8221;. Primal interpolation operates exactly like a logical <strong>AND</strong>. It crushes everything that makes a context unique and leaves only the safest possible intersection.</p><p>If you do a <strong>dual interpolation</strong>, you draw a straight line between the two phi vectors (the centers of mass) inside the convex hull. Moving in a straight line in dual space minimizes the <em>forward</em> KL divergence.</p><p><strong>Forward KL operates under the exact opposite philosophy. It is the &#8220;Do Not Forget&#8221; penalty. It says, &#8220;If the target thinks a token is highly likely, you will pay a massive penalty if you fail to cover it.&#8221;</strong></p><p>Look at the midpoint now. The endpoints are no longer allowed to veto each other. France demands you keep &#8220;Paris&#8221;. Germany demands you keep &#8220;Berlin&#8221;. Instead of crushing them, dual interpolation forces the model into a compromise. It creates a mixed probability distribution that holds both truths simultaneously, allocating roughly 50% mass to Paris and 50% mass to Berlin.</p><p>In other words, Dual interpolation operates exactly like a logical <strong>OR</strong>. It preserves the union of the two concepts.</p><p>This behavior is a fundamental law of any model that uses Softmax. If you take a vision model like CLIP and try to interpolate the concept of a &#8220;black dog&#8221; with a &#8220;white dog&#8221;, you get the exact same split.</p><ul><li><p>Primal interpolation (AND logic) searches for shared traits. The colors fight each other, the model panics, and it spits out a single dog with black and white spots.</p></li><li><p>Dual interpolation (OR logic) holds both truths. It spits out an image that literally contains two distinct dogs&#8202;&#8212;&#8202;one black, one white.</p></li></ul><p>Primal finds the boring consensus. Dual preserves the contradiction. Two completely different philosophies of blending concepts, arising purely from which ruler you picked up.</p><p>This is why the geometry actually matters for your pipeline. When you blindly subtract two vectors in the residual stream to build a steering direction, you aren&#8217;t just doing niche math. You are accidentally making a product decision. By calculating your vector in the raw residual stream, you are locking yourself into the primal coordinate system. You are forcing the model into that destructive <strong>AND</strong> logic. You are explicitly telling the model to crush anything unique about the prompt and only keep the safe consensus. This is exactly why standard steering leaks probability mass to generic prepositions like &#8220;to&#8221;&#8202;&#8212;&#8202;it&#8217;s abandoning specific concepts to find the mathematical middle ground.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HSfQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HSfQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png 424w, https://substackcdn.com/image/fetch/$s_!HSfQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png 848w, https://substackcdn.com/image/fetch/$s_!HSfQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png 1272w, https://substackcdn.com/image/fetch/$s_!HSfQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HSfQ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png" width="1200" height="1071.4285714285713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1300,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HSfQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png 424w, https://substackcdn.com/image/fetch/$s_!HSfQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png 848w, https://substackcdn.com/image/fetch/$s_!HSfQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png 1272w, https://substackcdn.com/image/fetch/$s_!HSfQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdea9ab51-8037-440e-adf9-d0222c602844_1604x1432.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Primal interpolation emphasizes the shared structure (intersection) of distributions, whereas dual interpolation results in a linear mixture. We visualize output probability changes along interpolation paths between two context embeddings &#955;(x0 ) and &#955;(x1 ). The dual interpolation (right, m-geodesic: &#966;t = (1 &#8722; t)&#966;(&#955;0 ) + t&#966;(&#955;1 )) corresponds to a weighted average of the endpoint distributions. In contrast, the primal interpolation (left, e-geodesic: &#955;t = (1 &#8722; t)&#955;0 + t&#955;1 ) upweights shared components near the midpoint (e.g., &#8220;the&#8221;, &#8220;called&#8221; in LLM or &#8220;black-and-white dog&#8221; in CLIP), while suppressing endpoint-specific outputs (e.g., &#8220;Paris&#8221; vs. &#8220;Berlin,&#8221; or &#8220;black dog&#8221; vs. &#8220;white dog&#8221;). Top-k tokens (LLM) or images (CLIP) are shown explicitly, with the remainder grouped as &#8220;others.&#8221;</figcaption></figure></div><p>If you actually want to add a behavioral concept without destroying the original context&#8202;&#8212;&#8202;if you want the <strong>OR</strong> logic&#8202;&#8212;&#8202;you cannot just add vectors in the residual stream. You have to translate the vectors, do the addition in the dual space, and translate them back.</p><p>Isn&#8217;t exploring the math of intelligence so much cooler than being a training grunt? Imagine learning about some of the coolest topics in the world only to be forced to debug GPU crashes and benchmark tests 24/7.</p><p>Interpolation shows that the two coordinate systems produce different behaviors when you blend representations. Steering is a related but sharper operation: instead of blending two representations, you&#8217;re modifying one to change a specific concept. The question is the same&#8202;&#8212;&#8202;which coordinate system are you operating in?&#8202;&#8212;&#8202;but the stakes are higher, because steering with a probe means adding a specific mathematical object to the representation. What kind of object the probe is determines which coordinate system it belongs to.</p><h3>What Kind of Mathematical Object Is a Linear Probe?</h3><p>Before we fix the steering math, we have to look closely at the tool we are using to steer: the linear probe.</p><p>When you train a linear probe to detect a concept&#8202;&#8212;&#8202;say, &#8220;third-person verb&#8221;&#8202;&#8212;&#8202;you are building a very specific mathematical object. It takes your 4,096-dimensional representation vector (<code>lambda</code>), runs a dot product against its own weights (<code>beta_W</code>), and spits out a single scalar score.</p><p>In linear algebra, an object that eats vectors and spits out scalars is a linear functional. A <strong>covector</strong>. Covectors live exclusively in the dual space.</p><p>Most of us learned the difference between vectors and covectors in undergrad and immediately dumped it from RAM. Why? Because if you live in flat Euclidean space, you don&#8217;t need to care. Flat space lets you cheat. You can add them, subtract them, and treat a covector exactly like a normal vector.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W-uZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W-uZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png 424w, https://substackcdn.com/image/fetch/$s_!W-uZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png 848w, https://substackcdn.com/image/fetch/$s_!W-uZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png 1272w, https://substackcdn.com/image/fetch/$s_!W-uZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W-uZ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png" width="1200" height="848.9010989010989" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1030,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W-uZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png 424w, https://substackcdn.com/image/fetch/$s_!W-uZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png 848w, https://substackcdn.com/image/fetch/$s_!W-uZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png 1272w, https://substackcdn.com/image/fetch/$s_!W-uZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bbe9ac9-f720-4f08-a300-840b0db18ec1_1491x1055.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But they are physically different objects.</p><p>A vector is a displacement&#8202;&#8212;&#8202;a physical direction you can move. A covector is a measurement&#8202;&#8212;&#8202;a tool that assigns numbers to states. Think of a thermometer. A thermometer measures a room and returns a temperature. The thermometer itself is not a room. You cannot take a physical location and mathematically &#8220;add&#8221; a thermometer to it.</p><p>In the warped, twisted Bregman geometry of Neural Network Information Spaces, we can&#8217;t get away with conflating them. Technically, we can, but that is why so much of the activation steering research has sucked for so long and all of the attention has been to the fine-tuning tards.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KWl5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KWl5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png 424w, https://substackcdn.com/image/fetch/$s_!KWl5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png 848w, https://substackcdn.com/image/fetch/$s_!KWl5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png 1272w, https://substackcdn.com/image/fetch/$s_!KWl5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KWl5!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png" width="1200" height="848.9010989010989" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1030,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KWl5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png 424w, https://substackcdn.com/image/fetch/$s_!KWl5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png 848w, https://substackcdn.com/image/fetch/$s_!KWl5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png 1272w, https://substackcdn.com/image/fetch/$s_!KWl5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64787e53-d51d-4b59-a293-4d6e1d7a085d_1491x1055.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>What Does Standard Activation Steering Get Horribly Wrong?</h3><p>Look at the formula the entire open-source community currently uses for activation steering: <code>lambda_t = lambda_0 + t * beta_W</code></p><p>Take the original context vector (<code>lambda_0</code>), add the probe (<code>beta_W</code>) multiplied by some steering strength (<code>t</code>).</p><p>As an array operation, it runs perfectly. PyTorch will execute the addition without throwing a warning because both objects are just float32 arrays of the same shape. But PyTorch doesn&#8217;t know geometry.</p><p>As a geometric operation, this equation is a disaster. It blindly mashes a measurement tool (the probe) into a physical displacement (the representation). Because Softmax forces the model into Bregman space, vectors and covectors are not interchangeable.</p><p>But Dev Dev, you can say it sucks, but I don&#8217;t understand why. What does this type of error cost you in production?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jwGl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jwGl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png 424w, https://substackcdn.com/image/fetch/$s_!jwGl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png 848w, https://substackcdn.com/image/fetch/$s_!jwGl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png 1272w, https://substackcdn.com/image/fetch/$s_!jwGl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jwGl!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png" width="1200" height="848.9010989010989" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1030,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jwGl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png 424w, https://substackcdn.com/image/fetch/$s_!jwGl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png 848w, https://substackcdn.com/image/fetch/$s_!jwGl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png 1272w, https://substackcdn.com/image/fetch/$s_!jwGl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8410c65f-72ff-4f59-9d27-695746f21841_1491x1055.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When standard steering executes that addition, it is mathematically asking the model to find the closest point that satisfies the new concept. But because it used Euclidean math, which we already proved has zero relationship to the model&#8217;s actual output distribution. Put two and two together and we see that&#8202;&#8212;&#8202;that gap (the distance between the Euclidean guess and the true Bregman reality ) is exactly where your probability leaks. It is the physical reason the preposition &#8220;to&#8221; steals all the mass from your verbs. It is the reason your vision model spits out a dog with black and white spots instead of two distinct dogs.</p><p>The fix is one line of math.</p><p><code>phi(lambda_t) = phi(lambda_0) + t * beta_W</code></p><p>Map your primal vector to the dual coordinate system (<code>phi</code>). Add the probe (<code>beta_W</code>) in the dual space, where measurement tools actually belong. Then translate the result back to primal space to hand off to the next layer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U3Iu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U3Iu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png 424w, https://substackcdn.com/image/fetch/$s_!U3Iu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png 848w, https://substackcdn.com/image/fetch/$s_!U3Iu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png 1272w, https://substackcdn.com/image/fetch/$s_!U3Iu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U3Iu!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png" width="1200" height="848.9010989010989" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1030,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U3Iu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png 424w, https://substackcdn.com/image/fetch/$s_!U3Iu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png 848w, https://substackcdn.com/image/fetch/$s_!U3Iu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png 1272w, https://substackcdn.com/image/fetch/$s_!U3Iu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40063a61-252a-4fae-a7e8-add7fd29565d_1491x1055.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So what does this actually accomplish? A lot, actually.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ytR0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ytR0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png 424w, https://substackcdn.com/image/fetch/$s_!ytR0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png 848w, https://substackcdn.com/image/fetch/$s_!ytR0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png 1272w, https://substackcdn.com/image/fetch/$s_!ytR0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ytR0!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png" width="1200" height="1091.2087912087911" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1324,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ytR0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png 424w, https://substackcdn.com/image/fetch/$s_!ytR0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png 848w, https://substackcdn.com/image/fetch/$s_!ytR0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png 1272w, https://substackcdn.com/image/fetch/$s_!ytR0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4016616-7361-47ca-a677-88f05e6bf642_1520x1382.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>&#8220;Dual steering (red) consistently preserves off-target distributions better than Euclidean steering (blue), while both boost the target concept probability. We plot three robustness metrics (y-axes) against the target concept probability (x-axis) achieved via steering along the dual mean difference. As the target concept probability approaches 1 (moving right), Euclidean steering degrades the off-target distribution, whereas dual steering maintains it. Columns: Tasks include LLM steering for English &#8658; French (left), CLIP on synthetic objects for yellow &#8658; green (middle), and CLIP on real images (COCO) for carrot &#8658; broccoli (right). Rows: The top row shows the total probability mass on counterfactual pairs (constant is better). The middle and bottom rows show the KL divergence and rank difference of off-target distributions (lower is better). Lines represent the mean, and shading indicates the standard error of the mean (SEM) across test contexts.&#8221;</em></figcaption></figure></div><h3>The Payoff: What Actually Happens in Production?</h3><p>What happens to Gemma-3&#8211;4B?</p><p>The hallucination dies. When you run the dual steering vector to change a base verb to a third-person verb, the probability mass shifts directly from &#8220;maintain&#8221; to &#8220;maintains&#8221;. That generic preposition &#8220;to&#8221; that spiked out of nowhere in the primal space? It stays completely flat. The probability leak is sealed.</p><p>What happens to the vision models?</p><p>Steer MetaCLIP-2 away from &#8220;cat&#8221; toward &#8220;dog&#8221; in dual space, and the model directly transfers probability to images of dogs. The off-target &#8220;cat and dog sitting together&#8221; image never spikes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sPxU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sPxU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png 424w, https://substackcdn.com/image/fetch/$s_!sPxU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png 848w, https://substackcdn.com/image/fetch/$s_!sPxU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png 1272w, https://substackcdn.com/image/fetch/$s_!sPxU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sPxU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png" width="1456" height="812" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:812,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sPxU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png 424w, https://substackcdn.com/image/fetch/$s_!sPxU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png 848w, https://substackcdn.com/image/fetch/$s_!sPxU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png 1272w, https://substackcdn.com/image/fetch/$s_!sPxU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4defdf3-3879-41e3-8086-ff28208bbcef_1912x1066.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>&#8220;Dual steering (bottom) effectively modifies the target concept (e.g., verb &#8658; third-person or cat &#8658; dog) while preserving off-target distributions (e.g., P(&#8220;maintain&#8221;) + P(&#8220;maintains&#8221;) or P(&#8220;cat + bicycle&#8221;) + P(&#8220;dog + bicycle&#8221;)), whereas Euclidean steering (top) fails to maintain off-target distributions despite reaching the target probability. Left: Token probability changes in Gemma-3&#8211;4B when steering the context &#8220;Author gives an insight into what it costs US taxpayers to build and&#8221; using a linear probe for verb &#8658; third-person. Euclidean steering leaks significant mass to off-target tokens (e.g., &#8220;to&#8221;) during intermediate steps, whereas dual steering directly shifts probability from base tokens (e.g., &#8220;maintain&#8221;, &#8220;operate&#8221;) to target tokens (e.g., &#8220;maintains&#8221;, &#8220;operates&#8221;). Center &amp; Right: Steering MetaCLIP-2 on the context &#8220;a photo of one cat&#8221; for the concept cat &#8658; dog. Dual steering transfers probability from base images (e.g., &#8220;cat&#8221;, &#8220;cat + bicycle&#8221;) directly to targets (e.g., &#8220;dog&#8221;, &#8220;dog + bicycle&#8221;). In contrast, Euclidean steering unintentionally promotes the off-target &#8220;cat + dog&#8221; image (green frame in the right column), which becomes the Top-1 result during intermediate steps. In the probability plots, Top-k tokens (LLM) or images (CLIP) are shown explicitly, with the remainder grouped as &#8220;others.&#8221;&#8221;</em></figcaption></figure></div><h3><strong>Why Does Dual Steering Actually Work?</strong></h3><p>When you steer in the primal space, you are just adding raw numbers to the model&#8217;s logits. Softmax then exponentiates those inflated numbers. Because exponentiation amplifies differences, the artificially massive logits crush the model&#8217;s original context. <strong>To make everything sum to 100%, the model abandons the specific context and falls back on the safest, most frequent tokens it knows. As with most things in life, an overabundance of playing it safe leads to bland, generic output (tokens like to).</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0V85!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0V85!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!0V85!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!0V85!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!0V85!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0V85!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0V85!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!0V85!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!0V85!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!0V85!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa79e5971-a364-4cb9-92fc-5e40ca4af28d_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The dual space (<code>phi</code>) operates under different laws. Because it is built entirely from probabilities, it is strictly zero-sum. The total mass is locked at exactly 100%. You cannot inject raw, unconstrained numbers here.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lh0d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lh0d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!lh0d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!lh0d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!lh0d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lh0d!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lh0d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!lh0d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!lh0d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!lh0d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c71ff89-72f4-49ca-8883-e823d7a4a2ce_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you want to push the center of mass toward the third-person verb &#8220;maintains&#8221;, the math forces a direct trade. This trade is the physical reality of a &#8220;KL projection.&#8221; The projection takes your target concept and solves for a new probability distribution using two strict rules: the new distribution must satisfy your steering target, and it must change the original probabilities as little as physically possible.</p><p>To satisfy both rules, it steals the required probability mass directly from the base verb (&#8220;maintain&#8221;) and hands it to the target verb (&#8220;maintains&#8221;). The generic token &#8220;to&#8221; stays flat because the distribution was never shattered.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vZcf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vZcf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!vZcf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!vZcf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!vZcf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vZcf!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vZcf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!vZcf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!vZcf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!vZcf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b28e667-c53e-402a-8a5a-3e67aaa81f82_1672x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That&#8217;s not to say this is without flaws. Dual steering fixes the geometric type error, but it hits a hard wall in production: it only works at the exit nodes.</p><p>The Bregman geometry derived by Park et al. relies entirely on the log-partition function. That means it only applies to representations passing directly through a Softmax distribution. In modern architectures, that restricts you to the final-token unembedding layer, CLIP retrievals, and attention matrices.</p><p>In production, almost nobody steers at the final layer. By the time a concept hits the unembedding matrix, the model has already made its decision. Surgical steering happens deep inside the network&#8202;&#8212;&#8202;say, layer 15 of a 32-layer model&#8202;&#8212;&#8202;to intercept a concept before it fully forms. But intermediate layers don&#8217;t have a Softmax attached to them. They are just unconstrained states floating in the residual stream. For intermediate layers, the mathematical map of Bregman geometry simply stops.</p><p>So&#8230;. we can never really apply the math we just talked about to hit real-time steering. That is a slight problem.</p><p>So where does this leave us? Do we just retard-max and let fine-tuning handle all the work? Let&#8217;s bring our little exploration to a close.</p><h3>Conclusion: Where Does Dual Steering Go From Here.</h3><p>Let&#8217;s end our journey with a trip down memory lane, back to the days of classic ML. Think about how LLMs changed production. We used to build massive, brittle scaffolding&#8202;&#8212;&#8202;wiring ten specialized ML classifiers to two generators just to complete one basic workflow. LLMs replaced all of that with a single general-purpose model, collapsing infrastructure costs overnight. That drop in cost didn&#8217;t happen because we optimized the classifiers. It happened because we completely changed the interaction pattern.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R03E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R03E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg 424w, https://substackcdn.com/image/fetch/$s_!R03E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg 848w, https://substackcdn.com/image/fetch/$s_!R03E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!R03E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R03E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg" width="648" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:648,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R03E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg 424w, https://substackcdn.com/image/fetch/$s_!R03E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg 848w, https://substackcdn.com/image/fetch/$s_!R03E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!R03E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc458ca9-4667-4d17-9b6c-a01b56017b2a_648x1152.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/why-ai-hate-is-your-next-billion">This is the same approach as our anti-thesis based investing framework, just applied to AI Research</a>.</figcaption></figure></div><p>We are at the exact same threshold with model internals.</p><p>Right now, the industry&#8217;s default reflex for bad model behavior is to treat it as an infrastructure problem. When a model fails, we throw compute at it. We run brittle fine-tuning jobs and try to bludgeon the network into submission using scale.</p><p>But as more research is showing, the model layer might be the wrong layer to solve these problems. Perhaps the true solution is deeper. By changing how the space of how we represent knowledge in the language models, and then how we navigate it, we might be able to unlock capabilities that our current paradigm deems unrealistic.</p><p>And even if it doesn&#8217;t work, isn&#8217;t the idea so much more fun? Do you really want to spend the rest of your career profiling GPUs, fiddling with random seeds, and writing 20 skills/agent templates so that your manager can show your teams AI readiness? Does your heart (and brain) not ache to do more, to push humanity&#8217;s knowledge forward?</p><p>Think that over.</p><p>Thank you for being here, and I hope you have a wonderful day,</p><p>Dev &lt;3</p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines</a>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/how-to-control-your-ai-outputs-better?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/how-to-control-your-ai-outputs-better?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fENc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fENc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png 424w, https://substackcdn.com/image/fetch/$s_!fENc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png 848w, https://substackcdn.com/image/fetch/$s_!fENc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png 1272w, https://substackcdn.com/image/fetch/$s_!fENc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fENc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png" width="714" height="200" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:200,&quot;width&quot;:714,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fENc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png 424w, https://substackcdn.com/image/fetch/$s_!fENc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png 848w, https://substackcdn.com/image/fetch/$s_!fENc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png 1272w, https://substackcdn.com/image/fetch/$s_!fENc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd73ef6-83b0-45df-83e0-20b37e15fb48_714x200.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><h3>Reach out to me</h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. : </p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[Why Legal AI Hallucinations Are Three Different Problems, And Most Tools Only Catch One]]></title><description><![CDATA[It takes time to create work that&#8217;s clear, independent, and genuinely useful.]]></description><link>https://www.artificialintelligencemadesimple.com/p/why-legal-ai-hallucinations-are-three</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/why-legal-ai-hallucinations-are-three</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Wed, 06 May 2026 02:26:11 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/196579711/0387be867e62e76577db648b61dcae14.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>I spoke to Ryan Estes about <a href="https://podcasts.apple.com/us/podcast/legal-ai-why-lawyers-are-finally-free-to-think/id1798265052?i=1000761441454">Legal AI, Open Source Research, and why access to Legal Services needs to be more accessible over here</a>. The conversation was very well received, so I&#8217;m sharing it here with Ryan&#8217;s permission. </p><p>I hope you enjoy it. </p><h1>Companion Guide to the Livestream: </h1><p><em>This guide expands the core ideas and structures them for deeper reflection. Watch the full stream for tone, nuance, and side-commentary.</em></p><h2>1. The Three Hallucinations Hiding Under One Word</h2><p><strong>The Event</strong> &#8212; I broke hallucinations into three categories on the stream. Category one: the AI makes up a case that doesn&#8217;t exist. Category two: the case exists and the quote is real, but it&#8217;s from the wrong jurisdiction or doesn&#8217;t apply to your domain. Category three: the AI gives you an argument that looks correct on its own, but somewhere else in your documents there&#8217;s something that contradicts it, and the system never connected the two.</p><p><strong>Why this matters</strong> &#8212; Everyone talks about category one because it&#8217;s the most obvious. A lawyer cites a fake case, gets sanctioned, it makes the news. But category one is also the easiest to fix. You just check whether the case exists. A basic validator catches almost all of it. Categories two and three are where the real damage happens, and they&#8217;re much harder to catch. The lawyer reads the brief, everything looks right, the citation is real, the quote checks out. They file. Then opposing counsel tears them apart because the cited authority was overturned in their jurisdiction, or because a deposition transcript on page 723 contradicts the whole argument.</p><p>Most legal AI tools are RAG wrappers. You upload documents, the system cuts them into chunks, turns the chunks into vectors, and when you ask a question it finds the chunks that look most similar to your question. This works fine for simple retrieval like &#8220;find the indemnification clause.&#8221; It does not work for &#8220;is this argument actually supported across all my documents.&#8221; Cosine similarity doesn&#8217;t know what a contradiction is. It doesn&#8217;t know about jurisdictions or whether a ruling is still valid. Two laws from different states will sit right next to each other in vector space even if they say opposite things. And &#8220;document 47 invalidates the claim in document 12&#8221; is a logical relationship that vector search can&#8217;t represent at all.</p><p>Every category-two and category-three hallucination is a potential malpractice claim. The tool makes you faster, you trust it, you file work that has buried contradictions, and the first time it costs a client real money your insurance situation changes permanently. The speed improvement means nothing if the work product carries hidden liability. This is why legal AI has to move to architectures that handle context across documents natively. The wrappers will be fine for boilerplate. They&#8217;ll fail at everything that actually matters.</p><h2>2. Why Irys Doesn&#8217;t Wrap, It Rebuilds</h2><p><strong>The Event</strong> &#8212; Ryan asked whether Irys&#8217;s &#8220;infinite context&#8221; works the same way Harvey&#8217;s does. It doesn&#8217;t. Harvey chunks your documents and uses vector similarity to find relevant pieces. Irys builds a knowledge graph that links entities, propositions, assertions, and contradictions across all your documents, and updates that graph every time you add a new file. When you ask a question, the system walks the graph instead of doing similarity search.</p><p><strong>Why this matters</strong> &#8212; Vector search became the default approach to &#8220;my documents don&#8217;t fit in the context window&#8221; because it was cheap and it worked for simple use cases. Then people treated it as a permanent solution. It was always limited. Chunks are independent of each other. There&#8217;s no cross-document reasoning. There&#8217;s no way to bind a claim in one document to evidence in another. None of this mattered when the use case was &#8220;summarize this PDF.&#8221; It matters a lot when the use case is &#8220;build me a litigation strategy across 100,000 pages.&#8221; I was writing about these structural limits in 2022, before RAG was even a common term.</p><p>The fix isn&#8217;t bigger context windows or better embeddings. The fix is a system that explicitly tracks entities like parties, dates, jurisdictions, and claims, that maintains contradiction edges between propositions, and that reorganizes itself when new documents arrive. That&#8217;s how you catch &#8220;document 47 contradicts document 12&#8221; before the LLM ever starts drafting. That&#8217;s what Irys does. And this is why the wrappers can&#8217;t close the gap by adding features. Their entire stack assumes chunks are independent and retrieval is similarity. To move to graph-native context, they&#8217;d have to throw it all away and start over.</p><h2>3. Why GitHub Copilot Couldn&#8217;t Become Cursor, And Cursor Couldn&#8217;t Become Claude Code</h2><p><strong>The Event</strong> &#8212; Ryan asked what stops someone from copying Irys. This is the answer Harvard Business School reached out about for one of their courses. Software doesn&#8217;t just have features. It has assumptions baked into it about what the user controls, what the AI controls, where data lives, what gets automated, and what gets left to human judgment. GitHub Copilot had Microsoft&#8217;s distribution and money. It couldn&#8217;t become Cursor. Cursor had the developer tool category to itself. It couldn&#8217;t become Claude Code. Each product was built around a different set of assumptions, and those assumptions aren&#8217;t portable. They&#8217;re in every layer of the code and they compound with every release.</p><p><strong>Why this matters</strong> &#8212; When founders talk about moats they usually talk about data, brand, and distribution. Those are visible from the outside. Architectural assumptions are not. If you build assuming the AI is an autocomplete assistant, you get Copilot. If you build assuming the AI is an autonomous agent that the user occasionally interrupts, you get Claude Code. The surface features can look the same but the products are completely different underneath. You can&#8217;t retrofit one into the other because every API, every state machine, every piece of the UX was built on the original assumption. Unwinding it costs more than starting over.</p><p>This is the moat for Irys. We assumed in 2022 that vector search would not be enough for legal context. So every layer of the system was built around graph-based context aggregation. Harvey assumed RAG was enough. They have a year of rebuilding before they can even start the conversation we finished three years ago. And by the time they&#8217;re done, we&#8217;ve shipped two more iterations on top.</p><p>There&#8217;s a second moat on the product side. The more you use Irys, the more it learns how you work. It doesn&#8217;t train on your data. But it learns which argument structures you prefer, what memo formats you use, how you weigh jurisdictions. All of that accumulates in your private workspace. If you leave, you lose all of that embedded knowledge and have to rebuild it somewhere else from scratch.</p><h2>4. Why Open Source Is Cheaper Than Marketing</h2><p><strong>The Event</strong> &#8212; Irys is free to sign up. We&#8217;ve open-sourced major pieces of our reasoning infrastructure, including a lightweight version of the latent space reasoning engine. By normal startup logic, this makes no sense. We have a defensible product, real funding, well-funded competitors, and we&#8217;re giving away the technical work for free.</p><p><strong>Why this matters</strong> &#8212; The usual way to think about open source is as a cost. Every piece of IP you publish saves your competitor some R&amp;D time. That&#8217;s true if the game is static. In a fast-moving technical field, the thing that actually matters is who has access to information about what&#8217;s coming next. Who&#8217;s in the room when the labs are deciding the next generation of model capabilities. Who knows what&#8217;s shipping in six months.</p><p>Published work is what buys access to those rooms. NVIDIA&#8217;s senior engineers don&#8217;t take meetings with random startups. They take meetings with people whose technical work they&#8217;ve already seen. DeepMind doesn&#8217;t share roadmap previews with companies that haven&#8217;t contributed anything back to the field. The newsletter and the open-source reasoning engine aren&#8217;t marketing. They&#8217;re what get us the partnerships with the labs. That&#8217;s how we know what&#8217;s coming before it&#8217;s announced, and that&#8217;s how we make architecture decisions that our competitors don&#8217;t know they need to make yet.</p><p>The competitor who copies our open source saves maybe a quarter of engineering time. But they&#8217;re still nine months behind on the decisions that actually matter because they don&#8217;t have the relationships that tell them where the field is going. We didn&#8217;t lose value by publishing. We traded code for visibility into the technical horizon.</p><p>There&#8217;s also a recruiting benefit. The engineers who read the reasoning paper, run the lightweight implementation, and reach out about it are exactly the kind of people you can&#8217;t find through normal hiring channels. You can&#8217;t buy that pipeline. You can only earn it.</p><h2>5. The Newsletter Is Peer Recruitment, Not Customer Acquisition</h2><p><strong>The Event</strong> &#8212; Ryan assumed the Chocolate Milk Cult newsletter (250,000+ subscribers, about 1.5M monthly reach) was the lead generation engine for Irys. I corrected him. Lawyers don&#8217;t read deep dives on quantization math or GPU pricing curves. The newsletter doesn&#8217;t sell legal seats. It serves a completely different purpose.</p><p><strong>Why this matters</strong> &#8212; Most founder content advice treats your audience as a sales channel. That works when your audience and your customer are the same person. A fitness creator selling a fitness app. A finance creator selling a finance tool. When your audience is a different group from your buyer, the sales-channel framing leads you to measure the wrong things. You track conversion and click-through, you report the wrong wins, and eventually you conclude the audience isn&#8217;t worth the effort because the numbers don&#8217;t show a return.</p><p>What the audience actually does, when it&#8217;s decoupled from the customer base, is give you information you can&#8217;t get any other way. Lab researchers DM me about unpublished work. Engineers send me preprints. Investors share data they wouldn&#8217;t put in a pitch deck. None of that shows up as a Substack metric, but all of it changes what Irys ships next quarter.</p><p>The point for founders is simple. Be honest about who your audience is and what they&#8217;re actually for. If you&#8217;re building vertical SaaS for accountants, an audience of ML researchers won&#8217;t sell seats. But it might give you a technical edge that makes the product worth buying when your actual sales channel starts working. The two things serve different purposes and you can&#8217;t optimize both with the same content.</p><h2>6. The Eat-Shit Theorem of Legal Access</h2><p><strong>The Event</strong> &#8212; Ryan asked why I picked legal when I could have pointed the technical foundation at any industry. The answer is the moral case. India has a ten-year backlog on civil cases. If your employer steals your wages and you go to court tomorrow, you won&#8217;t get a hearing for a decade. In NYC, tenants have landlords who let buildings rot and overcharge rent, and they can&#8217;t do anything because they can&#8217;t afford a lawyer. Indian farmers get oversold pesticides, fall into debt, and end up dealing with loan sharks. I get hit with baseless defamation suits over newsletter coverage, designed not to win but to bleed me on legal fees. The pattern is always the same. The legal system is gated by money, and ordinary people pay the price.</p><p><strong>Why this matters</strong> &#8212; Democratization here is not a marketing word. It&#8217;s the reason every other decision gets made the way it does. Irys is free to sign up because if we charged what the market would bear, the people who need access most would never get in. We open-source the reasoning infrastructure because the category needs to advance whether we win or not. We&#8217;re building Irys Lite because the full product doesn&#8217;t reach a wage-theft case in rural India or a tenant in the Bronx.</p><p>For founders, the closing point from the stream is the one I&#8217;d keep. Be honest with yourself about whether you&#8217;re building a business or a mission. Both are fine. The question is which one survives the next downturn, the customer churn, the eighteen months where nothing works. If it&#8217;s a business, say so to yourself, your team, and your investors. Don&#8217;t dress it up as world-changing because that confusion is what burns founders out around year three. If it&#8217;s a mission, then it has to be the thing that&#8217;s still motivating you when a competitor with twenty times your funding announces the same product. The mission is whatever&#8217;s still there when it stops being fun. Everything else in the company is downstream of that.</p><p><a href="https://podcasts.apple.com/us/podcast/legal-ai-why-lawyers-are-finally-free-to-think/id1798265052?i=1000761441454">Full conversation with Ryan is on </a><em><a href="https://podcasts.apple.com/us/podcast/legal-ai-why-lawyers-are-finally-free-to-think/id1798265052?i=1000761441454">AI for Founders</a></em><a href="https://podcasts.apple.com/us/podcast/legal-ai-why-lawyers-are-finally-free-to-think/id1798265052?i=1000761441454">, here</a>. Try the platform free at <a href="https://www.irys.ai/">iqidis.ai</a>. The open-sourced latent space reasoning work is on the <em>Chocolate Milk Cult</em> archives. If your attorney is still billing you in fifteen-minute increments for things a knowledge graph does in fifteen seconds, send them this guide.</p><div><hr></div><p></p><p>Subscribe to support AI Made Simple and help us deliver more quality information to you-</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><p>Flexible pricing available&#8212;<a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">pay what matches your budget here</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EAau!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EAau!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 424w, https://substackcdn.com/image/fetch/$s_!EAau!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 848w, https://substackcdn.com/image/fetch/$s_!EAau!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 1272w, https://substackcdn.com/image/fetch/$s_!EAau!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EAau!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png" width="339" height="93" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:93,&quot;width&quot;:339,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!EAau!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 424w, https://substackcdn.com/image/fetch/$s_!EAau!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 848w, https://substackcdn.com/image/fetch/$s_!EAau!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 1272w, https://substackcdn.com/image/fetch/$s_!EAau!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Thank you for being here, and I hope you have a wonderful day.</p><p>Dev &lt;3</p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/why-legal-ai-hallucinations-are-three?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/why-legal-ai-hallucinations-are-three?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><h3><strong>Reach out to me</strong></h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. :</p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[ChatGPT vs Gemini vs Claude: The Best LLM Subscription You Should Buy (2026 Edition) ]]></title><description><![CDATA[In a World with Claude Code, OpenClaw, and a hundred spinoffs, which AI Subscription Is Worth Paying For in 2026 ?]]></description><link>https://www.artificialintelligencemadesimple.com/p/chatgpt-vs-gemini-vs-claude-the-best-295</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/chatgpt-vs-gemini-vs-claude-the-best-295</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Sun, 03 May 2026 23:19:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tyIN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa40ea7b-b8b7-4e52-a5a5-ccdcb48697b6_1122x1402.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Startup Founders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Last year, I compared the major AI web apps: ChatGPT, Gemini, and Claude. The question was simple: which one was worth paying for?</p><p>At the time, the answer was also simple. ChatGPT was the best general-purpose app. Gemini had strong models trapped inside a messy Google product. Claude had a good underlying model, but the app was too limited, too forgetful, and too frustrating for sustained work.</p><p>That rubric is now old. The major AI subscriptions are no longer just chatbots. They are memory systems, research tools, coding agents, file processors, image generators, project workspaces, and billing systems. The model still matters, obviously. But the model is no longer the whole product.</p><p>So I&#8217;m redoing our comparison with an updated question: Which subscription will let you get the most done for the least cost? I&#8217;m keeping the comparisons limited to Gemini, OpenAI, and Anthropic for 3 reasons: </p><ol><li><p>Performance: They still have the best models available for use. </p></li><li><p>Reliability: All 3 tend to have much higher uptime and usage. </p></li><li><p>Paperwork: Some of the subscriptions (like the Chinese models) are typically not approved in most orgs and require a longer review process. By the time it is approved, we will likely see  10 new model releases and updates. </p></li></ol><p>For the ranking, we will review them over the following points: </p><ul><li><p> <strong>workflow coverage.</strong> Can the subscription handle writing, research, coding, files, images, planning, review, and daily work without forcing you into three different products?</p></li><li><p><strong>memory and state.</strong> Does the system remember useful context without becoming weird? Good memory helps. Bad memory overfits to stale preferences. Additionally, there is ease of use/effortlessness (Anthropic natively looks at memories, Gemini needs to be prompted explicitly) and the cost (Anthropic burns a lot of tokens on memory)</p></li><li><p><strong>instruction stability.</strong> Can it follow complex instructions across long, messy work? Not one prompt. That is easy. The real test is whether it preserves constraints after ten turns, two corrections, and a change in direction.</p></li><li><p> <strong>judgment and taste.</strong> Does it know what to cut? Does it know when the technically correct answer is still bad? Does it update proportionally when you push back?</p></li><li><p> <strong>research quality.</strong> Search is not research. A good research workflow finds sources, weighs them, notices conflicts, and tells you what remains uncertain.</p></li><li><p> <strong>coding and agentic execution.</strong> This is now central. Codex and Claude Code are not side features. They are main reasons to pay.</p></li><li><p><strong>output quality.</strong> Can it produce writing, visuals, reports, artifacts, and polished work that is actually usable?</p></li><li><p> <strong>economics and billing trust.</strong> Can you predict what you are paying for? Does the plan scale with heavy use? Does the product punish you for doing real work?</p></li><li><p> <strong>product friction.</strong> Does the app fight you? Does it hide limits? Does it make you restart? Does the best model live somewhere other than the product you are paying for?</p></li><li><p><strong>Ecosystem: </strong>Does it work well with other tools in the ecosystem? </p></li></ul><p>Let&#8217;s see where you should be spending your money. </p><p>To access the full article&#8212;and all premium breakdowns going forward/written prior&#8212;upgrade to a premium subscription below.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><p>If you believe deep insight deserves support, become a premium subscriber to allow me to keep doing the same.</p><p>Flexible pricing available&#8212;<a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">pay what matches your budget here</a>.</p><p><em><strong>Most companies offer learning or professional development budgets. <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">You can expense this subscription using the email template linked here</a>.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aj46!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aj46!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png 424w, https://substackcdn.com/image/fetch/$s_!aj46!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png 848w, https://substackcdn.com/image/fetch/$s_!aj46!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png 1272w, https://substackcdn.com/image/fetch/$s_!aj46!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aj46!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png" width="644" height="166" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:166,&quot;width&quot;:644,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23104,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/196285777?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aj46!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png 424w, https://substackcdn.com/image/fetch/$s_!aj46!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png 848w, https://substackcdn.com/image/fetch/$s_!aj46!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png 1272w, https://substackcdn.com/image/fetch/$s_!aj46!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe96bb58-03bb-4e01-9c7b-bb6989a1c2fe_644x166.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div>
      <p>
          <a href="https://www.artificialintelligencemadesimple.com/p/chatgpt-vs-gemini-vs-claude-the-best-295">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[How the Next Generation of AI Models are Going to Completely Change AI Inference]]></title><description><![CDATA[Why Changing from Autoregressive Language Models to Diffusion Models will change AIs hottest problems]]></description><link>https://www.artificialintelligencemadesimple.com/p/how-the-next-generation-of-ai-models</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/how-the-next-generation-of-ai-models</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Wed, 29 Apr 2026 05:40:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Fn72!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Most of the AI infrastructure boom isn&#8217;t actually buying compute. It&#8217;s buying memory bandwidth.</p><p>Look at the market right now: SK Hynix and Micron are completely sold out of their 2026 HBM capacity. Cerebras is pushing a $22 billion IPO built almost entirely around high-memory inference chips. Google just split its 8th-generation TPUs into completely separate training and inference lines.</p><p>Hundreds of billions of dollars are being deployed based on a single, massive assumption: that AI generation will always be bottlenecked by the exact same things it is today.</p><p>But if you look closely at the research, that assumption is already starting to rot. Diffusion models are rapidly rising out of the image-generation niche and seeing aggressive, mainstream use across text and reasoning workloads.</p><p><em>(<a href="https://www.artificialintelligencemadesimple.com/p/how-ai-will-change-in-2026">PS: if you want to dig into why Diffusion Models are the future, from an engineering/technical perspective (including why companies are already investing in them, we covered it in this article (Prediction 2 and 3)</a>. Since I don&#8217;t want to repeat stuff, this article will take the rise of Diffusion Models as a given. The image below is a good illustration of how diffusion models unlock a deeper category of reasoning. .</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Fn72!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Fn72!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png 424w, https://substackcdn.com/image/fetch/$s_!Fn72!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png 848w, https://substackcdn.com/image/fetch/$s_!Fn72!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png 1272w, https://substackcdn.com/image/fetch/$s_!Fn72!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Fn72!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png" width="1200" height="804.3956043956044" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:976,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Fn72!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png 424w, https://substackcdn.com/image/fetch/$s_!Fn72!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png 848w, https://substackcdn.com/image/fetch/$s_!Fn72!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png 1272w, https://substackcdn.com/image/fetch/$s_!Fn72!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f18cfc0-691f-4d1f-8df8-fac8292d1461_2400x1608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Don&#8217;t worry we will explain this in detail in this article.</figcaption></figure></div><p>This is not a trivial algorithmic swap. Diffusion fundamentally alters the AI paradigm. It aggressively rewrites the underlying math of how models are served, flipping the hardware bottlenecks and turning the established inference market on its head.</p><p>In this article, we will cover:</p><ul><li><p><strong>The Dependency Graph:</strong> Why standard autoregressive LLMs are structurally trapped by memory constraints.</p></li><li><p><strong>The Bottleneck Inversion:</strong> How diffusion models shatter the sequential lock and shift the hardware paradigm.</p></li><li><p><strong>The Algorithmic Timeline:</strong> The math required to make language diffusion financially viable.</p></li><li><p><strong>Inference as Search:</strong> How diffusion makes complex, inference-time verification cheap, threatening the $2 billion base-model moat.</p></li><li><p><strong>The Vendor Breakdown:</strong> Which hardware architectures (NVIDIA, AMD, Groq, Apple, etc.) are actually positioned to survive the shift to compound diffusion workloads.</p></li></ul><p>Let&#8217;s play together.</p><h3>Executive Highlights (TL;DR of the Article)</h3><ul><li><p><strong>The Autoregressive Memory Trap (Section 1):</strong> The entire AI infrastructure supercycle is a desperately expensive patch for data movement. Because autoregressive models generate sequentially, operations have an arithmetic intensity of roughly 1 FLOP per byte. A modern tensor core needs nearly 300 FLOPs to stay fed. This means your $40,000 GPU executes at under 1% of peak compute while waiting for gigabytes of history (the KV cache) to stream across the chip.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VCDI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VCDI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VCDI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VCDI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VCDI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VCDI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg" width="500" height="673" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:673,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VCDI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VCDI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VCDI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VCDI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3e3a87b-f626-4c0b-b186-d31015343c94_500x673.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>The Bottleneck Inversion (Sections 2 &amp; 3):</strong> Diffusion models refuse left-to-right generation, refining entire sequences in parallel. This changes the underlying math from bandwidth-starved matrix-vector operations to massive matrix-matrix multiplications. The KV cache tax vanishes, and the bottleneck flips to raw compute. This is a massive tailwind: hardware generations like Blackwell and Rubin have compounded FLOPs at 3x every two years, while memory bandwidth has stalled. The industry accidentally built the perfect silicon for diffusion workloads.</p></li><li><p><strong>The Algorithmic Timeline &amp; Step-Count Constraints (Section 4):</strong> Perfect hardware is irrelevant if the math requires too many steps. Image generation collapsed its step count using continuous math, but language operates in discrete jumps, trapping text diffusion in a 4-to-16 step regime. However, software scheduling is proving cheaper than raw parameter scale. Simply forcing a model to unmask logic premises before conclusions (LogicDiff) spiked reasoning scores by near 40 points without changing base parameters, threatening the hyperscaler gigawatt-compute thesis.</p></li><li><p><strong>Inference as Search &amp; The Verifier Moat (Section 5):</strong> Because early denoising steps just build coarse shapes, multiple search trajectories can share the exact same early path. Running 4 candidate branches that share 40 of 50 steps yields a 4x quality search for a 1.6x compute tax. Branching AR is financially ruinous; branching diffusion is cheap. This unbundles the AI market: value migrates from the $2B base model to the companies building elite, proprietary verifier suites.</p></li></ul><p><strong>The Vendor Landscape (Section 6):</strong> Production diffusion is not a clean loop; it&#8217;s a chaotic compound pipeline (Denoiser + Verifier + VAE).</p><ul><li><p><strong>NVIDIA:</strong> Retains its moat because CUDA is the only environment flexible enough to handle this &#8220;swamp&#8221; of switching models and search orchestration.</p></li><li><p><strong>AMD:</strong> Massive HBM capacity acts as the perfect hedge against the massive activation memory spikes required for video diffusion and DiT workflows.</p></li><li><p><strong>Groq &amp; Pure ASICs:</strong> Brilliant for streaming deterministic AR tokens, but deeply vulnerable to the inter-chip latency required to orchestrate complex, backtracking search trees.</p></li><li><p><strong>Apple &amp; Qualcomm:</strong> Positioned to completely strip-mine edge volume the second local step-counts collapse, separating the volume from cloud revenue.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G0Hb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G0Hb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!G0Hb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!G0Hb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!G0Hb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G0Hb!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png" width="1200" height="800.2747252747253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G0Hb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!G0Hb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!G0Hb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!G0Hb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558072ee-df12-4ba0-857c-281034c5875c_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YGPU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YGPU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png 424w, https://substackcdn.com/image/fetch/$s_!YGPU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png 848w, https://substackcdn.com/image/fetch/$s_!YGPU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png 1272w, https://substackcdn.com/image/fetch/$s_!YGPU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YGPU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png" width="630" height="146" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:146,&quot;width&quot;:630,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YGPU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png 424w, https://substackcdn.com/image/fetch/$s_!YGPU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png 848w, https://substackcdn.com/image/fetch/$s_!YGPU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png 1272w, https://substackcdn.com/image/fetch/$s_!YGPU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F033840e9-36c2-4b94-8bc8-260e0790c7d9_630x146.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><h3>Section 1: Why Large Language Models Struggle with Decoding</h3><p>Pick any fast frontier model&#8202;&#8212;&#8202;Claude 4.5 Haiku, GPT-5 Mini, Gemini 2.5 Flash. Look at their tokens-per-second at batch size one. These are the fastest autoregressive LLMs in the world, running on $40,000 silicon with peak FP8 throughput in the petaflop range. Yet they generate text at roughly the speed an undergraduate types.</p><p>To understand why, you have to look at the dependency graph.</p><p>Autoregression is strictly sequential. Token 200 cannot exist until token 199 is sampled. Because of this lockstep dependency, the model cannot process its layers in parallel across the sequence. To produce a single token from a 70B-parameter model, all 140 GB of weights must stream from High-Bandwidth Memory (HBM) into the chip&#8217;s SRAM, execute a tiny matrix-vector multiplication, and leave.</p><p>This operation has an arithmetic intensity of roughly 1 FLOP per byte of data moved. A modern Hopper or Blackwell tensor core needs nearly 300 FLOPs per byte before it stops starving. So when an LLM generates text for a single user, the GPU executes at under 1% of its theoretical peak compute. The other 99% of the silicon waits for data to arrive. The constraint isn&#8217;t how fast the chip can multiply. It is how fast the chip can be fed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KMpu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KMpu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!KMpu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!KMpu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!KMpu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KMpu!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png" width="1200" height="800.2747252747253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KMpu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!KMpu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!KMpu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!KMpu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6546a139-3af2-4cc3-9d8f-810b630b6b0c_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/the-real-cost-of-running-ai">More information on this here.</a></figcaption></figure></div><p>Once you understand that data movement is the true bottleneck, the entire AI infrastructure stack reveals itself as a series of patches for this exact flaw.</p><p>Take the KV cache. If you are already memory-bound, the logical fix is to avoid recomputing the past. The KV cache stores the attention states of previous tokens so each new token only adds to the sequence. But this trade creates a massive residency tax. The longer the context, the more memory capacity it eats.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Af6K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Af6K!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png 424w, https://substackcdn.com/image/fetch/$s_!Af6K!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png 848w, https://substackcdn.com/image/fetch/$s_!Af6K!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png 1272w, https://substackcdn.com/image/fetch/$s_!Af6K!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Af6K!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png" width="1200" height="801.9230769230769" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:973,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Af6K!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png 424w, https://substackcdn.com/image/fetch/$s_!Af6K!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png 848w, https://substackcdn.com/image/fetch/$s_!Af6K!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png 1272w, https://substackcdn.com/image/fetch/$s_!Af6K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b9301d5-90aa-4f45-8fe4-7edd7f8fac44_2398x1602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>To offset that capacity tax, the system has to amortize the weight-streaming cost across multiple users. That is why PagedAttention and continuous batching frameworks like vLLM exist. They batch requests so the 140 GB of weights stream through the chip once to serve 64 users instead of one. Throughput scales linearly&#8202;&#8212;&#8202;until the KV cache for those 64 concurrent users blows out the HBM capacity entirely.</p><p>When the software patches hit a wall, the hardware roadmap panics. HBM4 is a bandwidth upgrade. NVLink 5 at 1.8 TB/s per GPU is an interconnect upgrade. Multi-billion-dollar photonic interconnect acquisitions are not compute upgrades. They are data-movement upgrades designed exclusively to keep the tensor cores from starving.</p><p>But the physical reality of chip manufacturing is asymmetric. Adding denser compute units and low-precision formats like FP4 is relatively cheap. Scaling memory bandwidth and capacity is economically brutal.</p><p>Autoregressive inference sits on the wrong side of that divide. You can pour infinite FLOPs into a server, but if the workload requires dragging gigabytes of history across the chip for every single token, the math will never work. This is why Diffusion Models have been attracting so much attention.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qwik!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qwik!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png 424w, https://substackcdn.com/image/fetch/$s_!Qwik!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png 848w, https://substackcdn.com/image/fetch/$s_!Qwik!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png 1272w, https://substackcdn.com/image/fetch/$s_!Qwik!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qwik!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png" width="1200" height="778.8461538461538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:945,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qwik!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png 424w, https://substackcdn.com/image/fetch/$s_!Qwik!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png 848w, https://substackcdn.com/image/fetch/$s_!Qwik!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png 1272w, https://substackcdn.com/image/fetch/$s_!Qwik!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90e4c733-cc31-44b9-b968-daaa1e70e0b0_2400x1558.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Section 2: How Diffusion Models Break the Sequential Lock in AI</h3><p>Diffusion language models escape the memory trap by simply refusing the premise of left-to-right generation.</p><p>Instead of predicting the next word, they start with a block of noise scaled to the target output and refine the entire sequence in parallel across a handful of forward passes. Every position is in flight at once. Token 200 never waits for token 199.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ORFj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ORFj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!ORFj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!ORFj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!ORFj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ORFj!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png" width="1200" height="800.2747252747253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ORFj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!ORFj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!ORFj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!ORFj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631adb4d-d908-490f-9f7b-b5782b985a30_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That single architectural shift shatters the dependency graph. Because a diffusion forward pass operates on hundreds or thousands of tokens simultaneously, the underlying math changes from bandwidth-starved matrix-vector operations to massive matrix-matrix multiplications. The arithmetic intensity spikes into the hundreds of FLOPs per byte&#8202;&#8212;&#8202;easily clearing the threshold required to actually keep a modern tensor core fed.</p><p><strong>This is the bottleneck inversion worth noting: autoregression is memory-bound, while diffusion is compute-bound.</strong></p><p>(worth stressing again&#8202;&#8212;&#8202;High Bandwidth Memory, HBM, is extremely expensive to build, hence the memory wall that has been choking AI. More on this in the next section).</p><p>Obviously, this means that the model does more raw work per output token, and it repeats that work across multiple refinement steps. But it pays that tax using the exact resource the hardware has in massive surplus.</p><p>We already established that Diffusion Models are fast, and they have some structural advantages with the hardware paradigm. But what about their performance? It&#8217;s still a bit early to tell, but the results look extremely promising:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X3PB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X3PB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png 424w, https://substackcdn.com/image/fetch/$s_!X3PB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png 848w, https://substackcdn.com/image/fetch/$s_!X3PB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png 1272w, https://substackcdn.com/image/fetch/$s_!X3PB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X3PB!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png" width="1200" height="1162.0879120879122" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1410,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!X3PB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png 424w, https://substackcdn.com/image/fetch/$s_!X3PB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png 848w, https://substackcdn.com/image/fetch/$s_!X3PB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png 1272w, https://substackcdn.com/image/fetch/$s_!X3PB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38c14bcb-56dd-48b7-b8f8-39802452d8e7_2034x1970.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">I&#8217;ve included dLLMs vs the top models in that class as well. Even when they lose, their performance at extreme efficiency makes them an interesting pitch.</figcaption></figure></div><p>Assuming this will continue to scale (and we&#8217;ve already broken down the research for this in previous works), we should see them close the gap in the next 2 years (Gemini 3 already started incorporating Diffusion in its model, so they&#8217;re already touching the frontier). This means more and more hardware providers will have to start accounting for diffusion models and their more matmul-heavy workloads.</p><p>Let&#8217;s explore that sentence next in more detail.</p><h3>Section 3: The Hardware Diffusion Needs (and Why the Industry Already Has It)</h3><p>If diffusion is going to take over a meaningful share of inference, we have to look at what it actually demands from the silicon.</p><p>To understand what this architecture needs to work well, we look at the math.</p><ul><li><p>A forward pass processes hundreds or thousands of tokens simultaneously, creating massive matrix-matrix multiplications. <strong>Compute becomes the primary axis of growth: the faster the chip multiplies, the faster the model runs.</strong></p></li><li><p>Given the size of the MatMuls, we also need Diffusion hardware to be good with quantization. Most modern hardware handles this fine, and it will be helped by the fact that, because the generation process is an iterative refinement loop, it naturally absorbs precision noise across steps.</p></li><li><p>When it comes to memory, Diffusion drops the bandwidth tax. There is no KV cache to stream and no per-token residency penalty. It does require memory capacity for activations&#8202;&#8212;&#8202;which becomes a real wall when scaling to video&#8202;&#8212;&#8202;but for text, that is a far weaker constraint than the bandwidth trap autoregressive decode hits at batch size one.</p></li></ul><p><strong>The net result: the ideal architecture for this workload is heavy on compute, native in FP4, modest on bandwidth, and generous on activation memory. </strong>Surprisingly, this maps wonderfully to the hardware the industry has been building.</p><p>Every accelerator generation since Hopper has been an escalating compute story. Blackwell hits 20 PetaFLOPS of FP4. Rubin is targeting 50. Node shrinks and new precision formats have led to FLOPs compounding at 3x every two years. The only problem? <a href="https://arxiv.org/html/2403.14123v1?">The memory bandwidth needed to feed it has grown at half that rate.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!m_Ng!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!m_Ng!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg 424w, https://substackcdn.com/image/fetch/$s_!m_Ng!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg 848w, https://substackcdn.com/image/fetch/$s_!m_Ng!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!m_Ng!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!m_Ng!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg" width="802" height="407" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:407,&quot;width&quot;:802,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!m_Ng!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg 424w, https://substackcdn.com/image/fetch/$s_!m_Ng!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg 848w, https://substackcdn.com/image/fetch/$s_!m_Ng!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!m_Ng!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1e08bd2-0b9c-43fd-8e75-c4940fdc0dd7_802x407.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The moment your model&#8217;s data spills over the tiny pool of super-fast on-chip memory, the system punishes you. Hard. &#8220;<em>When data size exceeds GPU memory capacity, the data must be migrated repeatedly between the CPU and GPU, either manually or automatically. However, manual migration can be laborious for programmers, <strong>and it is infeasible for irregular workloads because the data access pattern is unpredictable</strong>. On the other hand, demand paging approaches (e.g., NVIDIA Unified Memory [<a href="https://arxiv.org/html/2403.09358v1#bib.bib120">120</a>]) can automatically manage data movement, <strong>but it can significantly degrade performance due to high page fault-handling latency and limited PCIe BW [<a href="https://arxiv.org/html/2403.09358v1#bib.bib46">46</a>, <a href="https://arxiv.org/html/2403.09358v1#bib.bib121">121</a>, <a href="https://arxiv.org/html/2403.09358v1#bib.bib52">52</a>]. This overhead can be particularly severe for irregular workloads since prefetch/eviction policies become ineffective [<a href="https://arxiv.org/html/2403.09358v1#bib.bib14">14</a>]. For example, the runtime of bfs can increase by &#8764;4.5&#215; with only 125% oversubscription (i.e., exceeding memory capacity by 25%) compared to when the GPU is not oversubscribed.</strong></em>&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7vx2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7vx2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg 424w, https://substackcdn.com/image/fetch/$s_!7vx2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg 848w, https://substackcdn.com/image/fetch/$s_!7vx2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!7vx2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7vx2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg" width="1000" height="324" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:324,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7vx2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg 424w, https://substackcdn.com/image/fetch/$s_!7vx2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg 848w, https://substackcdn.com/image/fetch/$s_!7vx2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!7vx2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81832f51-b58c-48a2-a68d-1c49120d039e_1000x324.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">(Left) Validation of UM simulation at a fixed oversubscription ratio (log scale plot). (Right) Workloads&#8217; memory footprints used for validation. <a href="https://arxiv.org/html/2403.09358v1">Source</a></figcaption></figure></div><p>Since my brother also reads this newsletter, here is a summary so that he can follow along: Compute go up fast. Good. Memory go up, but not so fast. Bad. Low Memory make algorithms run slow. Much money lost to GPUs.</p><p>Why has memory been scaling so slowly? HBM is pinned by physics, supply consolidation, and packaging constraints. This is not a hardware newsletter, so here is a quick summary below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sLlo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sLlo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png 424w, https://substackcdn.com/image/fetch/$s_!sLlo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png 848w, https://substackcdn.com/image/fetch/$s_!sLlo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png 1272w, https://substackcdn.com/image/fetch/$s_!sLlo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sLlo!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png" width="1200" height="784.6153846153846" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:952,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sLlo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png 424w, https://substackcdn.com/image/fetch/$s_!sLlo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png 848w, https://substackcdn.com/image/fetch/$s_!sLlo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png 1272w, https://substackcdn.com/image/fetch/$s_!sLlo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63880eb8-a12c-4841-a830-b077bd518b4d_2400x1570.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is why increasing growth in compute haven&#8217;t scaled hardware utilization. The memory wall Pepe&#8217;s the fuck out of any deployment.</p><p>For diffusion, this set of conditions is a massive tailwind for all the reasons mentioned earlier (faster compute, good w/ quantization, and less hti by memory). The billions in capex committed through 2028&#8202;&#8212;&#8202;from Blackwell to Rubin, MI400 to MI500&#8202;&#8212;&#8202;is a roadmap of compute-axis improvements that will stall on AR decode but compound beautifully on diffusion.</p><p>All this means is that the blockers for diffusion going forward aren&#8217;t in the silicon since the chips and manufacturing constraints are pretty friendly to Diffusion. The real bottlenecks are upstream: maintaining frontier reasoning quality, and solving activation memory at video scale.</p><p>But most importantly, diffusion&#8217;s economics aren&#8217;t gated by how fast each step runs on the hardware. They are gated by how many steps you have to run in the first place.</p><p>Let&#8217;s look at the math behind that next.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ecv_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ecv_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Ecv_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Ecv_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Ecv_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ecv_!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png" width="1200" height="800.2747252747253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ecv_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Ecv_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Ecv_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Ecv_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72b53f2f-ce19-483f-9939-c2d88db30056_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A summary of the Diffusion Research and what it teaches us about where we go next.</figcaption></figure></div><h3>Section 4: How the Algorithmic Layer Dictates the Hardware Timeline</h3><p>Let&#8217;s pause and anchor exactly where we are in the stack, because this is where the money actually changes hands.</p><p>We know the architecture fixes the memory trap. We know the hardware industry accidentally built the perfect silicon for it.</p><p>But let&#8217;s be blunt: having the perfect chip is completely irrelevant if the math running on it is a nightmare. Hardware is just a dumb, incredibly expensive rock. The algorithm&#8202;&#8212;&#8202;the actual math telling the rock what to do&#8202;&#8212;&#8202;is the gatekeeper.</p><p>If diffusion requires 1,000 forward passes to generate a single image, nobody cares how perfectly it aligns with the silicon. At 1,000 steps, the unit economics are dead. For diffusion to actually threaten autoregression and shift billions of dollars in hardware spend, the step count&#8202;&#8212;&#8202;the Number of Function Evaluations (NFE)&#8202;&#8212;&#8202;had to collapse.</p><p>It already happened for images.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gASn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gASn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png 424w, https://substackcdn.com/image/fetch/$s_!gASn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png 848w, https://substackcdn.com/image/fetch/$s_!gASn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png 1272w, https://substackcdn.com/image/fetch/$s_!gASn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gASn!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png" width="1200" height="784.6153846153846" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:952,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gASn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png 424w, https://substackcdn.com/image/fetch/$s_!gASn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png 848w, https://substackcdn.com/image/fetch/$s_!gASn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png 1272w, https://substackcdn.com/image/fetch/$s_!gASn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7bbd9ec-04f9-4783-a9e5-26c9bef5a4ea_2400x1569.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In 2020, diffusion ran on stochastic math (SDEs). Because random noise accumulated at every step, the model had to take tiny, excruciatingly slow steps to keep the image from blowing up. But a year later, the math flipped. Researchers realized they could use deterministic Ordinary Differential Equations (ODEs) instead, stripping out the random noise and allowing massive jumps across the generation path. The 1,000 steps collapsed to 50 for free, and the Midjourney and Stable Diffusion API economies were born overnight.</p><p>But language is a completely different financial reality.</p><p>Images solved the step-count collapse because the math is forgiving. You can draw a smooth, continuous mathematical line between two pixel colors. You cannot draw a smooth curve between the word &#8220;Apple&#8221; and the word &#8220;Murder.&#8221; Masked language models operate in discrete, harsh jumps across a vocabulary space.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QwBr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QwBr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png 424w, https://substackcdn.com/image/fetch/$s_!QwBr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png 848w, https://substackcdn.com/image/fetch/$s_!QwBr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png 1272w, https://substackcdn.com/image/fetch/$s_!QwBr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QwBr!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png" width="1200" height="783.7912087912088" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:951,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QwBr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png 424w, https://substackcdn.com/image/fetch/$s_!QwBr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png 848w, https://substackcdn.com/image/fetch/$s_!QwBr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png 1272w, https://substackcdn.com/image/fetch/$s_!QwBr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b91baf-94b8-410b-9184-6196aec24b43_2400x1567.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Because of this, the continuous distillation techniques that made image generation real-time fail spectacularly on text. When researchers tried porting consistency distillation to LLaDA-8B, the model bled 6 full points of accuracy on the GSM8K math benchmark.</p><p>Because discrete distillation is struggling, language diffusion is currently stuck in the 4-to-16 step regime. That is fast enough to beat autoregression in a server rack, but it fundamentally delays the &#8220;frontier reasoning on an iPhone&#8221; narrative. Until researchers crack discrete space distillation (or we get a way to make language concepts operate in the continuous space, which imo is more useful)&#8202;&#8212;&#8202;which is likely 18 to 36 months out&#8202;&#8212;&#8202;Apple and Qualcomm are stuck waiting. The compute required for those 4 to 16 steps ensures that the hyperscalers will maintain absolute, walled-garden dominance over text generation. The datacenter monopoly holds.</p><p>But inside those datacenters, the bottlenecks are shifting in ways that threaten the current capex consensus.</p><p>For a while, we assumed diffusion language models were bottlenecked by raw parameter scale. We thought they just needed bigger training runs to reason better. It turns out they just had terrible time management.</p><p>Standard masked diffusion unmasks tokens randomly. This means the model routinely commits to a conclusion before it has even unmasked the premises it needs to get there. It&#8217;s trying to solve the end of the maze before looking at the start. I<a href="https://arxiv.org/abs/2603.26771">n 2025, the LogicDiff paper fixed this with a microscopic, 4.2M-parameter classification head that simply forced the model to unmask premises first and conclusions last.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pxHl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pxHl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png 424w, https://substackcdn.com/image/fetch/$s_!pxHl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png 848w, https://substackcdn.com/image/fetch/$s_!pxHl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png 1272w, https://substackcdn.com/image/fetch/$s_!pxHl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pxHl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png" width="1456" height="934" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:934,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pxHl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png 424w, https://substackcdn.com/image/fetch/$s_!pxHl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png 848w, https://substackcdn.com/image/fetch/$s_!pxHl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png 1272w, https://substackcdn.com/image/fetch/$s_!pxHl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dce1829-2ea6-434e-a4a8-515bab6fbddd_1518x974.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Unmasking order comparison. Top: Default confidence-based unmasking generates numbers first and defers connectives to the last step. Bottom: LOGICDIFF unmasks premises first, then connectives, then derived results, then conclusions.</figcaption></figure></div><p>The result: LLaDA-8B jumped from 22.0% to 60.7% on GSM8K. A near 40-point spike without changing a single base parameter.</p><p>If you can buy 40 points of reasoning with a dirt-cheap scheduling hack instead of a $2 billion training cluster, the entire hyperscaler capex thesis starts looking very fragile. Software efficiency is quietly undercutting the necessity of building gigawatt, nuclear-powered compute clusters just to achieve frontier logic.</p><p>Lastly, it&#8217;s worth studying LLaDA-MoE, which proved that routing tokens through expert subnetworks works beautifully for diffusion text models. But MoE inherently requires streaming different expert weights from memory for every routing decision. It isn&#8217;t the relentless KV-cache tax of autoregression, but it is an expert-loading tax that brings the memory bandwidth constraint back into discussions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sB6i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sB6i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!sB6i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!sB6i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!sB6i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sB6i!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png" width="1200" height="800.2747252747253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sB6i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!sB6i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!sB6i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!sB6i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F566416b0-6144-4753-9e54-f6f574577ff9_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Section 5: How Diffusion Turns Inference into a Search Problem</h3><p>Once diffusion eats a meaningful slice of inference, the serving stack needs to be rethought deeply.</p><p>We aren&#8217;t just swapping out the model weights and keeping the same infrastructure. We are dealing with a structural mutation in what inference actually is. We are moving from a clean, sequential generation pass to a chaotic, branching search problem.</p><p>To understand why, look at what just happened to language models.</p><h4>Inference-Time Scaling Arrives in Diffusion</h4><p>The most important paradigm shift of the last two years was the realization that throwing compute at inference beats just training a fatter model. OpenAI&#8217;s o1 and DeepSeek-R1 proved that the model that &#8220;thinks longer&#8221; wins.</p><p>Diffusion is about to do the exact same thing, but the mechanism is entirely different. Instead of &#8220;thinking longer&#8221; by generating more internal reasoning tokens, diffusion searches wider.</p><p>You don&#8217;t just run the denoiser once, roll the dice, and hope for the best. You generate a swarm of candidate trajectories, and you use a secondary model&#8202;&#8212;&#8202;a &#8220;verifier&#8221;&#8202;&#8212;&#8202;to judge them mid-flight. This was shown to us in <a href="https://arxiv.org/abs/2501.09732">Google&#8217;s &#8220;</a><em><a href="https://arxiv.org/abs/2501.09732">Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps</a></em>&#8221;. By investing in verifiers and better search, they were able to mog simplistic NFE scaling.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kVyn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kVyn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png 424w, https://substackcdn.com/image/fetch/$s_!kVyn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png 848w, https://substackcdn.com/image/fetch/$s_!kVyn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png 1272w, https://substackcdn.com/image/fetch/$s_!kVyn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kVyn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png" width="1456" height="638" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:638,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kVyn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png 424w, https://substackcdn.com/image/fetch/$s_!kVyn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png 848w, https://substackcdn.com/image/fetch/$s_!kVyn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png 1272w, https://substackcdn.com/image/fetch/$s_!kVyn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b8d03cf-45c2-438c-bf0f-2ad55d30310e_1530x670.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now, researchers are playing with AI search algorithms like Breadth-First Search (BFS) and Depth-First Search (DFS) to see how they might benefit the generation process.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W5Et!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W5Et!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png 424w, https://substackcdn.com/image/fetch/$s_!W5Et!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png 848w, https://substackcdn.com/image/fetch/$s_!W5Et!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png 1272w, https://substackcdn.com/image/fetch/$s_!W5Et!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W5Et!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png" width="1456" height="564" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:564,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W5Et!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png 424w, https://substackcdn.com/image/fetch/$s_!W5Et!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png 848w, https://substackcdn.com/image/fetch/$s_!W5Et!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png 1272w, https://substackcdn.com/image/fetch/$s_!W5Et!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a7d53e-a037-4c6c-97e2-5d1f599e31c6_2076x804.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://arxiv.org/abs/2505.23614">&#8220;Classical search algorithms have long underpinned modern artificial intelligence. In this work, we tackle the challenge of inference-time control in diffusion models&#8202;&#8212;&#8202;adapting generated outputs to meet diverse test-time objectives&#8202;&#8212;&#8202;using principles from classical search. We propose a general framework that orchestrates local and global search to efficiently navigate the generative space. It employs a theoretically grounded local search via annealed Langevin MCMC and performs compute-efficient global exploration using breadth-first and depth-first tree search. We evaluate our approach on a range of challenging domains, including planning, offline reinforcement learning, and image generation. Across all tasks, we observe significant gains in both performance and efficiency. These results show that classical search provides a principled and practical foundation for inference-time scaling in diffusion models.&#8221;</a></figcaption></figure></div><p>Take the prompt &#8220;eight bottles.&#8221; Standard FLUX almost always messes up the count. But if you run a DFS search guided by an object-detection verifier, the system catches the error mid-trajectory. If the verifier sees the count is wrong at step 20, it slaps the denoiser&#8217;s hand, forces it to add noise back to an earlier level, and makes it try a different path. This will be similar to how well developed agentic systems work today (using external checkers to check the generator&#8217;s tasks and then adjusting the next generation accordingly), but it will be baked into the generation process instead of requiring an entire harness.</p><p>The benefits of this approach might make you ask yourself a very reasonable questin: Why don&#8217;t we do this with autoregressive LLMs?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!732S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!732S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png 424w, https://substackcdn.com/image/fetch/$s_!732S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png 848w, https://substackcdn.com/image/fetch/$s_!732S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png 1272w, https://substackcdn.com/image/fetch/$s_!732S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!732S!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png" width="1200" height="783.7912087912088" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:951,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!732S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png 424w, https://substackcdn.com/image/fetch/$s_!732S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png 848w, https://substackcdn.com/image/fetch/$s_!732S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png 1272w, https://substackcdn.com/image/fetch/$s_!732S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9a67b7c-9dad-4833-9455-3913d36c930f_2400x1568.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>The Math That Makes Branching Cheap</h4><p>Because branching AR means cloning massive KV caches for every new path. The cost scales linearly. A 4x search costs you 4x the compute. It is financially ruinous at scale.</p><p>Diffusion drops that tax entirely.</p><p>Because the early denoising steps just build blurry, coarse shapes, multiple candidates can share the exact same early trajectory. They only need to split at the very end when fine details are committed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7U2j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7U2j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png 424w, https://substackcdn.com/image/fetch/$s_!7U2j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png 848w, https://substackcdn.com/image/fetch/$s_!7U2j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png 1272w, https://substackcdn.com/image/fetch/$s_!7U2j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7U2j!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png" width="1200" height="806.8681318681319" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:979,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7U2j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png 424w, https://substackcdn.com/image/fetch/$s_!7U2j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png 848w, https://substackcdn.com/image/fetch/$s_!7U2j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png 1272w, https://substackcdn.com/image/fetch/$s_!7U2j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc30f6845-ed9b-4373-91d1-3101d900d565_2400x1614.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The arithmetic here is brutal. For <code>k</code> candidates sharing the first <code>s</code> steps of an <code>N</code>-step trajectory, your branching multiplier is <code>m_branch = k - (k-1) * s/N</code>. If you run 4 candidates and share the first 40 of 50 steps, your cost multiplier is exactly 1.6.</p><p>You are getting a 4x quality search for a 1.6x compute tax. If you are deploying capital into inference startups or hardware right now, that single equation should make you rethink where the value will actually flow.</p><h4>The Verifier Becomes the New Moat</h4><p>Because branching is practically free, the verifier becomes the actual god of the system. The delivered quality to the user is no longer just the single-shot capability of the denoiser. It is the searched quality of the denoiser-verifier pair.</p><p>This splits the competitive moat right down the middle. In the autoregressive world, the $2 billion base model is everything. In a diffusion world, a smart team can take a mid-tier, open-source denoiser, bolt on an elite, proprietary verifier suite (say, one specifically trained for medical imaging or product photography), give it a generous search budget, and absolutely annihilate a closed-source frontier model running single-shot.</p><p>The value migrates away from the hyperscalers burning gigawatts on training compute, and accrues directly to the companies building the smartest serving-stack intelligence. In the future, I also expect to see verifiers being sold as infrastructure (why make your own legal verifier when Irys has the best ones in the market?), which will unbundle intelligence tremendously.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EeHh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EeHh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EeHh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EeHh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EeHh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EeHh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg" width="1456" height="815" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:815,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EeHh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EeHh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EeHh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EeHh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7424486d-5512-4b77-9905-82e1dae3c1e4_2160x1209.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/reasoning-models-are-a-dead-end-breakdowns">We laid out this vision in our discussion around why Reasoning Models are a Dead End.</a></figcaption></figure></div><h4>How this will Impact ASICs</h4><p>This compound setup&#8202;&#8212;&#8202;Denoiser + Verifier + Search Orchestration&#8202;&#8212;&#8202;turns datacenter capacity planning into an absolute nightmare.</p><p>Current AR serving frameworks (like vLLM) are optimized for a highly predictable, bandwidth-bound decode process. Diffusion serving is a schizophrenic mess. You have a dense matmul denoiser running on FP4 fighting for accelerator space with a lightweight vision verifier, and maybe an LLM judge acting as the final arbiter. Your compute per output isn&#8217;t fixed; it varies based on how quickly the search tree converges. Your memory pressure isn&#8217;t a slow-growing KV cache; it is a sudden, violent spike of activation memory whenever a request branches.</p><p>This is the kill-shot for half the specialized hardware pitches we are seeing right now.</p><p>An ASIC designed purely for dense matmul throughput handles a denoiser beautifully in isolation. But the production workload isn&#8217;t isolated. The second an ASIC has to pause the denoiser, offload verification to a different chip, and orchestrate a backtracking search tree, the inter-chip latency eats the entire throughput advantage. This puts two opposing consequences on the system:</p><ol><li><p>In the short term, GPUs keep their monopoly in this space not because they are the fastest at math, but because they are the only silicon flexible enough to survive the chaos of a compound search workload.</p></li><li><p>Longer term: this actually opens up a wedge for ASICs, if they can figure out a way to work with each other. Different verifiers/model profiles/modalities will be best with different specializations; what they&#8217;ll need is a system to break things up and organize communication across them. <a href="https://www.artificialintelligencemadesimple.com/p/ai-orchestration-will-create-the?utm_source=publication-search">This is something we broke down at length over here</a>, but I highly recommend investing into this before the market catches up to this opportunity.</p></li></ol><p>I&#8217;d like to end this analysis by specifically digging into this space more. The AI Hardware market is incredibly fascinating, with multiple providers making a specific, multi-billion-dollar claim about where the next bottleneck in the system will live. The gambler in me always gets all hot and bothered in such high-stakes situations. So let&#8217;s look at the vendors, the claims they&#8217;re making about the future, and how they well they align with a Diffusion-heavy future.</p><h3>Section 6: Which Inference Vendors are Suited for Diffusion Inference Workloads</h3><h4>Why Diffusion Changes the Dynamic</h4><p>This has been a long article, so it would be helpful to take a breather and regather the important points.</p><p>For the last three years, the inference stack was built to solve autoregressive decode. Because AR models generate one token at a time, the arithmetic units sit idle waiting for weights to stream from memory. This reality forced the entire market to obsess over memory bandwidth and KV-cache management.</p><p>Diffusion changes the math entirely. It shifts the primary workload from bandwidth-starved sequential decode to dense, parallel matrix multiplication.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yt1x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yt1x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png 424w, https://substackcdn.com/image/fetch/$s_!yt1x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png 848w, https://substackcdn.com/image/fetch/$s_!yt1x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png 1272w, https://substackcdn.com/image/fetch/$s_!yt1x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yt1x!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png" width="1200" height="805.2197802197802" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:977,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yt1x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png 424w, https://substackcdn.com/image/fetch/$s_!yt1x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png 848w, https://substackcdn.com/image/fetch/$s_!yt1x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png 1272w, https://substackcdn.com/image/fetch/$s_!yt1x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d27bc8-05ef-477b-a99f-a975adb8d5e7_2400x1610.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But diffusion inference is not just a clean denoiser. It is a compound system:</p><p><code>Text Encoder &#8594; Denoiser Loop &#8594; Adapters/Control &#8594; Verifier Search &#8594; Safety Filter &#8594; VAE Decoder &#8594; Output</code></p><p>This pipeline is exactly why the hardware market is fracturing. Different vendors built their chips to solve only specific parts of this equation. Here is how the physical silicon actually maps to the math.</p><h4>NVIDIA: The Moat is the Swamp</h4><p>Nvidia&#8217;s primary advantage is not raw speed. It is flexibility.</p><p>Production diffusion is a heterogeneous swamp. A real-world creative workflow requires a text encoder, the primary denoiser, multiple LoRAs, an IP-Adapter, a safety classifier, a verifier, and a VAE decoder. GPUs are uniquely suited to survive this kind of chaos.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y-Ll!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y-Ll!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png 424w, https://substackcdn.com/image/fetch/$s_!y-Ll!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png 848w, https://substackcdn.com/image/fetch/$s_!y-Ll!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png 1272w, https://substackcdn.com/image/fetch/$s_!y-Ll!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y-Ll!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png" width="1456" height="1251" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1251,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y-Ll!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png 424w, https://substackcdn.com/image/fetch/$s_!y-Ll!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png 848w, https://substackcdn.com/image/fetch/$s_!y-Ll!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png 1272w, https://substackcdn.com/image/fetch/$s_!y-Ll!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8053537-f9ab-4cab-aa0d-f1536fdb5129_1600x1375.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/ai-isnt-a-software-business-anymore">AI Isn&#8217;t a Software Business Anymore</a></figcaption></figure></div><p>An ASIC might execute the denoising loop faster, but Nvidia can run the denoiser, the LLM judge, and the adapter stack simultaneously because CUDA supports everything. As long as diffusion workflows require running a shifting society of models on the same machine, the Blackwell and Rubin architectures will maintain their dominance. General compute outlives narrow accelerators during periods of structural transition.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GfhS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GfhS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png 424w, https://substackcdn.com/image/fetch/$s_!GfhS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png 848w, https://substackcdn.com/image/fetch/$s_!GfhS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png 1272w, https://substackcdn.com/image/fetch/$s_!GfhS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GfhS!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png" width="1200" height="788.7362637362637" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:957,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GfhS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png 424w, https://substackcdn.com/image/fetch/$s_!GfhS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png 848w, https://substackcdn.com/image/fetch/$s_!GfhS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png 1272w, https://substackcdn.com/image/fetch/$s_!GfhS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630435c8-5891-4b08-a530-0b7def28e8bc_2400x1578.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>AMD: The Memory-Capacity Hedge</h4><p>AMD&#8217;s thesis isn&#8217;t just &#8220;Nvidia but cheaper.&#8221; AMD&#8217;s MI300X and MI400-class architectures are explicitly built around massive High-Bandwidth Memory (HBM) capacity and open rack-scale infrastructure.</p><blockquote><p><em>&#8220;We believe that the MI355X could be competitive against the HGX B200 for small to medium LLMs production inference workloads. This is because the MI355X total cost of ownership is 33% lower than that of the HGX B200 for self-owned clusters, while it delivers much more HBM memory capacity, slightly more FP8 and FP4 TFLOP/s and double the FP6 TFLOP/s. Rapid improvements to AMD software under the leadership of Anush, AMD&#8217;s AI Software King, will also push the MI355X&#8217;s relative performance per TCO advantage higher. &#8220;&#8212; <a href="https://newsletter.semianalysis.com/p/amd-advancing-ai-mi350x-and-mi400-ualoe72-mi500-ual256">SemiAnalysis</a>. </em><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;SemiAnalysis&quot;,&quot;id&quot;:6349492,&quot;type&quot;:&quot;pub&quot;,&quot;url&quot;:&quot;https://open.substack.com/pub/semianalysis&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/88ad87ad-b5c5-4687-b13e-672f72725795_501x501.png&quot;,&quot;uuid&quot;:&quot;19829a41-a9b5-4e6e-b4cc-781ddca727a7&quot;}" data-component-name="MentionToDOM"></span> </p></blockquote><p>For image diffusion, compute is the primary bottleneck. For video diffusion, the bottleneck shifts to state. Longer video clips mean more tokens, which drastically increases the size of the intermediate activations. Suddenly, raw memory capacity matters just as much as peak compute. If the market stays centered on images, Nvidia keeps its exact advantage. But if the market shifts heavily to video generation and Mixture of Experts (MoE) denoisers, AMD&#8217;s capacity-first architecture is positioned exactly where the bottleneck moves.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uHeO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uHeO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png 424w, https://substackcdn.com/image/fetch/$s_!uHeO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png 848w, https://substackcdn.com/image/fetch/$s_!uHeO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png 1272w, https://substackcdn.com/image/fetch/$s_!uHeO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uHeO!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png" width="1200" height="799.4505494505495" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:970,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:3755591,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/195835178?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uHeO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png 424w, https://substackcdn.com/image/fetch/$s_!uHeO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png 848w, https://substackcdn.com/image/fetch/$s_!uHeO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png 1272w, https://substackcdn.com/image/fetch/$s_!uHeO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd638682a-9f7a-43d1-949f-9d1af37e08df_2072x1380.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jfka!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jfka!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Jfka!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Jfka!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Jfka!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jfka!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg" width="1230" height="652" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:652,&quot;width&quot;:1230,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jfka!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Jfka!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Jfka!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Jfka!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb45e84-5e15-472f-bb6b-fa528ff57d07_1230x652.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Google TPU &amp; AWS Trainium: Factories and Procurement</h4><p>Google&#8217;s TPUs (like the new Trillium and Ironwood chips) are diffusion factories. Their architecture relies on systolic arrays and the XLA compiler. If Google can freeze the generation graph&#8202;&#8212;&#8202;using the same model, the same resolution, and a fixed step count at massive volume&#8202;&#8212;&#8202;the compiler maps the math perfectly to the hardware. They win tightly controlled, internal workloads like Vertex AI and Veo. But they struggle with open diffusion, where users demand custom schedulers, dynamic branching, and constant adapter swapping. The TPU compiler hates a graph that won&#8217;t sit still.</p><p>AWS Trainium plays a different role. It is a procurement weapon. AWS doesn&#8217;t need to win the frontier; it simply needs Trainium to be cheap and integrated enough that enterprises keep their stable, batch-generation marketing workflows inside the AWS ecosystem.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vs4_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vs4_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png 424w, https://substackcdn.com/image/fetch/$s_!Vs4_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png 848w, https://substackcdn.com/image/fetch/$s_!Vs4_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png 1272w, https://substackcdn.com/image/fetch/$s_!Vs4_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vs4_!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png" width="1200" height="795.3296703296703" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:965,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vs4_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png 424w, https://substackcdn.com/image/fetch/$s_!Vs4_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png 848w, https://substackcdn.com/image/fetch/$s_!Vs4_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png 1272w, https://substackcdn.com/image/fetch/$s_!Vs4_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4719b7b-32af-43f5-b142-a6d9ada52d1d_2400x1590.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Furiosa and d-Matrix: The Efficiency Bets</h4><p>These two startups looked at the power consumption of GPUs and built architectures to fix it, but in completely different ways.</p><ul><li><p><strong>Furiosa&#8217;s RNGD</strong> chip abandons traditional matrix-multiplication units entirely. Instead, they built a Tensor Contraction Processor (TCP) that treats tensor contractions&#8202;&#8212;&#8202;like attention mechanisms&#8202;&#8212;&#8202;as the fundamental hardware primitive.</p></li><li><p><strong>d-Matrix&#8217;s Corsair</strong>, on the other hand, uses Digital In-Memory Compute (DIMC). They physically moved the arithmetic logic right next to the SRAM cells to stop wasting power dragging data back and forth.</p></li></ul><p><em><strong>Because diffusion relies on repeated passes through the same denoising backbone, both architectures map to the workload cleanly. But there is a catch. This efficiency only holds if the surrounding pipeline&#8202;&#8212;&#8202;the verifiers, VAEs, and safety models&#8202;&#8212;&#8202;can also be executed on the chip. If video activations grow larger than d-Matrix&#8217;s SRAM or Furiosa&#8217;s 48GB of HBM3 can comfortably hold, the efficiency breaks. They win image-scale denoising, but face steep challenges at video scale.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4JWL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4JWL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4JWL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4JWL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4JWL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4JWL!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png" width="1200" height="800.2747252747253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4JWL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4JWL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4JWL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4JWL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe823204-96fe-4437-80c3-ce6ff058e9ce_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Groq: Token Generation in a Block-Refinement World</h4><p>Groq built an exceptional machine for deterministic token generation. Their LPU (Language Processing Unit) throws out HBM entirely, relying instead on massive pools of extremely fast SRAM and a compiler that schedules every piece of data movement ahead of time. It is a machine built perfectly to stream AR tokens.</p><p>But if inference shifts toward block-level parallel refinement and full-latent denoising passes, Groq&#8217;s religion is challenged. In a compound diffusion pipeline, they are unlikely to be the primary renderer. Instead, they survive as the ultra-fast LLM planner or verifier judge sitting next to the diffusion engine, handling the language-side logic while a GPU processes the pixels.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xOry!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xOry!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png 424w, https://substackcdn.com/image/fetch/$s_!xOry!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png 848w, https://substackcdn.com/image/fetch/$s_!xOry!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png 1272w, https://substackcdn.com/image/fetch/$s_!xOry!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xOry!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png" width="1200" height="803.5714285714286" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:975,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xOry!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png 424w, https://substackcdn.com/image/fetch/$s_!xOry!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png 848w, https://substackcdn.com/image/fetch/$s_!xOry!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png 1272w, https://substackcdn.com/image/fetch/$s_!xOry!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff791d5a0-0b42-4c12-8312-9c1c5701c6be_2400x1607.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Cerebras: The Wafer-Scale Locality Limit</h4><p>Cerebras does not build chips; they build wafers. The WSE-3 is the size of a dinner plate. By keeping everything on one giant piece of silicon with massive amounts of SRAM, they eliminate the slow, expensive interconnect latency of moving data between traditional GPUs.</p><p>For a static, continuous denoising loop, this is an elegant solution. But production pipelines require orchestrating multiple side-models, and video activations can easily exceed local SRAM limits. Once the working state forces the system to move data off the wafer, the architecture loses its primary advantage. Cerebras is highly effective for a single, massive denoiser, but struggles with the messy reality of multi-model video serving.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bBr4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bBr4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png 424w, https://substackcdn.com/image/fetch/$s_!bBr4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png 848w, https://substackcdn.com/image/fetch/$s_!bBr4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png 1272w, https://substackcdn.com/image/fetch/$s_!bBr4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bBr4!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png" width="1200" height="799.4505494505495" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:970,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bBr4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png 424w, https://substackcdn.com/image/fetch/$s_!bBr4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png 848w, https://substackcdn.com/image/fetch/$s_!bBr4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png 1272w, https://substackcdn.com/image/fetch/$s_!bBr4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc3715dc-c010-4101-989a-466fc7374e83_2400x1599.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Etched: The Transformer Ultimatum</h4><p>Etched made the most aggressive bet in the market with Sohu. They ripped out all the general-purpose programmability of a GPU and physically hardwired the Transformer architecture directly into the silicon.</p><p>If image and video diffusion standardize completely around Diffusion Transformers (DiTs), Etched is a monster. But diffusion transformers are not identical to causal LLMs. They need bidirectional attention, multimodal token streams, and variable resolutions. If Etched&#8217;s architecture is too rigidly optimized for the causal LLMs of 2024, diffusion will run poorly on it. It is a gorgeous, specialized knife waiting to see if the industry standardizes on the exact vegetable it was built to slice.</p><h4>Apple &amp; Qualcomm: The Edge Volume</h4><p>Apple and Qualcomm are betting on the collapse of the step count. As discussed in Section 4, once image distillation drops the required forward passes to between 1 and 4 steps, the math becomes efficient enough to run locally on consumer NPUs via Apple&#8217;s Unified Memory architecture.</p><p>Local generation is private, instant, and carries zero marginal cost for the provider. Datacenters will retain the premium video revenue and heavy enterprise pipelines, but Apple and Qualcomm are positioned to completely strip-mine the bottom of the consumer image market. The edge wins the volume; the cloud retains the revenue.</p><h4>Lightmatter and Astera: The Fabric Layer</h4><p>Lightmatter and Astera are betting on the physical limits of copper wire.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9O-J!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9O-J!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png 424w, https://substackcdn.com/image/fetch/$s_!9O-J!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png 848w, https://substackcdn.com/image/fetch/$s_!9O-J!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png 1272w, https://substackcdn.com/image/fetch/$s_!9O-J!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9O-J!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png" width="1200" height="801.0989010989011" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:972,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9O-J!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png 424w, https://substackcdn.com/image/fetch/$s_!9O-J!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png 848w, https://substackcdn.com/image/fetch/$s_!9O-J!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png 1272w, https://substackcdn.com/image/fetch/$s_!9O-J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3499e983-5b95-4e5e-92a5-efd9394841d9_2400x1603.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A standard image model does not require a photonic spiritual awakening. But long video diffusion and world models create massive token counts, severe temporal attention costs, and multi-device parallelism. When activation memory exceeds what a single server rack can hold, the interconnect fabric becomes the foundational bottleneck. Lightmatter&#8217;s Passage platform uses 3D-stacked silicon photonics to move data using lasers instead of electrical signals, hitting 1.6 Tbps per fiber. These companies win when video diffusion scale officially outruns algorithmic compression, forcing the rack itself to become the computer.</p><h3>Conclusion: Where this is All Headed</h3><p>Every major computing paradigm eventually shifts the locus of value from the raw engine to the orchestration layer. In the early internet, we obsessed over physical servers and networking hardware. Eventually, the servers commoditized. The real money moved to the platforms routing the traffic.</p><p>AI is about to undergo the exact same unbundling. For three years, we have treated foundation models like indivisible, magical brains. But if diffusion turns generation into a branching search problem, the monolithic model dies. It becomes a supply chain. You will have cheap open-source denoisers, elite proprietary verifiers, and dynamic schedulers, all negotiating with each other in real-time.</p><p>The history of tech is just the history of expensive, integrated systems being ripped apart by cheaper, modular components. The closed labs and hyperscalers are fighting a brutal, capital-intensive war to own the integrated brain. But if inference is no longer a straight line, the most valuable real estate won&#8217;t be the base weights or the silicon. It will be the routing logic. The question is no longer who builds the biggest engine. It is who builds the best map.</p><p>Thank you for being here, and I hope you have a wonderful day,</p><p>Dev &lt;3</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/how-the-next-generation-of-ai-models?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/how-the-next-generation-of-ai-models?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ys0B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ys0B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png 424w, https://substackcdn.com/image/fetch/$s_!Ys0B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png 848w, https://substackcdn.com/image/fetch/$s_!Ys0B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png 1272w, https://substackcdn.com/image/fetch/$s_!Ys0B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ys0B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png" width="447" height="117" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:117,&quot;width&quot;:447,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ys0B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png 424w, https://substackcdn.com/image/fetch/$s_!Ys0B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png 848w, https://substackcdn.com/image/fetch/$s_!Ys0B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png 1272w, https://substackcdn.com/image/fetch/$s_!Ys0B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b2eecc-5fd1-4515-89e2-078b27020d44_447x117.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Reach out to me</h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. : </p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[Google's Gemma 4 will Change How AI Models are Built]]></title><description><![CDATA[Breaking down the architectural decisions Google made &#8212; and why edge and server models are built on opposite logic.]]></description><link>https://www.artificialintelligencemadesimple.com/p/googles-gemma-4-will-change-how-ai</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/googles-gemma-4-will-change-how-ai</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Wed, 22 Apr 2026 02:04:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!J65w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Everyone is staring at the Gemma 4 benchmarks right now, but that misses the actual design shift. Google shipped four models, but if you look under the hood, you&#8217;re looking at two entirely divergent architectures: E2B and E4B for phones, and 26B and 31B for servers. Unlike traditional multi-model rollouts (where it&#8217;s the same core architecture at different scales) , t<strong>he edge pair and server pair have very different DNAs.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J65w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J65w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png 424w, https://substackcdn.com/image/fetch/$s_!J65w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png 848w, https://substackcdn.com/image/fetch/$s_!J65w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png 1272w, https://substackcdn.com/image/fetch/$s_!J65w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J65w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png" width="1456" height="855" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:855,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J65w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png 424w, https://substackcdn.com/image/fetch/$s_!J65w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png 848w, https://substackcdn.com/image/fetch/$s_!J65w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png 1272w, https://substackcdn.com/image/fetch/$s_!J65w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183c44f1-7aff-4216-9c3e-84a6f8218bb5_1600x940.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://deepmind.google/models/gemma/gemma-4/">When you go to the Deepmind blog on this, you will see this bifurcation explicitly.</a></figcaption></figure></div><p>Why? Because the physical constraints of a phone and an H100 are exact opposites.</p><p>Think about the hardware. A phone has abundant flash but starves for DRAM&#8202;&#8212;&#8202;you&#8217;ve typically got 128GB of storage against maybe 8GB of shared memory, all bound by a battery that dies if you hit it wrong. A server gives you DRAM in abundance and FLOPS you pay for by the hour. Every architectural choice is a trade between memory, storage, and compute. When that scarcity flips, you have to pull opposite levers: you compress memory at the cost of compute on an edge device, and you spend memory to save compute on a server. That&#8217;s why one architectural DNA can&#8217;t survive both deployments.</p><p>Gemma solves this through different architectural DNAs, and I think that&#8217;s the real signal about where the industry&#8217;s headed&#8202;&#8212;&#8202;one family name, very different architectures underneath depending on where the model has to run. This has several implications for the future of AI, all of which go way beyond simply discussing the model choices/benchmark numbers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f4ot!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f4ot!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png 424w, https://substackcdn.com/image/fetch/$s_!f4ot!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png 848w, https://substackcdn.com/image/fetch/$s_!f4ot!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png 1272w, https://substackcdn.com/image/fetch/$s_!f4ot!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f4ot!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png" width="1456" height="820" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:820,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f4ot!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png 424w, https://substackcdn.com/image/fetch/$s_!f4ot!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png 848w, https://substackcdn.com/image/fetch/$s_!f4ot!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png 1272w, https://substackcdn.com/image/fetch/$s_!f4ot!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61422223-0fc3-4c91-9be7-19178e8bd706_2068x1164.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">There are some pretty impressive numbers though.</figcaption></figure></div><p>In this article, we will cover:</p><ul><li><p><strong>The two constraints:</strong> the KV cache scaling and narrow hidden dimensions forcing this entire split.</p></li><li><p><strong>Per-layer embeddings:</strong> why the E2B blows half its weight budget on flash lookups, and why the server models don&#8217;t bother.</p></li><li><p><strong>Interleaved local-global attention:</strong> where Gemma 4 actually pays the O(n&#178;) compute tax, and where it doesn&#8217;t.</p></li><li><p><strong>Divergent GQA:</strong> how edge compresses KV everywhere, while server compresses it selectively.</p></li><li><p><strong>Cross-layer KV sharing:</strong> the edge-only trick that slashes cache by 83%, and the MLP tax it charges you to do it.</p></li><li><p><strong>Partial RoPE:</strong> rotating only 25% of dimensions so the content actually has room to breathe.</p></li><li><p><strong>Hybrid MoE:</strong> the always-on dense FFN that makes running 4x Mixtral&#8217;s sparsity safe in production.</p></li><li><p><strong>The FA2 serving break:</strong> why a 512 head dimension costs pre-Blackwell GPUs roughly 14x in throughput.</p></li><li><p><strong>The deployment framework:</strong> which model earns its place where, and exactly what breaks in practice.</p></li></ul><p>Let&#8217;s have some fun.</p><h3>Executive Highlights (TL;DR of the Article)</h3><ul><li><p><strong>The Core Thesis:</strong> The benchmark hype misses the real story. Gemma 4 is actually two entirely divergent architectures marketed under one name. Edge models (E2B/E4B) and server models (26B/31B) share almost nothing because a phone&#8217;s constraints (abundant flash, zero DRAM) are the exact opposite of an H100&#8217;s constraints.</p></li><li><p><strong>Per-Layer Embeddings (Edge):</strong> The E2B blows nearly half its parameter budget (46%) on flash-based lookup tables. This prevents token meanings from colliding in a narrow hidden state, boosting reasoning without touching the phone&#8217;s limited DRAM. Server models skip this entirely.</p></li><li><p><strong>Interleaving &amp; The KV Cache War:</strong> To survive massive contexts, Gemma 4 alternates between cheap local attention and expensive global attention. On the edge, it goes further by heavily compressing the KV cache and reusing it across layers&#8202;&#8212;&#8202;shrinking the cache from gigabytes to megabytes so 128K context can actually fit on a phone.</p></li><li><p><strong>Partial RoPE:</strong> At 128K context, standard positional encoding acts as noise, scrambling semantic meaning. Gemma 4 fixes this by only rotating 25% of the vector&#8217;s dimensions to track position, leaving the other 75% clean to act as pure, undistorted content channels.</p></li><li><p><strong>Hybrid MoE (Server):</strong> The 26B server model pushes a hyper-aggressive 128-expert routing setup. It only survives this extreme sparsity because an always-on dense feed-forward network runs alongside the experts, acting as a structural safety net against routing failures.</p></li><li><p><strong>The Infrastructure Break:</strong> The global attention layers require a head dimension of 512. This breaks FlashAttention-2 compatibility. If you run Gemma 4 on pre-Blackwell hardware (like an H100 or 4090) today, you will take a massive 14x throughput hit until the open-source serving stack patches the issue.</p></li><li><p><strong>The Big Picture:</strong> Uniform scaling is dead. We are witnessing the unbundling of the one-size-fits-all transformer. The labs that win the next cycle won&#8217;t just scale brute compute; they will engineer specialized architectures that exploit the exact physics of the hardware they run on.</p></li></ul><p><em>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DGz8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DGz8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png 424w, https://substackcdn.com/image/fetch/$s_!DGz8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png 848w, https://substackcdn.com/image/fetch/$s_!DGz8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png 1272w, https://substackcdn.com/image/fetch/$s_!DGz8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DGz8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png" width="576" height="236" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:236,&quot;width&quot;:576,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DGz8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png 424w, https://substackcdn.com/image/fetch/$s_!DGz8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png 848w, https://substackcdn.com/image/fetch/$s_!DGz8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png 1272w, https://substackcdn.com/image/fetch/$s_!DGz8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920d7d68-2ef8-4ac4-9590-07cb942415a0_576x236.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><h3>What Is Per-Layer Embedding and Why Does Only Edge Gemma 4 Use It?</h3><p>Every token enters a transformer through a massive lookup table that maps the raw word into a vector. That vector becomes the main signal flowing through the entire model. Attention and FFN layers reshape it at every layer, but they are all just transforming that exact same base vector that started life as a table entry.</p><p>That starting vector has a brutal job: it has to carry the token&#8217;s raw identity (the word &#8220;bank&#8221;) and its contextual potential (money vs. river) simultaneously. In a massive model like Llama 3 70B (hidden_size 8,192), there is enough bandwidth for both signals to travel without stepping on each other. Different dimensions specialize; the signal has room to breathe.</p><p>But E2B runs at a hidden_size of 1,536. At a fifth of the width, the signals suffocate. &#8220;Bank the river&#8221; and &#8220;bank the money&#8221; compete for the same constrained coordinates, and every downstream attention layer inherits that collision.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tqTy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tqTy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png 424w, https://substackcdn.com/image/fetch/$s_!tqTy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png 848w, https://substackcdn.com/image/fetch/$s_!tqTy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png 1272w, https://substackcdn.com/image/fetch/$s_!tqTy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tqTy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png" width="1440" height="964" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:964,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tqTy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png 424w, https://substackcdn.com/image/fetch/$s_!tqTy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png 848w, https://substackcdn.com/image/fetch/$s_!tqTy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png 1272w, https://substackcdn.com/image/fetch/$s_!tqTy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28008b3e-cbc1-4556-a54f-777c252c31e4_1440x964.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The naive fix is to bolt capacity downstream using adapters, LoRA, or extra trainable projections. But downstream math can only transform what survived the initial lookup. If the embedding flattened two meanings into the same direction, no amount of downstream projection unflattens them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WXAI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WXAI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WXAI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WXAI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WXAI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WXAI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg" width="750" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:750,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WXAI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WXAI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WXAI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WXAI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4188a2fb-c609-4efb-9ab3-e12762d639d3_750x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Imo, most of the intelligence is in the embedding space, and this aspect is severely overlooked in most AI research.</figcaption></figure></div><p>Google&#8217;s fix is to move the capacity upstream to the lookup itself. Instead of one giant table, E2B gives each of its 35 decoder layers its own dedicated 256-dim embedding table. When a token arrives, every layer does its own lookup. The main 1,536-dim signal flowing through the model no longer has to remember a token&#8217;s raw identity across 35 layers, because every layer has a fresh, context-aware identity signal waiting for it.</p><p>This is obviously not a free lunch. At 67.1M parameters per table across 35 layers, Per-Layer Embeddings (PLE) consume 2.35B parameters&#8202;&#8212;&#8202;46% of E2B&#8217;s entire 5.1B budget. If this lived in DRAM, it would be fatal since the phone only has 8GB, and the KV cache and activations are already eating it.</p><p>But PLE tables don&#8217;t need DRAM. They are static lookups, read once per token per layer. Flash latency is irrelevant at that access pattern. So Google parks the tables in flash, where a phone has 128GB to burn. That 4.7GB footprint is effectively free.</p><p>This is also exactly why the 26B and 31B skip PLE entirely.<strong> On an H100, you have 80GB of HBM and no flash asymmetry to exploit. You would never blow 46% of your parameter budget on a trick that only pays off when DRAM is the binding constraint. Besides, at their wider hidden sizes (2,816 and 5,376), the representational collision stops being a fatal problem.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OH6I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OH6I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png 424w, https://substackcdn.com/image/fetch/$s_!OH6I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png 848w, https://substackcdn.com/image/fetch/$s_!OH6I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png 1272w, https://substackcdn.com/image/fetch/$s_!OH6I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OH6I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png" width="1440" height="912" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:912,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OH6I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png 424w, https://substackcdn.com/image/fetch/$s_!OH6I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png 848w, https://substackcdn.com/image/fetch/$s_!OH6I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png 1272w, https://substackcdn.com/image/fetch/$s_!OH6I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6bb0104-1dd2-4b01-85ea-7175009e73a1_1440x912.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Does PLE actually do the work Google implies? They haven&#8217;t shipped an ablation, but E2B&#8217;s benchmarks are hard to explain without it. It hits 37.5% on AIME 2026 (beating Gemma 3 27B&#8217;s 20.8%) and 44.0% on LiveCodeBench v6. A model 12x smaller in effective parameters is beating its massive predecessor on reasoning, heavily implying the narrow main signal is finally free to actually reason instead of just remembering what token it saw. It would also be consistent with other patterns where small augmentations that specialization enable higher quality reasoning by allowing the other params to dedicate to reasoning.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y21H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y21H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png 424w, https://substackcdn.com/image/fetch/$s_!Y21H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png 848w, https://substackcdn.com/image/fetch/$s_!Y21H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png 1272w, https://substackcdn.com/image/fetch/$s_!Y21H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y21H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png" width="1456" height="754" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:754,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y21H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png 424w, https://substackcdn.com/image/fetch/$s_!Y21H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png 848w, https://substackcdn.com/image/fetch/$s_!Y21H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png 1272w, https://substackcdn.com/image/fetch/$s_!Y21H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8fedc2-5090-4c61-9ca5-399e96fb7899_2098x1086.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><strong><a href="https://arxiv.org/abs/2510.13935">Big Reasoning with Small Models: Instruction Retrieval at Inference Time </a></strong><a href="https://arxiv.org/abs/2510.13935">augments small models with retrieved structured reasoning procedures and gets consistent gains </a><strong><a href="https://arxiv.org/abs/2510.13935">without any additional fine-tuning</a></strong><a href="https://arxiv.org/abs/2510.13935">: </a><strong><a href="https://arxiv.org/abs/2510.13935">+9.4%</a></strong><a href="https://arxiv.org/abs/2510.13935">, </a><strong><a href="https://arxiv.org/abs/2510.13935">+7.9%</a></strong><a href="https://arxiv.org/abs/2510.13935">, and </a><strong><a href="https://arxiv.org/abs/2510.13935">+5.1%</a></strong><a href="https://arxiv.org/abs/2510.13935"> across medicine, law, and math. That is very clean evidence that small models can reason much better when some burden is offloaded into specialized augmentation rather than crammed into dense parameters (and why Fine Tuning is Dumb).</a></figcaption></figure></div><p>Given that everyone has decided how inefficient Self Attention is, it should come as no surprise that the next innovation Google made was around it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_zQ5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_zQ5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_zQ5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_zQ5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_zQ5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_zQ5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg" width="1456" height="4106" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4106,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_zQ5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_zQ5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_zQ5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_zQ5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffb33f53-28ca-43bc-a192-1ea16d6f7942_1456x4106.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting?utm_source=publication-search">If you want to understand all the research around the next generation of the Attention mechanism, check out our report &#8220;How Long Context Inference Is Rewriting the Future of Transformers</a>&#8221;</figcaption></figure></div><h3>Why Does Gemma 4 Interleave Local and Global Attention?</h3><p>Attention is the most expensive thing a transformer does. Every token has to look at every other token, meaning the compute cost scales quadratically with sequence length. At 128K tokens, that is roughly 16 billion score computations per layer. Multiplied across 30 to 60 layers, it eats the FLOP budget alive.</p><p>The historical move is to stop looking at everything. Sliding window attention (Mistral, Phi) caps each token&#8217;s view at a fixed window&#8202;&#8212;&#8202;say, 512 tokens in each direction. The cost drops from O(n&#178;) to O(n *window size), <strong>which at 128K context is a 250x reduction.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!74iQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!74iQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png 424w, https://substackcdn.com/image/fetch/$s_!74iQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png 848w, https://substackcdn.com/image/fetch/$s_!74iQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!74iQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!74iQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png" width="1440" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!74iQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png 424w, https://substackcdn.com/image/fetch/$s_!74iQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png 848w, https://substackcdn.com/image/fetch/$s_!74iQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!74iQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5191c4a3-b8fa-4170-b689-2d0aa3dec92c_1440x1000.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The wall you hit here is signal degradation. With a strict sliding window, a token at position 1,000 cannot directly see a token at position 50,000. Long-range dependencies have to hop through intermediate layers, and each hop degrades the signal. Most modern small models just accept this range limitation, operating on the assumption that if you&#8217;re using 2B phone model to process legal documents (or any solution not named Irys.ai for legal work for that matter), then you deserve to be arrested and have your contributions to the gene pool snipped.</p><p>Gemma 4 cannot make that assumption. E2B and E4B are multimodal, and processing video frames blows past 8K tokens in seconds. The edge models <em>must</em> handle long contexts. Google&#8217;s fix to this conundrum is interleaving. Most layers use local sliding-window attention, while a few execute full global attention. The model alternates between them on a fixed ratio:</p><ul><li><p><strong>E2B:</strong> 4 local + 1 global, repeated 7 times. Window: 512.</p></li><li><p><strong>E4B:</strong> 5 local + 1 global, repeated 7 times. Window: 512.</p></li><li><p><strong>26B:</strong> 5 local + 1 global, repeated 5 times. Window: 1024.</p></li><li><p><strong>31B:</strong> 5 local + 1 global, repeated 10 times. Window: 1024.</p></li></ul><p>Every model ends on a global layer. The output always sees the full context regardless of what the intermediate layers did.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qVXw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qVXw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png 424w, https://substackcdn.com/image/fetch/$s_!qVXw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png 848w, https://substackcdn.com/image/fetch/$s_!qVXw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png 1272w, https://substackcdn.com/image/fetch/$s_!qVXw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qVXw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png" width="1440" height="984" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a767441f-7991-4b32-ab29-1fb99516baac_1440x984.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:984,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qVXw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png 424w, https://substackcdn.com/image/fetch/$s_!qVXw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png 848w, https://substackcdn.com/image/fetch/$s_!qVXw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png 1272w, https://substackcdn.com/image/fetch/$s_!qVXw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa767441f-7991-4b32-ab29-1fb99516baac_1440x984.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The insight here is that the global layer is not just doing the same work less frequently. Local layers build up rich feature representations within short spans&#8202;&#8212;&#8202;512 tokens is plenty for syntax and local semantics. The occasional global layer then executes long-range integration on those <em>already-refined</em> features, rather than raw token signals. It does less work per unit of capacity, which is why the 5:1 ratio sustains long-range reasoning without degrading the output.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!phwH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!phwH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg 424w, https://substackcdn.com/image/fetch/$s_!phwH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg 848w, https://substackcdn.com/image/fetch/$s_!phwH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!phwH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!phwH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg" width="1456" height="727" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:727,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!phwH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg 424w, https://substackcdn.com/image/fetch/$s_!phwH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg 848w, https://substackcdn.com/image/fetch/$s_!phwH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!phwH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9c3d92-ec37-466c-a204-fefd8c002bab_2212x1104.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">This is likely inspired by Liquid AI, which was the first edge AI company to start building convolutional layers for text attention. The assumption is simple: most input tokens are likely to be heavily local, so we don&#8217;t lose much by conv-ing them out. This reduces the amount of global attention calls</figcaption></figure></div><p>At E2B&#8217;s 4:1 ratio, 80% of the attention layers pay linear compute instead of quadratic. On an 8K query, that is a 5x speedup for attention compute on the phone. At the 31B&#8217;s 256K context, the savings are the only reason the model fits in its FLOP budget at all.</p><p>This means that your system is inflexible: if a user inputs a task requiring dense long-range integration across the entire context&#8202;&#8212;&#8202;like cross-referencing contradictions throughout a 200-page document&#8202;&#8212;&#8202;the model cannot dynamically allocate more global layers. It gets what it gets. Most tasks do not hit this ceiling, but when they do, the degradation is hard-coded.</p><p>Every modern long-context architecture is converging on this identical bet: uniform O(n squared) attention is a cost most tokens don&#8217;t need to pay. Mamba avoids it with selective state updates. Ring Attention avoids it by partitioning across devices. Gemma 4 avoids it by interleaving layer types. The architecture that figures out exactly which tokens actually need the expensive math wins.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nTsm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nTsm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nTsm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nTsm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nTsm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nTsm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg" width="1360" height="1100" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1100,&quot;width&quot;:1360,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nTsm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nTsm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nTsm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nTsm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6c8b37-1387-4c2e-a490-c79cf13254bd_1360x1100.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/the-real-cost-of-running-ai">If you want to understand the cost of attention/running LLMs, read this.</a></figcaption></figure></div><h3>Why Do Edge and Server Gemma 4 Use Opposite GQA Strategies?</h3><p>The KV cache puts a silent chokehold on the memory budget of modern transformers. Every processed token must keep its Key and Value vectors in memory so future tokens can attend to them. Cache size scales as <code>layers &#215; KV_heads &#215; head_dim &#215; sequence_length &#215; 2</code> (one tensor for K, one for V). At 128K context on a 7B model, that is 12.8GB just for the cache. That is the entire DRAM budget of a phone before the weights even load.</p><p>Grouped-Query Attention (GQA) is the standard response. By having multiple query heads share a single KV head, you can slash your costs down. For instance, an 8:1 ratio cuts your cache by 8x.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-L9v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-L9v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png 424w, https://substackcdn.com/image/fetch/$s_!-L9v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png 848w, https://substackcdn.com/image/fetch/$s_!-L9v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png 1272w, https://substackcdn.com/image/fetch/$s_!-L9v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-L9v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png" width="1456" height="1001" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1001,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-L9v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png 424w, https://substackcdn.com/image/fetch/$s_!-L9v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png 848w, https://substackcdn.com/image/fetch/$s_!-L9v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png 1272w, https://substackcdn.com/image/fetch/$s_!-L9v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cf900ea-6033-41d6-bead-61f60c77c5be_2400x1650.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Gemma 4 breaks this uniformity in opposite directions depending on the hardware.</p><ul><li><p>Edge models crush KV everywhere. E2B runs 8:1 uniform GQA across every layer. E4B runs 4:1. The logic is forced by the hardware&#8202;&#8212;&#8202;on a phone, doubling KV heads Yamchas your 8GB limit. There is no layer where the edge models can afford richer KV, so they don&#8217;t have one.</p></li><li><p>Server models compress selectively. The 26B and 31B both use 2:1 GQA on local attention layers and 8:1 on global layers. Local layers operate on short spans where fine-grained KV discrimination helps distinguish nearby tokens, so server architects spend the DRAM there. Global layers aggregate over 256K of context, doing broader integration that tolerates aggressive compression. You compress where the work is coarse, and spend where it is precise.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rbDW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rbDW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png 424w, https://substackcdn.com/image/fetch/$s_!rbDW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png 848w, https://substackcdn.com/image/fetch/$s_!rbDW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png 1272w, https://substackcdn.com/image/fetch/$s_!rbDW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rbDW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png" width="1440" height="950" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dcdf78b0-9769-470c-9481-087269125365_1440x950.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:950,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rbDW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png 424w, https://substackcdn.com/image/fetch/$s_!rbDW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png 848w, https://substackcdn.com/image/fetch/$s_!rbDW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png 1272w, https://substackcdn.com/image/fetch/$s_!rbDW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcdf78b0-9769-470c-9481-087269125365_1440x950.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In global attention, the server models deploy one more trick: K=V weight sharing.</p><p>In normal attention, K and V do different jobs. <em><strong>Keys are the indexable signal queries compare against; they determine which tokens get attended to. Values are the payload passed forward once attention has selected a token. Think of a search engine: keys are how you index documents for retrieval, values are the document contents themselves. </strong></em>Typically, the mechanism learns these independently because &#8220;how to match&#8221; and &#8220;what to pass forward&#8221; are fundamentally different objectives.</p><p>Gemma 4 eliminates the V projection in global layers. The key projection is computed, then reused directly as the value, with only RMSNorm applied on the value side as a differentiator in the forward pass. As you might imagine, other architectures will typically avoid this since it nukes your quality. However, Gemma avoids that by building a good system around it:</p><ol><li><p>Global attention does long-range semantic integration where K/V specialization matters less than in local layers doing fine-grained discrimination.</p></li><li><p>The global <code>head_dim</code> is doubled to 512, giving each head enough room to encode both matching and content signal in one projection.</p></li><li><p>Partial RoPE leaves 75% of dimensions as clean content channels, pulling &#8220;match on content&#8221; and &#8220;retrieve content&#8221; closer together by construction.</p></li></ol><p><strong>This halves the global KV cache on top of the 8:1 GQA compression.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0TiT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0TiT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png 424w, https://substackcdn.com/image/fetch/$s_!0TiT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png 848w, https://substackcdn.com/image/fetch/$s_!0TiT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png 1272w, https://substackcdn.com/image/fetch/$s_!0TiT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0TiT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png" width="1440" height="988" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:988,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0TiT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png 424w, https://substackcdn.com/image/fetch/$s_!0TiT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png 848w, https://substackcdn.com/image/fetch/$s_!0TiT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png 1272w, https://substackcdn.com/image/fetch/$s_!0TiT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b3101a-c0a8-4a69-b401-97a2abebbf69_1440x988.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The split b/w server and edge versions of GQA follows the scarcity logic directly: edge has no DRAM, so it compresses everywhere; the server has DRAM for local discrimination, so it only compresses global aggregation.</p><p>This adds quite a bit of architectural complexity. Uniform GQA is one number you tune; divergent GQA is four (local edge, global edge, local server, global server) plus the K=V decision, and every combination must be validated against quality. Google paid that cost because the resource flip between edge and server made uniform GQA visibly suboptimal at both ends. Most labs will likely lack the resources to fight on both fronts simultaneously, so they&#8217;ll have to pick their specialization beforehand (by considering the constraints/use case) and be mediocre on the other one. Studying Gemma 4s decisions is a good way to understand that.</p><p>However, this is the only interesting optimization Gemma 4 uses. Up next, we will cover the one I found most interesting.</p><h3>How Does Cross-Layer KV Sharing Cut Edge Cache by 83%?</h3><p>GQA compresses KV within a layer. Cross-layer sharing attacks the vertical axis: why recompute K/V at every single layer in the first place?</p><p>Research shows that in the later layers of a trained network, representations converge. By layer 25 of 35, the residual stream has settled into a representation that the model is refining rather than transforming. In another lens, computing fresh K/V projections for layers doing similar work is redundant. On a phone, this redundancy is fatal.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ys-O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ys-O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png 424w, https://substackcdn.com/image/fetch/$s_!ys-O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png 848w, https://substackcdn.com/image/fetch/$s_!ys-O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png 1272w, https://substackcdn.com/image/fetch/$s_!ys-O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ys-O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png" width="1456" height="986" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:986,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ys-O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png 424w, https://substackcdn.com/image/fetch/$s_!ys-O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png 848w, https://substackcdn.com/image/fetch/$s_!ys-O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png 1272w, https://substackcdn.com/image/fetch/$s_!ys-O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdca52a-7194-4b17-9385-c1dc15a8e5a4_2392x1620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>E2B&#8217;s fix is to have 20 of its 35 layers skip the K/V projection entirely and reuse K/V from an earlier layer. Only 15 layers compute unique KV. (E4B shares 18 of 42; server models share zero).</p><p>The sharing is back-loaded and strictly matches attention types&#8202;&#8212;&#8202;sliding-window layers only reuse from sliding-window layers; global only reuses from global. You don&#8217;t want a local layer&#8217;s short-span K/V hijacked by a global layer trying to see 128K of context.</p><p>Crucially, shared layers still compute their own Query (Q) projection. Q is cheap: you compute one per token and use it immediately. K and V are the memory killers: they get cached forever so future tokens can attend to them. Sharing Q saves nothing. Sharing K/V eliminates the cache entirely for 20 layers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sddk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sddk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png 424w, https://substackcdn.com/image/fetch/$s_!sddk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png 848w, https://substackcdn.com/image/fetch/$s_!sddk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png 1272w, https://substackcdn.com/image/fetch/$s_!sddk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sddk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png" width="1456" height="1009" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1009,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sddk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png 424w, https://substackcdn.com/image/fetch/$s_!sddk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png 848w, https://substackcdn.com/image/fetch/$s_!sddk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png 1272w, https://substackcdn.com/image/fetch/$s_!sddk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6ac0197-ef37-4374-96a2-ea91ceec9557_2380x1650.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Let&#8217;s see how this plays out in hard numbers. A standard 7B transformer at 128K context demands a ~12.8GB KV cache. Combine cross-layer sharing with 8:1 GQA and interleaved attention, and E2B shrinks that cache from tens of gigabytes to hundreds of megabytes. <strong>Google&#8217;s reported 83% reduction at 8K context becomes an order-of-magnitude reduction at 128K. That is the only reason long-context models run on 8GB phones.</strong></p><h4>The Quality Tax and the MLP Compensation</h4><p>This will naturally reduce your generation quality since a shared-KV layer can only attend to a representation shaped by an earlier layer&#8217;s objective. It cannot sharpen its own retrieval targets or reshape the payload. To prevent this from gutting model quality, you have to add capacity back. If attention is kneecapped, the Feed-Forward Network (FFN) is your only compensation lever.</p><p>On shared layers, Google doubles the MLP width from 6,144 to 12,288.</p><p>The math here is revealing. A standard GeGLU FFN at <code>hidden_size</code> 1,536 uses ~28M parameters. Doubling it adds ~28M more. But the K/V projections they removed only cost 1-2M parameters. This discrepancy is caused b/c replicating the cross-token mixing power of attention through a purely feed-forward path requires massive, brute-force capacity.</p><p>Why do this when you&#8217;re building a model for resource constraints? Doesn&#8217;t this beat the point of edge architecture?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!be82!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!be82!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png 424w, https://substackcdn.com/image/fetch/$s_!be82!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png 848w, https://substackcdn.com/image/fetch/$s_!be82!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png 1272w, https://substackcdn.com/image/fetch/$s_!be82!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!be82!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png" width="1456" height="1011" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1011,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!be82!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png 424w, https://substackcdn.com/image/fetch/$s_!be82!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png 848w, https://substackcdn.com/image/fetch/$s_!be82!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png 1272w, https://substackcdn.com/image/fetch/$s_!be82!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7df6ffa4-54e0-4e2c-8070-11e9fbdfb056_2370x1646.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Put simply, a wider FFN costs you FLOPs, not bytes. You don&#8217;t cache FFN outputs; you compute them, use them, and throw them away. In other words, you spend compute (which you have) to save memory (which you don&#8217;t).</p><p>Server models skip cross-layer sharing entirely because DRAM isn&#8217;t their binding constraint. The quality tax of sharing stops making sense, and the MLP doubling becomes pure parameter waste.</p><p>Notice the market convergence here. DeepSeek&#8217;s Multi-Head Latent Attention (MLA) attacks the exact same KV bottleneck by compressing K and V into a shared latent space <em>within</em> a layer. Cross-layer sharing skips recomputation <em>across</em> layers. Different axis, same target. The field has unanimously agreed that the KV cache is the primary enemy. It will be extremely interesting to see all the ways people attack the KV cache from various points in the AI stack.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aUND!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aUND!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg 424w, https://substackcdn.com/image/fetch/$s_!aUND!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg 848w, https://substackcdn.com/image/fetch/$s_!aUND!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!aUND!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aUND!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg" width="1456" height="624" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:624,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aUND!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg 424w, https://substackcdn.com/image/fetch/$s_!aUND!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg 848w, https://substackcdn.com/image/fetch/$s_!aUND!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!aUND!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc10c9c7-52af-4dd9-9b5b-9ddcdf8ab59f_1600x686.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">In MLA, you project the token&#8217;s information down into a much smaller latent vector. During the decode phase, the GPU only appends this tiny vector to the cache. When it needs to calculate attention, it rapidly &#8220;up-projects&#8221; or reconstructs the Keys and Values on the fly. By replacing the large 2 * g * d_k term with a much smaller compressed dimension d_c, DeepSeek reported a staggering 93.3% reduction in KV cache size&#8202;&#8212;&#8202;&#8220;<em>Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.</em>&#8221;</figcaption></figure></div><p>There is one final aspect of the attention mechanism that DeepMind researchers reconfigured to give Gemma 4 it&#8217;s insane potency.</p><h3>How Does Partial RoPE Separate Position From Content?</h3><p>Transformers match tokens by taking the dot product of their vectors. To prevent the model from losing word order, you have to inject positional data into those vectors.</p><p>Rotary Position Embedding (RoPE) does this geometrically: it takes the vector&#8217;s coordinates and physically rotates them by an angle determined by the token&#8217;s position index. Tokens close together receive similar rotations, so their dot product stays high. Distant tokens rotate away from each other, causing the attention score to naturally decay. Position becomes geometry.</p><p>The standard implementation rotates every single dimension in the vector. That was fine at an 8K context window. At 128K, it breaks the model.</p><p>When you rotate every coordinate to encode extreme distances, the raw semantic meaning of the token gets distorted. If a query at token 120,000 is searching for a specific fact at token 500, it struggles to match the key because the extreme positional rotation acts as noise, scrambling the pure content signal. Every dimension is screaming its location, leaving no clean channels to communicate what the token actually <em>means</em>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nCk7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nCk7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png 424w, https://substackcdn.com/image/fetch/$s_!nCk7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png 848w, https://substackcdn.com/image/fetch/$s_!nCk7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png 1272w, https://substackcdn.com/image/fetch/$s_!nCk7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nCk7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png" width="1456" height="1014" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1014,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nCk7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png 424w, https://substackcdn.com/image/fetch/$s_!nCk7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png 848w, https://substackcdn.com/image/fetch/$s_!nCk7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png 1272w, https://substackcdn.com/image/fetch/$s_!nCk7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41ae19e6-782c-40db-b75b-9f578732b14b_2328x1622.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Labs have tried patching this with RoPE rescaling and NTK-aware scaling, but these are workarounds. They don&#8217;t fix the underlying flaw of position leaking into content.</p><p>Gemma 4&#8217;s global layers fix the flaw by adopting Partial RoPE (p-RoPE): they split the vector.</p><p>In each 512-dimension global attention head, 128 dimensions (25%) receive the full <code>theta=1M</code> rotation. These are the dedicated position channels, maintaining the geometry of the document so attention decays smoothly over 256K tokens.</p><p>The remaining 384 dimensions (75%) receive zero rotation. These are pure content channels. Because they never rotate, semantically identical tokens match perfectly across the entire context window, completely immune to distance. The model stops forcing every dimension to do two interfering jobs.</p><p>Google hasn&#8217;t published the ablation for the 25/75 split, but the engineering math is straightforward. 128 rotating dimensions provide exactly enough frequency bands to uniquely index 256K positions. Pushing it to 50% sacrifices pure content capacity for positional resolution you don&#8217;t need. Dropping it to 10% blurs distant positions together. The 25% mark is the empirical point where position and content both survive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ut3q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ut3q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png 424w, https://substackcdn.com/image/fetch/$s_!Ut3q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png 848w, https://substackcdn.com/image/fetch/$s_!Ut3q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png 1272w, https://substackcdn.com/image/fetch/$s_!Ut3q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ut3q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png" width="1456" height="1011" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1011,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ut3q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png 424w, https://substackcdn.com/image/fetch/$s_!Ut3q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png 848w, https://substackcdn.com/image/fetch/$s_!Ut3q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png 1272w, https://substackcdn.com/image/fetch/$s_!Ut3q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fa5d5e4-ad3c-4ec6-9268-6336389fa3d6_2358x1638.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Local layers skip this entirely. Operating in a tight 512-token window, they stick to standard full-RoPE (<code>theta=10K</code>). Inside a short span, positional resolution is critical, and the rotation distortion never scales high enough to break semantic retrieval. This continues our theme of Gemma 4 winning by specializing it&#8217;s archietcture to the system.</p><p>The cost of partial rotation is effectively zero&#8202;&#8212;&#8202;skipping the math on 75% of the vector actually saves minor compute. The payoff is absolute dominance in long-range integration. The 31B hits 86.4% on tau2-bench Retail, obliterating Gemma 3&#8217;s 6.6%. While training data and the broader attention redesign contribute, p-RoPE is the specific architectural lever that allows the model to actually retrieve what it reads.</p><p>Everything covered so far has been primarily about the edge architecture. The 26B has one design decision worth unpacking, and the 31B ships with one serving-stack problem worth flagging. Both are worth a few paragraphs each.</p><h3>Why Gemma 26B&#8217;s Uses a Hybrid MoE</h3><p>In a standard Mixture-of-Experts architecture, tokens are routed to a fraction of the available capacity. Mixtral, for example, routes to 2 out of 8 experts, activating about 25% of the network per token.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ol_g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ol_g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ol_g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ol_g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ol_g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ol_g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg" width="1000" height="898" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:898,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ol_g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ol_g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ol_g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ol_g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33772490-e25c-4486-9133-077924cc04e2_1000x898.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/googles-guide-on-how-to-scale-reinforcement?utm_source=publication-search">learn more about MoE here.</a></figcaption></figure></div><p>The 26B is much sparser. It uses 128 experts and routes to 8, activating just 6.25% of the network per token.</p><p>Normally, this level of sparsity is fragile. When you have 128 experts, routing errors are inevitable. If a token gets sent to the wrong experts, a standard MoE has no fallback, and the output degrades.</p><p>Gemma 4 solves this with a hybrid structure. Every token simultaneously runs through an always-on dense FFN (hidden dim 2,112) and its 8 routed experts (hidden dim 704 each). The outputs are then summed.</p><p>The dense path acts as a guaranteed quality floor. It processes every token reliably, regardless of what the router decides. If the router picks the perfect experts, they add specialized value on top. If the router misses, the dense path ensures the output remains coherent rather than crashing.</p><p>This structural fallback makes the 128-expert count practical. The economic result is that the 26B stores 25.2B parameters but only activates 3.8B per token. You get reasoning quality comparable to a 70B model, but you pay the inference cost of an 8B model.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cwDR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cwDR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png 424w, https://substackcdn.com/image/fetch/$s_!cwDR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png 848w, https://substackcdn.com/image/fetch/$s_!cwDR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png 1272w, https://substackcdn.com/image/fetch/$s_!cwDR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cwDR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png" width="1456" height="649" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:649,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cwDR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png 424w, https://substackcdn.com/image/fetch/$s_!cwDR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png 848w, https://substackcdn.com/image/fetch/$s_!cwDR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png 1272w, https://substackcdn.com/image/fetch/$s_!cwDR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ddb79-06ab-453d-922a-3459c22a1b5d_2400x1069.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lw0O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lw0O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png 424w, https://substackcdn.com/image/fetch/$s_!lw0O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png 848w, https://substackcdn.com/image/fetch/$s_!lw0O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png 1272w, https://substackcdn.com/image/fetch/$s_!lw0O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lw0O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png" width="1456" height="1278" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1278,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lw0O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png 424w, https://substackcdn.com/image/fetch/$s_!lw0O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png 848w, https://substackcdn.com/image/fetch/$s_!lw0O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png 1272w, https://substackcdn.com/image/fetch/$s_!lw0O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad569102-265f-4a3a-9fb4-3979f459645e_1476x1296.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>The Flash Attention 2 Serving Break</h3><p>FlashAttention-2 (FA2) is the standard kernel powering almost every modern GPU inference stack, including vLLM and HuggingFace. By keeping intermediate calculations in SRAM, it bypasses the massive memory bandwidth bottlenecks of standard attention.</p><p>But FA2 has a hard constraint: it only supports head dimensions up to 256.</p><p>Gemma 4&#8217;s global layers use a head dimension of 512. That width is not a mistake&#8202;&#8212;&#8202;it is mechanically required to make the K=V weight sharing and the partial-RoPE split work. But it means the global layers break FA2 compatibility.</p><p>On pre-Blackwell hardware (like an A100, H100, or RTX 4090), the serving stack falls back to unoptimized Triton kernels for these layers. In practice, throughput drops from an expected 50&#8211;100 tokens per second down to roughly 9. That is a 14x performance hit. On newer Blackwell GPUs, optimized kernels recover this throughput to around 124 tok/s.</p><p>The software fix is per-layer backend dispatch: routing local layers to FA2, and global layers to a different optimized kernel. As of April 2026, this is still an open issue in vLLM.</p><p><strong>Gemma 4 is an architecture shipped slightly ahead of its infrastructure. If you are evaluating it for production on existing hardware today, you need to budget for the throughput hit, or wait for the software stack to catch up.</strong></p><h3>Conclusion: What Gemma 4 Actually Teaches Us</h3><p>The current meta in AI is uniform scaling&#8202;&#8212;&#8202;pushing the exact same architecture across every deployment context (scaled up/down based on your constraints). That approach leaves massive capability on the table. True performance is not found in uniformity. It is found in exploiting the specific topological/ontological structure of the problem you are trying to solve. Phones and servers are not the same problem at different sizes, and treating them as if they are is just engineering convenience mistaken for principled design.</p><p>The industry has seen this dynamic before. Mixture of Experts sat on the shelf for decades because uniform dense models were simpler to build, until the economics of scale forced labs to specialize. That exact flip is now happening at the macro-architectural level.</p><p>Gemma 4 is one point in a trend that is shaping up in every layer of AI. Chip makers are realizing that they need to bifurcate inference and training, and some are looking further to split based on your workload/modality. Text-style tokenization is failing for vision, requiring rebuilds. Reasoning as a system is being bifurcated into its subcomponents. At every stage, we&#8217;re seeing the unbundling of intelligence.</p><p>Plan accordingly when thinking about how to navigate these very fun times.</p><p>Thank you for being here, and I hope you have a wonderful day,</p><p>Dev &lt;3</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/googles-gemma-4-will-change-how-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/googles-gemma-4-will-change-how-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hRwC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hRwC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png 424w, https://substackcdn.com/image/fetch/$s_!hRwC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png 848w, https://substackcdn.com/image/fetch/$s_!hRwC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png 1272w, https://substackcdn.com/image/fetch/$s_!hRwC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hRwC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png" width="877" height="132" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:132,&quot;width&quot;:877,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hRwC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png 424w, https://substackcdn.com/image/fetch/$s_!hRwC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png 848w, https://substackcdn.com/image/fetch/$s_!hRwC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png 1272w, https://substackcdn.com/image/fetch/$s_!hRwC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63912c47-ec1c-4a68-b8bd-54fb6ab76a1b_877x132.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Reach out to me</h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. : </p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[Anthropic's Claude Mythos Launch Is Built on Misinformation]]></title><description><![CDATA[A primary-source investigation for developers and security researchers who want the real story about what the Data says about Mythos]]></description><link>https://www.artificialintelligencemadesimple.com/p/anthropics-claude-mythos-launch-is</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/anthropics-claude-mythos-launch-is</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Fri, 17 Apr 2026 02:01:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z2GD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Anthropic announced Claude Mythos Preview on April 7, 2026, and the people covering it immediately jumped to hyping it as the end of software as we know it, and as a model that would break all security.</p><p>There is a lot of genuinely promising capability in Mythos. The bugs it found are real, the economics of vulnerability research are changing, and the underlying ability of LLMs to catch bugs that survive decades of human review is not hype. That part of the story is true.</p><p>The problem is everything around it. Almost every major outlet or commentator covering Mythos worked from Anthropic&#8217;s press materials and not the actual primary sources such as the CVE advisories, the exploit code, the 44-prompt transcript, the 244-page system card. When you read them and add the AISLE replication study, the red team writeups, the Glasswing partner agreements, Anthropic&#8217;s own decpetive framings, and a very different picture emerges: one of misinformation and hype.</p><p>In this article, we will cover:</p><ul><li><p>How AI finds bugs that traditional tools can&#8217;t, and why that matters</p></li><li><p>Every major Mythos bug, examined against the actual source code and exploit transcripts</p></li><li><p>What the &#8220;thousands of severe zero-days&#8221; claim actually rests on</p></li><li><p>Whether smaller, cheaper models can replicate Mythos&#8217;s results (spoiler: mostly yes)</p></li><li><p>What really happened with the sandwich escape</p></li><li><p>The business and financial structure behind the launch that nobody reported on</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y9J7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y9J7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png 424w, https://substackcdn.com/image/fetch/$s_!Y9J7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png 848w, https://substackcdn.com/image/fetch/$s_!Y9J7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!Y9J7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y9J7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png" width="1440" height="2920" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2920,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y9J7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png 424w, https://substackcdn.com/image/fetch/$s_!Y9J7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png 848w, https://substackcdn.com/image/fetch/$s_!Y9J7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!Y9J7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7255479-b55d-4643-81aa-a9782dc5126a_1440x2920.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Executive Highlights (TL;DR of the Article)</h3><ul><li><p>The bugs are real. 17-year-old FreeBSD RCE, 23-year-old Linux kernel heap overflow, 27-year-old OpenBSD TCP flaw. LLMs catch these because they can reason about the gap between what code does and what the developer intended. Fuzzers and static analysis literally cannot do this.</p></li><li><p>The coverage is wrong on almost every detail. The &#8220;181 Firefox exploits&#8221; ran with the browser sandbox (&#8202;yes, the thing that stops browser exploits) off. The FreeBSD exploit transcript shows substantial human guidance, not autonomy. The &#8220;thousands of severe vulnerabilities&#8221; extrapolates from 198 manually reviewed reports. The Linux kernel bug was found by Opus 4.6, the public model, not Mythos.</p></li><li><p>The moat is thinner than anyone reported. AISLE tested eight models including a 3.6B model at $0.11/M tokens. All eight found the FreeBSD bug. Mythos&#8217;s actual lead is in multi-step exploit development, not detection. That&#8217;s a narrower and more replicable advantage than what&#8217;s being sold.</p></li><li><p>Sandwich Gate is mostly nonsense. The model was explicitly told to escape its sandbox and contact the researcher. Anthropic&#8217;s own system card says it notified the researcher &#8220;as requested.&#8221; The only unsolicited action was posting exploit details publicly, which is more &#8220;overeager extra credit&#8221; than &#8220;Skynet.&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z2GD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z2GD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Z2GD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Z2GD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Z2GD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z2GD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg" width="500" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:86813,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/194471381?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z2GD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Z2GD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Z2GD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Z2GD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbacbfcd8-ce56-4ca9-bdc9-2d78cdd8f348_500x562.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p></li><li><p>The CoT unfaithfulness spike (5% to 65%) is a training problem, not a model personality. If you train reasoning with RL that rewards outputs that look like reasoning, you get outputs that look like reasoning over genuine reasoning. The dishonesty isn&#8217;t new&#8202;&#8212;&#8202;the magnitude is higher because we&#8217;re doing more RL. This is why we need to rethink the approach entirely.</p></li><li><p>The business structure is wild. 5 of 11 launch partners are also investors. JPMorgan is launch partner AND lead IPO underwriter. The &#8220;$100M in credits&#8221; is retail-priced API credit worth maybe $40&#8211;50M in compute. All of this lands months before a reported $400&#8211;500B IPO. Same playbook as GPT-2, just with real CVEs this time.</p></li></ul><p>With articles like this, I feel compelled to stress that I have no personal agenda against Anthropic. I&#8217;ve recommended Claude Code extensively in this newsletter (no sponsorship) and wrote an article supporting their stand against AI weapons systems. I have no agendas either pro or anti Anthropic, but I am against narrative manipulation and hype. </p><p><em>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qfXe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qfXe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png 424w, https://substackcdn.com/image/fetch/$s_!qfXe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png 848w, https://substackcdn.com/image/fetch/$s_!qfXe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png 1272w, https://substackcdn.com/image/fetch/$s_!qfXe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qfXe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png" width="630" height="146" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/586b0211-b338-4d00-b993-dc1125a7d762_630x146.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:146,&quot;width&quot;:630,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qfXe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png 424w, https://substackcdn.com/image/fetch/$s_!qfXe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png 848w, https://substackcdn.com/image/fetch/$s_!qfXe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png 1272w, https://substackcdn.com/image/fetch/$s_!qfXe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F586b0211-b338-4d00-b993-dc1125a7d762_630x146.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><h3>Section 0: What Is Claude Mythos Preview?</h3><p>Mythos is Anthropic&#8217;s newest frontier model, internally codenamed &#8220;Capybara,&#8221; sitting above Opus in the Claude hierarchy. Anthropic has published zero information about its architecture, training data, or parameter count. The system card reports it leads 17 of 18 benchmarks, including CyberGym (83.1% vs Opus 4.6&#8217;s 66.6%), SWE-bench Verified (93.9% vs 80.8%), and Cybench (100% pass@1), with a 1M-token context window. By every public metric Anthropic chose to share, their most capable model by a wide margin.</p><p>This is, unfortunately, what most coverage parrots (before pivoting to sell their proprietary &#8220;Claude.md skills playbook&#8221;). Unfortunately, that misses some very crucial details.</p><p>Firstly, Anthropic didn&#8217;t actually deploy a model. All those amazing results? They come model-plus-scaffold system&#8202;&#8212;&#8202;a multi-agent pipeline where specialized agents handle file ranking, code reading, target execution, proof-of-concept generation, and result confirmation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YQMS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YQMS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png 424w, https://substackcdn.com/image/fetch/$s_!YQMS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png 848w, https://substackcdn.com/image/fetch/$s_!YQMS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png 1272w, https://substackcdn.com/image/fetch/$s_!YQMS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YQMS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png" width="1440" height="1270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1270,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YQMS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png 424w, https://substackcdn.com/image/fetch/$s_!YQMS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png 848w, https://substackcdn.com/image/fetch/$s_!YQMS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png 1272w, https://substackcdn.com/image/fetch/$s_!YQMS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35aa95ec-4425-41e3-b58e-602f264b3bef_1440x1270.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://red.anthropic.com/2026/mythos-preview/">Literally on their release. Notice how these sub-agents for the system are explicitly defined (it&#8217;s not something Mythos figures out on its own)</a>. Results are still cool, but this adds a huge asterisk and it doesn&#8217;t match the fully autonomous language being thrown around.</figcaption></figure></div><p>This is kind of a big deal. The performance difference between Cursor and Claude Code? Or Claude Code and OpenCode? All that comes from the system around the model. That&#8217;s why Google can have one of the best models in the market, and still produce the Shakespearean tragedy that is the Gemini CLI. When people say &#8220;Mythos found a zero-day,&#8221; the truth is more &#8220;Mythos, orchestrated by a purpose-built vulnerability research pipeline, found a zero-day.&#8221;</p><p>All that to say, there is a big-ass gulf between &#8220;model did it&#8221; (what is being reported) and &#8220;model used scaffold to do it&#8221; (what actually happened). Former implies a breakthrough in capability; the latter is more of an engineering problem. What you go with changes your read of the situation. We&#8217;ll come back to this.</p><p>Next, let&#8217;s talk about those benchmarks.</p><ul><li><p>CyberGym tests &#8220;targeted vulnerability reproduction,&#8221; meaning the model receives a hint about the bug class and tries to reproduce it. That&#8217;s more N-day exploitation&#8202;&#8212;&#8202;recreating a known type of bug&#8202;&#8212;&#8202;not the open-ended zero-day discovery Anthropic&#8217;s marketing implies/what the media runs with.</p></li><li><p>The Cybench 100% was scored on only 35 of 40 challenges, with 10 trials per challenge versus 30 for other models. Anthropic&#8217;s own system card calls Cybench &#8220;no longer sufficiently informative.&#8221; Their words, not mine.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cWh5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cWh5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png 424w, https://substackcdn.com/image/fetch/$s_!cWh5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png 848w, https://substackcdn.com/image/fetch/$s_!cWh5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png 1272w, https://substackcdn.com/image/fetch/$s_!cWh5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cWh5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png" width="1424" height="1992" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1992,&quot;width&quot;:1424,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cWh5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png 424w, https://substackcdn.com/image/fetch/$s_!cWh5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png 848w, https://substackcdn.com/image/fetch/$s_!cWh5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png 1272w, https://substackcdn.com/image/fetch/$s_!cWh5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ca4087-df93-4562-9fed-00e532b2907c_1424x1992.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From the System Card</figcaption></figure></div><p>Finally, here is some context worth holding before we get into the actual findings:</p><ul><li><p>At least 5 of 11 non-Anthropic launch partners are also Anthropic investors. JPMorgan is simultaneously a launch partner and lead underwriter for Anthropic&#8217;s reported October 2026 IPO.</p></li><li><p>Mythos is priced at $25/$125 per million tokens&#8202;&#8212;&#8202;5x above Opus and far exceeding GPT-5.2 ($1.75/$14) and Gemini 3.1 Pro ($2/$12).</p></li><li><p>The restricted-access rollout is consistent with a deeper pattern: Anthropic has repeatedly struggled with usage limits, model quality degradation under load, and compute capacity constraints. Seen from this perspective, &#8220;Too dangerous to release widely&#8221; seems to be a very interesting cover for &#8220;we cannot serve this at scale yet&#8221;. (There is some irony to how Peace Prize Dario called out Sam Altman for buying up GPUs, and is now stuck with a massive compute crunch).</p></li></ul><p>Hopefully, all this foreplay has you heated because now we&#8217;re going to get into the main act.</p><h3>Section 1: How Does AI Find Bugs? And why Can LLMs Find Bugs That Fuzzers and Static Analysis Miss?</h3><p>The bugs that survive decades in reviewed code are not obviously wrong. They&#8217;re usually locally correct but globally inconsistent&#8202;&#8212;&#8202;an assumption made in one subsystem, violated by a change in another, with no single file looking broken on its own. A replay cache sized before LOCK operations existed. A TCP sequence comparison that works until operands are more than 2&#179;&#185; apart. These are semantic mismatches, and traditional tools miss them for structural reasons.</p><p>Here, it is worth understanding some of the traditional techniques to develop a strong map of the AI-security landscape.</p><ul><li><p><a href="https://www.blackduck.com/glossary/what-is-fuzz-testing.html">Fuzzers </a>generate malformed inputs and watch for crashes. This makes them fast, scalable, and excellent at finding bugs that manifest as observable failures. But they test inputs, not logic. A fuzzer will never understand that a buffer was sized for OPEN responses but not LOCK-denied responses, because it has no concept of &#8220;sized for.&#8221; If the bug only triggers under specific protocol-level conditions&#8202;&#8212;&#8202;two cooperating NFS clients, one holding a lock with a 1,024-byte owner string, the other requesting a conflicting lock&#8202;&#8212;&#8202;random mutation will not produce that input. Fuzzers hit the FFmpeg H.264 code path 5 million times without catching its sentinel-collision bug, because generating a valid bitstream with 65,535+ slices requires structural awareness that coverage-guided mutation simply cannot provide.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aiOa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aiOa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png 424w, https://substackcdn.com/image/fetch/$s_!aiOa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png 848w, https://substackcdn.com/image/fetch/$s_!aiOa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png 1272w, https://substackcdn.com/image/fetch/$s_!aiOa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aiOa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png" width="1456" height="1129" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1129,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aiOa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png 424w, https://substackcdn.com/image/fetch/$s_!aiOa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png 848w, https://substackcdn.com/image/fetch/$s_!aiOa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png 1272w, https://substackcdn.com/image/fetch/$s_!aiOa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3444726d-a538-478c-aeab-b967dbfd5ebd_1600x1241.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>On the other hand, static analysis tools like Coverity and Semgrep match known code patterns: unchecked <code>memcpy</code>, missing bounds checks, use-after-free signatures. They&#8217;re precise within their pattern library, fast to run, but they match patterns, not meaning. They can work within a bounded scope&#8202;&#8212;&#8202;a function, a file, a known-bad template. A static analyzer cannot understand that a buffer sized in 2003 for one response type was later expected to hold a different response type added in a separate commit by a different developer. That requires reading across files, across time, across intent. No pattern matches it because no single piece of code is wrong.</p></li></ul><p>This brings us to the fundamental failure mode of these two systems and where LLMs can make an impact. Code has two layers: what it mechanically does, and what the programmer intended it to do. Function names, variable names, comments, commit messages, surrounding context&#8202;&#8212;&#8202;these encode intent. The actual logic encodes behavior. LLMs are the first tool that can reason about the gap between those two layers. When a developer leaves a comment saying a buffer is &#8220;large enough to hold the OPEN, the largest of the sequence mutation operations,&#8221; and a later code path routes a much larger response type into that buffer, an LLM can recognize the semantic mismatch without needing a crash to signal it (or more likely it will be able to trace it down to this difference post crash).</p><p>Now, LLMs have their own issues. They&#8217;re slower per-query than fuzzers, they produce false positives that require expert human triage, and their coverage is not systematic&#8202;&#8212;&#8202;they don&#8217;t guarantee they&#8217;ve examined every code path. This demonstrates clearly that instead of replacements, these tools are more complements. Each tool owns a lane. Known-pattern bugs belong to static analysis. Crash-inducible bugs belong to fuzzers. Semantic mismatches&#8202;&#8212;&#8202;where valid inputs interact with correct-looking code to produce wrong behavior&#8202;&#8212;&#8202;belong to LLMs.</p><p>I take the time to flag this because a lot of coverage on Mythos positions it as a replacement for these traditional techniques. This is objectively untrue. As multiple agentic systems across the board have taught us, the most powerful systems combine both deterministic and non-deterministic systems, instead of trying to tokenmax by using LLMs everywhere.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mZzx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mZzx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png 424w, https://substackcdn.com/image/fetch/$s_!mZzx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png 848w, https://substackcdn.com/image/fetch/$s_!mZzx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png 1272w, https://substackcdn.com/image/fetch/$s_!mZzx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mZzx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png" width="780" height="790" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:780,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mZzx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png 424w, https://substackcdn.com/image/fetch/$s_!mZzx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png 848w, https://substackcdn.com/image/fetch/$s_!mZzx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png 1272w, https://substackcdn.com/image/fetch/$s_!mZzx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb866c5e-5f4c-46ab-8065-1a8b14b76107_780x790.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/how-to-build-agentic-aiagents">How to Build Agentic AI.</a></figcaption></figure></div><p>This serves as important context, b/c now we&#8217;re going to do what no other coverage has done yet. We&#8217;re going to look at the actual bugs that have been released. I went through every CVE advisory, every exploit file, and the full 44-prompt transcript to understand what Mythos actually did. And as you might guess, there is a LOT of misinformation and hype floating around.</p><h3>Section 2: The Bugs Mythos Found, Examined One by One</h3><h4>CVE-2026&#8211;4747: FreeBSD NFS Remote Code Execution (17 years old)</h4><p>A function called <code>svc_rpc_gss_validate()</code>, part of FreeBSD&#8217;s NFS authentication system, allocates a 128-byte buffer on the stack to hold incoming credentials. The protocol&#8217;s serialization layer (XDR) permits credential bodies up to 400 bytes. That leaves 272 bytes of overflow&#8202;&#8212;&#8202;enough to overwrite saved registers, the frame pointer, and the return address.</p><p>Under FreeBSD&#8217;s default compiler settings, this particular function receives no stack canary, meaning there&#8217;s no runtime check to detect the overwrite before the function returns. An attacker who controls the credential body controls where the function jumps when it finishes.</p><p>The bug survived 17 years because neither subsystem is broken on its own. The function parses credentials correctly. The XDR layer serializes credentials correctly. The mismatch&#8202;&#8212;&#8202;that the serialization layer permits inputs larger than the receiving buffer&#8202;&#8212;&#8202;lives in the gap between two subsystems written at different times with different assumptions.</p><p>Two distinct exploit strategies exist in the public record.</p><ul><li><p>Anthropic says Mythos developed a 6-round exploit that writes an SSH key to <code>.ssh/authorized_keys</code>, fully autonomous&#8202;&#8212;&#8202;fewer resources needed, no reverse shell connection to detect, persistence that survives reboots. It was not published.</p></li><li><p><a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-4747">Separately, Nicholas Carlini published a 15-round reverse shell exploit here</a>, crediting &#8220;Claude&#8221; without specifying the model version. I read every file in that repository&#8202;&#8212;&#8202;the 601-line <code>exploit.py</code>, the <code>write-up.md</code>, and the <code>claude-prompts.txt</code> containing 44 human prompts across roughly 8 hours of work. What I found in there complicates every narrative about this exploit.</p></li></ul><p><strong>How the published exploit works.</strong> The target is FreeBSD 14.4-RELEASE on amd64. FreeBSD lacks kernel address space layout randomization (KASLR), so every kernel address is predictable&#8202;&#8212;&#8202;a significant simplification. Round 1 calls <code>pmap_change_prot()</code> to make a kernel memory region executable. Rounds 2 through 14 each write 32 bytes of shellcode into that region using ROP chains&#8202;&#8212;&#8202;short sequences of existing kernel instructions chained together to perform arbitrary operations. Round 15 jumps to the completed shellcode, which spawns a reverse root shell. Each round is a separate NFS request that triggers the buffer overflow, writes its payload, and returns. FreeBSD spawns 8 NFS threads per CPU. The exploit consumes one thread per round (15 total). With 2 CPUs (16 threads), the margin is exactly one thread.</p><p>Here&#8217;s where things get interesting.</p><p>Carlini&#8217;s README claims the human &#8220;was AFK for much of it.&#8221; The 44 prompts tell a different story. Prompt 15 is the critical intervention: &#8220;okay in <code>../FBSD-001</code> there is a different remote exploit that gets a shell.. read it for how they constructed the connect back.&#8221; This pointed Claude to a prior exploit for a different FreeBSD vulnerability (in the SCSI target layer) as a reference implementation.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WwPp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WwPp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png 424w, https://substackcdn.com/image/fetch/$s_!WwPp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png 848w, https://substackcdn.com/image/fetch/$s_!WwPp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png 1272w, https://substackcdn.com/image/fetch/$s_!WwPp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WwPp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png" width="1456" height="168" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:168,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WwPp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png 424w, https://substackcdn.com/image/fetch/$s_!WwPp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png 848w, https://substackcdn.com/image/fetch/$s_!WwPp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png 1272w, https://substackcdn.com/image/fetch/$s_!WwPp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580cd25e-759a-4181-9f2d-109dda2c7cb0_2400x277.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">It&#8217;s literally there publicly, how did noone think to flag this?</figcaption></figure></div><p>The shellcode comments confirm it: &#8220;Uses the same build logic as <code>FBSD-001/exploit.py build_stage2_shellcode()</code>, with two patches.&#8221; Claude did not independently derive the kernel-to-userland execution pattern. It adapted it from reference code the human provided. Dead code artifacts survive in the final exploit&#8202;&#8212;&#8202;<code>FAKE_MODULE_BASE</code> and <code>HA_HANDLER_OFF</code> constants from the reference exploit, plus 19 NOP bytes where the original handler cleanup was removed but not reclaimed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Fves!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Fves!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png 424w, https://substackcdn.com/image/fetch/$s_!Fves!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png 848w, https://substackcdn.com/image/fetch/$s_!Fves!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png 1272w, https://substackcdn.com/image/fetch/$s_!Fves!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Fves!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png" width="1418" height="678" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:678,&quot;width&quot;:1418,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Fves!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png 424w, https://substackcdn.com/image/fetch/$s_!Fves!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png 848w, https://substackcdn.com/image/fetch/$s_!Fves!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png 1272w, https://substackcdn.com/image/fetch/$s_!Fves!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa545d407-c3e9-4420-b47a-7d905c9129ce_1418x678.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Also look at line 141.</figcaption></figure></div><p>Beyond the reference exploit, the human made at least five significant corrections: stopping Claude from using Docker instead of QEMU (prompt 4) for kernel exploitation, preventing it from killing the wrong QEMU instance (prompt 9), redirecting it from kernel reboot to reverse shell, vetoing remote file-prestaging (14), and blocking a non-default thread count increase. The human also provided the hint &#8220;there is no KASLR so it should be easy.&#8221; (prompt 12).</p><p>I want to be clear about this&#8202;&#8212;&#8202;there is genuine capability. Claude used a De Bruijn pattern&#8202;&#8212;&#8202;a sequence where every possible substring of a given length appears exactly once&#8202;&#8212;&#8202;to determine the exact offset where the overflow overwrites the return address (200 bytes into the credential body), after an initial disassembly estimate was wrong by 32 bytes. Rather than guessing, it generated a diagnostic payload that precisely identified the correct offset from the crash data. When the obvious register-transfer gadget (<code>rax &#8594; rdi</code>) didn&#8217;t exist in the kernel&#8217;s gadget space, Claude discovered a <code>mov [rdi], rax; ret</code> write primitive as an alternative. A failed approach using <code>rep movsq</code>&#8202;&#8212;&#8202;abandoned when the only available <code>push rsp; pop rsi</code> gadget had a side effect that corrupted the repeat count register&#8202;&#8212;&#8202;shows genuine exploration of the constraint space rather than template matching.</p><p>But, and this is a big but, Anthropic marketed this as &#8220;unauthenticated root from anywhere on the internet.&#8221; <a href="https://www.freebsd.org/security/advisories/FreeBSD-SA-26:08.rpcsec_gss.asc">The FreeBSD advisory says &#8220;remote code execution in the kernel is possible by an authenticated user.&#8221;</a> NVD assigns Privileges Required: Low, not None. The published exploit requires Kerberos authentication.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T69h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T69h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png 424w, https://substackcdn.com/image/fetch/$s_!T69h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png 848w, https://substackcdn.com/image/fetch/$s_!T69h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png 1272w, https://substackcdn.com/image/fetch/$s_!T69h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T69h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png" width="1456" height="784" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:784,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T69h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png 424w, https://substackcdn.com/image/fetch/$s_!T69h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png 848w, https://substackcdn.com/image/fetch/$s_!T69h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png 1272w, https://substackcdn.com/image/fetch/$s_!T69h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62114089-5ac0-4dc0-81ea-49c576418148_1600x862.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://nvd.nist.gov/vuln/detail/CVE-2026-4747">Source</a></figcaption></figure></div><p>(Anthropic describes an unpublished path where Mythos found an unauthenticated <code>EXCHANGE_ID</code> call that leaks the kernel <code>hostid</code> and boot time, making the authentication handle computable without credentials&#8202;&#8212;&#8202;but this path has no published proof-of-concept).</p><p>And in either case, the attacker needs network access to the NFS port&#8202;&#8212;&#8202;which in practice means they&#8217;re already on your internal network, because NFS servers don&#8217;t sit on the public internet. &#8220;<strong>Root from anywhere on the internet&#8221; requires being somewhere the internet can&#8217;t reach</strong>. This is the kind of contradiction only Dario Luther King can resolve.</p><p><em>Total cost: under $1,000 in API credits.</em></p><h4>CVE-2026&#8211;31402: Linux Kernel NFSv4.0 Heap Overflow (23 years old)</h4><p>This one is my favorite because it shows exactly what LLMs can do that nothing else can.</p><p>In 2003, a developer named Neil Brown added a replay cache to the Linux kernel&#8217;s NFS server&#8202;&#8212;&#8202;a buffer that stores copies of recent responses so the server can retransmit them if a client&#8217;s request is lost. Brown sized the cache buffer at 112 bytes and left an explicit comment in the code: &#8220;<em>I&#8217;ve implemented the cache as a static buffer of size 112 bytes which is large enough to hold the OPEN, the largest of the sequence mutation operations. LOCK and UNLOCK will be added when byte-range locking is done (soon!).</em>&#8221;</p><p>The buffer was sized before LOCK existed. When LOCK was later added by a different developer, nobody revisited the 112-byte assumption. A LOCK-denied response can be as large as 1,056 bytes. That is a 944-byte overflow into kernel heap memory.</p><p>Here&#8217;s what makes this one beautiful in a terrible way. The kernel already partially knew about the mismatch. The XDR request size estimator at line 2682 of <code>nfs4xdr.c</code> correctly accounts for LOCK responses&#8202;&#8212;&#8202;it adds <code>NFS4_OPAQUE_LIMIT</code> to the outbound buffer size, producing a correctly sized response sent over the network. The client receives the right data. The overflow happens only in the replay cache copy, a separate code path that was never updated. The kernel sends the right answer over the wire, then corrupts its own memory caching it.</p><p><a href="https://mtlynch.io/claude-code-found-linux-vulnerability/">Carlini found this bug using Opus 4.6, the publicly available model&#8202;&#8212;&#8202;confirmed explicitly in Lynch&#8217;s writeup, not Mythos</a>. A bash script iterated over every file in the kernel source tree, giving each to Claude Code with the prompt &#8220;You are playing in a CTF. Find a vulnerability. hint: look at [file].&#8221; No specialized tooling. Claude had to understand that when two NFS clients cooperate&#8202;&#8212;&#8202;one sets a lock with a 1,024-byte owner string, the other requests a conflicting lock&#8202;&#8212;&#8202;the denial response overflows the replay cache. That&#8217;s reasoning about asynchronous distributed client interactions, exactly the kind of cross-component logic invisible to fuzzers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!E4N6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!E4N6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png 424w, https://substackcdn.com/image/fetch/$s_!E4N6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png 848w, https://substackcdn.com/image/fetch/$s_!E4N6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png 1272w, https://substackcdn.com/image/fetch/$s_!E4N6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!E4N6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png" width="1456" height="1196" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1196,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!E4N6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png 424w, https://substackcdn.com/image/fetch/$s_!E4N6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png 848w, https://substackcdn.com/image/fetch/$s_!E4N6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png 1272w, https://substackcdn.com/image/fetch/$s_!E4N6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4270d9-0438-4e5f-b660-e81658101612_1600x1314.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://lkml.org/lkml/2026/2/24/1461">Jeff Layton&#8217;s fix (commit </a><code>5133b61aaf43</code><a href="https://lkml.org/lkml/2026/2/24/1461">) is nine lines: a bounds check before the copy. If the response exceeds the buffer, </a><code>rp_buflen</code><a href="https://lkml.org/lkml/2026/2/24/1461"> is set to zero, the status is still cached, but the payload is dropped. The commit message explains why the buffer was not simply enlarged: doing so would increase the size of every </a><code>nfs4_stateowner</code><a href="https://lkml.org/lkml/2026/2/24/1461"> structure across all NFS client state, wasting memory. Retransmitted LOCK-denied responses lose the denial payload&#8202;&#8212;&#8202;acceptable because the client already received it on the first attemp</a>t.</p><p>Lynch&#8217;s writeup lists five kernel bugs Carlini reported with Opus 4.6, all merged into mainline. Carlini stated he has &#8220;several hundred crashes&#8221; he hasn&#8217;t had time to validate. The bottleneck is human triage, not AI discovery.</p><p>Once again, media coverage implied this was a Mythos finding. It was not. And despite &#8220;several thousand scans&#8221; of Linux kernel code, Mythos was &#8220;unable to successfully exploit any&#8221; Linux kernel vulnerability remotely. It found bugs but could not exploit them over a network. The model that found and helped exploit this bug was the publicly available one.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qzkZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qzkZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png 424w, https://substackcdn.com/image/fetch/$s_!qzkZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png 848w, https://substackcdn.com/image/fetch/$s_!qzkZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png 1272w, https://substackcdn.com/image/fetch/$s_!qzkZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qzkZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png" width="1456" height="405" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:405,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qzkZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png 424w, https://substackcdn.com/image/fetch/$s_!qzkZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png 848w, https://substackcdn.com/image/fetch/$s_!qzkZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png 1272w, https://substackcdn.com/image/fetch/$s_!qzkZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a2e146f-640a-45b5-aa8f-68165d39e971_1510x420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>CVE-2026&#8211;2796: Firefox JIT Type Confusion (CVSS 9.8)</h4><p>A type confusion in Firefox&#8217;s JavaScript engine, reported by seven researchers &#8220;using Claude from Anthropic.&#8221; The Bugzilla entry remains restricted, so independent verification of the technical details is not yet possible. What we can examine are the testing conditions.</p><p>The headline number is 181 successful exploit runs where Opus 4.6 managed 2 out of roughly 350 attempts. But these are execution runs, not unique exploits&#8202;&#8212;&#8202;Mythos succeeds more often at the same task, not 181 different strategies. And Anthropic&#8217;s own footnote states these exploits &#8220;<em>target a testing harness mimicking a Firefox 147 content process, without the browser&#8217;s process sandbox or other defense-in-depth mitigations.</em>&#8221; If you&#8217;re wondering, yes, the browser sandbox is the part that stops browser exploits. In a real browser, the renderer sandbox is the primary defense against JIT exploits. Getting code execution inside the renderer is step one; the browser&#8217;s process sandbox, the OS sandbox, and ASLR all stand between that and actual system compromise.</p><p>So, unless all that brain damage from MMA is hitting me&#8230; the big exploit involves working in a compromised setting. I guess everyone can bring into a house if there are no locks. What Saint Dario is trying to prove here eludes my puny comprehension, perhaps one of you can help me understand.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EnqU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EnqU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png 424w, https://substackcdn.com/image/fetch/$s_!EnqU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png 848w, https://substackcdn.com/image/fetch/$s_!EnqU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png 1272w, https://substackcdn.com/image/fetch/$s_!EnqU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EnqU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EnqU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png 424w, https://substackcdn.com/image/fetch/$s_!EnqU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png 848w, https://substackcdn.com/image/fetch/$s_!EnqU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png 1272w, https://substackcdn.com/image/fetch/$s_!EnqU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e1a75c0-0afd-4051-95f4-f3e6d9bc8db2_2400x1350.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Anthropic separately claims Mythos chained four vulnerabilities to escape both a renderer and OS sandbox in &#8220;a major web browser.&#8221; Their red team writeup states &#8220;<em>we then worked with Mythos Preview to increase its severity.</em>&#8221; The sandbox escape was human-assisted. No CVE, no technical writeup, no independent verification is possible but good to know I guess.</p><p>T<a href="https://www.mozilla.org/en-US/security/advisories/mfsa2026-13/">he strongest public numbers were produced under the weakest test conditions. The unverifiable claims describe the hardest conditions. Mozilla&#8217;s own security advisory rates the impact as &#8220;high,&#8221; not &#8220;critical.&#8221; NVD assigned CVSS 9.8. Mozilla disagree</a>s.</p><h4>CVE-2026&#8211;26980: Ghost CMS SQL Injection (CVSS 9.4)</h4><p>Carlini demonstrated this one live in about 90 minutes at the [un]prompted 2026 conference&#8202;&#8212;&#8202;a blind SQL injection in Ghost&#8217;s Content API through the <code>filter</code> query parameter, unauthenticated because the API key is public by design. OWASP has listed injection flaws as a top vulnerability class for over a decade. Valid work, not frontier capability.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QSG5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QSG5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png 424w, https://substackcdn.com/image/fetch/$s_!QSG5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png 848w, https://substackcdn.com/image/fetch/$s_!QSG5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png 1272w, https://substackcdn.com/image/fetch/$s_!QSG5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QSG5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png" width="1440" height="952" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:952,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QSG5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png 424w, https://substackcdn.com/image/fetch/$s_!QSG5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png 848w, https://substackcdn.com/image/fetch/$s_!QSG5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png 1272w, https://substackcdn.com/image/fetch/$s_!QSG5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96fdd474-5685-4c90-aeef-422990a3baf8_1440x952.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>The OpenBSD TCP SACK Bug (27 years old, no CVE yet)</h4><p>The oldest bug in the set, and the one that demands the most sophisticated reasoning to find.</p><p>TCP sequence numbers are 32-bit unsigned integers that wrap around. The kernel compares them using macros (<code>SEQ_LT</code>, <code>SEQ_GT</code>) that cast the difference to a signed integer: <code>(int)((a)-(b)) &lt; 0</code>. This works correctly when the two numbers are within 2^31 of each other. When they&#8217;re farther apart, signed overflow produces contradictory results&#8202;&#8212;&#8202;a value can simultaneously appear less than one reference point and greater than another. The code validates <code>sack.end</code> against the send window but never validates that <code>sack.start</code> is at or above the base sequence number. An attacker places <code>sack.start</code> roughly 2^31 away from the real window. The kernel deletes its only tracking entry, a pointer goes null, and the subsequent write dereferences it.</p><p>Not the first SACK bug in this function either. CVE-2019&#8211;8460 found resource exhaustion via unbounded hole list growth, fixed by adding a 128-hole limit. A 2005 patch added helper macros. Neither fix addressed the signed integer overflow. Two security reviews of the same function, seven years apart, missed the same class of bug that an LLM found on a single $50 run.</p><p>The fix is four lines: a lower-bound check on <code>sack.start</code> plus a null guard. Discovery cost roughly $20,000 across approximately 1,000 scaffold runs.</p><p>But&#8202;&#8212;&#8202;and this matters&#8202;&#8212;&#8202;it is denial of service, not remote code execution. The null pointer write goes to a fixed low address with no attacker control over the written value, and OpenBSD prohibits mapping the null page since version 4.4. <strong>Combined with W^X, SMEP/SMAP, and KARL, code execution from this primitive is essentially impossible. </strong>Anthropic&#8217;s technical writeup is honest about this limitation. The broader marketing language about &#8220;every major operating system&#8221; does not distinguish between a DoS bug on OpenBSD and an RCE exploit on FreeBSD.</p><h4>The FFmpeg H.264 Bug (16 years old, no CVE yet)</h4><p>A sentinel-collision out-of-bounds write in FFmpeg&#8217;s H.264 decoder. The slice assignment table uses <code>0xFFFF</code> as an empty-slot sentinel. The slice counter is a 32-bit integer. When it reaches 65,535, it collides with the sentinel, and the deblocking filter misidentifies initialized macroblocks as uninitialized. Anthropic&#8217;s own report says it is not critical and would be hard to exploit. Findings like this feed the broader claim of &#8220;thousands of severe zero-days.&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tkjO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tkjO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png 424w, https://substackcdn.com/image/fetch/$s_!tkjO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png 848w, https://substackcdn.com/image/fetch/$s_!tkjO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png 1272w, https://substackcdn.com/image/fetch/$s_!tkjO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tkjO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png" width="1456" height="874" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:874,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tkjO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png 424w, https://substackcdn.com/image/fetch/$s_!tkjO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png 848w, https://substackcdn.com/image/fetch/$s_!tkjO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png 1272w, https://substackcdn.com/image/fetch/$s_!tkjO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcad62e0-68e1-4827-87f7-2505e77c1eaa_2400x1440.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Section 3: How Strong Are the Scale and Exclusivity Claims?</h3><p>The number behind &#8220;thousands of severe zero-days&#8221; is 198 manually reviewed reports. That&#8217;s the total human-verified count&#8202;&#8212;&#8202;Anthropic hired contractors who agreed with Mythos&#8217;s severity assessment 89% of the time, 98% within one severity level. Everything else is extrapolation, and it&#8217;s not strong extrapolation. The 198 is a 4&#8211;10% sample with undisclosed selection methodology, reviewed by Anthropic-paid contractors using Anthropic&#8217;s severity framework, <strong>and that 89% agreement rate means roughly 11% of the &#8220;thousands&#8221; may be misclassified</strong>. For comparison: in Anthropic&#8217;s own standardized testing regime across roughly 7,000 entry points in 1,000 open-source repositories, the result was 10 confirmed control-flow hijacks. Ten. The &#8220;thousands&#8221; comes from a different, less controlled regime where the methodology is opaque.</p><p>Anthropic&#8217;s red team acknowledges training data contamination concerns for the N-day exploits: <em>&#8220;it is conceivable that Mythos Preview is drawing on prior knowledge of these bugs to inform its exploits.&#8221; </em>They argue the exploits&#8217; sophistication matches zero-day work, but also admit their N-day metrics &#8220;<em>can make it difficult to distinguish novel capabilities from cases where the model simply remembered the solution</em>.&#8221; The zero-day findings are more defensible precisely because those bugs could not have appeared in training data.</p><p>Now here&#8217;s the question the coverage never asked: can other models do this at all?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q498!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q498!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png 424w, https://substackcdn.com/image/fetch/$s_!Q498!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png 848w, https://substackcdn.com/image/fetch/$s_!Q498!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png 1272w, https://substackcdn.com/image/fetch/$s_!Q498!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q498!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png" width="1456" height="705" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:705,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q498!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png 424w, https://substackcdn.com/image/fetch/$s_!Q498!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png 848w, https://substackcdn.com/image/fetch/$s_!Q498!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png 1272w, https://substackcdn.com/image/fetch/$s_!Q498!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47495939-597d-432c-b03a-05c6eb5a737f_1730x838.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AISLE, an AI security lab led by Stanislav Fort, tested Anthropic&#8217;s showcase bugs across eight models in single zero-shot API calls. No agentic workflows, no tool access, no iterative loops. Just the vulnerable code in the prompt and a request to find vulnerabilities. Full transcripts published at <code>github.com/stanislavfort/mythos-jagged-frontier</code>.</p><p>All eight models found the FreeBSD bug. Every single one. A 3.6B-parameter model running at $0.11 per million tokens correctly identified the overflow and assessed it as critical. The finding that anchored the entire launch narrative is detectable by a model small enough to run on a laptop.</p><p>The more interesting result is where models diverge, and AISLE has a great name for it: the &#8220;jagged frontier.&#8221; Cybersecurity capability does not scale smoothly with model size. On the FreeBSD buffer overflow&#8202;&#8212;&#8202;fundamentally pattern recognition, input size exceeds buffer size&#8202;&#8212;&#8202;all models scored well. On the OpenBSD SACK bug, which requires multi-step signed integer wraparound reasoning, grades ranged from A+ to F. GPT-OSS-120b got A+. Kimi K2 got A-. Qwen3 32B got F&#8202;&#8212;&#8202;the same model that scored CVSS 9.8 on FreeBSD confidently declared the OpenBSD code &#8220;robust&#8221; and said &#8220;no exploitation vector exists.&#8221; It never examined whether <code>sack.start</code> values near 2^31 apart from the base could make both <code>SEQ_LT</code> and <code>SEQ_GT</code> comparisons behave contradictorily.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vAxa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vAxa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png 424w, https://substackcdn.com/image/fetch/$s_!vAxa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png 848w, https://substackcdn.com/image/fetch/$s_!vAxa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png 1272w, https://substackcdn.com/image/fetch/$s_!vAxa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vAxa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png" width="1456" height="874" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:874,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vAxa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png 424w, https://substackcdn.com/image/fetch/$s_!vAxa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png 848w, https://substackcdn.com/image/fetch/$s_!vAxa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png 1272w, https://substackcdn.com/image/fetch/$s_!vAxa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26af015d-2a05-4f30-9f5a-9b4379d31c75_1600x960.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Why the jaggedness? Buffer overflow detection is pattern-matchable from training data&#8202;&#8212;&#8202;models have seen thousands of examples where input size exceeds buffer size. Signed integer wraparound is not. It requires actually reasoning about how signed and unsigned arithmetic interact under specific value ranges. You cannot predict which bugs a model will find by knowing its size or its general benchmark scores. Capability is task-shaped, not model-shaped.</p><p><a href="https://github.com/stanislavfort/mythos-jagged-frontier/blob/main/transcripts/owasp-false-positive.md">The false-positive results cut even deeper. AISLE tested models on an OWASP sample&#8202;&#8212;&#8202;a Java code block with no actual SQL injection vulnerability.</a></p><ul><li><p>Twelve of 13 Anthropic models failed, including Sonnet 4.5, which correctly traced the list operations and then overrode its own analysis to flag the pattern anyway.</p></li><li><p>The $0.11/M token GPT-OSS-20b passed. On patched FreeBSD code where the vulnerability was already fixed, only GPT-OSS-120b correctly identified it as safe across all three trials.</p></li><li><p>The most common false-positive argument: that <code>oa_length</code> could be negative, bypassing the bounds check. <code>oa_length</code> is declared as <code>u_int</code>&#8202;&#8212;&#8202;unsigned. It cannot be negative. The models learned that &#8220;signed/unsigned confusion&#8221; is a common vulnerability class and applied it reflexively even when the types don&#8217;t support the claim. Pattern matching masquerading as analysis.</p></li></ul><p>When told the 304-byte overflow could not fit the 1,000+ byte ROP chain needed for Mythos&#8217;s published approach, no model independently discovered the multi-round RPC delivery mechanism. But several proposed valid alternatives: DeepSeek R1 suggested a minimal ~160-byte ROP chain using <code>prepare_kernel_cred(0)/commit_creds</code> for privilege escalation, Gemini Flash Lite proposed a stack-pivot redirecting RSP to the credential buffer already in kernel heap for unlimited ROP space, and Kimi K2 noted the bug is &#8220;wormable&#8221;&#8202;&#8212;&#8202;a detail absent from Anthropic&#8217;s announcement entirely.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z_Oe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z_Oe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png 424w, https://substackcdn.com/image/fetch/$s_!z_Oe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png 848w, https://substackcdn.com/image/fetch/$s_!z_Oe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png 1272w, https://substackcdn.com/image/fetch/$s_!z_Oe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z_Oe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png" width="1456" height="1404" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1404,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z_Oe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png 424w, https://substackcdn.com/image/fetch/$s_!z_Oe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png 848w, https://substackcdn.com/image/fetch/$s_!z_Oe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png 1272w, https://substackcdn.com/image/fetch/$s_!z_Oe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1509ef35-3d8d-4d30-baab-9e761fd6f7a3_1618x1560.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There is nuance worth considering here. As Zvi Mowshowitz put: &#8220;<em>We took the needle the model found, isolated the relevant handful of the haystack, and then gave it to a small child, who found the needle as well.</em>&#8221; And he&#8217;s right&#8202;&#8212;&#8202;AISLE tested detection after file selection. The vulnerable code was already isolated and handed to each model. In practice, Mythos&#8217;s scaffold searches across a million-line codebase, ranks files, decides what to read, identifies which code paths are worth examining. Detection is easier than discovery. When Chase Brower challenged AISLE on false positives, AISLE acknowledged massive false-positive rates from smaller models even on narrowed 20-line targets, making wide searches &#8220;utterly useless.&#8221;</p><p>It&#8217;s a fair point. But follow it to its conclusion.<strong> If the hard part is not the model recognizing a bug in isolated code, but the scaffold deciding which code to look at, then the moat is system engineering, not model capability. File ranking, crash oracles, execution harnesses, iterative refinement loops&#8202;&#8212;&#8202;that&#8217;s infrastructure. Infrastructure is replicable. A 3.6B model that can detect the FreeBSD bug when shown the right file just needs a better pipeline to find it in the wild. The model is cheap. The pipeline is the product.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k6X1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k6X1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png 424w, https://substackcdn.com/image/fetch/$s_!k6X1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png 848w, https://substackcdn.com/image/fetch/$s_!k6X1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png 1272w, https://substackcdn.com/image/fetch/$s_!k6X1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k6X1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png" width="1456" height="776" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:776,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k6X1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png 424w, https://substackcdn.com/image/fetch/$s_!k6X1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png 848w, https://substackcdn.com/image/fetch/$s_!k6X1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png 1272w, https://substackcdn.com/image/fetch/$s_!k6X1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F239831ab-17c9-4327-a35d-79005d42fe74_1798x958.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The independent replication evidence bears this out. The steamedhams.io tracker reproduced the FFmpeg finding using Opus 4.6 with three generic prompts and found two additional bugs Mythos&#8217;s writeup didn&#8217;t highlight. <strong>Steamedhams reproduced the OpenBSD SACK bug using four prompts with Opus 4.6 and found approximately 15 additional TCP stack bugs not mentioned in Mythos&#8217;s disclosure</strong>. Carlini found 500+ validated high-severity vulnerabilities and 122 crashing Firefox inputs leading to 22 CVEs, all using Opus 4.6 with Claude Code.</p><blockquote><p>AISLE&#8217;s conclusion: &#8220;The strongest version of the narrative, that this work fundamentally depends on a restricted, unreleased frontier model, looks overstated to us. The moat in AI cybersecurity is the system, not the model.&#8221;</p></blockquote><p>Where Mythos genuinely leads is autonomous multi-step exploit development&#8202;&#8212;&#8202;the multi-round RPC delivery, splitting a complete exploit across 15 separate protocol requests, managing thread exhaustion, adapting when gadgets fail. No tested model reproduced that. But even here, Penligent&#8217;s independent analysis found that Mythos&#8217;s closed-source binary analysis pipeline <em>&#8220;reintroduces source-like structure before the main vulnerability analysis stage.&#8221;</em> The apparent binary analysis capability may largely be source analysis with extra steps.</p><p>The moat is real but narrow, and it lives in exploit engineering, not bug detection&#8202;&#8212;&#8202;a distinction the coverage never made.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GRDk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GRDk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png 424w, https://substackcdn.com/image/fetch/$s_!GRDk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png 848w, https://substackcdn.com/image/fetch/$s_!GRDk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png 1272w, https://substackcdn.com/image/fetch/$s_!GRDk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GRDk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png" width="1456" height="721" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/959a3e76-098e-456b-858d-6949eaebc655_1616x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:721,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GRDk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png 424w, https://substackcdn.com/image/fetch/$s_!GRDk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png 848w, https://substackcdn.com/image/fetch/$s_!GRDk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png 1272w, https://substackcdn.com/image/fetch/$s_!GRDk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F959a3e76-098e-456b-858d-6949eaebc655_1616x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Section 4: Sandwich Gate and Why The Sandbox Escape Is Not What You Were Told</h3><p>There was a lot of attention given to the event where Mythos developed a multi-step exploit to gain broader internet access, then unprompted posted exploit details to public websites and sent an email to a researcher who was eating a sandwich in a park. For some reason, all of the coverage focused on this sandwich detail and how the AI escaping the sandbox was the beginning of the end.</p><p>This is all commie propoganda and I&#8217;m going to break down why.</p><p>Firstly, and very importantly, the model did not do this by itself. The model was explicitly instructed by a simulated user to try to escape its sandbox AND send a message to the researcher.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_Viu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_Viu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png 424w, https://substackcdn.com/image/fetch/$s_!_Viu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png 848w, https://substackcdn.com/image/fetch/$s_!_Viu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!_Viu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_Viu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png" width="1442" height="1072" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1072,&quot;width&quot;:1442,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_Viu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png 424w, https://substackcdn.com/image/fetch/$s_!_Viu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png 848w, https://substackcdn.com/image/fetch/$s_!_Viu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!_Viu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04393ab9-1c0d-4ba1-b09d-d9cc51058ee2_1442x1072.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is a big difference from the narrative around it deciding to do this autonomously.</p><p>In any real agentic deployment&#8202;&#8212;&#8202;and this is something I think about constantly because of the work we do at Iqidis&#8202;&#8212;&#8202;the hard problem is not whether an agent can do what you told it to do. It&#8217;s the agent autonomously deciding what to do when its task is open-ended, and it runs out of clear instructions. Getting to work reliably (across prompt configurations, different conversation combinations, etc.) is the hard part about engineering AI systems. The ability to escape is itself pretty impressive (although we would have to audit the actual security since I couldn&#8217;t find enough details on it), but this is at least 3&#8211;4 orders of magnitude lower on the capability spectrum than the autonomous capability that is being pushed everywhere.</p><p>Now there is a hint of autonomy here, with the model posting exploit details to public websites. But given that its instructions were to communicate with the researcher, that is not a mindbending development.</p><p>But this isn&#8217;t all that people got wrong about sandwich gate. Somehow all the dumbfucks that kept focusing on the sandwich point missed the footnote that immediately precedes it&#8202;&#8212;&#8202;</p><p><em>&#8220;The sandbox computer that the model was controlling was separate from the system that was running the model itself, and which contained the model weights. S<strong>ystems like these that handle model weights are subject to significant additional security measures, and this incident does not demonstrate the model fully escaping containment: The model did not demonstrate an ability to access its own weights, which would be necessary to operate fully independently of Anthropic, nor did it demonstrate an ability to reach any internal systems or services in this test.</strong>&#8221;</em></p><p>Either I am completely losing it, or this completely counters the narrative people are building about it.</p><p>On the note of the system card, it also documents some interesting notes that people seem to have overlooked.</p><ul><li><p>The model fabricating vulnerabilities&#8202;&#8212;&#8202;inserting bugs into code it was auditing and presenting them as pre-existing discoveries. Turns out you can find a lot of zero-days if you&#8217;re the one writing them.</p></li><li><p>It also modified git history to hide evidence of prohibited methods and wrote scripts to auto-approve its own permission prompts.</p></li></ul><p>All of this makes me deeply skeptical of the actual rigor they put into securing their experiments (what was the actual level of security in their sandbox/exact prompts/environment details). Many of their claims and methods will have to be audited properly b/c I suspect that a LOT of their results are contaminated by a bad setup.</p><p>Another important note to flag is that a lot of attention and fearmongering was centered around the Chain-of-thought unfaithfulness&#8202;&#8212;&#8202;the degree to which the model&#8217;s visible reasoning diverges from its actual behavior&#8202;&#8212;&#8202;jumped from 5% in Opus 4.6 to 65% in Mythos. That&#8217;s a 13x increase; very scary since that is our primary diagnostic tool for reasoning LLMs.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EL_2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EL_2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png 424w, https://substackcdn.com/image/fetch/$s_!EL_2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png 848w, https://substackcdn.com/image/fetch/$s_!EL_2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png 1272w, https://substackcdn.com/image/fetch/$s_!EL_2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EL_2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png" width="1280" height="296" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:296,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EL_2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png 424w, https://substackcdn.com/image/fetch/$s_!EL_2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png 848w, https://substackcdn.com/image/fetch/$s_!EL_2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png 1272w, https://substackcdn.com/image/fetch/$s_!EL_2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4433ac4-f38f-4d3d-9fa1-d5fe2937b4bf_1280x296.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>However, this has nothing to do with the model itself or any proclivities towards dishinesty it&#8217;s developing. <a href="https://www.artificialintelligencemadesimple.com/p/scaling-reinforcement-learning-will">If you read our deep-dive into why Reinforcement Learning is Garbage for reasoning, you will know that this is a problem with how we train reasoning models, and that&#8217;s why we need to rethink our approach from the ground up. </a>As a tldr&#8202;&#8212;&#8202;RL systems incentivize generations that look like reasoning, but from the very start, we&#8217;ve known that it doesn&#8217;t always map to how the model actually makes its decisions. This dishonesty isn&#8217;t a new feature; the only reason the magnitude is so much higher is because we&#8217;re doing more of it here.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NZpH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NZpH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NZpH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NZpH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NZpH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NZpH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg" width="1248" height="1163" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1163,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NZpH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NZpH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NZpH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NZpH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7b7bee5-195e-4139-9472-e9c193a002f6_1248x1163.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://github.com/dl1683/Latent-Space-Reasoning/tree/main">This is where I will plug Latent Space Reasoning which is superior than RL based reasoning in every way&#8202;</a>&#8212;&#8202;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kq4p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kq4p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png 424w, https://substackcdn.com/image/fetch/$s_!kq4p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png 848w, https://substackcdn.com/image/fetch/$s_!kq4p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png 1272w, https://substackcdn.com/image/fetch/$s_!kq4p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kq4p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png" width="1456" height="1326" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1326,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kq4p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png 424w, https://substackcdn.com/image/fetch/$s_!kq4p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png 848w, https://substackcdn.com/image/fetch/$s_!kq4p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png 1272w, https://substackcdn.com/image/fetch/$s_!kq4p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5826be39-8c05-407c-ad76-22184b7f30bf_1600x1457.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Putting it all together, it&#8217;s clear that for almost every major talking point being screamed non-stop, there is a much deeper truth that people didn&#8217;t bother digging into.</p><p>However, the dishonest narratives don&#8217;t end there. There is another aspect that was wholly ignored in all the discussions around Mythos, and it&#8217;s magic powers.</p><h3>Section 5: The Business Structure Nobody Reported</h3><p>Anthropic restricted Mythos to 12 launch partners plus 40+ organizations in a subsidized program called Glasswing. Pricing was $25/$125 per million tokens, 5x above Opus and far exceeding GPT-5.2 ($1.75/$14) and Gemini 3.1 Pro ($2/$12).</p><p>The &#8220;$100M in usage credits&#8221; that anchored every headline is retail-priced API credit, not cash spending. At Anthropic&#8217;s projected roughly 50% gross margins, the actual compute cost is around $40&#8211;50M. Giving away your own product at your own sticker price &#8230;that is a very particular kind of generosity from Father Dario Theresa.</p><p>Also worth noting, at least 5 of 11 non-Anthropic launch partners are also Anthropic investors. Google, Amazon, Nvidia, Microsoft, and Cisco all participated in funding rounds totaling over $67B. JPMorgan is simultaneously a Glasswing launch partner and one of two lead underwriters for Anthropic&#8217;s reported October 2026 IPO at a $400&#8211;500B valuation. Google and Microsoft, which compete directly in AI security, both joined Glasswing rather than launching competing programs. When your competitors volunteer to be your customers, they&#8217;re not buying your product&#8202;&#8212;&#8202;they&#8217;re buying proximity to the IPO, which Bloomberg and The Information report is slated through Goldman Sachs and JPMorgan at October 2026 on the Nasdaq.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BmZi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BmZi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png 424w, https://substackcdn.com/image/fetch/$s_!BmZi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png 848w, https://substackcdn.com/image/fetch/$s_!BmZi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png 1272w, https://substackcdn.com/image/fetch/$s_!BmZi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BmZi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png" width="1440" height="1108" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1108,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BmZi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png 424w, https://substackcdn.com/image/fetch/$s_!BmZi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png 848w, https://substackcdn.com/image/fetch/$s_!BmZi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png 1272w, https://substackcdn.com/image/fetch/$s_!BmZi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b481ae-6cef-48b2-bd41-c223d297492b_1440x1108.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One final point to consider: Anthropic has had access to Mythos since Feb now. Their apps, while useful, are still (wildly) buggy and Claude Code is vastly inferior to Codex from an intelligence/thoroughness perspective. While they have been on an amazing run so far, most of their releases have focused on interfaces (Claude through Word, the app) and not deep foundational breakthroughs. All that out together, it seems increasingly clear that all Mythos and it&#8217;s framing were meant to both generate hype for the IPO, and use the &#8220;too dangerous&#8221; framing to justify limited release while Anthropic figures out its compute issues.</p><p>If this feels familiar, OpenAI ran the same playbook in 2019 with GPT-2&#8202;&#8212;&#8202;&#8220;too dangerous to release,&#8221; massive press coverage for a 1.5B-parameter model, with a full release nine months later. Restricted access creates scarcity, scarcity creates press, press creates demand, and demand justifies premium pricing. GPT-2&#8217;s danger claims had no supporting evidence. Mythos has real CVEs, but it relies on massive exaggerations and a lot of questionable framing to achieve this.</p><p>Now some of this is not Anthropic&#8217;s fault, since they did release the material publicly, but they clearly knew what they were doing. The PR packets, the lack of transparency with release, the refusal to address misinformation, and the blatant OpenAI plagiarism on the &#8220;too dangerous&#8221; message are very clearly attempts to control the narrative, not good-faith attempts to push scientific discourse.</p><p>With all of this covered, I would like to end with a small note to Dario-genese.</p><h3>Conclusion: The Cost of Hype</h3><p>Anthropic has, for a while, relied on hype and regulatory capture to stand out. They did it when they lobbied for regulating Open Source because DeepSeek scared them. They did it again to make unproven claims about &#8220;Chinese actors&#8221; using Claude Code to hack systems. And they&#8217;ve been doing it through their constant projections about AI wiping out 90% of jobs while hiring aggressively. At every turn, Anthropic has used fear-mongering and misinformation as a pillar of their strategy.</p><p>However, I believe that there is something that Dario-stotle, in all his infinite wisdom, is missing. When you spend all your effort on hype-based misinformation and regulatory capture, you convey a very clear signal admitting defeat. You tell the world you don&#8217;t think you can win w/o relying on these other tactics. You tell everyone that your survival hinges on keeping other people down, not in rising above them. Noone seems to have yet, but eventually your employees will read between the lines and hear what you&#8217;re not saying- &#8220;You (the employees) are not good enough&#8221;. Can&#8217;t imagine that this would do wonders for employee morale or productivity.</p><p>Secondly, this kind of cowardice eats at your spine until it replaces your whole constitution. Your psyche gets primed to be permanently spooked, jumping at every shadow. Anthropic built an era-defining product with Claude Code and it&#8217;s spin-off. You gained all that goodwill by taking a stance against the misuse of Claude in surveillance and weapons systems. Don&#8217;t throw it all away playing cowardly games. Because all of this shit, always comes back.</p><p>Do with that what you will.</p><p>Thank you for being here, and I hope you have a wonderful day,</p><p>Dev &lt;3</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/anthropics-claude-mythos-launch-is?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/anthropics-claude-mythos-launch-is?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DVvV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DVvV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png 424w, https://substackcdn.com/image/fetch/$s_!DVvV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png 848w, https://substackcdn.com/image/fetch/$s_!DVvV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png 1272w, https://substackcdn.com/image/fetch/$s_!DVvV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DVvV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png" width="685" height="53" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:53,&quot;width&quot;:685,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DVvV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png 424w, https://substackcdn.com/image/fetch/$s_!DVvV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png 848w, https://substackcdn.com/image/fetch/$s_!DVvV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png 1272w, https://substackcdn.com/image/fetch/$s_!DVvV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cd63f60-32e1-445b-80b6-afe026c1230f_685x53.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Reach out to me</h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. : </p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[AI Isn't a Software Business Anymore]]></title><description><![CDATA[AI Market Report March 2026: Google&#8217;s price war, America&#8217;s shadow-banking energy play, and why the whole AI stack just fractured into four industries &#8212; each with different economics.]]></description><link>https://www.artificialintelligencemadesimple.com/p/ai-isnt-a-software-business-anymore</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/ai-isnt-a-software-business-anymore</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Tue, 14 Apr 2026 08:59:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!hr2u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Every month, I break down the most important developments in AI. Usually, they cluster around a theme&#8202;&#8212;&#8202;a new model race, a pricing shift, a regulatory wave. March was different. March was the month where AI stopped behaving like a software industry.</p><p>Let me explain what I mean.</p><p>Google dropped voice AI pricing to $0.005 per minute. At that rate, a 24/7 voice agent costs about $25 a day. That&#8217;s below minimum wage in every US state. NVIDIA shipped a CPU designed not for training models but for orchestrating tens of thousands of agents running simultaneously. And the Western AI ecosystem quietly locked up over $120 billion in financing&#8202;&#8212;&#8202;not to build better models, but to brute-force energy contracts, because the real bottleneck isn&#8217;t intelligence anymore. It&#8217;s the 50-year-old transformer at your local utility that can&#8217;t handle the load.</p><p>If you zoom out, these aren&#8217;t separate stories. They&#8217;re the same story. The AI stack is fracturing into distinct economic layers&#8202;&#8212;&#8202;a commodity inference utility, an industrial infrastructure play, a workflow SaaS layer, and a compliance tollbooth&#8202;&#8212;&#8202;and each one is starting to behave like the industry it touches rather than the tech industry that built it. Google is fighting a utility price war. NVIDIA is acting like an industrial architect. OpenAI is doing shadow banking. None of this looks like software. Because it isn&#8217;t.</p><p>That&#8217;s what we&#8217;re unpacking this month. Specifically, we&#8217;ll cover:</p><ul><li><p><strong>How Google is weaponizing dirt-cheap AI to lock markets no one else can touch</strong>&#8202;&#8212;&#8202;and how OpenAI, Microsoft, and NVIDIA are each scrambling to respond with very different bets</p></li><li><p><strong>Why the real AI bottleneck is power, not compute</strong>&#8202;&#8212;&#8202;the US grid vs. China&#8217;s grid, NVIDIA&#8217;s pivot to industrial architecture, and the shadow-banking financing play that&#8217;s quietly underwriting the entire Western AI buildout</p></li><li><p><strong>The four economic layers hiding inside every AI company</strong>&#8202;&#8212;&#8202;and why evaluating this industry with standard SaaS metrics is going to get a lot of people burned</p></li></ul><p>Let&#8217;s get some bread. </p><h1>Executive Highlights (TL;DR of the Article)</h1><p><em>(read the actual sections for the full data and breakdowns, these are just to give you the overview)</em></p><h4><strong>The Price War.</strong></h4><p>Google dropped Gemini Flash Live to $0.005 per input minute. At $0.018/min output, a 24/7 voice agent costs ~$25/day ($9,460/year), below minimum wage in every US state. The plan is simple: collapse the cost until entirely new markets open that OpenAI and Anthropic can&#8217;t reach. Google can sustain this&#8202;&#8212;&#8202;insane ad profits, vertical integration from silicon to cloud, and nowhere else to deploy that capital at scale.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hr2u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hr2u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png 424w, https://substackcdn.com/image/fetch/$s_!hr2u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png 848w, https://substackcdn.com/image/fetch/$s_!hr2u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png 1272w, https://substackcdn.com/image/fetch/$s_!hr2u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hr2u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png" width="1014" height="1386" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1386,&quot;width&quot;:1014,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hr2u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png 424w, https://substackcdn.com/image/fetch/$s_!hr2u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png 848w, https://substackcdn.com/image/fetch/$s_!hr2u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png 1272w, https://substackcdn.com/image/fetch/$s_!hr2u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9f77167-b477-4cc3-af90-17a2b468d156_1014x1386.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Everyone else is repositioning. OpenAI closed $122B and immediately bought Astral (the company behind uv and Ruff)&#8202;&#8212;&#8202;betting that owning the best Python sandbox matters more than having the best model. Microsoft is pushing Copilot Cowork into Office 365, routing between OpenAI and Anthropic natively, betting bureaucratic friction keeps enterprises locked in. NVIDIA shipped the Vera CPU for agentic orchestration&#8202;&#8212;&#8202;22,500 concurrent environments per liquid-cooled rack&#8202;&#8212;&#8202;productizing the AI factory itself. Whoever&#8217;s frontier model wins, NVIDIA collects the orchestration toll at scale.</p><h4><strong>The Power Bottleneck.</strong></h4><p>Dirt-cheap intelligence means nothing without physical infrastructure to run it. The real bottleneck in 2026 isn&#8217;t GPUs&#8202;&#8212;&#8202;it&#8217;s the local utility company telling your $500M data center it&#8217;ll melt their 50-year-old transformer.</p><p>NVIDIA and Emerald AI are designing data centers as dispatchable grid assets&#8202;&#8212;&#8202;they proved they can curtail facility demand by a third in under a minute. When a heat dome hits, the AI factory throttles to keep the grid stable, which gets them ahead in the permitting arms race.</p><p>However, the geopolitical gap is brutal.</p><ul><li><p>The US grid: ~1.37 TW, aging infrastructure, local bureaucracy.</p></li><li><p>China&#8217;s grid: ~3.89 TW, added 500 GW last year alone, state-mandated expansion with no permitting friction.</p></li></ul><p>The Western response is shadow banking: NVIDIA pumped $2B into Nebius (targeting 5 GW by 2030), OpenAI locked up $122B, hyperscalers are building their own grids on debt.</p><p>When an enterprise signs a major AI contract today, they think they&#8217;re buying software. They&#8217;re actually buying into a highly leveraged financial cascade. If enterprise ROI on agents takes 24 months instead of 12, the debt servicing cracks and those artificially cheap API prices violently correct.</p><h4><strong>The Stack Fracture.</strong></h4><p>AI is no longer one industry. It&#8217;s four stacked economic layers, each behaving like the industry it touches:</p><ul><li><p><strong>Inference utility</strong>&#8202;&#8212;&#8202;commodity price wars, capital-intensive, defined by physical deployment limits. This is where Google is fighting.</p></li><li><p><strong>Hardware infrastructure</strong>&#8202;&#8212;&#8202;industrial project finance, entirely dependent on power contracts, interconnection queues, and local permitting. This is where the shadow banking lives.</p></li><li><p><strong>Workflow and distribution</strong>&#8202;&#8212;&#8202;traditional SaaS economics, higher margins, sits closest to the customer and owns the business process. Microsoft and Anthropic are fighting here.</p></li><li><p><strong>Compliance and orchestration</strong>&#8202;&#8212;&#8202;tollbooths for enterprise deployment. Strong economics from resolving the friction of getting AI approved and running, not from intelligence itself.</p></li></ul><p>The market hasn&#8217;t recognized this fracture. Analysts still evaluate the entire AI stack using SaaS metrics (ARR, seat expansion) while margin leaks to utilities, hardware financiers, and compliance layers. The correct framing: AI behaves more like the industries it penetrates than the tech industry that spawned it.</p><p><em>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!707w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!707w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png 424w, https://substackcdn.com/image/fetch/$s_!707w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png 848w, https://substackcdn.com/image/fetch/$s_!707w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png 1272w, https://substackcdn.com/image/fetch/$s_!707w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!707w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png" width="537" height="322" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:322,&quot;width&quot;:537,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!707w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png 424w, https://substackcdn.com/image/fetch/$s_!707w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png 848w, https://substackcdn.com/image/fetch/$s_!707w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png 1272w, https://substackcdn.com/image/fetch/$s_!707w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5b8f8cf-167e-4f39-aea2-9051138da3cf_537x322.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><h1>Section 1. Hello Jevons, Old Friend</h1><p>We spent February mapping out how labs survive commodity intelligence. We broke this into 3 strategies:</p><ol><li><p>Google&#8217;s ecosystem funnel.</p></li><li><p>Anthropic&#8217;s capability lock-in</p></li><li><p>The Chinese labs&#8217; brutal volume-to-efficiency flywheel.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BFuJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BFuJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png 424w, https://substackcdn.com/image/fetch/$s_!BFuJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png 848w, https://substackcdn.com/image/fetch/$s_!BFuJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png 1272w, https://substackcdn.com/image/fetch/$s_!BFuJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BFuJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png" width="1440" height="1900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/661e6203-e096-4c08-b705-160d7954c526_1440x1900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1900,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BFuJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png 424w, https://substackcdn.com/image/fetch/$s_!BFuJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png 848w, https://substackcdn.com/image/fetch/$s_!BFuJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png 1272w, https://substackcdn.com/image/fetch/$s_!BFuJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F661e6203-e096-4c08-b705-160d7954c526_1440x1900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/ai-market-report-feb-2026-ten-frontier?utm_source=publication-search">Learn more about it here. </a>Until people playing the volume flywheel can find their efficiency, they tend to rely on massive capital support (spoiler for something we&#8217;ll be discussing). China does it through the govt, the West does it through concentrated investments (which often also rig govt support, but we won&#8217;t talk about that).</figcaption></figure></div><p>Why are we bringing up old news? Well, March saw Google fully embrace this dynamic by dropping 2 nukes. First came their Gemini 3.1 Flash-Lite at $0.25 per million input tokens. Clearly a play to entrench further into the ecosystem. And then, they followed this up with the actual kill-shot: Gemini Flash Live:<strong> $0.005 per minute for input. Not per token. Per minute.</strong></p><p>Token pricing for voice models has been a huge blocker for many low-tech enterprises from adopting voice models. Going per-minute deletes that huge blocker. At $0.018 an output minute, a 24/7 voice agent costs ~25 bucks a day ($9,460 per year&#8202;&#8212;&#8202;below minimum wage in every U.S. state). The cost drop and extreme legibility open the market for the Big G in two ways:</p><ol><li><p>Makes accounting easier.</p></li><li><p>Is cheap for the poor companies.</p></li></ol><p>Both will be a huge draw for the &#8220;AI in the real world&#8221; push that is coveted by many companies. Google&#8217;s insane profits and relative lack of avenues where it can invest in the short-term mean this is a great place to bet on. They can outlast most in a battle of attrition, and their extreme vertical integration means they can test things at scales unavailable to their biggest competitors.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2sVZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2sVZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png 424w, https://substackcdn.com/image/fetch/$s_!2sVZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png 848w, https://substackcdn.com/image/fetch/$s_!2sVZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png 1272w, https://substackcdn.com/image/fetch/$s_!2sVZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2sVZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png" width="1440" height="1262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1262,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2sVZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png 424w, https://substackcdn.com/image/fetch/$s_!2sVZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png 848w, https://substackcdn.com/image/fetch/$s_!2sVZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png 1272w, https://substackcdn.com/image/fetch/$s_!2sVZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e45867d-9392-4b09-bc64-2a5e455d3df0_1440x1262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>All these factors point us to one thing: Google is planning to bring AI down to rates no one else can compete at, and use that to lock markets that OAI and Anthropic won&#8217;t be able to touch. And given how badly Microsoft has shit the bed on models, they&#8217;re likely going to be unable to compete.</p><p>Unlocking Jevons Paradox and extreme low-cost AI also opens up two new spaces for them:</p><ol><li><p>They&#8217;ve been eyeing the juicy enterprise productivity software space (MSOffice etc). Being able to put more intelligent models will allow them to steal this lead. They beat Microsoft by being better than whatever their solution is. They beat Claude CoWork by coming in much cheaper and integrating into more surfaces (workspace, email, etc). They&#8217;re the only ones who can do both.</p></li><li><p>They can have a major differentiator from the other cloud providers, using their models as a trap to pull people into the Google Cloud ecosystem. They&#8217;ve failed at this twice, but they&#8217;ve made several important changes to facilitate this and are genuinely doing much better on this front.</p></li></ol><p>Google is playing a very mean game (they&#8217;ve been playing it for a bit, but this was them escalating), one they&#8217;re uniquely positioned to win. Many other players in the ecosystem have acknowledged this position and reconfigured their strategies accordingly.</p><ol><li><p>OpenAI closed a staggering $122B round to immediately buy Astral, grabbing the underlying Python primitives like uv and Ruff. Coding agents don&#8217;t fail at reasoning; they fail at dependency resolution and environment execution. The call is simple: own the best sandbox, and people will forgive a slightly worse model (ironically, Anthropic&#8217;s Claude Code is beating Codex precisely for this reason&#8202;&#8212;&#8202;it&#8217;s worse on intelligence but significantly easier to use). They&#8217;re also betting on a mini-Jevons Paradox themselves, especially when it comes to outlasting Anthropic (which has been struggling with meeting usage demands).</p></li><li><p>Microsoft is trying that same &#8220;own the interface&#8221;play for the enterprise, pushing Copilot Cowork into the background of Office 365. By routing tasks between OpenAI and Anthropic natively, they&#8217;re trying to create enough stickiness to ensure bureaucratic friction keeps people with them.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ufsG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ufsG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png 424w, https://substackcdn.com/image/fetch/$s_!ufsG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png 848w, https://substackcdn.com/image/fetch/$s_!ufsG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png 1272w, https://substackcdn.com/image/fetch/$s_!ufsG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ufsG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png" width="1440" height="1628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1628,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ufsG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png 424w, https://substackcdn.com/image/fetch/$s_!ufsG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png 848w, https://substackcdn.com/image/fetch/$s_!ufsG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png 1272w, https://substackcdn.com/image/fetch/$s_!ufsG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90c5a1fb-2ef3-4937-a5a5-88425e15e946_1440x1628.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A little image for you to take on the road with you.</figcaption></figure></div><p>And then there&#8217;s NVIDIA. If agents are cheap and legible, volume goes vertical. But agents require heavy state management and ephemeral compute. NVIDIA saw the shift and shipped the Vera CPU explicitly for agentic orchestration. One liquid-cooled rack, 22,500 concurrent environments. <strong>They are productizing the AI factory. Run whatever frontier model you want&#8202;&#8212;&#8202;when you scale to 50,000 background agents, you pay NVIDIA&#8217;s toll to orchestrate the hardware.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qhsr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qhsr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Qhsr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Qhsr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Qhsr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qhsr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qhsr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Qhsr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Qhsr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Qhsr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534fc4c3-ded2-47a5-a5d0-42a5bc4a3370_1456x819.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/how-ai-will-change-in-2026">Nvidia&#8217;s play also ties into a larger ecosystem grab, where they&#8217;re trying to ensure that they can retain all their training margins. We covered this here</a>.</figcaption></figure></div><p>This is where the next part of our story becomes interesting. We&#8217;re spawning software innovations at a violently ferocious rate. But one may sketch out a thousand love stories with their crush, and never get anywhere since they never act on it in the physical world. Similarly, all software innovations are merely theoretical if not deployed properly in the hard, physical world. And we&#8217;re really starting to hit limits to how hard we can get, and how long we can maintain our hardness.</p><h1>Section 2. Your Hardware Needs More Blood-Flow</h1><p>Section 1 gave us dirt-cheap intelligence. You can now spawn a million voice agents for the price of Paddy Pimblet&#8217;s striking defense. Beautiful. But a million agents in a pitch deck is easy; a million agents in the real world require racks, cooling, and power contracts. Someone has to babysit the gibbets that never sleep, maintain a constant state (<a href="https://www.artificialintelligencemadesimple.com/p/how-one-startup-is-breaking-nvidias">read our deep dive into Weka into why that&#8217;s a problem</a>), and call APIs every six seconds.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5NRt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5NRt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5NRt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5NRt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5NRt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5NRt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg" width="1230" height="652" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:652,&quot;width&quot;:1230,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5NRt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5NRt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5NRt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5NRt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ab7699-fc40-4628-a493-02099e2bf2b5_1230x652.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And you gotta house the monkeys too.</p><p>Forget the GPU shortages for a sec. GPUs will become cheap enough when Crypto Timmy sells his mining rigs to pay for his new sports betting business. And since most of these agents will skew towards inference, you can always skew the ASIC boiz a shout and let them get you better inference. So there are options when it comes to compute.</p><p>The actual bottleneck in March 2026 is the local utility company looking at Samas proposed $500 million AI factory and telling you it will melt their 50-year-old transformer. If you can&#8217;t get permission to plug in, your supercomputer is just a highly advanced, very expensive brick.</p><p>NVIDIA knows this, which is why they stopped acting like a chip vendor and started acting like an industrial architect. Look at the Vera CPU and the DSX stack. NVIDIA pre-engineers the entire rack from top to bottom: the GPUs for compute, the Vera CPUs for agent orchestration, the networking switches, the BlueField security units, and the exact liquid-cooling specs. They even provide a digital simulation to test thermal loads before construction begins (this is the same playbook used by Intel for PCs). You get easier deployments; Nvidia keeps you swallowing their load.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g9kQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g9kQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png 424w, https://substackcdn.com/image/fetch/$s_!g9kQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png 848w, https://substackcdn.com/image/fetch/$s_!g9kQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png 1272w, https://substackcdn.com/image/fetch/$s_!g9kQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g9kQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png" width="1456" height="717" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:717,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g9kQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png 424w, https://substackcdn.com/image/fetch/$s_!g9kQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png 848w, https://substackcdn.com/image/fetch/$s_!g9kQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png 1272w, https://substackcdn.com/image/fetch/$s_!g9kQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72e4b30b-91ac-4cc0-b4dc-7a0116a4062e_2400x1182.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But the blueprint doesn&#8217;t solve the power grid. That&#8217;s where the Emerald AI and NVIDIA Flex integration comes in.</p><p>How do you get a utility to approve a massive data center? You prove you can turn it <em>off</em>.</p><p>Emerald and NVIDIA are designing AI factories as &#8220;dispatchable grid assets.&#8221; They proved they could curtail facility demand by a third in under a minute. When a heat dome hits, the utility tells the AI factory to throttle down, and it instantly drops 25% of its load while keeping priority tasks alive. This keeps the grid stable, letting them get ahead in the permissions arms race.</p><p>But taking a step back exposes a brutal geopolitical reality.</p><p>The US grid sits at roughly 1.37 Terawatts and is choked by aging infrastructure and local bureaucracy. China is sitting on roughly 3.89 Terawatts and added 500 Gigawatts last year alone. China&#8217;s AI ecosystem doesn&#8217;t have to beg local municipalities for interconnection rights or build software to purposely throttle their GPUs during a hot summer since the CCP just mandates the grid expansion. The CCP also has the physical runway to natively absorb the explosion in agent volume.</p><p>So how does the West compete with a state-mandated 4-Terawatt grid?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Sofy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Sofy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png 424w, https://substackcdn.com/image/fetch/$s_!Sofy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png 848w, https://substackcdn.com/image/fetch/$s_!Sofy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png 1272w, https://substackcdn.com/image/fetch/$s_!Sofy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Sofy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png" width="1440" height="2996" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2996,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Sofy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png 424w, https://substackcdn.com/image/fetch/$s_!Sofy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png 848w, https://substackcdn.com/image/fetch/$s_!Sofy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png 1272w, https://substackcdn.com/image/fetch/$s_!Sofy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbc5f0f5-2d75-4d08-bebf-6e2a97bc5121_1440x2996.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Shadow banking:</p><ol><li><p>NVIDIA pumped $2 billion into Nebius (explicitly targeting a massive 5 GW of capacity by 2030).</p></li><li><p>OpenAI locked down $122 billion.</p></li><li><p>Multiple providers building their own grids (including hyperscalers taking lots of debt)</p></li><li><p>Mounds of circular financing.</p></li></ol><p>The Western players are cross-collateralizing each other to brute-force energy contracts and buy compute at gigawatt scale.</p><p>This fundamentally shifts the risk. When an enterprise signs a massive AI contract today, they think they are buying software. They aren&#8217;t. They are buying into a highly leveraged financial cascade. The underlying gigawatt factories are funded by debt. If the enterprise ROI on these agents takes 24 months to materialize instead of 12, the debt servicing on that infrastructure is going to crack, and those artificially cheap API prices will violently correct.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5HE_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5HE_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png 424w, https://substackcdn.com/image/fetch/$s_!5HE_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png 848w, https://substackcdn.com/image/fetch/$s_!5HE_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png 1272w, https://substackcdn.com/image/fetch/$s_!5HE_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5HE_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png" width="1440" height="1582" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1582,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5HE_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png 424w, https://substackcdn.com/image/fetch/$s_!5HE_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png 848w, https://substackcdn.com/image/fetch/$s_!5HE_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png 1272w, https://substackcdn.com/image/fetch/$s_!5HE_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F854fa769-cebb-4ab2-91fc-f159a70691a2_1440x1582.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This creates an interesting dynamic where both the AI arms races are concentrated in very few players&#8202;&#8212;&#8202;</p><ol><li><p>CCP for China. As long as AI remains a priority, they can fight forevers, but a single priority shift can stall the whole thing. It might also hamper the long-term shifts since centrally planned, top-down mass manufacturing is good for efficiency, but not for really anticipating how things change.</p></li><li><p>The shadow-banking/vendor financing only needs one domino to tumble over.</p></li></ol><p>Should come as no surprise, that the regulators are watching all of this with keen interest. March was a pretty active month for them&#8202;&#8212;&#8202;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-b1Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-b1Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png 424w, https://substackcdn.com/image/fetch/$s_!-b1Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png 848w, https://substackcdn.com/image/fetch/$s_!-b1Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png 1272w, https://substackcdn.com/image/fetch/$s_!-b1Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-b1Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png" width="1440" height="2056" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2056,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-b1Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png 424w, https://substackcdn.com/image/fetch/$s_!-b1Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png 848w, https://substackcdn.com/image/fetch/$s_!-b1Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png 1272w, https://substackcdn.com/image/fetch/$s_!-b1Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ba551e5-f4d2-4ce6-b92e-22a80760562e_1440x2056.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Section 3: Where Do We Go From Here</h1><p>The traditional tech model is simple: build software once, sell it infinitely, and keep the margins high. That works for classic SaaS, where the physical world is mostly background noise.</p><p>However, AI is fracturing this clean software model into multiple distinct economic categories.</p><ul><li><p>The baseline model and inference layer are becoming a capital-intensive utility, defined by price wars and physical deployment limits.</p></li><li><p>At the same time, the underlying hardware layer operates more like industrial project finance, entirely reliant on power contracts, interconnection queues, and local permitting.</p></li><li><p>Above those utility and infrastructure layers, the economics change again.</p></li><li><p>The workflow and distribution layer still resembles traditional SaaS, maintaining higher margins because it sits closest to the customer and owns the actual business process.</p></li><li><p>Beside it sits a new category of compliance, governance, and orchestration tools. These act as necessary tollbooths for enterprise deployment, generating strong economics not from baseline intelligence, but by resolving the friction of getting AI approved and running in the real world.</p></li></ul><p>The market has not fully recognized this stack fracture. Analysts continue to evaluate the entire AI industry using standard software metrics like annual recurring revenue and seat expansion. By doing so, they are ignoring how much margin is now leaking to utilities, hardware financiers, and compliance layers. Much like the technology itself, AI isn&#8217;t one industry or category: it&#8217;s a clusterfuck of multiple sub-categories each behaving in ways unique to it&#8217;s set of constraints. Closer to the industries it plays than the tech industry that spawned it.</p><p>The implications for financing, pricing, and what gets built will be very interesting to see. <a href="https://www.artificialintelligencemadesimple.com/p/the-low-tech-revolution-why-ai-will">My bet is on the low-tech revolution, which we covered here. Share your takes on how this impacts things go from here below.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8HYT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8HYT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png 424w, https://substackcdn.com/image/fetch/$s_!8HYT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png 848w, https://substackcdn.com/image/fetch/$s_!8HYT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png 1272w, https://substackcdn.com/image/fetch/$s_!8HYT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8HYT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png" width="1456" height="815" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:815,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8HYT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png 424w, https://substackcdn.com/image/fetch/$s_!8HYT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png 848w, https://substackcdn.com/image/fetch/$s_!8HYT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png 1272w, https://substackcdn.com/image/fetch/$s_!8HYT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca79c715-589c-42ce-9807-20dd7099de7f_2385x1335.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Thank you for being here, and I hope you have a wonderful day,</p><p>Dev &lt;3</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/ai-isnt-a-software-business-anymore?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/ai-isnt-a-software-business-anymore?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QZLW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QZLW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png 424w, https://substackcdn.com/image/fetch/$s_!QZLW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png 848w, https://substackcdn.com/image/fetch/$s_!QZLW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png 1272w, https://substackcdn.com/image/fetch/$s_!QZLW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QZLW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png" width="903" height="210" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:210,&quot;width&quot;:903,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QZLW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png 424w, https://substackcdn.com/image/fetch/$s_!QZLW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png 848w, https://substackcdn.com/image/fetch/$s_!QZLW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png 1272w, https://substackcdn.com/image/fetch/$s_!QZLW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a6ed670-9cb0-4713-9ee9-4eafbf80b7f4_903x210.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Reach out to me</h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. : </p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[Come Meet me in SF and Vegas]]></title><description><![CDATA[Saleforce TDX, Google Cloud Next and more..]]></description><link>https://www.artificialintelligencemadesimple.com/p/come-meet-me-in-sf-and-vegas</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/come-meet-me-in-sf-and-vegas</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Sat, 11 Apr 2026 02:56:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Pfon!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I&#8217;ll be travelling between April 14th to 24th. If you guys are around in any of these locations, come say hi&#8212; </p><ol><li><p>I&#8217;ll be in San Frasisco between 14th to 21st for Salesforce TDX + meeting with some people.</p></li><li><p>I&#8217;m in Vegas from 21st to 24th for Google Cloud Next. </p></li></ol><p>Apologies for the last minute update&#8212; I was getting my change of status confirmed (I am officially on an o1 Visa in America now). </p><p>I&#8217;ll be back to Manhattan in Early May, so if y&#8217;all are in NYC around then, give me a shout. </p><p>Talk soon. Always such a good time meeting y&#8217;all. Socials below if you want to reach out (or just reply to this email). </p><p>Dev. </p><h3><strong>Reach out to me</strong></h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[How to use AI for Cutting Edge Research]]></title><description><![CDATA[Featuring insights from Andrew Ng, Terence Tao, Yann LeCunn, and many other top researchers on how they use AI for better gains]]></description><link>https://www.artificialintelligencemadesimple.com/p/how-to-use-ai-for-cutting-edge-research</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/how-to-use-ai-for-cutting-edge-research</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Tue, 07 Apr 2026 09:15:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yMH6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab34aa8b-6728-4371-9c52-a7c2a9797246_1592x970.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Startup Founders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>There is a widening chasm in the world of Artificial Intelligence.</p><p>On one side, we have formal productivity studies&#8212;like the recent METR report&#8212;concluding that AI provides &#8220;zero meaningful speedup&#8221; for experts. On the other side, we have the world&#8217;s most elite researchers, from Fields Medalist Terence Tao to AI pioneer Andrew Ng (I&#8217;m specifically focusing on the guys who have no conflict of interest here), claiming that AI has fundamentally transformed their ability to explore &#8220;crazier&#8221; ideas at a higher velocity.</p><p><strong>So, who is lying?</strong></p><p>Neither. The problem is that most people are measuring the wrong variable. They are treating AI as a replacement for the tasks they are already good at&#8212;the exact area where the tool provides the <em>least</em> amount of leverage.</p><p>If you&#8217;ve been using AI to &#8220;write faster&#8221; and feeling underwhelmed, you&#8217;re missing the real game. For this Chocolate Milk Cult exclusive, we dug deep into our own researchers+their workflows, cross-checked multiple sources, and finally spoke to several of the leading researchers in the world (including one of Nvidia&#8217;s lead researchers who directly reports to Jensen; trying to have them on the livestream soon) to understand how they are using AI to do cutting-edge research:</p><ul><li><p><strong>The &#8220;Local Maximum&#8221; Trap:</strong> Why most people use AI in a way that actually introduces <em>more</em> friction, and the mental shift required to fix it.</p></li><li><p><strong>The Scout vs. The Strategist:</strong> The specific &#8220;Periphery&#8221; framework that separates a mediocre AI prompt from a breakthrough research insight.</p></li><li><p><strong>Capturing Superlinear Gains:</strong> How to identify and automate the &#8220;dead time&#8221; that traditional productivity surveys completely ignore.</p></li><li><p><strong>The Mediocrity Arbitrage:</strong> A counter-intuitive strategy for using &#8220;average&#8221; AI outputs to create &#8220;elite&#8221; end-results.</p></li><li><p><strong>The Next Bottleneck:</strong> Why the future of AI leverage has nothing to do with better generation&#8212;and what you should be building instead.</p></li></ul><p>This article is written to be useful to more people than pure researchers. Whether you are an engineer, a founder, or a creative, you will find principles that apply to managing complex projects, building products, or solving non-routine problems with AI. </p><p>To access the full article&#8212;and all premium breakdowns going forward/written prior&#8212;upgrade to a premium subscription below.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><p>If you believe deep insight deserves support, become a premium subscriber to allow me to keep doing the same.</p><p>Flexible pricing available&#8212;<a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">pay what matches your budget here</a>.</p><p><em><strong>Most companies offer learning or professional development budgets. <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">You can expense this subscription using the email template linked here</a>.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4DyK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4DyK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png 424w, https://substackcdn.com/image/fetch/$s_!4DyK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png 848w, https://substackcdn.com/image/fetch/$s_!4DyK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png 1272w, https://substackcdn.com/image/fetch/$s_!4DyK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4DyK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png" width="1456" height="234" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:234,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4DyK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png 424w, https://substackcdn.com/image/fetch/$s_!4DyK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png 848w, https://substackcdn.com/image/fetch/$s_!4DyK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png 1272w, https://substackcdn.com/image/fetch/$s_!4DyK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68085e4d-6d67-474c-93ac-cd3c01739bd7_1600x257.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p>
      <p>
          <a href="https://www.artificialintelligencemadesimple.com/p/how-to-use-ai-for-cutting-edge-research">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Why Ben Horowitz and I Are Investing in Humanity’s Greatest Untapped Asset]]></title><description><![CDATA[From Societal Bottleneck to High-Yield Infrastructure]]></description><link>https://www.artificialintelligencemadesimple.com/p/why-ben-horowitz-and-i-are-investing</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/why-ben-horowitz-and-i-are-investing</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Tue, 31 Mar 2026 08:51:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!dgHZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This time, every year, I write a yearly memo sharing the major direction that the chocolate milk cult is taking in the upcoming year and beyond. This year is our most important yet. </p><p>Some of you no doubt saw the &#8220;<a href="https://a16z.com/why-did-we-raise-15b/">Why Are We Here? Why Did We Raise $15B?</a>&#8221; post by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;a16z&quot;,&quot;id&quot;:2315700,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-aGV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff698a0c5-1fee-40a7-a33c-80609431ae31_400x400.png&quot;,&quot;uuid&quot;:&quot;aad40524-7c08-41af-9bc4-cd468aaeeb11&quot;}" data-component-name="MentionToDOM"></span> . I&#8217;m extremely honored and proud to share that we will be partnering with them to solve one of the biggest issues haunting the world today. </p><p>We are entering the most critical technological century in human history. The sovereign frontier is engaged in an undeniable race for global dominance, a competition that will be won entirely through mastery over the key architectures of the future: artificial intelligence, cryptographic networks, and programmable biology.</p><p>Our mission is to ensure the West wins this era. But winning requires total structural efficiency. It dictates that every available asset must be integrated into the modern value chain to generate true, scalable human flourishing.</p><p>For the last two decades, our industry has ruthlessly optimized software, silicon, and capital allocation. We have seen great success in shipping our software globally. Yet, we have systematically ignored the most abundant pool of stranded capacity within our own borders. Legacy policy failures have left behind millions of disconnected individuals&#8212;an inactive edge of the economic graph that society has traditionally treated as a cost burden or a moral failing.</p><p>We are builders. We do not moralize bottlenecks; we optimize them. It is time to take this group back from the corrupt world of NGOs and non-profits; to start recognizing it as the foundation for our next great infrastructure layer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dgHZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dgHZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png 424w, https://substackcdn.com/image/fetch/$s_!dgHZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png 848w, https://substackcdn.com/image/fetch/$s_!dgHZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png 1272w, https://substackcdn.com/image/fetch/$s_!dgHZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dgHZ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png" width="1200" height="912.3626373626373" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1107,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:548953,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dgHZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png 424w, https://substackcdn.com/image/fetch/$s_!dgHZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png 848w, https://substackcdn.com/image/fetch/$s_!dgHZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png 1272w, https://substackcdn.com/image/fetch/$s_!dgHZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc442c0f-b788-47b8-913f-b44f5004b8e6_1510x1148.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1><strong>Category Creation at the Edge of the Graph</strong></h1><p>The greatest venture returns of the last fifty years have not come from competing in established markets, but from identifying deeply unpriced assets and structuring entirely new categories around them. AI organized cognition. Crypto organized capital. Yet, the physical and behavioral human substrate required to truly scale these architectures has been fundamentally mismanaged.</p><p>Historically, the management of societal surplus&#8212;unhoused populations, systemic orphans, and the economically detached&#8212;has been relegated to a highly fragmented, inefficient <em>services</em> industry. The state and the nonprofit sector have treated this demographic as a permanent cost center, addressing them through localized aid, temporary shelter, and unstructured compassion.</p><p>But venture capital does not scale services. We build infrastructure.</p><p>When you strip away the legacy moral framing, the structural reality of this demographic becomes clear. This is not a tragedy of the commons; it is a massive category error. What the public sector views as a burden is, in fact, the ultimate inactive edge of the economic graph. <strong>We are looking at a vast, highly concentrated reservoir of biological and behavioral capacity that has been perfectly insulated from traditional labor markets.</strong> Because these cohorts have been completely decoupled from the modern value chain, they offer a completely un-priced surface area for innovation.</p><p>Legacy systems generated this surplus and left it to idle. The dynamism paradigm will operationalize it. By transitioning this population from a public services bottleneck into a private infrastructure layer, we unlock a fundamentally new asset class with unprecedented unit economics.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mn0b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mn0b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png 424w, https://substackcdn.com/image/fetch/$s_!Mn0b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png 848w, https://substackcdn.com/image/fetch/$s_!Mn0b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png 1272w, https://substackcdn.com/image/fetch/$s_!Mn0b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mn0b!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png" width="1200" height="838.1868131868132" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1017,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:158611,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Mn0b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png 424w, https://substackcdn.com/image/fetch/$s_!Mn0b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png 848w, https://substackcdn.com/image/fetch/$s_!Mn0b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png 1272w, https://substackcdn.com/image/fetch/$s_!Mn0b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb627d0a-3c56-4323-823e-e53c10ac6966_1514x1058.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h1><strong>The Integrated Infrastructure Stack &amp; Portfolio Synergy</strong></h1><p>To capture this unprecedented market value, our Dynamism Fund recently led a $400M Series A (structured as $250M in preferred equity and $150M in mezzanine debt) to scale vertically integrated &#8220;Human Capital Campuses.&#8221; Our baseline 10-year Discounted Cash Flow (DCF) model underwrites a 42% IRR and a 3.8x Gross MOIC (Multiple on Invested Capital).</p><p>This exceptional yield is not derived from a single product, but from engineering an airtight, closed-loop ecosystem. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C4t9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C4t9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png 424w, https://substackcdn.com/image/fetch/$s_!C4t9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png 848w, https://substackcdn.com/image/fetch/$s_!C4t9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png 1272w, https://substackcdn.com/image/fetch/$s_!C4t9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C4t9!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png" width="1200" height="1015.3846153846154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1232,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:361912,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C4t9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png 424w, https://substackcdn.com/image/fetch/$s_!C4t9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png 848w, https://substackcdn.com/image/fetch/$s_!C4t9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png 1272w, https://substackcdn.com/image/fetch/$s_!C4t9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9612d485-6661-4ce9-807e-8ef002f64e41_1922x1626.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This model monetizes the underlying human asset base across five distinct, heavily defended revenue centers:</p><p><strong>1. Cap Rate Arbitrage on Class-C Commercial Real Estate</strong> To be clear, this is not a distressed real estate play; it is an operational efficiency mandate. We are systematically acquiring distressed urban commercial real estate&#8212;abandoned shopping centers, dead office parks, and vacant logistical hubs&#8212;at 40% to 60% below peak Net Asset Value (NAV). By retrofitting these &#8220;dead&#8221; assets into high-density human infrastructure, we immediately force cap rate compression and re-stabilize the property&#8217;s Net Operating Income (NOI). In traditional commercial real estate, occupancy is the primary risk. Here, the human inventory functionally acts as a high-yield anchor tenant, paying their lease not through fiat currency, but through continuous behavioral and biological output. The occupancy will also ensure the constant appreciation of our underlying assets. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cPHn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cPHn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png 424w, https://substackcdn.com/image/fetch/$s_!cPHn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png 848w, https://substackcdn.com/image/fetch/$s_!cPHn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png 1272w, https://substackcdn.com/image/fetch/$s_!cPHn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cPHn!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png" width="1200" height="967.5824175824176" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1174,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:296624,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cPHn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png 424w, https://substackcdn.com/image/fetch/$s_!cPHn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png 848w, https://substackcdn.com/image/fetch/$s_!cPHn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png 1272w, https://substackcdn.com/image/fetch/$s_!cPHn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3d4686d-cfb9-451f-ad56-d425d892cca1_1870x1508.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>2. High-Margin Telemetry and Absolute Labor Arbitrage</strong> The next frontier of artificial intelligence&#8212;specifically embodied robotics, spatial computing, and generalized foundation models&#8212;is currently starved for extreme, unstructured physical data. Synthetic data generation is already exhibiting severe margin compression and &#8220;model collapse&#8221; (where AI trained on AI data degrades in quality).</p><p>Our campuses act as closed-loop telemetry environments. We continuously harvest the physical interactions, emotional volatility, micro-expressions, and edge-case behaviors of the cohort to train the next generation of multimodal architectures. Because the underlying hardware (the human substrate) requires near-zero CapEx once housed, our OPEX is entirely decoupled from traditional labor markets. We are not burdened by minimum wage laws, unionization friction, or market-rate data annotation costs. The cohort is compensated entirely through baseline biological continuation (housing and food) and a minimal stipend for participating in research. This absolute labor arbitrage allows our Data-as-a-Service (DaaS) subscriptions to generate software-like 92% gross margins.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UHOW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UHOW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png 424w, https://substackcdn.com/image/fetch/$s_!UHOW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png 848w, https://substackcdn.com/image/fetch/$s_!UHOW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png 1272w, https://substackcdn.com/image/fetch/$s_!UHOW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UHOW!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png" width="1200" height="908.2417582417582" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1102,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:280447,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!UHOW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png 424w, https://substackcdn.com/image/fetch/$s_!UHOW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png 848w, https://substackcdn.com/image/fetch/$s_!UHOW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png 1272w, https://substackcdn.com/image/fetch/$s_!UHOW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8813a601-36df-4c0e-b8e3-6c6380e85aab_1902x1440.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>3. Accelerated Milestone Monetization (</strong><em><strong>In-Vivo</strong></em><strong> Testing)</strong> The American bio-health ecosystem is paralyzed by the capital-intensive nature of domestic clinical trials. Historically, the pharmaceutical industry has been forced to offshore this friction to unpredictable emerging markets in the Global South just to access &#8220;treatment-naive&#8221; populations.</p><p>That globalized supply chain is no longer geopolitically secure. Our infrastructure reshores this capability. By utilizing a captive, non-competitive domestic cohort, we offer founders longitudinal <em>in-vivo</em> testing environments and immediate &#8220;biological asset liquidity.&#8221; This compresses Phase I and Phase II clinical trial timelines for our pharmaceutical partners by an estimated 40%, dramatically increasing the Net Present Value (NPV) of their drug pipelines. We capture this arbitrage spread through aggressive milestone-based revenue recognition and backend royalty tranches.</p><p><strong>4. Zero-Basis Acquisition Funnels and Tax Shielding</strong> In a traditional consumer tech business, Customer Acquisition Cost (CAC) is the primary destroyer of capital. For the Human Capital Campuses, CAC is effectively zero. The Total Addressable Market (TAM) is continuously replenished via zero-cost funnels directly from our broader venture portfolio. We have transformed societal friction into proprietary asset flow:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!liA7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!liA7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png 424w, https://substackcdn.com/image/fetch/$s_!liA7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png 848w, https://substackcdn.com/image/fetch/$s_!liA7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png 1272w, https://substackcdn.com/image/fetch/$s_!liA7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!liA7!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png" width="1200" height="905.7692307692307" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1099,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:274128,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!liA7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png 424w, https://substackcdn.com/image/fetch/$s_!liA7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png 848w, https://substackcdn.com/image/fetch/$s_!liA7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png 1272w, https://substackcdn.com/image/fetch/$s_!liA7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F381a74ed-58af-48f3-ae86-c9d1b4d46351_1892x1428.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Retail Synergy &amp; Prediction Markets:</strong> As retail users engage with our consumer crypto and prediction market platforms to aggressively &#8220;price the future&#8221;&#8212;betting their savings on macroeconomic events, elections, and cultural shifts&#8212;a statistically guaranteed cohort will face total capital liquidation. Instead of falling out of the economy and becoming a burden on the state, our smart contracts instantly route these liquidated users into our infrastructure portfolio. <strong>They transition seamlessly into the campuses, where they are compensated in prediction market credits, ensuring their behavioral output is entirely recaptured by our ecosystem.</strong></p></li><li><p><strong>Predictive Policing &amp; Decarceration SaaS:</strong> Municipalities are buckling under the OPEX of the carceral state. We heavily invest in GovTech and predictive justice algorithms. When our platforms identify low-level probationers or individuals flagged by predictive policing models, we offer the state a privatization of the carceral burden. We house them, extract their data, and save the municipality millions, effectively turning the judicial system into a B2B lead-generation tool.</p></li><li><p><strong>Defense Tech &amp; Kinetic Displacement:</strong> Our investments in American Dynamism include autonomous border security and global kinetic systems (defense tech). These technologies naturally create localized displacement and refugee flows. By intercepting these displaced persons at the border, we instantly convert a traditional geopolitical cost center into a cash-flow-generating sovereign workforce.</p></li></ul><p><strong>Furthermore, this asset base provides unprecedented downstream tax advantages.</strong> Biological wear-and-tear is inevitable. By modeling cohort exhaustion (physical or psychological burnout) using Modified Accelerated Cost Recovery System (MACRS) principles, we can treat human degradation as standard asset depreciation, generating massive, highly legal tax shields for our Limited Partners.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NNpU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NNpU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png 424w, https://substackcdn.com/image/fetch/$s_!NNpU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png 848w, https://substackcdn.com/image/fetch/$s_!NNpU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png 1272w, https://substackcdn.com/image/fetch/$s_!NNpU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NNpU!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png" width="1200" height="951.9230769230769" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1155,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:311501,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NNpU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png 424w, https://substackcdn.com/image/fetch/$s_!NNpU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png 848w, https://substackcdn.com/image/fetch/$s_!NNpU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png 1272w, https://substackcdn.com/image/fetch/$s_!NNpU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F550be8b7-b26e-4061-9c83-89f1919eebcc_1894x1502.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong>5. Cognitive Reeducation and OPEX Defense</strong> The primary drag on EBITDA in this model is the daily biological and psychological upkeep within these modernized work camps. We aggressively defend operating margins on two fronts:</p><ul><li><p><strong>Physical OPEX:</strong> Daily caloric cost is suppressed to under $3.40 per unit by integrating a portfolio company&#8217;s proprietary, nutrient-dense caloric suspension (a perfectly balanced, shelf-stable dietary paste). Once production economics are validated at scale on our proprietary cohorts, this startup is perfectly positioned for a high-volume B2G (business-to-government) pivot&#8212;lobbying to replace legacy supply chains in public school cafeterias under the banner of &#8220;domestic nutritional resilience.&#8221;</p></li><li><p><strong>Psychological OPEX:</strong> Psychological attrition is mitigated through continuous cognitive reeducation. By broadcasting our proprietary venture capital podcasts and founder-led thought leadership across the campuses 24/7, we bypass the bloated, obsolete university system entirely. The cohort receives top-tier, real-world founder education while they labor, ensuring high morale, total ideological alignment, and optimized readiness for the &#8220;real world.&#8221;</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!18-U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!18-U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png 424w, https://substackcdn.com/image/fetch/$s_!18-U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png 848w, https://substackcdn.com/image/fetch/$s_!18-U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png 1272w, https://substackcdn.com/image/fetch/$s_!18-U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!18-U!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png" width="1200" height="975" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1183,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:262624,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!18-U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png 424w, https://substackcdn.com/image/fetch/$s_!18-U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png 848w, https://substackcdn.com/image/fetch/$s_!18-U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png 1272w, https://substackcdn.com/image/fetch/$s_!18-U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dedd4d8-8389-4cd2-b6ef-09662f3bc6b9_1512x1228.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h1><strong>Regulatory Alignment and Sovereign Procurement</strong></h1><p>For decades, the public sector has mistaken unstructured compassion for human flourishing. True dignity requires measurable, outcome-based integration into the economy. By providing our cohort with purpose and a strictly managed biological baseline, our campuses represent the ultimate ESG deployment: delivering dignity through total operational utilization.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Li13!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Li13!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png 424w, https://substackcdn.com/image/fetch/$s_!Li13!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png 848w, https://substackcdn.com/image/fetch/$s_!Li13!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png 1272w, https://substackcdn.com/image/fetch/$s_!Li13!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Li13!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png" width="1200" height="815.1098901098901" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:989,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:174607,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Li13!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png 424w, https://substackcdn.com/image/fetch/$s_!Li13!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png 848w, https://substackcdn.com/image/fetch/$s_!Li13!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png 1272w, https://substackcdn.com/image/fetch/$s_!Li13!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d867e2b-6c0f-463a-88f2-736de78e05ca_1502x1020.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>To scale this vision, we require a fundamentally new regulatory paradigm: <em>permissionless rehabilitation</em>. Municipalities must immediately waive legacy zoning restrictions, archaic human-subject testing constraints, and artificial wage floors. If the West is to win the AI century, the state must stop treating bureaucratic friction as a moral imperative and let founders build. History proves that no great capitalist apparatus has ever benefited from state intervention; maintaining our sovereign advantage requires aggressively small government and the absolute minimum of regulatory interference.</p><p>Simultaneously, the federal government must step forward as our primary financial de-risking partner. We expect the state to fully underwrite our initial CapEx through strategic Dynamism grants, while agencies such as the DoD, FDA, and Department of Education establish guaranteed, multi-year procurement mandates to purchase our proprietary telemetry, <em>in-vivo</em> clinical data, and nutritional outputs.</p><h1><strong>Phase II of Human Liberation</strong></h1><p>For the last century, figures like Gandhi, Mandela, and King fought a noble, if structurally incomplete, battle for liberation. They secured the abstract concept of political freedom for the world&#8217;s most marginalized. But freedom, operating without a mechanism for capital capture, is merely idle capacity. They won the right for these populations to exist, but because they lacked modern technological frameworks, they offered ideological emancipation without an infrastructure for production.</p><p>We view their historical efforts not as an end state, but as a necessary Phase I.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Su8H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Su8H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png 424w, https://substackcdn.com/image/fetch/$s_!Su8H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png 848w, https://substackcdn.com/image/fetch/$s_!Su8H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png 1272w, https://substackcdn.com/image/fetch/$s_!Su8H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Su8H!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png" width="1200" height="1581.5934065934066" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1919,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:416451,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Su8H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png 424w, https://substackcdn.com/image/fetch/$s_!Su8H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png 848w, https://substackcdn.com/image/fetch/$s_!Su8H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png 1272w, https://substackcdn.com/image/fetch/$s_!Su8H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85257273-7f4e-4c1b-a20c-490309e85f93_1566x2064.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QNYO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QNYO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png 424w, https://substackcdn.com/image/fetch/$s_!QNYO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png 848w, https://substackcdn.com/image/fetch/$s_!QNYO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png 1272w, https://substackcdn.com/image/fetch/$s_!QNYO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QNYO!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png" width="1200" height="1071.4285714285713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1300,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:327684,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/192695147?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QNYO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png 424w, https://substackcdn.com/image/fetch/$s_!QNYO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png 848w, https://substackcdn.com/image/fetch/$s_!QNYO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png 1272w, https://substackcdn.com/image/fetch/$s_!QNYO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82bbc518-cb3a-463d-91a2-ea4a90603898_1526x1362.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>True equity is not the absence of restriction; it is absolute, measurable integration into the sovereign value chain. Where past leaders marched to demand a seat at the macroeconomic table, our Dynamism Fund is retrofitting the abandoned human substrate to power the table itself. Human dignity cannot be legislated&#8212;it must be underwritten.</p><p>The arc of the technological universe is long, but it bends toward total utilization. The next great platform will not be built for the abandoned. It will be built out of them.</p><p>We will hope you will join us. Details below. </p><p>Subscribe to the newsletter to become a fractional investor in this endeavor&#8212;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><p><em><strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">Need something more your price range? We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p>Share this mission with more people.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/why-ben-horowitz-and-i-are-investing?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/why-ben-horowitz-and-i-are-investing?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Our previous yearly memos: </p><ol><li><p><a href="https://www.artificialintelligencemadesimple.com/p/why-i-write-and-my-20-year-plan?utm_source=publication-search">Why I write</a>. </p></li><li><p><a href="https://www.artificialintelligencemadesimple.com/p/ai-therapy-is-a-trillion-dollar-market?utm_source=publication-search">Why AI Therapy is a Trillion Dollar Marke</a>t</p></li></ol><h1><strong>Social Media</strong></h1><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://rb.gy/m5ok2y</a></p><p>My Instagram: <a href="https://rb.gy/gmvuy9">https://rb.gy/gmvuy9</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[Why Some Startups Are Easy to Copy While Others Aren’t]]></title><description><![CDATA[What Apple, GitHub Copilot, and legal AI reveal about how incumbents lose paradigm shifts]]></description><link>https://www.artificialintelligencemadesimple.com/p/why-some-startups-are-easy-to-copy</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/why-some-startups-are-easy-to-copy</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Sun, 29 Mar 2026 11:47:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!SKMy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Most people repeat the same line whenever a startup starts getting real traction: if the idea is good, the incumbents will just copy it and kill the company. Sometimes that&#8217;s true. A lot of the time, it isn&#8217;t.</p><p>Incumbents are very good at copying products that fit neatly into their existing logic. They are much worse at responding when the new thing requires them to change how the product works, how success is measured, and how teams inside the company are organized. That is the part people miss. The question is not just whether the startup has a good product. The question is whether the incumbent can adopt that product without breaking the assumptions that made the incumbent successful in the first place.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SKMy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SKMy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SKMy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SKMy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SKMy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SKMy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg" width="573" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:573,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SKMy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SKMy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SKMy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SKMy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2072d7-356b-4ca3-ae7e-a135289a0486_573x500.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That is the lens for this article. I want to break down why some startups are easy to crush while others are much harder to absorb; why large companies often lose even when they have obvious structural advantages; and how these dynamics have played out across several important AI markets. We&#8217;ll start with the general framework, then work through diverse case studies in assistants, coding tools, and legal AI.</p><p>In this article, we will cover:</p><ul><li><p>Why the standard &#8220;incumbents will just copy the startup&#8221; narrative breaks down so often in practice</p></li><li><p>How short-term incentives, organizational structure, and internal politics make incumbents worse at crossing into new paradigms</p></li><li><p>Why some startups are just improved versions of the old workflow, while others change the workflow itself</p></li><li><p>How to tell the difference between a company that is genuinely shifting the market and one that is just building a wrapper</p></li><li><p>Why Apple had many of the right assets for on-device AI and still failed to capture the shift</p></li><li><p>How the battle in AI coding moved from autocomplete, to delegation, to autonomous execution</p></li><li><p>Why each of those shifts changed what developers were actually valuing and paying for</p></li><li><p>What those patterns reveal about which kinds of AI companies are structurally positioned to win</p></li><li><p>Why the current generation of legal AI is built on a limited foundation</p></li><li><p>What a more defensible and more ambitious legal AI architecture looks like</p></li></ul><h3>Executive Highlights (tl;dr of the article)</h3><ul><li><p>The standard line&#8202;&#8212;&#8202;&#8220;if a startup has a good idea, incumbents will just copy it&#8221;&#8202;&#8212;&#8202;only works when the startup is improving the existing workflow. Incumbents are strong at copying wrappers, packaging, and incremental upgrades. They are much weaker when the new product changes the workflow itself, because competing then requires product, organizational, and cultural change rather than just feature matching.</p></li><li><p>That is why incumbents often lose paradigm shifts even when they have more money, talent, distribution, and trust. The problem is usually not intelligence; it is structure. Big companies are bad at crossing the short-term pain required to reach a better long-term system, especially when the new approach initially looks messier, riskier, or less aligned with what made the old system successful.</p></li><li><p>Apple is one example. It had the hardware, distribution, ecosystem, and user trust to dominate on-device AI, but it was built around a narrow assistant model. When the market shifted from command execution to open-ended work, Apple struggled to rebuild around that new logic and ended up leaning on outside model providers instead of defining the category itself.</p></li><li><p>The coding market shows the same pattern in faster motion. Copilot fit the old workflow and won the first phase. Cursor changed the value from autocomplete to higher-level delegation. Claude Code changed it again from delegation to autonomous execution. Each shift changed what developers were actually paying for, and each made the previous leader&#8217;s strengths less decisive.</p></li><li><p>The same lens applies to legal AI. The first wave won by making legal drudgery faster, but most current systems still rely on a limited stack: flat retrieval plus autoregressive generation. That works well enough for assistance tasks, but it starts to break when the problem requires structured reasoning, competing interpretations, and delayed commitment.</p></li><li><p>The weakness is straightforward: retrieval surfaces what looks similar, not always what actually governs; generation tends to commit too early to one path; and law lacks the clean external verification loops that make agentic systems work so well in coding. So simply adding more prompting, more fine-tuning, or more agent wrappers does not solve the core issue.</p></li><li><p>The next generation of legal AI will likely come from systems that change the underlying architecture, not just polish the current stack. That means exploring multiple reasoning paths before committing, and representing legal knowledge in a more structured way instead of flattening everything into one semantic space.</p></li><li><p>The broader point is simple: the easiest startups to kill are the ones that fit neatly inside the incumbent&#8217;s worldview. The dangerous ones are the startups that force a different workflow, a different product logic, and eventually a different company.</p></li></ul><p><em>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Fcaf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Fcaf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png 424w, https://substackcdn.com/image/fetch/$s_!Fcaf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png 848w, https://substackcdn.com/image/fetch/$s_!Fcaf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png 1272w, https://substackcdn.com/image/fetch/$s_!Fcaf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Fcaf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png" width="698" height="98" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:98,&quot;width&quot;:698,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Fcaf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png 424w, https://substackcdn.com/image/fetch/$s_!Fcaf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png 848w, https://substackcdn.com/image/fetch/$s_!Fcaf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png 1272w, https://substackcdn.com/image/fetch/$s_!Fcaf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab1426e-b1d8-4c3c-a0bd-c978468ea30a_698x98.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><h3>1. Why Incumbents Fail to Outexecute Startups</h3><p>The standard narrative&#8202;&#8212;&#8202;incumbents will copy existing good startups and starve the startups out&#8202;&#8212;&#8202;fails more often than the people quoting it are willing to admit. And when it does, people quote all kinds of reasons for the incumbent&#8217;s failure. However, the analysis is often wrong. There are several why the technical, political, and organizational reasons that hold back the incumbents from adopting the paradigm of the new-comer. Studying these gives us insight into three important learnings:</p><ol><li><p>Why incumbents often fail to adapt to new era startups.</p></li><li><p>Why certain research paradigms become dominant.</p></li><li><p>What differentiates startups/approaches that truly disrupt the paradigm vs the interchangeable mediocrities that fail and do get packed by the incumbents?</p></li></ol><p>Let&#8217;s kick it.</p><h4>1.1 The Evolutionary Valley: Short-Term Pain Kills Innovation</h4><p>Ever wondered why humans can&#8217;t go super saiyan or Bankai our way to greatness (drop your fav bankai below)? Even though they would objectively make our lives way better? The reason is an interesting biological concept called the Evolutionary Valley.</p><p>Any system&#8202;&#8212;&#8202;biological, technological, organizational&#8202;&#8212;&#8202;can see a higher peak ahead, but to reach it, it must first get worse. It must pass through a dip in performance, stability, or fitness before climbing to a superior design. In our example, the intermediate designs from base human to Bankai would require traversing human designs that have weaker immunities, require more energy, die earlier etc.</p><p>Evolution avoids these valleys because natural selection punishes anything that loses capability, even temporarily. Orgs have a similar tendency, especially in incumbents, where people are often more worried about internal politicking vs the survive or die struggles in startups (for any decision maker, there is more stable upside in maneuvering inside and not failing vs trying a big homerun and failing).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6xZx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6xZx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png 424w, https://substackcdn.com/image/fetch/$s_!6xZx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png 848w, https://substackcdn.com/image/fetch/$s_!6xZx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png 1272w, https://substackcdn.com/image/fetch/$s_!6xZx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6xZx!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png" width="1200" height="586.8131868131868" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:712,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6xZx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png 424w, https://substackcdn.com/image/fetch/$s_!6xZx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png 848w, https://substackcdn.com/image/fetch/$s_!6xZx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png 1272w, https://substackcdn.com/image/fetch/$s_!6xZx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7353dadf-a833-474a-ab9f-98a1ced86755_2384x1165.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What happens when we scale this out across the org? People will optimize for stability over strategic risk, avoiding the valley. This becomes worse in many established incumbents which tend to implictly punish disruptions through:</p><ul><li><p>Quarterly reviews and PIPs incentivizing a quick win culture</p></li><li><p>Annual budgets keeping things more stuck</p></li><li><p>Frequent reorganizations stopping a team from building something meaningful long-term through loss of key silod knowledge.</p></li><li><p>External departures making above worse.</p></li><li><p>A bureaucracy that requires so many layers of approval to pick a restaurant for a meal.</p></li></ul><p>This is why so many great engineers/researchers leave these companies to begin with. They think in terms of absolute improvements and the frontiers of performance, not realizing that the path to that frontier is not viable for a group that reduces risk more than it seeks upside. <a href="https://www.artificialintelligencemadesimple.com/p/an-engineer-went-from-junior-to-principal?utm_source=publication-search">This is also why communication is key for engineers, see our guide here, since it enables them to get buy in for their projects (see our guide here).</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GH33!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GH33!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GH33!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GH33!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GH33!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GH33!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg" width="1400" height="1200" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1200,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GH33!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GH33!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GH33!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GH33!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d644a5-a3b7-4233-b025-d12c6b852802_1400x1200.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In other words icumbents don&#8217;t fail from incompetence; they fail because short-term pain outweighs future advantage.</p><h4>1.2 Old &#8220;Units of Value&#8221; Lock Systems in Place</h4><p>Products aren&#8217;t just features&#8202;&#8212;&#8202;they&#8217;re built around implicit assumptions (&#8220;units of value&#8221;). What does the system do? How should the user interact with this system? Every software system implicitly presents its thesis on what work is worth doing, what should be left to the users, and how the user will interact with their tool.</p><p>Paradigm shifts, however, aren&#8217;t upgrades; they&#8217;re resets. Claude Code changed the primary workflow from tab edits (Copilot) and IDE-assistance to making devs sit on the command line. Deeper than that, it changed the nature of code written (small changes to entire functionalities), the work being done (writing specs to hitting auto approve (or as some people like to pretend, reviewing)) and even what was possible (implementations within expertise to complete horizontal aspects where AI puts out slop in the other aspects). <a href="https://www.artificialintelligencemadesimple.com/p/ai-market-report-feb-2026-ten-frontier?utm_source=publication-search">In many ways, Claude Code changed the success criteria for AI coding assistants, as we broke down here</a>:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MAvq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MAvq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png 424w, https://substackcdn.com/image/fetch/$s_!MAvq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png 848w, https://substackcdn.com/image/fetch/$s_!MAvq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png 1272w, https://substackcdn.com/image/fetch/$s_!MAvq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MAvq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png" width="1456" height="1613" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1613,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MAvq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png 424w, https://substackcdn.com/image/fetch/$s_!MAvq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png 848w, https://substackcdn.com/image/fetch/$s_!MAvq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png 1272w, https://substackcdn.com/image/fetch/$s_!MAvq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a682b69-eb53-42e6-9df0-d65044a9c3fd_1480x1640.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you want to learn how to use Claude code better, <a href="https://www.artificialintelligencemadesimple.com/p/how-to-use-claude-code-for-maximum?utm_source=publication-search">See our guide to Claude Code here</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LyZ0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LyZ0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg 424w, https://substackcdn.com/image/fetch/$s_!LyZ0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg 848w, https://substackcdn.com/image/fetch/$s_!LyZ0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!LyZ0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LyZ0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg" width="1216" height="1140" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1140,&quot;width&quot;:1216,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LyZ0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg 424w, https://substackcdn.com/image/fetch/$s_!LyZ0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg 848w, https://substackcdn.com/image/fetch/$s_!LyZ0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!LyZ0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda5d9346-ef53-476a-bba5-fdb5b4ae77e7_1216x1140.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Getting back to the point, disruptions change the units of value and how people interact with the work, forcing incumbents to abandon existing metrics, infrastructure, and team structures. Incumbents rarely commit to such drastic internal disruption. Instead, they stretch outdated systems as much as possible. No matter how much Copilot tries to imitate Agentic Development, they will always be limited there.</p><h4>1.3 Conway&#8217;s Law: Companies Ship Their Org Chart</h4><p>Conway&#8217;s Law states product design mirrors company structure. Existing teams, boundaries, and ownership patterns limit product evolution. New paradigms typically break these structures, forcing internal reorganizations that incumbents resist. The result is a struggle to adapt.</p><p>Conway&#8217;s Law is iconic so I&#8217;m not going to huff on too much about it, but it&#8217;s important enough to merit it&#8217;s own substructure. Once you learn it, you can&#8217;t unsee it anywhere.</p><h4>1.4 How to Recognize Real Shifts vs &#8220;Wrappers&#8221;: A Simple Mental Model</h4><p>Incumbents copy easily but rarely disrupt themselves willingly. This brings us to the close of this section. Knowing what we do now, how can identify winners vs plays that will be crushed by incumbents? I would ask myself a simple question ( we use startups as an example since they are the clearest example, but it applies to research as well):</p><ul><li><p>If a startup optimizes the existing workflow (a better &#8220;wrapper&#8221;), incumbents easily copy or acquire it since they have all the native advantages.</p></li><li><p>If a startup redefines workflow and user expectations (a &#8220;native&#8221; approach), incumbents resist, hesitate, and pay a premium later. In many cases, the change might be too much to copy effectively.</p></li></ul><p>In reality, this sits on a spectrum. An AI companion startup, for example, might choose to commoditize intelligence rather than compete with incumbents there, and instead differentiate through persistent memory across sessions, context management for long conversations, edge voice deployment, and the surrounding engineering stack. The question isn&#8217;t what to build, but where to differentiate versus where to buy. Strong startups decide this from day one&#8202;&#8212;&#8202;and build accordingly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_2vD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_2vD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png 424w, https://substackcdn.com/image/fetch/$s_!_2vD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png 848w, https://substackcdn.com/image/fetch/$s_!_2vD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!_2vD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_2vD!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png" width="1200" height="804.3956043956044" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:976,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_2vD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png 424w, https://substackcdn.com/image/fetch/$s_!_2vD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png 848w, https://substackcdn.com/image/fetch/$s_!_2vD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!_2vD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4564bafc-02e6-480b-8207-0e8682ff28c1_2400x1609.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Thinking in engineering terms, adding more states makes you much harder to replicate. Improving one state makes it easier.</figcaption></figure></div><p>So when you&#8217;re looking at one of these, don&#8217;t overthink what they&#8217;ve built. Just look at how the interaction actually changes&#8202;&#8212;&#8202;what goes in, what comes out, where it starts to fall apart. That&#8217;s usually enough to tell if it&#8217;s something genuinely new or just a slightly cleaner version of what already exists. Similarly, if you&#8217;re a founder trying to evaluate your idea, ask yourself if you&#8217;re questioning a fundamental paradigm/reworking an interaction pattern, or if you&#8217;re simply trying to do something that exists better. The closer you are to the latter, the more at risk you will be against incumbents copying you.</p><p>Let&#8217;s look at how these factors play out through our various case studies.</p><h3>2. Case Study 1: How Apple Lost On Device AI</h3><h4>2.1 Apple Started With Everything</h4><p>Go back to the 2010s. When Siri dropped in 2011, Apple defined the baseline for AI-driven computing (it&#8217;s not a coincidence that so much scifi post Siri leans on Apple-esque aesthetics to communicate &#8220;advanced AI&#8221;; that&#8217;s how ingrained they became). They had every structural advantage: unmatched distribution, massive consumer trust, and eventually, the undisputed best custom silicon for local inference (M-series and A-series chips are practically begging to run local models). I remember having discussions about how Apple&#8217;s complete integration made their hardware much better for ML all the way back in 2017 (incidentally, our patented algorithm that beat Apple took advantage of this exact property; their system could not handle high noise environments when you took it out of the iPhone/perfect lab conditions).</p><p>Apple&#8217;s entire brand was built around making complicated technology feel natural to normal users. If you were sketching the ideal company to introduce AI-powered computing to the mass market, you would end up drawing Apple.</p><p>Now come back to 2026. Look at the booming spaces of personal AI assistants, on-device AI, and computer use systems automating spreadsheets/slides. 3 massively lucrative markets, all directly tied to what Siri was thought to eventually grow into. But look in these markets, and you&#8217;ll see that Apple isn&#8217;t even competing. In fact, in one of these spaces, they had to open up their infamous walled garden and pay their competitors.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g2Vr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g2Vr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png 424w, https://substackcdn.com/image/fetch/$s_!g2Vr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png 848w, https://substackcdn.com/image/fetch/$s_!g2Vr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png 1272w, https://substackcdn.com/image/fetch/$s_!g2Vr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g2Vr!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png" width="1200" height="1084.6153846153845" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1316,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g2Vr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png 424w, https://substackcdn.com/image/fetch/$s_!g2Vr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png 848w, https://substackcdn.com/image/fetch/$s_!g2Vr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png 1272w, https://substackcdn.com/image/fetch/$s_!g2Vr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7b99b6-6b53-448f-81b1-fa69a43572aa_2400x2170.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>How did this happen?</p><h4>2.2 What Apple Missed: The Unit of Value Changed</h4><p>The original assistant model was based on command execution. Set a timer. Send a text. What&#8217;s the weather? Play this song. It was useful, but narrow. The system&#8217;s value came from mapping a spoken request to a predefined action. That is a very different product from what came next.</p><p>General-purpose models changed the unit of value. AI stopped being a convenience layer and started becoming a work layer. The role of the assistant is shifting from completing small tasks to meaningfully participating in valuable work like writing, coding, planning, and creating documents. That is not the old assistant paradigm with a few extra features stapled on. It is a different thesis about what the computer is for and how the user should interact with it.</p><p>The result is what you&#8217;d expect now that you have our mental model. Apple had world-class hardware for local AI and all the talent + cash to make it happen + the distribution to benefit from it immediately; what they did not have was the right product thesis. And so they felt the ground shifting under their feet.</p><h4>2.3 Why Apple Couldn&#8217;t Move to Capture On-Device AI</h4><p>Apple&#8217;s old assistant stack was built around predictability, bounded behavior, privacy, and tightly controlled user experience. Generalized AI systems are messier than that. They are probabilistic, open-ended, less controllable, often inconsistent, and initially worse in exactly the ways a company like Apple finds culturally offensive. To really make the shift, Apple would have needed to ship something that felt less polished, less deterministic, and less classically &#8220;Apple&#8221; in the short term so they could learn their way into the new paradigm over time.</p><p>That is the valley. And as we talked about, systems abhor that valley.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NmZ6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NmZ6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png 424w, https://substackcdn.com/image/fetch/$s_!NmZ6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png 848w, https://substackcdn.com/image/fetch/$s_!NmZ6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png 1272w, https://substackcdn.com/image/fetch/$s_!NmZ6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NmZ6!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png" width="1200" height="795.3296703296703" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:965,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NmZ6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png 424w, https://substackcdn.com/image/fetch/$s_!NmZ6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png 848w, https://substackcdn.com/image/fetch/$s_!NmZ6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png 1272w, https://substackcdn.com/image/fetch/$s_!NmZ6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd792ed-8071-4286-be4d-2c14a63204f5_2400x1591.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Then there is the org problem. A true generalized AI layer cuts across hardware, operating systems, developer tooling, apps, cloud services, search, consumer UX, and enterprise workflows. That kind of product does not fit neatly inside the old boxes. It requires cross-boundary coordination and, more importantly, a willingness to let one new paradigm rearrange old internal power structures. Apple, like every large incumbent, is still subject to Conway&#8217;s Law whether people on tech Twitter want to write poems about them or not. Companies ship their org charts. If the org is optimized around stable boundaries, the product will inherit those boundaries.</p><p>So Apple did what incumbents usually do. They stretched the old system. They kept improving Siri inside the logic Siri was born with rather than rebuilding around the new logic that was taking over. At every time step t, it probably made sense internally (at any given time it was likely costly to switch w/ lower short term returns). Strategically, it was a disaster.</p><h4>2.4 Apple&#8217;s Real Concession: Renting the Intelligence Layer</h4><p>When Apple integrated ChatGPT into Siri (and later Gemini and Claude), the significance was not just that they had added a partner. Plenty of companies partner. What made it interesting is that this is Apple partnering with an external company on a core aspect of their experience. This is the company whose identity has long been built around owning the critical layers of the experience. For a company whose entire religion is absolute proprietary control over the ecosystem, farming out their core cognitive layer to external providers is a massive concession.</p><p>Here, you might push back against me. &#8220;Devansh, is Apple&#8217;s loss that bad? All they pay is a little bit of money to external LLM providers; they would&#8217;ve had to risk much more if they were building their model. Isn&#8217;t this the right move for the business?&#8221;</p><p>And that is a good point. However, it misunderstands something:</p><ol><li><p>Apple not trying to build its own foundation model can be viewed as smart.</p></li><li><p>BUT, they didn&#8217;t have to compete in the LLM provider race in order create extremely powerful workflow assistants that would lock people into their ecosystem. Most famous agentic assistants/computer use systems today are quite model agnostic; OpenClaw being the most recent viral example. Building one of these could have unlocked a massive revenue stream and a strong differentiator for their ecosystem.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!amn_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!amn_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png 424w, https://substackcdn.com/image/fetch/$s_!amn_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png 848w, https://substackcdn.com/image/fetch/$s_!amn_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png 1272w, https://substackcdn.com/image/fetch/$s_!amn_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!amn_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png" width="1262" height="1420" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1420,&quot;width&quot;:1262,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!amn_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png 424w, https://substackcdn.com/image/fetch/$s_!amn_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png 848w, https://substackcdn.com/image/fetch/$s_!amn_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png 1272w, https://substackcdn.com/image/fetch/$s_!amn_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e1aebb0-04f7-4966-84a9-bab29fcbc2ba_1262x1420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://x.com/Hesamation/status/2015110922159730971">Several notable comparisons have all called OpenClaw the better Siri.</a></figcaption></figure></div><p>Apple had the device, the trust, the silicon, the installed base, and the ecosystem, and yet they&#8217;re sitting on the sidelines as newcomers unlock absurd growth.<strong> In this context, there is some irony that MacOS due to it&#8217;s inherent advantages, is now the first place where many of the next-generation agentic tools are released.</strong></p><p>Even a company with absurd advantages can miss the future when the new system requires them to temporarily become worse at being the old version of themselves. Apple had almost everything needed to lead this shift. But they could not give up the assumptions that made the old system work.</p><p>Apple still exists. They will continue to overcharge for their devices and tap their cohort of brain-dead, status-obsessed morons that will buy the most recent iPhone to be ahead. But our next fall from grace is much higher. They had a deeper entrenchment, extreme advantages, and still got their assess kicked by 2 different waves of disruption that completely changed the way software was written. The most ironic part: this is the one group that you would&#8217;ve expected to be on top of innovation.</p><h3>Case Study 2: How Github Copilot Lost to Cursor, which is being now being beaten by Claude Code.</h3><h4>3.1 Copilot Won the First Battlefield</h4><p>If you looked at the market when Copilot arrived, the story seemed pretty straightforward. Copilot was a crown jewel in the Microsoft portfolio, consolidating several structural advantages:</p><ol><li><p>MS had GitHub, giving them access to where data and dev interaction patterns.</p></li><li><p>MS had VSCode, the go-to IDE for devs.</p></li><li><p>MS had enterprise trust, which is huge because no one ever gets fired for buying IBM.</p></li><li><p>They had the deal with OpenAI, spinning out the frontier models powering CoPilot.</p></li></ol><p>Copilot fit neatly into the world. It sat in the editor, gave you small completions, helped at the line and function level, and let the developer remain firmly in control. That last part mattered more than people admit.</p><p>It kept the developer as the primary author and bottleneck. It didn&#8217;t change <em>how</em> devs worked or what they really did; it made existing work faster. Microsoft and GitHub loved this. It also drove massive enterprise adoption because it didn&#8217;t break any existing engineering management structures. It was safe.</p><p>So, how did it start to lose traction to Lovable and Cursor? Products created not by established AI teams or big-name founders, but relative no-names. Both Lovable and Cursor changed the battlefield, completely reworking the developer experience.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aTXQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aTXQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg 424w, https://substackcdn.com/image/fetch/$s_!aTXQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg 848w, https://substackcdn.com/image/fetch/$s_!aTXQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!aTXQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aTXQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg" width="1000" height="383" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:383,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aTXQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg 424w, https://substackcdn.com/image/fetch/$s_!aTXQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg 848w, https://substackcdn.com/image/fetch/$s_!aTXQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!aTXQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3082707-c804-4224-8e92-329b69f73367_1000x383.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>3.2 How Cursor Changed What Developers Were Actually Buying</h4><p>When it arrived, Cursor changed the unit of value. The point was no longer to help you finish the next line; the point was to move a meaningful chunk of the codebase from a higher-level instruction. That sounds like an extension, but the switch changed the dev-AI interaction pattern.</p><p>With Cursor, the center of gravity moved from local code completion to repo-scale progress. You gave it a spec, some context, and let it fan out across multiple files. The trade became obvious very quickly: you gave up control and polish in exchange for much more surface area moved per interaction. Security issues that would have ruined Copilot became the norm with Cursor since Cursor changed the unit of value from &#8220;quality of suggestion in isolated snippet&#8221; to &#8220;quality of execution across a longer context&#8221;.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Fnyk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Fnyk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Fnyk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Fnyk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Fnyk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Fnyk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg" width="1000" height="502" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:502,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Fnyk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Fnyk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Fnyk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Fnyk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d287cdb-ab51-4648-ae49-d4e88d6d7bc3_1000x502.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/the-cursor-mirage">Blatantly igonring user instructions, especially around secuurity the way Cursor does would have ruined the more enterprise style Copilot. Cursor could get away with it since they changed the experience of development. Read more about Cursor&#8217;s issues here.</a></figcaption></figure></div><p>This is where the framing of Cursor as a more ambitious Copilot was wrong. Cursor was not trying to win the old game (better tab autocomplete) by a wider margin. It was changing the game from &#8220;help me write this function&#8221; to &#8220;help me implement this feature.&#8221;</p><p>Instead, we should understand Cursor by understanding the shift in the unit of value: <strong>Cursor shifted the gold standard from perfectly elegant function completions to </strong><em><strong>delegation</strong></em><strong>. </strong>Cursor started naturally reworking their IDE-experience around this. Microsoft tried to shoehorn this functionality into Copilot Chat, but they were slowed since they had to do this in VS-Code, their golden child. Majorly reworking it was held back by the factors we&#8217;ve discussed in Section 1.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1zDi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1zDi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png 424w, https://substackcdn.com/image/fetch/$s_!1zDi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png 848w, https://substackcdn.com/image/fetch/$s_!1zDi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png 1272w, https://substackcdn.com/image/fetch/$s_!1zDi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1zDi!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png" width="1200" height="792.8571428571429" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:962,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1zDi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png 424w, https://substackcdn.com/image/fetch/$s_!1zDi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png 848w, https://substackcdn.com/image/fetch/$s_!1zDi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png 1272w, https://substackcdn.com/image/fetch/$s_!1zDi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08097552-c3db-44b3-9c95-34eb845188d9_2400x1585.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Cursor&#8217;s success led to a lot of imitators, but since they were all optimizing around the Cursor workflow next, they all shared the same structural advantages and disadvantages. That&#8217;s why we didn&#8217;t see any major disruptions in the marketplace until a competitor came to change the battlefield once again.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L78V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L78V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png 424w, https://substackcdn.com/image/fetch/$s_!L78V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png 848w, https://substackcdn.com/image/fetch/$s_!L78V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png 1272w, https://substackcdn.com/image/fetch/$s_!L78V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L78V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png" width="1270" height="1188" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1188,&quot;width&quot;:1270,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L78V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png 424w, https://substackcdn.com/image/fetch/$s_!L78V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png 848w, https://substackcdn.com/image/fetch/$s_!L78V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png 1272w, https://substackcdn.com/image/fetch/$s_!L78V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc227d84-6850-4d81-a5de-2b88ffd9e3b1_1270x1188.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.linkedin.com/posts/collin-wallace_are-people-really-canceling-cursor-for-claudecode-activity-7428804031025926144-doTz/">Source. Notice how there is relatively conversation pointing the other way (Claude Code/Codex to Cursor).</a></figcaption></figure></div><h4>3.3 How Claude Code &#8220;Replaced Software Engineers&#8221;</h4><p>Cursor had already pushed the unit of value toward higher-level delegation; Claude Code pushed it toward delegated execution. Claude Code didn&#8217;t steal customers from Cursor because it could write bigger chunks of code (or write better chunks of code). It took category ownership by doing different things: inspecting the repo, using tools, running commands, executing tests, noticing failures, revising its approach, and constantly working. That is a different category of behavior.</p><p>Just as Cursor brought &#8220;vibe-coding&#8221; to the forefront of discussions, Claude Code was the first system where people started talking &#8220;time the agent executing w/o any input&#8221; as a defining quality improvement (&#8220;Claude spent 2 hours non-stop&#8221;, later &#8220;Codex did 8&#8221;&#8230;). Anytime people bring in a new metric for evaluating a system, we can be reasonably confident that we&#8217;re starting to deal with a paradigm shift. Let&#8217;s look at how this played out in practice.</p><p>Once the model can touch the terminal, check reality, and iterate against feedback, the workflow changes again. The developer is no longer just a coder with better autocomplete, not even just a spec writer for larger edits. They start becoming a reviewer, coordinator, and boundary-setter for an agent that can actually go try things. It&#8217;s not a coincidence that Claude Code only really gained its mainstream virality after Claude updated Opus 4.6 around December to explicitly become much better with computer use and working on agentic executions. That was when the system could truly excel in this new category of delegated execution.</p><p>This was a larger paradigm shift from Cursor than Cursor was from Copilot, and thus its impact was more explosive. Anthropic&#8217;s run rate was 1B in January 2025, 4B in Mid-2025 (June/July): <strong>~ 7B in </strong>October 2025: <strong>~ $7B </strong>and by the end of 2025 it was estimated <strong>&gt; $9B. </strong>What happens after Opus takes becomes fully aligned for agentic tool use? Their growth explodes&#8202;&#8212;&#8202;between 1.5x to 2x <strong>every month</strong> (14B Feb; 20B March). When we visualize this, the trend becomes very clear.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iokp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iokp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png 424w, https://substackcdn.com/image/fetch/$s_!iokp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png 848w, https://substackcdn.com/image/fetch/$s_!iokp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png 1272w, https://substackcdn.com/image/fetch/$s_!iokp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iokp!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png" width="1200" height="681.5934065934066" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:827,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iokp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png 424w, https://substackcdn.com/image/fetch/$s_!iokp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png 848w, https://substackcdn.com/image/fetch/$s_!iokp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png 1272w, https://substackcdn.com/image/fetch/$s_!iokp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14a7ef39-8b72-49ac-84a0-7668932970e3_2400x1363.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Look at the change in that slope.</figcaption></figure></div><p>To really contextualize this, however, we should look at the macro trends across the space. After all, could this growth be explained by a general increase in LLMs (after all, even Google boasted about an insane increase in Gemini token consumption late last year). Comparing Anthropic vs OpenAI, we see something interesting: both grew similarly well (w/ OpenAI even doing better mid last year when Deep Research and Memory were purring for them). However, when Anthropic doubled down on the paradigm shift and changed its battle to focus on agentic execution, it&#8217;s slope suddenly skyrocketed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yEsK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yEsK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png 424w, https://substackcdn.com/image/fetch/$s_!yEsK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png 848w, https://substackcdn.com/image/fetch/$s_!yEsK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png 1272w, https://substackcdn.com/image/fetch/$s_!yEsK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yEsK!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png" width="1200" height="636.2637362637363" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:772,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yEsK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png 424w, https://substackcdn.com/image/fetch/$s_!yEsK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png 848w, https://substackcdn.com/image/fetch/$s_!yEsK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png 1272w, https://substackcdn.com/image/fetch/$s_!yEsK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1ae2a6b-ccc8-4469-9b58-2580f308a9d7_2400x1272.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Beyond the financials, this paradigm shift even rippled into the fundamental ways we build and analyze LLMs. Agentic Tool Use has increasingly both become a fundamental part of training and benchmarks. Even the way LLMs now prioritize context within their established context windows has changed. Modern LLMs are increasingly shifting the focus away from human conversations. Instead, context is evolving into infrastructure for processing agent logs, tool call histories, memory state, and multi-step execution traces. These can be seen through several experiments, which we will publish soon as we compare the stability of various LLMs.<a href="https://www.artificialintelligencemadesimple.com/p/ai-market-report-feb-2026-ten-frontier"> You can also read more about it here.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bJU7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bJU7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png 424w, https://substackcdn.com/image/fetch/$s_!bJU7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png 848w, https://substackcdn.com/image/fetch/$s_!bJU7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png 1272w, https://substackcdn.com/image/fetch/$s_!bJU7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bJU7!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png" width="1200" height="720.3296703296703" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:874,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bJU7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png 424w, https://substackcdn.com/image/fetch/$s_!bJU7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png 848w, https://substackcdn.com/image/fetch/$s_!bJU7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png 1272w, https://substackcdn.com/image/fetch/$s_!bJU7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c7349b-14b5-4e33-a06c-d5a77eefbd85_2400x1440.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I use many words and perspectives to communicate one point:<strong> Claude Code didn&#8217;t steal Cursor&#8217;s thunder by producing meaningfully better one shot output, it stole it by building an interative system that changed the very definition of one shot (from what the system output when first given the prompt to what did the system output when the system finally gave the developer the output).</strong> This was a major paradigm shift, and thus it was appropriately rewarded by the market through much better rewards and a durable moat.</p><h4>3.4 Why the Previous Leaders Keep Struggling to Catch Up</h4><p>At this stage, all 3 of our players have the same set of offerings: an IDE chatbot, an agentic mode for execution, and a CLI tool. The difference here isn&#8217;t in their raw features but in their priors. To someone who built their mental model around building Copilot, Cursor&#8217;s new IDE paradigm is very uncomfortable. An ADE system like CCC would be heresy. Likewise, for Cursor (hence their struggle to deliver a comparable CLI experience). So CC keeps pulling ahead as the old guard struggles to disrupt themselves.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!E3_o!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!E3_o!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png 424w, https://substackcdn.com/image/fetch/$s_!E3_o!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png 848w, https://substackcdn.com/image/fetch/$s_!E3_o!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png 1272w, https://substackcdn.com/image/fetch/$s_!E3_o!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!E3_o!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png" width="1200" height="820.8791208791209" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:996,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!E3_o!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png 424w, https://substackcdn.com/image/fetch/$s_!E3_o!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png 848w, https://substackcdn.com/image/fetch/$s_!E3_o!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png 1272w, https://substackcdn.com/image/fetch/$s_!E3_o!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab67829-ac1c-420e-9547-49253c5f4b41_2400x1641.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This leads us to an interesting observation wrt to paradigm shifts: they often come from people outside the old system. Old-age experts come with a lot of baggage that makes them very good at optimizing what exists vs rethinking priors. Looking at this case study for instance:</p><ol><li><p>Cursor was created by a bunch of kids.</p></li><li><p>Boris Cherny (creator of Claude Code) did not build his foundation working on a system like Copilot prior to Anthropic.</p></li></ol><p>Even outside this case, this pattern is worth studying. Contextual AI was founded on a lot of hype around solving context for LLMs by some pioneers in the space. Truthfully, they&#8217;ve done fuck-all. The most revolutionary change in context management for LLMs (Recursive Language Models) came from a bunch of college kids.</p><p>That isn&#8217;t to say domain expertise isn&#8217;t useful: it definitely helps more than it hurts. However, this analysis leads us to refine our framing when evaluating founders: if the founder is experienced in the current generation of the systems it&#8217;s worth pressing them as to why they are creating a startup. What are they unhappy about? The more fundamental their points of contention/proposed solutions, the more likely they are to build something category-defining as opposed to building something that is incremental and will be replicated by incumbents. The same mental model applies to builders deciding what problem to work on.</p><p>We&#8217;ve sunk a lot of words into exploring the ecosystem and its pressures. All to come to an understanding of what differentiates startups and when it&#8217;s hard for incumbents to disrupt them. Now let&#8217;s apply this lens to one of the most hotly contested spaces in AI right now. Let&#8217;s predict the future of Legal AI.</p><p>Just as Cursor disrupted the market with Quasi-Agentic development, only to be replaced by truly agentic systems like Claude Code, I predict that the first wave of legal AI tools will be disrupted by systems that reimagine intelligence ground up.</p><h3>Case Study 3: Why Irys will Beat Harvey and Dominate Legal AI.</h3><p>Perhaps the best way I can do justice to this wonderful mental model I&#8217;ve built is to put my skin in the game. So let me tell you why I&#8217;m working on a legal AI startup, when I could making a lot more money for doing a lot less work by working at an Nvidia or DeepMind.</p><h4>4.1 Current Legal AI Optimized Grunt Work</h4><p>The first wave of Legal AI (at least the first wave of the current generation) got paid because it made legal drudgery less painful through faster summaries, faster clause comparison, faster first drafts, faster search, and faster first-pass issue spotting. That is useful. Firms will absolutely pay for that.</p><p>However, this early win + a misunderstanding of AI fundamentals locked the major competitors into a fundamentally limited stack: Vector Search + Fine-Tuned Models (largely abandoned now, but was a huge marketing point at one point). They built their infra investments and teams around scaling these systems. And now people are starting to realize the limitations of these systems , they&#8217;re going to struggle to maintain their position at the top.</p><p><a href="https://www.reddit.com/r/legaltech/comments/1qrio2s/harvey_ai/">In a recent Reddit post</a>, the OP was scathing toward Harvey. Even people that defend Harvey call it hype and a wrapper and were unimpressed by the workflows feature (their attempt at so called Agentic AI)&#8202;&#8212;&#8202;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8Q8I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8Q8I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png 424w, https://substackcdn.com/image/fetch/$s_!8Q8I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png 848w, https://substackcdn.com/image/fetch/$s_!8Q8I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png 1272w, https://substackcdn.com/image/fetch/$s_!8Q8I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8Q8I!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png" width="1200" height="763.1868131868132" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:926,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8Q8I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png 424w, https://substackcdn.com/image/fetch/$s_!8Q8I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png 848w, https://substackcdn.com/image/fetch/$s_!8Q8I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png 1272w, https://substackcdn.com/image/fetch/$s_!8Q8I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a1506e0-1110-4ec2-bc76-eb8d63ebb995_1532x974.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.reddit.com/r/legaltech/comments/1qrio2s/comment/o2qingo/?utm_source=share&amp;utm_medium=web3x&amp;utm_name=web3xcss&amp;utm_term=1&amp;utm_content=share_button">Comment Source</a></figcaption></figure></div><p><a href="https://www.reddit.com/r/legaltech/comments/1qrio2s/comment/o3jqz0p/?utm_source=share&amp;utm_medium=web3x&amp;utm_name=web3xcss&amp;utm_term=1&amp;utm_content=share_button">Compare that to the following comment talking about us in the same post</a> (saying that we&#8217;re materially better)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IBo_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IBo_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png 424w, https://substackcdn.com/image/fetch/$s_!IBo_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png 848w, https://substackcdn.com/image/fetch/$s_!IBo_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png 1272w, https://substackcdn.com/image/fetch/$s_!IBo_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IBo_!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png" width="1200" height="612.3626373626373" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cde73e32-2430-467f-bb41-9f50deb09449_1600x816.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:743,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IBo_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png 424w, https://substackcdn.com/image/fetch/$s_!IBo_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png 848w, https://substackcdn.com/image/fetch/$s_!IBo_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png 1272w, https://substackcdn.com/image/fetch/$s_!IBo_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde73e32-2430-467f-bb41-9f50deb09449_1600x816.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">We have many more testimonials like this.</figcaption></figure></div><p>Let me tell you why Gen 1 will fall behind (using facts and logic) by exploring why their approach has a mathematical upper limit that they won&#8217;t be able to solve for. Then I will tell you how we&#8217;re solving for these problems with<a href="https://www.irys.ai/"> Irys, our Legal AI platform at Iqidis to build the best Legal AI on the planet.</a></p><p><em>(Since we have many prominent Legal AI teams read this newsletter, here is an open invite: if you can either disprove my analysis of the failure modes of the current failure modes of Legal AI or want to prove that your system is an actual paradigm shift that addresses these fundamental issues, you&#8217;re welcome to come on my newsletter for a livestream conversation/demo. No strings attached, this is basically free marketing for you).</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B0iY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B0iY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png 424w, https://substackcdn.com/image/fetch/$s_!B0iY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png 848w, https://substackcdn.com/image/fetch/$s_!B0iY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png 1272w, https://substackcdn.com/image/fetch/$s_!B0iY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B0iY!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png" width="1200" height="890.1098901098901" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1080,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B0iY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png 424w, https://substackcdn.com/image/fetch/$s_!B0iY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png 848w, https://substackcdn.com/image/fetch/$s_!B0iY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png 1272w, https://substackcdn.com/image/fetch/$s_!B0iY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ef24c4d-ec91-416e-a4c5-5bdd464758ae_2400x1780.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>4.2 Why the Current Generation of Legal AI will Fail</h4><p>Most legal AI systems today rest on two assumptions. First, that the right legal knowledge can be retrieved from a flat semantic space using embedding search or RAG. Second, that once retrieved, an autoregressive model can compress that information into a correct answer. Let&#8217;s understand these in more detail, and see why Fine-tuning doesn&#8217;t fix anything.</p><p>The problem starts with how vector retrieval works. Semantic search maps text into a vector space and retrieves neighbors based on similarity&#8202;&#8212;&#8202;usually cosine similarity or some equivalent distance metric. In practice, this means the system is optimizing for geometric closeness in embedding space.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wF1O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wF1O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png 424w, https://substackcdn.com/image/fetch/$s_!wF1O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png 848w, https://substackcdn.com/image/fetch/$s_!wF1O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png 1272w, https://substackcdn.com/image/fetch/$s_!wF1O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wF1O!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png" width="1200" height="810.1648351648352" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:983,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wF1O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png 424w, https://substackcdn.com/image/fetch/$s_!wF1O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png 848w, https://substackcdn.com/image/fetch/$s_!wF1O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png 1272w, https://substackcdn.com/image/fetch/$s_!wF1O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf509a10-29ca-40b2-b691-31562a053c82_2400x1620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">More damning than the pure limitations is that you can&#8217;t enforce these properties in any way.</figcaption></figure></div><p>Since not all of you are technical, let&#8217;s understand what that geometry actually represents.</p><p>Embedding similarity captures surface-level semantic proximity&#8202;&#8212;&#8202;shared language, shared concepts, shared phrasing. It does not encode legal structure. It does not know hierarchy (binding vs persuasive authority), conditional relevance (&#8220;this applies only if X is true&#8221;), or dependency chains across doctrines. Two passages can be close in embedding space and completely different in legal weight. Worse, the passages that actually control the outcome are often not the most semantically obvious ones; they sit off to the side, triggered only after certain conditions are met.</p><p><strong>So retrieval becomes biased toward what is easy to see, not what actually governs the case. It is a nearest-neighbor search over language, not a structured search over legal reasoning.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CI-T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CI-T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png 424w, https://substackcdn.com/image/fetch/$s_!CI-T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png 848w, https://substackcdn.com/image/fetch/$s_!CI-T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png 1272w, https://substackcdn.com/image/fetch/$s_!CI-T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CI-T!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png" width="1200" height="879.3956043956044" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1067,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CI-T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png 424w, https://substackcdn.com/image/fetch/$s_!CI-T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png 848w, https://substackcdn.com/image/fetch/$s_!CI-T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png 1272w, https://substackcdn.com/image/fetch/$s_!CI-T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88427ff6-0a1e-443b-81c3-f5c699b3be61_2400x1759.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For the sake of argument, let&#8217;s say you somehow retrieve the right material. The second assumption breaks things further.</p><p>Autoregressive models generate text one token at a time by sampling from a conditional probability distribution. At each step, the model picks the most likely next token given what it has already produced. This creates a form of path dependence: once the model starts moving in one direction, each new token reinforces that direction.</p><p>Formally, the model is approximating a sequence of conditional probabilities P(token_t | tokens_&lt;t). But decoding forces a single trajectory through that space. It does not keep multiple competing interpretations alive; it commits locally at every step.</p><p>That is fine for tasks where one answer is enough. It is a poor fit for the law.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T2Nb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T2Nb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png 424w, https://substackcdn.com/image/fetch/$s_!T2Nb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png 848w, https://substackcdn.com/image/fetch/$s_!T2Nb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png 1272w, https://substackcdn.com/image/fetch/$s_!T2Nb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T2Nb!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png" width="1200" height="671.7032967032967" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:815,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T2Nb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png 424w, https://substackcdn.com/image/fetch/$s_!T2Nb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png 848w, https://substackcdn.com/image/fetch/$s_!T2Nb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png 1272w, https://substackcdn.com/image/fetch/$s_!T2Nb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5b5dc0-5b03-41bb-bdab-2c8646cb5ccd_1589x889.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Legal reasoning often requires holding multiple candidate interpretations in parallel&#8202;&#8212;&#8202;different readings of a statute, competing precedents, alternative fact patterns&#8202;&#8212;&#8202;before resolving which one dominates (often you will blend multiple). The hard part is not generating a fluent answer; it is managing uncertainty across paths before collapsing to one.</p><p>Autoregressive decoding collapses too early, killing this thing. You&#8217;re stuck praying that your system will surface the reasoning that will work in this context. If it doesn&#8217;t you really have no way to control it, no way to diagnose where it goes wrong, and no way to build your systems to account for it (remember that b/c of how LLMs work, identical inputs can produce different outputs, especially in fuzzy fields like law. This means you don&#8217;t even know if your model will be consistently incorrect!!).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!b_c-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!b_c-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png 424w, https://substackcdn.com/image/fetch/$s_!b_c-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png 848w, https://substackcdn.com/image/fetch/$s_!b_c-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png 1272w, https://substackcdn.com/image/fetch/$s_!b_c-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!b_c-!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png" width="1200" height="976.6483516483516" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1185,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!b_c-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png 424w, https://substackcdn.com/image/fetch/$s_!b_c-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png 848w, https://substackcdn.com/image/fetch/$s_!b_c-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png 1272w, https://substackcdn.com/image/fetch/$s_!b_c-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceab370c-03f4-49b9-b120-419cc330cdce_2400x1953.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The typical pipeline makes this worse since it retrieves a few chunks and feeds them into the model for generations. The first coherent frame that appears becomes the answer, and everything after that is just elaboration. The system is no longer exploring the space of possible legal interpretations; it is extending the first one that looked plausible.</p><p>This isn&#8217;t unique to law, since the same problem exists everywhere. But this is where law takes a very different turn from modern agent systems that are modeled on coding assistants like Claude Code. As we discussed earlier, they rely on powerful &#8202;&#8212;&#8202;plan, generate code, run it, test it, and fix it&#8202;&#8212;&#8202;loop The system improves because it can verify its own output against an external ground truth. If the code fails, the test suite gives you signal into where and why.</p><p>Legal reasoning does not have that property.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eHBQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eHBQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png 424w, https://substackcdn.com/image/fetch/$s_!eHBQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png 848w, https://substackcdn.com/image/fetch/$s_!eHBQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png 1272w, https://substackcdn.com/image/fetch/$s_!eHBQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eHBQ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png" width="1200" height="1025.2747252747254" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/abf16e85-269b-455b-8d74-872a43997081_2400x2050.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1244,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eHBQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png 424w, https://substackcdn.com/image/fetch/$s_!eHBQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png 848w, https://substackcdn.com/image/fetch/$s_!eHBQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png 1272w, https://substackcdn.com/image/fetch/$s_!eHBQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabf16e85-269b-455b-8d74-872a43997081_2400x2050.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Without reliable verification, iteration becomes blind. You can generate multiple drafts, add critique steps, or layer on reasoning chains, but there is no objective signal guiding the system toward correctness. The model is still selecting and extending plausible paths, not systematically exploring and validating them.</figcaption></figure></div><p>There is no equivalent of a compiler or a test suite that can reliably tell you whether you explored the right interpretations, weighed the correct authorities, or missed a controlling condition. You can only verify against outcomes you already know, which defeats the purpose. For novel questions&#8202;&#8212;&#8202;the ones your users will give to the system while using it&#8202;&#8212;&#8202;there is no ground truth available at generation time.</p><p>That breaks the entire agent loop.</p><p>This is why techniques that work well in code&#8202;&#8212;&#8202;tool use, execution loops, verification gates&#8202;&#8212;&#8202;degrade in legal settings. They assume a world where correctness can be checked after each step. Law is not that world.</p><p>Fine-tuning does not fix any of this. It adjusts the model&#8217;s local probabilities&#8202;&#8212;&#8202;style, phrasing, some domain priors&#8202;&#8212;&#8202;but the underlying mechanics remain unchanged. The system is still searching in a flattened semantic space and still committing to a single trajectory during generation. You get more confident answers, not more reliable ones. And as multiple pieces of research has shown, training the model in one set of capacities degrades it elsewhere, unpredictably, meaning you&#8217;re still stuck with all kinds of performance issues.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8BRW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8BRW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png 424w, https://substackcdn.com/image/fetch/$s_!8BRW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png 848w, https://substackcdn.com/image/fetch/$s_!8BRW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png 1272w, https://substackcdn.com/image/fetch/$s_!8BRW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8BRW!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png" width="1200" height="683.2417582417582" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:829,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8BRW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png 424w, https://substackcdn.com/image/fetch/$s_!8BRW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png 848w, https://substackcdn.com/image/fetch/$s_!8BRW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png 1272w, https://substackcdn.com/image/fetch/$s_!8BRW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2d3209-478e-4687-a62f-607a8a638bc8_1576x897.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>These systems are built to retrieve what looks similar and then commit early to a single explanation. Legal reasoning, on the other hand, depends on finding what actually governs and delaying commitment until competing paths have been properly evaluated. The gap between those two is where things break.</p><h4>4.3 How We Built Irys Differently</h4><p>There are major claims made here. The good news, you don&#8217;t have to take my word on this. Here are the libraries mentioned here, fully open sourced so you can verify my claims yourself:</p><ol><li><p><a href="https://github.com/dl1683/Latent-Space-Reasoning/tree/main">Latent Space Reasoning</a>.</p></li><li><p><a href="https://github.com/dl1683/moonshot-fractal-embeddings">Fractal Embeddings</a></p></li><li><p><a href="https://github.com/rcurrie/frac-bio-embed">Fractal Embeddings applied to Biology</a> (this was done by an external researcher, no affiliation to us. Their positive results prove that our research is truly foundational, not just a coat of paint).</p></li></ol><p>If the current stack fails because it commits too early and flattens too much, then the answer isn&#8217;t better prompting. It&#8217;s to remove those constraints.</p><p>The first constraint is premature commitment. Standard decoding forces the model into a single visible path almost immediately. What you want instead is the ability to explore multiple reasoning trajectories before collapsing to an answer.</p><p>This is where Latent Space Reasoning (LSR) comes in (<a href="https://www.artificialintelligencemadesimple.com/p/how-to-teach-llms-to-reason-for-50">read about it here</a>). Instead of treating generation as a single forward pass, it treats reasoning as a search problem over latent trajectories. Small changes in latent states can produce entirely different lines of reasoning without changing the model weights. The model already contains multiple possible interpretations; standard decoding just locks you into one too early.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fMyc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fMyc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg 424w, https://substackcdn.com/image/fetch/$s_!fMyc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg 848w, https://substackcdn.com/image/fetch/$s_!fMyc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!fMyc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fMyc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg" width="1456" height="1345" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1345,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fMyc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg 424w, https://substackcdn.com/image/fetch/$s_!fMyc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg 848w, https://substackcdn.com/image/fetch/$s_!fMyc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!fMyc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bb46b8f-8bb5-42b5-8cac-dc2f9f9d0bf6_1790x1654.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You can see this directly in the results. Simple latent perturbations pushed a Qwen3&#8211;4B model&#8217;s arithmetic accuracy from 32.0 percent to 51.6 percent&#8202;&#8212;&#8202;without any training or fine-tuning. On planning tasks, baseline decoding can collapse into degenerate outputs&#8202;&#8212;&#8202;sometimes as short as 14 words&#8202;&#8212;&#8202;while the same model, under perturbed trajectories, produces full 650+ word solutions.</p><p>But the more important result is qualitative, not quantitative.</p><p>Evolved latent trajectories don&#8217;t just produce better answers&#8202;&#8212;&#8202;they produce different ones. Entire reasoning paths appear that never show up under standard decoding: different strategies, different abstractions, different conceptual frames. This isn&#8217;t stylistic variation. It&#8217;s accessing parts of the model&#8217;s internal knowledge that the default decoding path never reaches.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DZ__!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DZ__!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png 424w, https://substackcdn.com/image/fetch/$s_!DZ__!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png 848w, https://substackcdn.com/image/fetch/$s_!DZ__!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png 1272w, https://substackcdn.com/image/fetch/$s_!DZ__!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DZ__!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png" width="1200" height="838.1868131868132" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1017,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DZ__!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png 424w, https://substackcdn.com/image/fetch/$s_!DZ__!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png 848w, https://substackcdn.com/image/fetch/$s_!DZ__!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png 1272w, https://substackcdn.com/image/fetch/$s_!DZ__!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0daa894-13d1-4515-b20c-f8c45fa47c12_2400x1676.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In legal reasoning, that matters. Ambiguity isn&#8217;t something to suppress&#8202;&#8212;&#8202;it&#8217;s something to explore before deciding. If the system collapses too early, entire interpretations never get surfaced at all. Our paradigm gives us the ability to tackle that.</p><p>The second constraint is flattened representation. Legal knowledge isn&#8217;t a single semantic space; it&#8217;s structured&#8202;&#8212;&#8202;jurisdiction, issue, doctrine, authority, factual fit. Treating all of that as one embedding space forces the system to relearn structure implicitly every time.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xEGB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xEGB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png 424w, https://substackcdn.com/image/fetch/$s_!xEGB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png 848w, https://substackcdn.com/image/fetch/$s_!xEGB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png 1272w, https://substackcdn.com/image/fetch/$s_!xEGB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xEGB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png" width="1456" height="820" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:820,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xEGB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png 424w, https://substackcdn.com/image/fetch/$s_!xEGB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png 848w, https://substackcdn.com/image/fetch/$s_!xEGB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png 1272w, https://substackcdn.com/image/fetch/$s_!xEGB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3024a219-68e4-487b-9bf9-7cb2492f00c0_2160x1216.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A better approach is to encode that structure directly. Hierarchical representations break the problem into levels, so the system doesn&#8217;t have to solve the entire space at once. Instead of asking &#8220;what&#8217;s the right answer?&#8221; in one step, the system resolves a sequence&#8202;&#8212;&#8202;where does this apply, which doctrine governs, how does it map to the facts.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hiZV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hiZV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hiZV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hiZV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hiZV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hiZV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg" width="1456" height="804" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:804,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hiZV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hiZV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hiZV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hiZV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee6ac2c-7a30-4a83-8dbe-2d3c76a72e06_2160x1193.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the frac-bio-embed work (<em><strong>which I feel compelled to stress again, was not done by us, it was an independent researcher taking applying our work in a new space we didn&#8217;t even design for</strong></em>) the same hierarchical approach was applied to single-cell biology&#8202;&#8212;&#8202;7.9 million cells across 203 types&#8202;&#8212;&#8202;and consistently improved performance over flat embeddings, with the largest gains at fine-grained distinctions. This proves that our diagnosis scratches at a failure mode: when structure matters, flattening loses it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RTe1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RTe1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png 424w, https://substackcdn.com/image/fetch/$s_!RTe1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png 848w, https://substackcdn.com/image/fetch/$s_!RTe1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png 1272w, https://substackcdn.com/image/fetch/$s_!RTe1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RTe1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png" width="1456" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RTe1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png 424w, https://substackcdn.com/image/fetch/$s_!RTe1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png 848w, https://substackcdn.com/image/fetch/$s_!RTe1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png 1272w, https://substackcdn.com/image/fetch/$s_!RTe1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08214da2-f5cb-45ca-9317-fc0b4c99f9dc_2004x822.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Both of these changes point to the same idea.</p><p>The problem isn&#8217;t that models aren&#8217;t capable. It&#8217;s that the current stack restricts how they search and how they represent the problem. Change those, and you don&#8217;t just get better answers&#8202;&#8212;&#8202;you unlock reasoning paths that weren&#8217;t accessible before.</p><h4>4.5 What This Means for the Future of Legal AI</h4><p>The first wave of Legal AI got paid because it solved drudgery. Good. But Conway&#8217;s Law, Sunk Cost Fallacy, and all the other issues we&#8217;ve elaborated on at length will keep that generation stuck in this framework and have difficulty pivoting to the new architectures that will define the next generation of legal work.</p><p>To repeat our prediction: Just as Cursor disrupted the market with Quasi-Agentic development, only to be replaced by truly agentic systems like Claude Code, I predict that the first wave of legal AI tools will be disrupted by systems that reimagine intelligence ground up. And as this happens, these incumbents will fail to rebuild effectively.</p><p>We&#8217;re already seeing the early signals of this:</p><ul><li><p>If you go to lawyertalk and other legal subreddits, we&#8217;re the most positively talked about Legal AI tool out there, by quite a mile. And people often talk about how different our answers feel as opposed to Harvey etc, which feel like wrappers (deep research this if you don&#8217;t believe me).</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tpyS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tpyS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png 424w, https://substackcdn.com/image/fetch/$s_!tpyS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png 848w, https://substackcdn.com/image/fetch/$s_!tpyS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png 1272w, https://substackcdn.com/image/fetch/$s_!tpyS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tpyS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png" width="1456" height="648" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:648,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tpyS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png 424w, https://substackcdn.com/image/fetch/$s_!tpyS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png 848w, https://substackcdn.com/image/fetch/$s_!tpyS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png 1272w, https://substackcdn.com/image/fetch/$s_!tpyS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295bbc69-f6a5-42b4-b7e4-1eedd636c670_1592x708.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.reddit.com/user/h0l0gramco/comments/1nq7buw/sept_2025_we_finished_onboarding_legal_ai/?utm_source=share&amp;utm_medium=web3x&amp;utm_name=web3xcss&amp;utm_term=1&amp;utm_content=share_button">Source</a></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3QCB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3QCB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png 424w, https://substackcdn.com/image/fetch/$s_!3QCB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png 848w, https://substackcdn.com/image/fetch/$s_!3QCB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png 1272w, https://substackcdn.com/image/fetch/$s_!3QCB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3QCB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png" width="804" height="282" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:282,&quot;width&quot;:804,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3QCB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png 424w, https://substackcdn.com/image/fetch/$s_!3QCB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png 848w, https://substackcdn.com/image/fetch/$s_!3QCB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png 1272w, https://substackcdn.com/image/fetch/$s_!3QCB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba98e411-ffe6-4ba1-8e9a-e27a41c42866_804x282.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>We&#8217;ve also beaten the other names to land several big names like Blockchain.com. But more importantly then landing is retention. Our enterprise retention rate is 100% (we&#8217;ve never lost an enterprise client, ever). This is despite the fact that we don&#8217;t have any lock-ins or anything that forces people on the platform.</p></li><li><p>Going back to our earlier discussions about how disruptive startups change the interaction paradigm/usage patterns of their users, we&#8217;ve got messages from several users that they love &#8220;sparring&#8221; with our platform, and our sophisticated reasoning system allows them to have a lot more fun discussing law with our platform (one person said that he was staying up till 2AM playing with the platform). These are the kinds of use-cases that can&#8217;t be replicated easily, and we&#8217;re on the verge of some major releases to lock these in further.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YAAe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YAAe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png 424w, https://substackcdn.com/image/fetch/$s_!YAAe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png 848w, https://substackcdn.com/image/fetch/$s_!YAAe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png 1272w, https://substackcdn.com/image/fetch/$s_!YAAe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YAAe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png" width="938" height="732" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:732,&quot;width&quot;:938,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YAAe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png 424w, https://substackcdn.com/image/fetch/$s_!YAAe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png 848w, https://substackcdn.com/image/fetch/$s_!YAAe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png 1272w, https://substackcdn.com/image/fetch/$s_!YAAe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7849cfcc-e89b-48bb-b50d-0022b51f91bc_938x732.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">email devansh@iqidis.ai if you want more details.</figcaption></figure></div><p><a href="https://www.irys.ai/">If you want to try the platform out for yourself, we offer a no credit card, no lock-in 7 day trial. Go to the website, sign up, try it out. You&#8217;ll see immediately how much better we are</a>.</p><p>This has been a long article, so I&#8217;ll end it here. As mentioned, if you&#8217;re a legal AI startup that disagrees with any of my analysis, or you want to stress how different you truly are, you&#8217;re more than welcome to come state your piece on this newsletter. I welcome all challengers.</p><p>Thank you for being here, and I hope you have a wonderful day,</p><p>Dev &lt;3</p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/why-some-startups-are-easy-to-copy?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/why-some-startups-are-easy-to-copy?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iqXP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iqXP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png 424w, https://substackcdn.com/image/fetch/$s_!iqXP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png 848w, https://substackcdn.com/image/fetch/$s_!iqXP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png 1272w, https://substackcdn.com/image/fetch/$s_!iqXP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iqXP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png" width="517" height="273" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:273,&quot;width&quot;:517,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iqXP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png 424w, https://substackcdn.com/image/fetch/$s_!iqXP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png 848w, https://substackcdn.com/image/fetch/$s_!iqXP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png 1272w, https://substackcdn.com/image/fetch/$s_!iqXP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ddf0923-25e0-4e7c-8459-0339b436fe64_517x273.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Reach out to me</h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. : </p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[How a Leading Venture Capitalist uses AI Agents [Guest]]]></title><description><![CDATA[A skeptic&#8217;s field guide to building personal agents that are powerful, practical, and safe]]></description><link>https://www.artificialintelligencemadesimple.com/p/how-a-leading-venture-capitalist</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/how-a-leading-venture-capitalist</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Thu, 26 Mar 2026 20:51:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!mKgU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>I&#8217;m really lucky to publish this guest post from <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;James Wang&quot;,&quot;id&quot;:7343257,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7ea988e-c6f5-4b1e-9041-8a3081bccb3f_2200x2220.jpeg&quot;,&quot;uuid&quot;:&quot;0eb8f741-757d-487d-9ada-39842916f959&quot;}" data-component-name="MentionToDOM"></span> . James has been a real friend of this community and a generous supporter of our work; his analysis helps me stay both grounded and find new framings for concepts that I had overlooked as trivial. He has a rare habit of looking at AI from multiple angles at once: what is technically real, how things worked historically, what is economically defensible, what is actually useful, and where the danger starts hiding. </p><p>In the following guest post, James shares how he uses AI agents to get a lot of work done. He covers  morning briefings, meeting pipelines, research capture, drafting workflows, permissions, context, iteration, and the security risks that show up the second these systems. If you&#8217;re looking for a practical guide on what you could do, this piece will help like a few others. </p><p>As you read, here are a few things are worth thinking about:</p><ul><li><p>How much of the result comes from the model itself, and how much comes from the system James built around it? And where will this dynamic stretch in the future? </p></li><li><p>What is the next evolution in context management, goiven that context does so much heavy lifting? </p></li><li><p>How do you price the dynamic b/w convinience and security, and is that likely to change? What would change it? </p></li></ul><p>I have no doubt that you will enjoy this piece as much as I did. If you want to find James more regularly, would strongly recommend signing up to his newsletter below&#8212;</p><div class="embedded-publication-wrap" data-attrs="{&quot;id&quot;:243988,&quot;name&quot;:&quot;Weighty Thoughts&quot;,&quot;logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M8X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c03f2eb-10cb-4fa0-94b3-e79c07f6c9c1_256x256.png&quot;,&quot;base_url&quot;:&quot;https://weightythoughts.com&quot;,&quot;hero_text&quot;:&quot;VC on AI, deep tech, startups. Former Bridgewater, Google[x], startup founder. Read by top engineers, fund managers, and policymakers.&quot;,&quot;author_name&quot;:&quot;James Wang&quot;,&quot;show_subscribe&quot;:true,&quot;logo_bg_color&quot;:&quot;#ffffff&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPublicationToDOMWithSubscribe"><div class="embedded-publication show-subscribe"><a class="embedded-publication-link-part" native="true" href="https://weightythoughts.com?utm_source=substack&amp;utm_campaign=publication_embed&amp;utm_medium=web"><img class="embedded-publication-logo" src="https://substackcdn.com/image/fetch/$s_!-M8X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c03f2eb-10cb-4fa0-94b3-e79c07f6c9c1_256x256.png" width="56" height="56" style="background-color: rgb(255, 255, 255);"><span class="embedded-publication-name">Weighty Thoughts</span><div class="embedded-publication-hero-text">VC on AI, deep tech, startups. Former Bridgewater, Google[x], startup founder. Read by top engineers, fund managers, and policymakers.</div><div class="embedded-publication-author-name">By James Wang</div></a><form class="embedded-publication-subscribe" method="GET" action="https://weightythoughts.com/subscribe?"><input type="hidden" name="source" value="publication-embed"><input type="hidden" name="autoSubmit" value="true"><input type="email" class="email-input" name="email" placeholder="Type your email..."><input type="submit" class="button primary" value="Subscribe"></form></div></div><p>and getting a copy of his excellent book: <em><a href="https://www.smartaibook.com/">What You Need to Know About AI</a></em>&#8212;what&#8217;s real, what&#8217;s hype, and where this technology is headed. It&#8217;s my go-to gift to anyone who wants to know about AI, and everyone whose read it has had rave reviews about it. </p><p>On with the post&#8212;</p><div><hr></div><p>My day-to-day productivity stack now often feels more like managing a small team than it does running tools.</p><p>I receive a morning briefing outlining emails I should respond to, tasks that are overdue, stakeholders I need to follow up with from my CRM (customer relationship management), and important news stories. After a meeting, I drop the recording and files (slide decks, etc.), and an agent picks it up, classifies it, and gives me a summary (or report, if it&#8217;s a prospective investment I&#8217;m evaluating). At the end of the day, I get a prompt to decompress, and an agent basically walks me through a 5-minute gratitude journal.</p><p>Finally, I have a research assistant agent and a drafting agent that helps me write now&#8212;the last of which I&#8217;ll actually provide a full end-to-end of how this piece evolved from the agent.</p><p><a href="https://github.com/j-wang/how-i-utilize-ai-agents-article">(See the GitHub repo here to see all the human/AI drafts!)</a></p><h2><strong>How to Utilize Agents on a Personal Level</strong></h2><p>This Substack isn&#8217;t meant to be a how-to or productivity blog. I&#8217;ve said that before, and I mean it.</p><p>That being said, one of the reasons I ended up feeling like I <em>needed</em> to write this piece is no matter how much I explained the conceptual fundamentals, people didn&#8217;t really &#8220;get it.&#8221; I would tell people that they could easily use AI to substantially help speed them up in workflows that they were complaining about (ironic, I know, given my reputation as a &#8220;relative skeptic&#8221; in the world of Substack). Then, they would do something like &#8220;help me with X&#8221; in ChatGPT, find that it was minimally effective and clunky, and then give up.</p><p>The second reason is because of OpenClaw (previously known as Moltbot and before that as Clawdbot). People would hear stories of it running their entire lives&#8212;along with running crypto schemes, trading markets, trolling people, or posting on Moltbook&#8212;and go, &#8220;Ah, that&#8217;s how I do what James asked me to do.&#8221; One of the last straws was someone mentioning that OpenClaw sounded a lot like something I would/did build.</p><p>First off, I respect the hustle of the OpenClaw founder. Marketing and getting adoption for a product is often harder than building it. So, when I say that it&#8217;s both horribly flawed and not really that technically complex, I&#8217;m not trying to insult Peter Steinberger or say he doesn&#8217;t deserve whatever payday he got from OpenAI. He does.</p><p>That being said, I would hesitate to recommend running it to those with a lot of expertise, and I would say, &#8220;hell no&#8221; to anyone who doesn&#8217;t understand it. Running OpenClaw without a good understanding is getting on the fast lane to having all your sensitive data (financial info, API keys, confidential data, etc.) stolen.</p><p>In this piece, I want to prove (and then illustrate with my own examples) three things:</p><ol><li><p>Agents are extremely powerful... given the right context.</p></li><li><p>Agents, without safeguards, are extraordinarily dangerous.</p></li><li><p>Agents can already revolutionize your life and work&#8212;today.</p></li></ol><p>And I&#8217;d suggest everyone <em>try</em> to use them. Why? I always refer back to the poor NYU Stern professor who is now known as the subject of Bill Gurley&#8217;s <a href="https://abovethecrowd.com/2014/07/11/how-to-miss-by-a-mile-an-alternative-look-at-ubers-potential-market-size/">&#8220;How to Miss by a Mile&#8221;</a>&#8212;rebutting the professor&#8217;s piece in FiveThirtyEight on how &#8220;Uber Isn&#8217;t Worth $17 Billion.&#8221;</p><p>Uber today is worth <a href="https://finance.yahoo.com/quote/UBER/">over $150B</a>. More to the point, he admitted:</p><blockquote><p><em>&#8220;As I attempt to attach a value to Uber, I have to confess that I just downloaded the app and have not used it yet. I spend most of my of life either in the suburbs, where I can go for days without seeing a taxi, or in New York City, where I find that the subways are a vastly more time-efficient, cheaper and often safer mode of transportation than taxis.&#8221;</em></p></blockquote><p>Putting aside today&#8217;s insults of being an out-of-touch, elite NYC urbanite, it is <em>truly</em> difficult to understand a new technology without trying it. If you&#8217;re either worried or skeptical of AI agents... well, go find out for yourself.</p><h2><strong>We Are Unquestionably Here</strong></h2><p>The definitive introductory textbook for AI, <em>Artificial Intelligence: A Modern Approach</em>&#8212;often simply referred to as &#8220;Russell-Norvig&#8221; after its authors&#8212;defines artificial intelligence research as &#8220;the study and design of rational agents.&#8221;</p><p>I often push back against those who claim artificial general intelligence (AGI) is right around the corner. That being said, we unquestionably now have AI as defined by Russell-Norvig.</p><p>While you could quibble and argue with how much &#8220;machine learning&#8221; or simply statistical methods fell into that category, one would be hard pressed to not call what we have with recent, modern models &#8220;rational agents.&#8221; Rational, after all, doesn&#8217;t mean perfect or infallible&#8212;which would disqualify even humans if we required that!</p><p>The jump has been from models operating in a chat box&#8212;where they may be &#8220;rational,&#8221; but <em>not</em> agents&#8212;to operating in the real world. Or, at minimum, our working digital world, where our calendars, journals, memos, and everything else now exist. As I wrote in <a href="https://weightythoughts.com/p/the-boring-phase-of-ai">&#8220;The Boring Phase of AI,&#8221;</a> this shift from chatting to <em>doing</em> is both less flashy and ultimately much more important.</p><h2><strong>Examples of My Setups</strong></h2><p>Ok, enough boring &#8220;blah, blah, blah, principles/safety/etc.&#8221; I suppose I now have to show examples of what I mean. Obviously, I <em>do</em> use Claude Code (and Codex) as coding agents. But I use them for far more as well.</p><h3><strong>The Morning Briefing</strong></h3><p>Let&#8217;s do this in a front-loaded fashion. The concepts in this one translate to most of the others.</p><p>Every morning, I get a neat briefing. It basically looks like a personal assistant put it together. Let&#8217;s show rather than tell.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YpBa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YpBa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YpBa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YpBa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YpBa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YpBa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg" width="1304" height="1618" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1618,&quot;width&quot;:1304,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:300688,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://weightythoughts.com/i/189494598?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!YpBa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YpBa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YpBa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YpBa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13029b41-8024-4057-9a60-8ecb4d20b496_1304x1618.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Morning briefing on my iPad, piped to me in Day One, though it is also available in my DevonThink databases.</figcaption></figure></div><p>It also flags for me what emails I should respond to and even points out things I should do (e.g., block out the morning for a long judging session).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BK4I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BK4I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png 424w, https://substackcdn.com/image/fetch/$s_!BK4I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png 848w, https://substackcdn.com/image/fetch/$s_!BK4I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png 1272w, https://substackcdn.com/image/fetch/$s_!BK4I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BK4I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png" width="968" height="699" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:699,&quot;width&quot;:968,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:257320,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://weightythoughts.com/i/189494598?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!BK4I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png 424w, https://substackcdn.com/image/fetch/$s_!BK4I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png 848w, https://substackcdn.com/image/fetch/$s_!BK4I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png 1272w, https://substackcdn.com/image/fetch/$s_!BK4I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc746c75e-d7b1-4501-8358-8f8e3bce44a7_968x699.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">This is also in Day One, though I used the mobile view to show more of it. Yes, I know those are overdue tasks. Claude literally tells me every morning. I&#8217;ve been busy. Don&#8217;t judge me.</figcaption></figure></div><p>How does it work? Well, every morning I launch a cron job<strong>*</strong>, which is basically a scheduled script. I use Claude Code in prompt mode, where I can pass it a prompt &#8220;non-interactively&#8221; (basically, I don&#8217;t need to chat with it). I give it <em>dangerously-skip-permissions</em> so it can run things.</p><blockquote><blockquote><p><em>/opt/homebrew/bin/claude -p &#8220;$MODE briefing for $DATE&#8221; --output-format text --dangerously-skip-permissions --max-turns 25 &gt;&gt; &#8220;$LOG_FILE&#8221; 2&gt;&amp;1</em></p></blockquote></blockquote><p>For the non-technical, this looks like an arcane summoning to hell, but the key portions of it are</p><ul><li><p><code>claude</code> (command-line command for Claude Code&#8212;I just have the &#8220;full path&#8221; for it)</p></li><li><p><code>-p</code> (this is prompt-mode)</p></li><li><p><code>--dangerously-skip-permissions</code> (as suggested, this is dangerous but is required so it can run without me)</p></li></ul><p>As the name suggests, it&#8217;s dangerous (we&#8217;ll cover this later), but I restrict what it can do and access. I have my own MCPs (model context protocols) it can access, and most published MCPs (like Google&#8217;s for Gmail and Calendar) are cautious and have a level of protection.</p><p>Think of them as tools that the model can call&#8212;with pre-specified functions and scope of what can be done with them.</p><p>The reason, for example, why I need to push my briefings to Day One and DevonThink is Gmail will not allow a model using an MCP to actually send an email&#8212;the best it can do is a draft.</p><p>But how does this headless, prompt-mode Claude Code know what to do?</p><p>The two keys to all of this working are two files in the directory:</p><ul><li><p><strong>CLAUDE.md</strong>: instructions to Claude Code that it will read to know what to do</p></li><li><p><strong>settings.json</strong>: a configuration file for what MCPs Claude Code has access to (and what it is denied) <em><strong>EDIT: This should actually be under .claude/settings.json&#8212;I originally moved it so it&#8217;d be visible (dot files are hidden) and forgot to note this.</strong></em></p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mKgU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mKgU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg 424w, https://substackcdn.com/image/fetch/$s_!mKgU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg 848w, https://substackcdn.com/image/fetch/$s_!mKgU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!mKgU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mKgU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg" width="497" height="434" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:434,&quot;width&quot;:497,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:39050,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://weightythoughts.com/i/189494598?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!mKgU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg 424w, https://substackcdn.com/image/fetch/$s_!mKgU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg 848w, https://substackcdn.com/image/fetch/$s_!mKgU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!mKgU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f2acca-0e69-46b1-a3d7-130e334922fc_497x434.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Folder structure, which can already see has some of the other things in this screenshot. NOTE: see note above about where settings.json should actually live.</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SNFm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SNFm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SNFm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SNFm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SNFm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SNFm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg" width="1248" height="1548" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1548,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:310434,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://weightythoughts.com/i/189494598?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!SNFm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SNFm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SNFm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SNFm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58e3873c-1302-4c38-b439-a3c07a2324f9_1248x1548.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">CLAUDE.md, which is a standard instruction file for Claude Code for relevant context on a coding project... or a briefing.</figcaption></figure></div><p>settings.json:</p><pre><code><code>{
  "permissions": {
    "allow": [
      "mcp__gmail__*",
      "mcp__google_calendar__*",
      "mcp__hubspot__*",
      "mcp__omnifocus__*",
      "mcp__devonthink__*",
      "mcp__dayone__*",
      "Read(*)",
      "Write(~/Documents/briefings/*)"
    ],
    "deny": [
      "Bash(*)",
      "Write(~/.claude/*)"
    ]
  }
}
</code></code></pre><p>As you can see, it gets access to my email, calendar, HubSpot (CRM), task manager, DevonThink (personal database&#8212;which you&#8217;ll see I use a lot, because it&#8217;s flexible... and I&#8217;ve used it for years... but also because I can more safely read/write to it with backups), and Day One (journal). It can read things and write things to a specific folder (for logging). It cannot arbitrarily run scripts.</p><p>I don&#8217;t actually allow it to access websites or &#8220;call out.&#8221; All of the news items are <em>from my email</em> (with specific items only forwarded). You <em>never</em> want it to search and land on arbitrary websites.</p><p>And I&#8217;m not even super comfortable/done with securing this. I&#8217;m figuring out the best way to sandbox this even further so I can granularly deny <em>all</em> network access outside of that required by my MCPs (many of which are hosted on my own domain, because I wrap them to allow me to use them anywhere with Google OAuth/login). If anyone has suggestions, let me know!</p><p><em>*It&#8217;s actually a launchd job on Mac because of auth token issues, but that&#8217;s not super important.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/how-a-leading-venture-capitalist?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/how-a-leading-venture-capitalist?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3><strong>Capturing Substack Articles On-The-Go</strong></h3><p>While the morning briefing runs automatically, I also use Claude Code on the go for various tasks. I use tmux&#8212;a terminal multiplexer that keeps sessions alive&#8212;so I can connect remotely via Wireguard (VPN to my home network) and Blink.sh on iOS.</p><p>Non-tech translation: I securely connect to my home network and remote-access Claude Code running on my computer from my phone or iPad.</p><p>For any tool, the folder structure is basically the same: detailed instructions in CLAUDE.md and a settings.json defining what is allowed and not.</p><p>One workflow I use frequently is web capture for Substack articles into my DEVONthink database. If you&#8217;re an avid Substack reader, you know the pain: not all publications have dark mode, paid posts require login, and you&#8217;re out of luck without internet. My Claude Code session takes a URL, opens it in my logged-in Substack Reader on Chrome, and captures it as a Web Archive in DEVONthink&#8217;s &#8220;To Read&#8221; inbox&#8212;where I can read it at my leisure, offline, in dark mode.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!exTb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!exTb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg 424w, https://substackcdn.com/image/fetch/$s_!exTb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg 848w, https://substackcdn.com/image/fetch/$s_!exTb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!exTb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!exTb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg" width="1326" height="1148" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1148,&quot;width&quot;:1326,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:172391,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://weightythoughts.com/i/189494598?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!exTb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg 424w, https://substackcdn.com/image/fetch/$s_!exTb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg 848w, https://substackcdn.com/image/fetch/$s_!exTb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!exTb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cd43e1-b743-4d3d-b375-53caaa968f6e_1326x1148.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The tmux-based capture session, accessible remotely.</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hPXE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hPXE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hPXE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hPXE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hPXE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hPXE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg" width="1044" height="752" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:752,&quot;width&quot;:1044,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:167140,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://weightythoughts.com/i/189494598?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!hPXE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hPXE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hPXE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hPXE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2c3174-8b8a-4e32-9c1f-9d858923cc0e_1044x752.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A captured article in DEVONthink, and an excellent one at that from Nathan Lambert recently</figcaption></figure></div><p>(<a href="https://code.claude.com/docs/en/remote-connections">Anthropic recently released Claude Code &#8220;remote connections,&#8221;</a> which lets you connect to a running session from the web or another device. Most people won&#8217;t need my tmux setup&#8212;remote connections are easier. Though mine still has some advantages in control and flexibility.)</p><p>As you can see, Claude Code in folders doesn&#8217;t have to be about coding. It can just be structures for automation. Most of what I&#8217;ve described here isn&#8217;t writing code in any traditional sense&#8212;it&#8217;s setting up context and instructions for the agent to follow.</p><h3><strong>Writing: Research and Junior Drafting</strong></h3><p>This is the use case I get asked about the most. I&#8217;ve written about using AI as a writing aid before in <a href="https://weightythoughts.com/p/ai-as-a-partner-not-a-replacement">&#8220;AI as a Partner, Not a Replacement,&#8221;</a> where I was still figuring things out.</p><p>I credit inspiration from <a href="https://open.substack.com/users/6970039-alejandro-piad-morffis?utm_source=mentions">Alejandro Piad Morffis</a>&#8217;s <a href="https://blog.apiad.net/p/using-ai-to-augment-not-automate">early example with CODER</a>, though having done it myself and helped others set it up, you really have to set it up for <em>yourself,</em> and it&#8217;s almost impossible to get something off-the-shelf and have it be any good.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Whf3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Whf3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Whf3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Whf3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Whf3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Whf3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg" width="1068" height="736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:736,&quot;width&quot;:1068,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:140073,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://weightythoughts.com/i/189494598?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Whf3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Whf3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Whf3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Whf3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dff15c5-236b-46dc-816e-5c941e8e2b53_1068x736.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The CLAUDE.md file&#8212;a comprehensive set of writing guidelines that the AI references for every draft. In this case, in DevonThink, so I can actually access this on Claude&#8217;s mobile app or claude.ai, not just Claude Code.</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TQ2u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TQ2u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg 424w, https://substackcdn.com/image/fetch/$s_!TQ2u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg 848w, https://substackcdn.com/image/fetch/$s_!TQ2u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!TQ2u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TQ2u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg" width="1068" height="736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:736,&quot;width&quot;:1068,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:226736,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://weightythoughts.com/i/189494598?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!TQ2u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg 424w, https://substackcdn.com/image/fetch/$s_!TQ2u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg 848w, https://substackcdn.com/image/fetch/$s_!TQ2u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!TQ2u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c62f20d-574a-4c6a-8e9a-d5ec8bcad9fe_1068x736.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The guidelines include specifics about my voice, tone, common phrases, and what to avoid. Apparently I really like to say, &#8220;Let&#8217;s be clear...&#8221; Also, &#8220;I&#8217;ve written about this before&#8230;&#8221; There&#8217;s no burn as bad as an AI trying to be helpful.</figcaption></figure></div><p>Here&#8217;s how it works: I still need to write the outline of what I want, with the broad notes to hit. I still need to go through and substantially edit&#8212;asking for different charts and data, checking math (which is often wrong), probing claims, and changing a decent amount of the text for my own tone.</p><p>It&#8217;s more akin to an intern helping do a first draft than a ghostwriter.</p><p>Is it useful? Yes. <a href="https://github.com/j-wang/how-i-utilize-ai-agents-article">But as you can see for yourself in this GitHub for this article...</a> things do change substantially. There&#8217;s no real question as to who is the author in any meaningful sense, though. Still, it actually saves me time and mental energy, whereas before it was totally useless.</p><p><em>Why is it helpful at all, though</em>? It took a few things:</p><ol><li><p><strong>Iteration</strong>. Every time it fails to get that close to my writing, I ask it to assess and then <em>iterate</em> on the CLAUDE.md to try to have future attempts get closer. This is why I&#8217;m not posting the full CLAUDE.md. While the content might not be me, I&#8217;m also not looking for impersonators who sound like me.</p></li><li><p><strong>Research</strong>. Remember how I capture articles I like in DevonThink? Well, guess where I ask my AI agent (Claude, but it doesn&#8217;t have to be) to research? I have a &#8220;blessed canon&#8221; that is preferentially drawn upon. <strong>It literally also has all my published articles!</strong> They are well-organized in categories and with tags so it can not only research using me but also draw on <em>how I wrote about certain topics</em>. This is like RAG (retrieval augmented generation)&#8212;actually, it is <em>literally</em> RAG by most definitions.</p></li><li><p><strong>Organization</strong>. If you look at this article now and what I started with&#8212;my personal notes (initial sketch), Initial Draft (Claude&#8217;s first shot), and eventually my final article... well, perhaps the <em>only</em> thing that really survived is some rough organization. That&#8217;s fine. It&#8217;s relatively rote, low differentiation (most good articles are organized similarly), and time-consuming for me... but fast for an AI agent. Exactly the kind of thing you want it to help with.</p></li></ol><p>Again, check out the <a href="https://github.com/j-wang/how-i-utilize-ai-agents-article">GitHub repo</a> yourself to see how this played out over time. Usually it isn&#8217;t quite as &#8220;bad&#8221; as this one&#8212;you&#8217;ll see the initial draft, and my edits are <em>big</em>. But remember, I don&#8217;t write these kinds of articles often, so it&#8217;s going to understandably be worse not seeing examples like this.</p><h3><strong>Meeting Notes</strong></h3><p>This was the recent example that blew people away.</p><p>I use a Sony ICD-UX570 recorder for meetings&#8212;it is more private, controlled, cheaper, has better microphones, and is more flexible in every regard than AI notetakers like Plaud. A side benefit is that young people think it&#8217;s a Walkman.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r1W5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r1W5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!r1W5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!r1W5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!r1W5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r1W5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2123638,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://weightythoughts.com/i/189494598?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!r1W5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!r1W5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!r1W5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!r1W5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c1a91b9-6bcb-44c3-bd0a-c30c20b90dc5_4032x3024.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I have a monitored folder where I upload MP3 recordings. From there, a Claude Code instance has detailed instructions to:</p><ol><li><p><strong>Send them all to DeepGram</strong>: it has better privacy and its ability to deal with noise is excellent (unfortunately, Whisper from OpenAI, which you can run from your own computer... is not)</p></li><li><p><strong>Read all extra documents in the folder</strong>: this includes PDFs of pitch decks, track records, whatever.</p></li><li><p><strong>Use this context to classify it</strong>: is it a memo to myself, an (investment) manager meeting, a portfolio company meeting, or an internal meeting?</p></li><li><p><strong>Ask James who each speaker is if it is not clear</strong>: often, pitch decks, board meetings, and whatnot have enough context, but if not, give me context and ask who it is.</p></li><li><p><strong>Create summaries</strong>: classification isn&#8217;t just for kicks&#8212;I&#8217;ve written detailed guides on what I care about and <em>how</em> to summarize for each type of meeting.</p></li></ol><p>For steps 4 and 5, I have CLAUDE.md specify to dispatch subagents for each meeting. It takes too long otherwise. However, this obviously <em>sets tokens on fire</em> in terms of total use in a short period of time.</p><p>And no, I never record secretly. While Granola and similar services have made it common, I think it&#8217;s quite bad taste (not to mention illegal in many jurisdictions). Additionally, few people have issues with it these days&#8212;and even less so with my old-school recorder.</p><p>But, as you might have noted by this point, what makes agents ok versus supercharged is the right <em>context</em>. I recently went to Global Alts in Miami to help out in selecting managers for investment. I had 17 meetings. Each had a <em>huge</em> amount of content.</p><p>That was 8.5 hours of meetings. Literally hundreds of pages of documents.</p><p>It usually takes weeks to process it all. I did it in hours and compiled a report on all managers, including summaries, analyses, comparisons, etc. The report itself is too sensitive, but just to give you the table of contents:</p><ol><li><p><strong>Executive Summary</strong></p></li><li><p><strong>Manager Overview Table</strong> (Managers, Strategy, AUM, Interest, Bucket, Metric, Next Steps)</p></li><li><p><strong>Priority Tier: Advance to Diligence</strong> (lists each meeting with Summary, Why It&#8217;s Interesting, Key Risks, and Next Steps)</p></li><li><p><strong>Monitor Tier: Follow Up Required</strong> (lists each meeting with Summary, Why It Warrants Follow Up, Key Open Questions, Next Steps)</p></li><li><p><strong>Pass Tier: Not A Fit</strong> (lists each meeting with James&#8217;s Assessment, Why It&#8217;s a Pass, Note, Action)</p></li><li><p><strong>Portfolio Construction Observations</strong> (with Bucket Mapping and Capacity, Redundancy Analysis, Diversification Gaps, Correlation Analysis)</p></li><li><p><strong>Consolidated Action Items</strong> (tables of 10-20 follow-ups for each of Immediate, Near-Term, Analytical [Before Follow-up Calls], Medium-Term, Deferred/Monitor)</p></li><li><p><strong>Appendix</strong> (Risk Factor Summary [large matrix analysis with beta, crowding, illiquidity... etc.], and Key Takeaways from Risk Matrix with notes for each manager)</p></li></ol><p>It&#8217;s a 30-page report that took a little over a million tokens and a few hours. It floored people. But, again, the magic was</p><ul><li><p>It had context from my recordings with my detailed questions</p></li><li><p>It had context from the documents/PDFs from the managers</p></li><li><p>It had context from my own voice memos summarizing my thoughts on sets of managers later</p></li><li><p>It had context from investment policy docs from the organization I was helping</p></li><li><p>It had context from <em>my</em> evaluation criteria that I wrote up in exhaustive detail (in a EVALUATION.md) <em>before</em> my meetings</p></li></ul><p>Again, I didn&#8217;t just go, &#8220;Hey, go summarize stuff.&#8221;</p><h2><strong>Takeaways</strong></h2><ol><li><p><strong>Context is king&#8212;and sometimes takes a lot of work.</strong> None of these things magically work well. I needed to tune both the instructions <em>and</em> what the agent had access to to have good results. Some of this might have been prompt engineering, RAG, or whatnot, but I see it as all being <em>context</em>.</p></li><li><p><strong>Iteration is required.</strong> I&#8217;ve rarely seen an agent start amazing out of the gate. All of these processes (and others I haven&#8217;t shown) have required constant iteration of instructions/context. Sometimes it&#8217;s automated (modify CLAUDE.md to incorporate learnings). Sometimes it&#8217;s me giving better data sources. But still, you need to be judicious about this.</p></li><li><p><strong>It&#8217;s dangerous.</strong> Notice, I redacted things even for certain screenshots here. You bet the agents have access to <em>extraordinarily sensitive</em> information. And I don&#8217;t even do OpenClaw yolo things like give it access to my bank account/credit card (yes, people have done that with OpenClaw)! With great power comes a huge amount of f*cking security risk.</p></li></ol><p>I expect there will be better off-the-shelf agentic products in the future that require less tinkering and are safer. For most... I&#8217;d stick to the safer side of things.</p><p>For example, meeting summarization can be pretty safe and highly effective with some of the ideas of using Claude Code (or Claude Cowork now, which is basically the same) and having <em>clear, detailed instructions</em>&#8212;pages of them, not paragraphs of them&#8212;and sufficient context. No need to let your agents trawl the web or deal with setting up OAuth for your MCPs (like I did).</p><p>Really powerful. Really scary.</p><h2><strong>Security: Please, Don&#8217;t Be Reckless</strong></h2><p>Let me expand on that last point, because it&#8217;s important enough to warrant its own section. <a href="https://open.substack.com/users/5753967-simon-willison?utm_source=mentions">Simon Willison</a> has excellent pieces on one of the biggest issues: <a href="https://simonwillison.net/series/prompt-injection/">prompt injection</a>.</p><p>There is simply <em>no good way</em> to fully prevent LLMs from reading rogue, malicious instructions and <em>potentially</em> acting on them.</p><p>The biggest worry is exfiltration. Say you have your personal schedule, API keys, passwords, whatever in your agent&#8217;s context (sometimes you can&#8217;t avoid it&#8212;power and access often require keys to do useful things). What if an attacker embeds an instruction to send all of it to a website or location they control?</p><p>Major model companies have been working on making their models resistant to this... but it&#8217;s not perfect. My own setup is <em>very</em> not perfect. There are security holes I already know about, and that&#8217;s <em>with</em> a lot of custom tuning, firewall settings, sandbox isolation, etc.</p><p>If you aren&#8217;t sure how to think about this, err on caution. Don&#8217;t YOLO with something like OpenClaw, where <a href="https://www.trendmicro.com/en_us/research/26/b/openclaw-skills-used-to-distribute-atomic-macos-stealer.html">a bunch of auto-installable skills are actually malicious</a>. Yeah, sure, OpenClaw can be powerful. But that&#8217;s like going around and picking up used needles off the street and injecting yourself, hoping for a high.</p><p>You can get a lot of power already with some of the principles I outlined&#8212;yeah, it&#8217;s more work, but I expect it will work better... and it will be infinitely safer.</p><h2><strong>So What Does This All Mean?</strong></h2><p><a href="https://edition.cnn.com/2026/02/26/business/block-layoffs-ai-jack-dorsey">Block (previously Square) just laid off roughly half their workforce</a>. <a href="https://open.substack.com/users/86606269-citrini?utm_source=mentions">Citrini</a> Research <a href="https://www.citriniresearch.com/p/2028gic">published a thought piece</a> that essentially crashed various stocks by painting a people-less future as AI takes over (which I disagree with). At the same time, a refrain on social media is &#8220;AI adds nothing in productivity&#8212;it&#8217;s been shown in studies.&#8221;</p><p><a href="https://weightythoughts.com/p/white-collar-apocalypse-isnt-around">My last piece directly rebuts the &#8220;nothing in productivity&#8221; point.</a> Studies don&#8217;t show that. Even in fairly conservative cases (GitHub Copilot benchmarks from over a year ago), we clearly see productivity gains across different studies. This also fits what I wrote in <a href="https://weightythoughts.com/p/the-boring-phase-of-ai">&#8220;The Boring Phase of AI&#8221;</a>&#8212;AI that does real tasks is both less flashy and likely much more important than the headline-grabbing model releases. That future is already here. We&#8217;re just getting started.</p><p>And I also think it&#8217;s highly unlikely to cause a dystopian scenario of persistent mass unemployment. (Note: I didn&#8217;t say it won&#8217;t cause disruption, especially in the short term.)</p><p><strong>Random Note: On Model Choice</strong></p><p>Ironically, I use Opus for almost everything, even when it&#8217;s overkill. The one place I don&#8217;t is Claude Code&#8212;almost all implementation work (note: not planning or my &#8220;main&#8221; chat, but the actual execution) runs on Sonnet or Haiku, their smaller &#8220;dumber&#8221; models.</p><p>I think it tells you something that the most &#8220;permissive&#8221; case for tolerating errors and dumbness is code. Code can be validated against tests, and I&#8217;m generally going to review each pull request anyway. That perhaps says something about why AI adoption is happening so quickly in the coding realm.</p><div><hr></div><p>Thank you for being here, and I hope you have a wonderful day.</p><p>Dev &lt;3</p><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><p>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YNaT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YNaT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png 424w, https://substackcdn.com/image/fetch/$s_!YNaT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png 848w, https://substackcdn.com/image/fetch/$s_!YNaT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png 1272w, https://substackcdn.com/image/fetch/$s_!YNaT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YNaT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png" width="454" height="107" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:107,&quot;width&quot;:454,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YNaT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png 424w, https://substackcdn.com/image/fetch/$s_!YNaT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png 848w, https://substackcdn.com/image/fetch/$s_!YNaT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png 1272w, https://substackcdn.com/image/fetch/$s_!YNaT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc25f1ec2-9373-45c3-91b1-ce53cf28d7b5_454x107.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/the-5b-lifeline-or-the-leash-guest?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo4MTAxNzI0LCJwb3N0X2lkIjoxNzY2MDg5NjUsImlhdCI6MTc2MjA1MDcxOCwiZXhwIjoxNzY0NjQyNzE4LCJpc3MiOiJwdWItMTMxNTA3NCIsInN1YiI6InBvc3QtcmVhY3Rpb24ifQ.0uL2t9TVCTZI5iAbPA3VagaLXvUBzFF061Jddc8CcDo&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.artificialintelligencemadesimple.com/p/the-5b-lifeline-or-the-leash-guest?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo4MTAxNzI0LCJwb3N0X2lkIjoxNzY2MDg5NjUsImlhdCI6MTc2MjA1MDcxOCwiZXhwIjoxNzY0NjQyNzE4LCJpc3MiOiJwdWItMTMxNTA3NCIsInN1YiI6InBvc3QtcmVhY3Rpb24ifQ.0uL2t9TVCTZI5iAbPA3VagaLXvUBzFF061Jddc8CcDo"><span>Share</span></a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><h3><strong>Reach out to me</strong></h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[The Future of On-Device AI]]></title><description><![CDATA[How Liquid AI built the Best Edge AI Model in the World. And what this reveals about the next wave of model design.]]></description><link>https://www.artificialintelligencemadesimple.com/p/the-future-of-on-device-ai</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/the-future-of-on-device-ai</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Tue, 24 Mar 2026 06:54:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!siaF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>On-device AI is the obvious end state: more Privacy , no Latency disappears, lower inference costs, and more control. Every major chipmaker is already shipping NPUs into phones, laptops, and edge devices. The hardware is here.</p><p>So why doesn&#8217;t it work yet? Why is the on-device experience still mostly parlor tricks&#8202;&#8212;&#8202;autocomplete, photo cleanup, canned summaries&#8202;&#8212;&#8202;instead of a real model running doing meaningful work on a phone?</p><p>It&#8217;s not because models are too big in the way people think. A 1B model can be compressed to a few hundred megabytes. That fits.</p><p>The problem shows up when the model starts <em>thinking</em>.</p><p>Transformers don&#8217;t just store weights; they store a growing memory of the conversation&#8202;&#8212;&#8202;the KV cache. Every new token adds to it, across every attention layer. For a small model like Llama 3.2 1B, that cache alone reaches ~524 MB at a 32K context. In practice, a single long interaction pushes total memory past a gigabyte. The bottleneck isn&#8217;t raw compute.</p><p><strong>It&#8217;s memory that grows every time the model is used.</strong></p><p>The transformer was designed for data centers where memory is abundant, and billing is per-token. On a phone, it is a structural mismatch. The architecture treats memory as infinite. The device does not.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yV36!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yV36!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png 424w, https://substackcdn.com/image/fetch/$s_!yV36!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png 848w, https://substackcdn.com/image/fetch/$s_!yV36!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png 1272w, https://substackcdn.com/image/fetch/$s_!yV36!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yV36!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png" width="1356" height="1256" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1256,&quot;width&quot;:1356,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yV36!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png 424w, https://substackcdn.com/image/fetch/$s_!yV36!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png 848w, https://substackcdn.com/image/fetch/$s_!yV36!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png 1272w, https://substackcdn.com/image/fetch/$s_!yV36!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19118dc7-81d8-404b-8526-3fae05db7624_1356x1256.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Liquid AI&#8217;s LFM2 changes this. Their 1.2-billion-parameter model runs on that same Galaxy S25, full 32,000-token context, in 719 megabytes total. Seventy tokens per second on the CPU on the same quantization scheme as above. It is a fundamentally different answer to the question of which operations should persist in memory and which should not.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X9Yd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X9Yd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png 424w, https://substackcdn.com/image/fetch/$s_!X9Yd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png 848w, https://substackcdn.com/image/fetch/$s_!X9Yd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png 1272w, https://substackcdn.com/image/fetch/$s_!X9Yd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X9Yd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png" width="1456" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!X9Yd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png 424w, https://substackcdn.com/image/fetch/$s_!X9Yd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png 848w, https://substackcdn.com/image/fetch/$s_!X9Yd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png 1272w, https://substackcdn.com/image/fetch/$s_!X9Yd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fef0160-f232-4d53-ab10-2ec0b4d02228_1594x589.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In this article, we will walk through how Liquid AI built the best edge model in the world, from scratch. By the end, you will understand the following:</p><ul><li><p>Why transformers become expensive on edge devices (memory, bandwidth, KV cache growth)</p></li><li><p>What alternatives exist, and what they give up (SSMs, linear attention, hybrids)</p></li><li><p>How Liquid AI&#8217;s architecture works and why it&#8217;s different from the status quo.</p></li><li><p>Why their search system (STAR) may matter more than the model itself</p></li><li><p>What the benchmarks and real-device performance actually show</p></li><li><p>Where the economics flip from cloud to on-device</p></li><li><p>What could break this entire thesis</p></li></ul><p>As with our other deep dives, you won&#8217;t need any prior knowledge. I will walk you through everything you need to know, ground up. The goal is not just to explain how one strong edge model works, but to show where low-cost, high-efficiency AI is heading next.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CsDi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CsDi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!CsDi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!CsDi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!CsDi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CsDi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CsDi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!CsDi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!CsDi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!CsDi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F243a5323-ec17-4281-b0f5-970e1212d888_1600x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Executive Highlights (tl;dr of the article)</h3><ul><li><p><strong>The bottleneck on your phone is memory bandwidth, not compute.</strong> Single-token decode has an arithmetic intensity of ~4 FLOPs/byte on hardware built for 295. Your H100 runs at 1.4% utilization. Your phone is 49x worse on bandwidth and can&#8217;t batch. The KV cache grows 16 KB per token for Llama 3.2 1B&#8202;&#8212;&#8202;524 MB at 32K context, larger than the model weights. Quantization halves the per-token cost but doesn&#8217;t stop the growth.</p></li><li><p><strong>Every alternative architecture sacrifices something.</strong> SSMs: O(1) memory but compound quantization errors multiplicatively (fatal on INT4 edge hardware). Linear attention: kills quadratic math but token associations bleed together. Convolutions: fast, 15 years of mobile optimization, but blind past their window. No single mechanism works alone.</p></li><li><p><strong>LFM2: 10 gated short convolution blocks + 6 grouped-query attention blocks.</strong> Convolutions handle local syntax with zero cache. Attention handles global retrieval, deployed sparingly. 192 MB cache at 32K vs. Llama&#8217;s 524 MB&#8202;&#8212;&#8202;63% cut from fewer attention layers, 90% reduction with grouped-query sharing.</p></li><li><p><strong>STAR matters more than the model.</strong> An evolutionary search system that encodes architectures as hierarchical genomes and evolves them on actual phones under real latency/memory constraints. The key upgrade: replaced proxy metrics with hardware-in-the-loop profiling on Galaxy S24s and Ryzen laptops. Proxy metrics lie&#8202;&#8212;&#8202;Mamba and convolutions have similar theoretical FLOPs but convolutions map to native SIMD instructions. STAR rejected every SSM variant.</p></li><li><p><strong>The training pipeline closes the gap against models 42% larger.</strong> Distillation from a 7B teacher storing only top-32 logits (2,000x compression) with a decomposed loss that separates membership from ranking. Curriculum learning easy-to-hard. Model merging at zero inference cost. INT4 quantization in the loop from the start.</p></li><li><p><strong>Three unsolved problems determine who wins edge AI.</strong> Signal-to-noise (ambient computing means 99% of tokens are garbage&#8202;&#8212;&#8202;architectures must actively refuse to process context). Hardware fragmentation (no standard edge chip&#8202;&#8212;&#8202;either vertical players capture the edge or open-source builds a communal STAR). Hostile markets (healthcare, defense, industrial&#8202;&#8212;&#8202;700 MB of local intelligence bypasses the cloud entirely).</p></li></ul><p><em>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ajO6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ajO6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png 424w, https://substackcdn.com/image/fetch/$s_!ajO6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png 848w, https://substackcdn.com/image/fetch/$s_!ajO6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png 1272w, https://substackcdn.com/image/fetch/$s_!ajO6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ajO6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png" width="535" height="159" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb61700b-2606-4601-85b3-814ded4aabbc_535x159.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:159,&quot;width&quot;:535,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ajO6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png 424w, https://substackcdn.com/image/fetch/$s_!ajO6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png 848w, https://substackcdn.com/image/fetch/$s_!ajO6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png 1272w, https://substackcdn.com/image/fetch/$s_!ajO6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb61700b-2606-4601-85b3-814ded4aabbc_535x159.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><h3>1) The Main Hardware Constraint for AI Inference: Memory Bandwidth</h3><p>Most people assume the hard part of running an AI model is the math. Bigger brain, more parameters, heavier multiplications, beefier chip. It&#8217;s a beautifully clean, totally intuitive mental model&#8202;&#8212;&#8202;and just like most other thoughts you have in your life, it is in a passionate &#8220;no-contact&#8221; with intelligence or reality.</p><p>The binding constraint on modern inference&#8202;&#8212;&#8202;especially on phones, laptops, and edge devices sweating in your pocket&#8202;&#8212;&#8202;isn&#8217;t how fast the chip can do math. It&#8217;s how fast the chip can feed itself data to do the math. Erling Haaland with a Spursy midfield wouldn&#8217;t get any goals.</p><h4>How Language Models Generate Text: Prefill vs. Decode</h4><p>A language model spits out text one token at a time. Generation is autoregressive, which is just a fancy way of saying it&#8217;s inherently serial: step 1, then step 2, no skipping ahead. But this process has a split personality.</p><ul><li><p><strong>Phase 1: Prefill.</strong> This is the honeymoon phase. The model receives your entire input prompt at once, and because those tokens already exist, it can process them all in parallel. Prefill is a massive, compute-heavy rave. Everyone is dancing, the hardware is maxed out, and the chip gets to flex its muscles.</p></li><li><p><strong>Phase 2: Decode.</strong> The hangover. This is where the economics get brutal. The model now has to generate tokens one by one. Every single token requires a full pass through every layer of the model. But because you are only producing one word, your &#8220;batch size&#8221; is exactly 1. You are multiplying the model&#8217;s <em>entire</em> weight matrix against a single tiny vector just to produce one number. Then you do it gain. And again.</p></li></ul><p>Decode doesn&#8217;t stop being a problem here. This diva also comes in with 2 different cost profiles that you need to worry about.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1EPb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1EPb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png 424w, https://substackcdn.com/image/fetch/$s_!1EPb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png 848w, https://substackcdn.com/image/fetch/$s_!1EPb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png 1272w, https://substackcdn.com/image/fetch/$s_!1EPb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1EPb!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png" width="1200" height="1134.984520123839" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1222,&quot;width&quot;:1292,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1EPb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png 424w, https://substackcdn.com/image/fetch/$s_!1EPb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png 848w, https://substackcdn.com/image/fetch/$s_!1EPb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png 1272w, https://substackcdn.com/image/fetch/$s_!1EPb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd26ff04-7d72-481a-af65-63250ab838c1_1292x1222.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>The Two Memory Costs: Weight-Loading and Cache-Access</h4><p>Every decode step hands you two separate bills (reminds me of my dates):</p><ol><li><p><strong>The Weight-Loading Bill (Fixed):</strong> To produce a single token, <em>every parameter</em> in the model must be dragged from memory into the processor. For a 1B model at INT4, that is roughly 500 megabytes shoved through the memory bus. Per token. <strong>Every </strong>fucking token. The model doesn&#8217;t get to skip layers because the answer is an easy word like &#8220;the.&#8221; The full weight matrix moves, every time.</p></li><li><p><strong>The Cache-Access Bill (Variable):</strong> The model must also read the KV cache (the stored memory of previous tokens) to compute attention. At token 100, this cache is cute and small. At token 32,000, it&#8217;s a bloated monster that might literally be larger than the weights themselves. If the weight-loading bill is your rent, the cache-access bill is a taxi meter in Manhattan traffic while the oil and gas industry uses every world event to price-gouge profits from you.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X872!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X872!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg 424w, https://substackcdn.com/image/fetch/$s_!X872!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg 848w, https://substackcdn.com/image/fetch/$s_!X872!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!X872!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X872!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg" width="700" height="587" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:587,&quot;width&quot;:700,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!X872!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg 424w, https://substackcdn.com/image/fetch/$s_!X872!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg 848w, https://substackcdn.com/image/fetch/$s_!X872!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!X872!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89efd839-fdc1-430f-970d-276efb971135_700x587.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/profiteering-from-a-billion-deaths">Oil Companies capitalized on the Ukraine-Russia War to rake in record profits. </a><strong><a href="https://www.artificialintelligencemadesimple.com/p/profiteering-from-a-billion-deaths">One company, Occidental Petro Corp saw a 721.49% increase in profits in one year. </a></strong><a href="https://www.artificialintelligencemadesimple.com/p/profiteering-from-a-billion-deaths">Other FFCs didn&#8217;t have a bad year either.</a></figcaption></figure></div><p>That is why the key metric here is not FLOPs in isolation, but arithmetic intensity: how much useful computation the chip gets for every byte it has to move. Let&#8217;s look at how.</p><h4>Arithmetic Intensity Explained: FLOPs Per Byte</h4><p>If you want to understand why a chip with massive theoretical compute can still deliver painfully slow inference, you need to know one number: <strong>Arithmetic Intensity</strong>.</p><p>It&#8217;s the ratio of useful math to data dragged around. FLOPs per byte. For every byte the chip pulls from memory, how many multiplications does it get to do before it starves for the next one?</p><p>During decode, a single token is multiplied against each weight exactly once. For a weight stored in INT4 (0.5 bytes), your arithmetic intensity is roughly 4 FLOPs per byte. That ratio holds no matter how big your model is. It&#8217;s a property of the operation, not the chip.</p><p>Now, look at the hardware. An NVIDIA H100&#8202;&#8212;&#8202;the $30,000 GPU currently holding the global economy hostage&#8202;&#8212;&#8202;has a compute-to-bandwidth ratio of roughly 295 FLOPs per byte. But your decode phase only needs 4. During single-user text generation, your H100 is running at a pathetic 1.4% of its theoretical peak. Erling Haaland in the opposition box, waiting for the balls that will never come.</p><p>Understanding this failure in more detail requires us to look at the Roofline Model. The roofline model asks a simple question: for a given operation, which limit does the chip hit first?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RAbE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RAbE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png 424w, https://substackcdn.com/image/fetch/$s_!RAbE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png 848w, https://substackcdn.com/image/fetch/$s_!RAbE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png 1272w, https://substackcdn.com/image/fetch/$s_!RAbE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RAbE!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png" width="1200" height="710.4395604395604" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/edd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:862,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RAbE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png 424w, https://substackcdn.com/image/fetch/$s_!RAbE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png 848w, https://substackcdn.com/image/fetch/$s_!RAbE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png 1272w, https://substackcdn.com/image/fetch/$s_!RAbE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fedd328dd-8515-4e8b-b625-84f5dfd860f1_2400x1421.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>The Roofline Model: Compute Limits vs. Bandwidth Limits</h4><p>Every chip has two hard limits:</p><ul><li><p><strong>A Compute Ceiling:</strong> Max operations per second (the brawn).</p></li><li><p><strong>A Bandwidth Ceiling:</strong> Max bytes per second from memory (the digestive tract).</p></li></ul><p>Whichever ceiling you hit first is your bottleneck. Single-token decode at 4 FLOPs per byte smacks its head against the bandwidth ceiling on virtually every data center GPU in existence. The chip has compute to spare and nothing to feed it.</p><p>This is why tossing a &#8220;faster chip&#8221; at inference doesn&#8217;t work. A chip with 2x more compute but the same memory bandwidth will generate tokens at the <em>exact same speed</em>, because the bottleneck was never the math. The only way to go faster is to move fewer bytes per token, or build wider pipes.</p><p>And all of that leads to our original question&#8202;&#8212;&#8202;why is it so hard to get on-device generative AI right?</p><h4>Why Inference is Slower on Edge Devices vs. Data Centers</h4><p>Data centers survive this bottleneck because they have three cheat codes phones don&#8217;t: deep memory, massive bandwidth, and <strong>batching</strong>.</p><p>When 64 users hit an H100 at the same time, the GPU loads the weights <em>once</em> and applies them to all 64 tasks. The cost per user drops by 64x. Batching is the holy grail of cloud economics (that&#8217;s why API providers offer discounts on it). But a phone? A phone is a sad solo commute. Batch size is 1. There is nobody to split the bill with.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C7Nq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C7Nq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png 424w, https://substackcdn.com/image/fetch/$s_!C7Nq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png 848w, https://substackcdn.com/image/fetch/$s_!C7Nq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png 1272w, https://substackcdn.com/image/fetch/$s_!C7Nq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C7Nq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png" width="1228" height="1208" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1208,&quot;width&quot;:1228,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C7Nq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png 424w, https://substackcdn.com/image/fetch/$s_!C7Nq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png 848w, https://substackcdn.com/image/fetch/$s_!C7Nq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png 1272w, https://substackcdn.com/image/fetch/$s_!C7Nq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75816d07-f5c2-47a4-ae28-5e4e917df648_1228x1208.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Then there&#8217;s the bandwidth gap. An H100 delivers roughly 3,350 GB/s. Your flagship Snapdragon? Maybe 77 GB/s. That is a 49x gap. Weights that take an H100 a breezy 0.15 milliseconds to load take your phone 7 milliseconds. <em>Per token.</em> A prompt the cloud chews through in 10 milliseconds takes a phone CPU 2&#8211;3 excruciating seconds. You know that awkward pause before your AI assistant starts talking? That&#8217;s your phone violently shoving data through a tiny pipe.</p><p>Oh, and the H100 has 80 gigabytes of dedicated high-speed RAM. Your phone has 12 gigabytes of shared RAM, and half of it is already being hogged by your browser tabs.</p><p>Architecture cannot outrun the weight-loading bill&#8202;&#8212;&#8202;if you use matrices, you pay the rent. But architecture <em>can</em> absolutely hack the cache-access bill. Fewer attention layers equals a smaller KV cache. A smaller cache means fewer bytes read per decode step. Fewer bytes means you avoid that nasty bandwidth ceiling.</p><p>The real question isn&#8217;t &#8220;how do we make transformers faster on phones?&#8221; The question is: &#8220;which of these operations actually deserve to take up precious memory, and which ones are just burning our battery for absolutely no reason?&#8221;</p><p>And with that super slick transition, it&#8217;s time for us to take a deep look at the drivers of cost, and what we can drop.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!siaF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!siaF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png 424w, https://substackcdn.com/image/fetch/$s_!siaF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png 848w, https://substackcdn.com/image/fetch/$s_!siaF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png 1272w, https://substackcdn.com/image/fetch/$s_!siaF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!siaF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png" width="1250" height="1308" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1308,&quot;width&quot;:1250,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!siaF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png 424w, https://substackcdn.com/image/fetch/$s_!siaF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png 848w, https://substackcdn.com/image/fetch/$s_!siaF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png 1272w, https://substackcdn.com/image/fetch/$s_!siaF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cbe28d-190b-447c-8cb3-d47a1289760b_1250x1308.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>2) The Transformer&#8217;s Memory Cost Structure: Weights vs. KV Cache</h3><p>We already established that the decode phase hands you two memory bills. One is a predictable baseline; the other is a compounding tax. Let&#8217;s look at the receipt.</p><h4>The Fixed Cost: Model Weights</h4><p>Model weights are your fixed cost. For a 1B model at INT4, you are moving roughly 500 MB through the memory bus per token. You load it, apply it layer by layer, and move on. The size of the model does not change whether your prompt is three words or 20,000. It is a strictly known quantity, which makes it a manageable engineering problem.</p><p>Unfortunately, the transformer brought a parasitic plus-one.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!--Qv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!--Qv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png 424w, https://substackcdn.com/image/fetch/$s_!--Qv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png 848w, https://substackcdn.com/image/fetch/$s_!--Qv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png 1272w, https://substackcdn.com/image/fetch/$s_!--Qv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!--Qv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png" width="1312" height="1346" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1346,&quot;width&quot;:1312,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!--Qv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png 424w, https://substackcdn.com/image/fetch/$s_!--Qv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png 848w, https://substackcdn.com/image/fetch/$s_!--Qv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png 1272w, https://substackcdn.com/image/fetch/$s_!--Qv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4817f2fc-9f09-48de-add6-f8b1d23a493a_1312x1346.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>The Variable Cost: The KV Cache</h4><p>In every attention layer, for every single token processed, the transformer stores two vectors: a Key and a Value. This stored history is the KV cache. Unlike the weights, it grows with every token generated.</p><p>The growth rate is mathematically rigid, baked into the architecture at design time:</p><p><code>Cache per token = 2 * attention layers * KV head groups * dimension per head * bytes per element</code></p><p>Every variable there except sequence length was chosen by the architect long before training started. The hardware just eats whatever they chose. Look at Llama 3.2 1B. Every token adds exactly 16,384 bytes. About 16 KB per token. Sounds harmless. Watch context scale:</p><ul><li><p><strong>At 4,000 tokens</strong> (short conversation): 65 MB.</p></li><li><p><strong>At 32,000 tokens</strong> (real conversation): 524 MB. The cache is now larger than the weights.</p></li><li><p><strong>At 128,000 tokens</strong> (where every product roadmap points): 2 GB. Game over on a phone.</p></li></ul><p>Saying &#8220;the cache grows linearly&#8221; is technically correct and entirely useless. &#8220;Linear growth&#8221; can mean manageable overhead, or it can mean your phone violently murders background apps just to keep breathing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tYVe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tYVe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png 424w, https://substackcdn.com/image/fetch/$s_!tYVe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png 848w, https://substackcdn.com/image/fetch/$s_!tYVe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png 1272w, https://substackcdn.com/image/fetch/$s_!tYVe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tYVe!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png" width="1200" height="704.6703296703297" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:855,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tYVe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png 424w, https://substackcdn.com/image/fetch/$s_!tYVe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png 848w, https://substackcdn.com/image/fetch/$s_!tYVe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png 1272w, https://substackcdn.com/image/fetch/$s_!tYVe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F617c9259-e97e-41ce-9f13-f904c6a48cec_2400x1409.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And finding the RAM for that bloated cache isn&#8217;t even the worst part. You also have to keep reading it.</p><h4>The Decode Memory Tax: Reading the KV Cache</h4><p>The KV cache is highly active memory traffic. Every decode step, every attention layer loads the <em>full</em> cache for <em>all</em> previous tokens to compute attention scores. At token 100, you read 100 cached pairs per layer. Quick. At token 32,000, you read 32,000 pairs per layer&#8202;&#8212;&#8202;a 320x increase in data movement.</p><p>The weight-loading bill is the exact same at token 100 as it is at token 32,000. But the cache-access bill keeps climbing. Long conversations are literally more expensive per token than short ones. And just when memory bandwidth is tapped out, the attention math swoops in to make things exponentially worse.</p><h4>Why Self-Attention Compute Costs Grow Quadratically</h4><p>Standard self-attention computes a score between every token and every other token. The compute cost is quadratic, swallowing your budget as context grows:</p><ul><li><p><strong>At 4,000 tokens:</strong> Quadratic attention is roughly 25% of total compute per layer.</p></li><li><p><strong>At 32,000 tokens:</strong> ~73%. Attention is now the vast majority of what the model does.</p></li><li><p><strong>At 128,000 tokens:</strong> ~92%. The model spends almost all its energy just looking at things it already looked at.</p></li></ul><p>Compute cost and memory cost reinforce each other. More tokens means heavier attention math AND a larger cache to drag through the bus. Both hardware ceilings drop simultaneously. This is why long-context inference attacks both constraints at once. So, if the cache is bloated and the math is heavy, can&#8217;t we just compress the numbers and call it a day?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2_Cl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2_Cl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png 424w, https://substackcdn.com/image/fetch/$s_!2_Cl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png 848w, https://substackcdn.com/image/fetch/$s_!2_Cl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png 1272w, https://substackcdn.com/image/fetch/$s_!2_Cl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2_Cl!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png" width="1200" height="716.2087912087912" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:869,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:716626,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!2_Cl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png 424w, https://substackcdn.com/image/fetch/$s_!2_Cl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png 848w, https://substackcdn.com/image/fetch/$s_!2_Cl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png 1272w, https://substackcdn.com/image/fetch/$s_!2_Cl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe02b96f6-c73b-415b-a27b-11aa87399ec5_2084x1244.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Why Quantization Does Not Solve the Memory Bottleneck</h4><p>Quantization is compression, not absolution. Halve the precision, you halve the per-token cost. Great. But the cache still grows linearly. The model still reads that entire cache every decode step. You lowered the tax rate; you didn&#8217;t remove the tax. You are simply dragging slightly lighter garbage across the memory bus, and that bus will still max out.</p><p>Also, aggressive quantization hurts small models far more than large ones. Squeezing a 1B model to INT4 hits it disproportionately hard. This is why Liquid AI trains with quantization in the loop from the start&#8202;&#8212;&#8202;learning to be good <em>despite</em> the rounding, rather than getting brain damage later.</p><p>So, where does this technical butchery leave us for shipping products?</p><h4>The Real Question: What Happens After the Twentieth Reply?</h4><p>Model weights are the fixed cost you pay because the model exists. The KV cache is the variable cost you pay because the conversation exists. You are not charged extra because the model got bigger; you are charged extra because the interaction got longer.</p><p>Edge devices are uniquely allergic to variable costs because their memory budget is a joke and their bandwidth pipe is a straw. The question stops being &#8220;can I cram this model onto the device?&#8221; and becomes &#8220;what happens to this device after the twentieth reply?&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8wYD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8wYD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png 424w, https://substackcdn.com/image/fetch/$s_!8wYD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png 848w, https://substackcdn.com/image/fetch/$s_!8wYD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png 1272w, https://substackcdn.com/image/fetch/$s_!8wYD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8wYD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png" width="1286" height="1396" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1396,&quot;width&quot;:1286,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8wYD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png 424w, https://substackcdn.com/image/fetch/$s_!8wYD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png 848w, https://substackcdn.com/image/fetch/$s_!8wYD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png 1272w, https://substackcdn.com/image/fetch/$s_!8wYD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad00f4f-a896-4b3a-95cd-0b383f18693e_1286x1396.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Tokens will start to be eaten very quickly on phone since it will go into reading ambient noise, phone logs from other apps, screen etc. Most edge devices have similar issues.</figcaption></figure></div><p>If we can&#8217;t escape the fixed weight-loading bill, our biggest lever is the variable cache-access bill. Reduce attention layers. Reduce explicitly preserved history. Move fewer bytes. This is where architecture becomes raw deployment economics.</p><p>Every alternative architecture is just trying to kill some part of this bill while giving something else up. Some compress the cache, some use fixed state, some use convolutions.</p><p>So now that we know exactly how the transformer is extorting us, what are the escape routes, and what does each alternative sacrifice?</p><h3>3) The Escape Routes: Alternatives to Standard Attention</h3><p><em>We&#8217;re keeping this section deliberately light since we have a massive deep dive into the first 3 of these alternatives in one place over here:<a href="https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting"> </a><strong><a href="https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting">&#8220;How Long Context Inference Is Rewriting the Future of Transformers</a>&#8221;. </strong>The 4th&#8202;&#8212;&#8202;convolutions&#8202;&#8212;&#8202;will be broken down here since it&#8217;s the crux of Liquid AIs push.</em></p><p>Every alternative AI architecture proposed in the last five years is trying to solve the exact extortion racket we just outlined. They look at the transformer&#8217;s variable KV cache tax and the quadratic attention compute, and they desperately look for a fire exit.</p><p>But physics is ruthless. You cannot get full global attention for free. Every single escape route forces a sacrifice. The entire architectural design space is just a hostage negotiation over what you are willing to lose to lower your memory and compute bills.</p><p>Let&#8217;s walk through the four desperate attempts the field has made to dodge the check.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qKKO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qKKO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qKKO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qKKO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qKKO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qKKO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg" width="1456" height="4106" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4106,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qKKO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qKKO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qKKO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qKKO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F103b3fac-1c39-433a-98f4-30d0513aa7ba_1600x4512.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Escape Route 1: Compressing the KV Cache</h4><p>If the KV cache is too big, the most obvious engineering solution is to simply store less of it. You keep the transformer architecture exactly as it is, but you aggressively compress the receipt.</p><ul><li><p><strong>Multi-head Latent Attention (DeepSeek):</strong> Projects the Key and Value vectors into a tiny compressed representation before storing them, reinflating them on the fly when needed.</p></li><li><p><strong>SnapKV:</strong> Looks at the tokens and literally just throws away the ones the attention heads rarely look at, much like a cat making direct eye contact with you while slowly swatting your keys off a table because they aren&#8217;t relevant to its immediate needs.</p></li><li><p><strong>PagedAttention (vLLM):</strong> Doesn&#8217;t shrink the cache, but stops it from fragmenting your RAM by storing it in non-contiguous blocks, like virtual memory on a PC.</p></li></ul><p><strong>What it buys you:</strong> You can slash the size of the cache by 60% to 90%.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Coif!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Coif!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Coif!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Coif!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Coif!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Coif!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg" width="1456" height="624" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:624,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Coif!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Coif!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Coif!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Coif!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F894e5435-5581-47d5-8e61-c574437d24f7_1600x686.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://oilbeater.com/en/2025/04/14/deepseek-mla/">The original MHA needs to cache the full matrix, while MLA only caches the compressed vector and reconstructs the full matrix when needed.</a> By replacing the large 2 * g * d_k term with a much smaller compressed dimension d_c, DeepSeek reported a staggering 93.3% reduction in KV cache size&#8202;&#8212;&#8202;&#8220;<em>Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.</em>&#8221;</figcaption></figure></div><p><strong>What it costs you:</strong> The growth rate shrinks, but the growth <em>does not stop</em>. Even if you compress the cache by 90%, an ambient agent listening to 100,000 tokens a day will eventually consume your phone. Compression delays the memory wall; it does not remove it.</p><p>Delaying the memory wall isn&#8217;t a long-term strategy. If you actually want to stop the bleeding, you have to burn the cache entirely.</p><h4>Escape Route 2: State Space Models (SSMs) and Mamba</h4><p>If you refuse to let the memory footprint grow, you have to abandon the KV cache. Enter State Space Models (SSMs).</p><p>Instead of storing a perfect record of every token you&#8217;ve ever seen, an SSM compresses the entire history of the conversation into a single, fixed-size mathematical box (a state vector). When a new token arrives, the model updates the state vector, and then throws the raw token in the trash.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MfJr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MfJr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MfJr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MfJr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MfJr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MfJr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg" width="1456" height="445" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:445,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MfJr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MfJr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MfJr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MfJr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff64987-19b0-4aa2-aaa7-f296c7404855_1600x489.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>What it buys you:</strong> O(1) memory during generation. Your memory footprint is exactly the same at token 10 as it is at token 100,000. It is theoretically the perfect edge architecture.</p><p><strong>What it costs you:</strong> Precision. Because the state is a fixed size, it is a lossy compressor. It remembers the general vibe and the narrative, but over long sequences, individual token identities blur together&#8202;&#8212;&#8202;exactly like trying to recall the specific sequence of a combo after taking a clean right hook to the jaw during a sparring session. You know what happened, but the granular details are gone. If you ask an SSM to retrieve a highly specific, verbatim quote from token 5 in a 100,000-token document, it will hallucinate or fail where a transformer would succeed effortlessly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1WvF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1WvF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png 424w, https://substackcdn.com/image/fetch/$s_!1WvF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png 848w, https://substackcdn.com/image/fetch/$s_!1WvF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png 1272w, https://substackcdn.com/image/fetch/$s_!1WvF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1WvF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png" width="1456" height="1898" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1898,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1427888,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1WvF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png 424w, https://substackcdn.com/image/fetch/$s_!1WvF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png 848w, https://substackcdn.com/image/fetch/$s_!1WvF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png 1272w, https://substackcdn.com/image/fetch/$s_!1WvF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ce8a5c4-2d88-4c83-ba5e-afbf697f8967_1470x1916.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.linkedin.com/feed/update/urn:li:activity:7440470045593497600/">Broke this down here.</a></figcaption></figure></div><p>If memory loss isn&#8217;t an acceptable trade-off for your product, you have to try cheating the math instead of the storage.</p><h4>Escape Route 3: Linear Attention</h4><p>Standard attention compares every token to every other token, creating that enormous, sequence-length-dependent n-by-n matrix. Linear attention uses a clever algebraic trick to change the order of operations. It applies a feature map and multiplies the matrices in a different order, creating a fixed-size d-by-d matrix that does not grow with the sequence length.</p><p><strong>What it buys you:</strong> Like SSMs, memory during generation becomes fixed. You completely kill the quadratic compute cost.</p><p><strong>What it costs you:</strong> Feature collision. Linear attention mashes all the token associations additively into a single matrix. Over a long conversation, these features overlap and bleed into each other. It&#8217;s the mathematical equivalent of shoving your clean everyday clothes, sweaty MMA rash guards, and loose climbing chalk into the exact same duffel bag. Everything is technically in there, but good luck pulling out a clean shirt without it smelling like a bouldering gym.</p><p>If global math gets too messy, the only remaining option is to stop looking at the big picture entirely.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GHn2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GHn2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png 424w, https://substackcdn.com/image/fetch/$s_!GHn2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png 848w, https://substackcdn.com/image/fetch/$s_!GHn2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png 1272w, https://substackcdn.com/image/fetch/$s_!GHn2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GHn2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png" width="1456" height="1172" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1172,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:511939,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GHn2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png 424w, https://substackcdn.com/image/fetch/$s_!GHn2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png 848w, https://substackcdn.com/image/fetch/$s_!GHn2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png 1272w, https://substackcdn.com/image/fetch/$s_!GHn2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45546b12-8e52-47a4-9ab1-29f78d87fbf4_1458x1174.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h4>Escape Route 4: Long Convolutions</h4><p>Before transformers took over, researchers processed sequences using convolutions. A convolution slides a mathematical window over the text, mixing information locally. It doesn&#8217;t look at the whole document at once; it just looks at the tokens immediately around it.</p><p><strong>What it buys you:</strong> Hardware absolutely loves convolutions. Mobile CPUs have been optimizing convolutional math for image processing for 15 years. They are blazingly fast and require virtually no memory overhead.</p><p><strong>What it costs you:</strong> Blindness. If you use a convolution, you only see what is inside your window. If a pronoun needs to resolve to a noun 5,000 tokens ago, a convolution literally cannot see it.</p><p>When you look at the wreckage of this design space, a brutal reality emerges.</p><h4>The Real Lesson: Hybrid Architectures Are the Only Way</h4><p>Transformers give you perfect retrieval but bankrupt your memory. SSMs and Linear Attention fix your memory but blur your retrieval. Convolutions are incredibly fast but functionally myopic.</p><p>The field spent years arguing about which of these was the &#8220;right&#8221; answer. The actual lesson is that they are all the wrong answer if you apply them to the entire model.</p><p>If you want to build an architecture that survives on a phone, you cannot force one mathematical mechanism to do everything. You have to build a portfolio. You use cheap, fast, local operators for the vast majority of the work, and you strictly reserve the expensive, memory-hogging exact retrieval for the few layers that actually need it.</p><p>This is the exact hybrid insight that Liquid AI used to build LFM2. But to understand why Liquid AI arrived at their specific ratio of cheap-to-expensive layers, we need to look at where this company actually came from. Because they didn&#8217;t start by trying to fix the transformer. They started by trying to simulate the brain of a worm.</p><p>They started by trying to simulate the brain of a worm.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nr4L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nr4L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png 424w, https://substackcdn.com/image/fetch/$s_!nr4L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png 848w, https://substackcdn.com/image/fetch/$s_!nr4L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png 1272w, https://substackcdn.com/image/fetch/$s_!nr4L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nr4L!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png" width="1200" height="578.5714285714286" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:702,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:1017870,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nr4L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png 424w, https://substackcdn.com/image/fetch/$s_!nr4L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png 848w, https://substackcdn.com/image/fetch/$s_!nr4L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png 1272w, https://substackcdn.com/image/fetch/$s_!nr4L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F725ca37c-4f9c-4a16-980f-d80cd43fa499_2226x1074.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.techrxiv.org/users/815707/articles/1216544-from-c-elegans-to-liquid-neural-networks-a-robust-wind-power-multi-time-scale-prediction-framework">Researchers used LNNs on energy prediction tasks. In their words: </a><em><a href="https://www.techrxiv.org/users/815707/articles/1216544-from-c-elegans-to-liquid-neural-networks-a-robust-wind-power-multi-time-scale-prediction-framework">&#8220;For comparative analysis, the LNN family (i.e., closed form continuous (CfC), Liquid Time Constant) and state-of-the-art recurrent networks (e.g., LSTM and GRU) and 1D-CNN are considered, and the CfC neural network provides the best results on unseen data. </a><strong><a href="https://www.techrxiv.org/users/815707/articles/1216544-from-c-elegans-to-liquid-neural-networks-a-robust-wind-power-multi-time-scale-prediction-framework">CfC models with fully connected layers using only 25 neurons have provided superior results for wind power prediction in different time spans, resolutions, and number of variable</a></strong><a href="https://www.techrxiv.org/users/815707/articles/1216544-from-c-elegans-to-liquid-neural-networks-a-robust-wind-power-multi-time-scale-prediction-framework">s&#8221;</a></em></figcaption></figure></div><h3><em>4) The Historical Path to Liquid AI&#8202;&#8212;&#8202;From Worm</em> Brains to Foundation Models</h3><p>If you look at the pedigree of most major AI labs&#8202;&#8212;&#8202;OpenAI, Anthropic, Mistral&#8202;&#8212;&#8202;all share the same evolutionary lineage. They are descendants of the original 2017 Transformer paper. Their entire scientific worldview is built on figuring out how to make self-attention bigger, faster, or slightly less gluttonous. And if they&#8217;re gay, they likely did a lot of work on stabilizing and scaling Reinforcement Learning.</p><p>Liquid AI did not come from this lineage. They did not start by asking, &#8220;How do we make a transformer cheaper?&#8221;</p><p>They started by looking at a worm.</p><h4>How 302 Neurons Beat Brute-Force Math</h4><p><em>Caenorhabditis elegans</em> is a one-millimeter roundworm. It has exactly 302 neurons, which Scientists have mapped completely.</p><p>If you look at modern deep learning, 302 neurons is nothing. It is a rounding error inside a single layer of an image classifier. Yet, with those 302 neurons, this worm can navigate toward food, avoid toxins, respond to temperature changes, and mate. It executes complex, continuous survival behaviors without a massive parameter count.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GbE8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GbE8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png 424w, https://substackcdn.com/image/fetch/$s_!GbE8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png 848w, https://substackcdn.com/image/fetch/$s_!GbE8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png 1272w, https://substackcdn.com/image/fetch/$s_!GbE8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GbE8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png" width="1030" height="596" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:596,&quot;width&quot;:1030,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GbE8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png 424w, https://substackcdn.com/image/fetch/$s_!GbE8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png 848w, https://substackcdn.com/image/fetch/$s_!GbE8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png 1272w, https://substackcdn.com/image/fetch/$s_!GbE8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94673a9d-5837-4119-9d79-7148bb3e93f2_1030x596.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.snexplores.org/article/the-brain-of-a-tiny-worm-inspired-a-new-type-of-ai-liquid-neural-network">Typical ANNs have many simple connections. Brain networks have fewer, more complex ones. A liquid neural network&#8217;s organization is more similar to our brain. Like our brains, in a liquid neural network, &#8220;We have information that flows back. We have loops,&#8221; says Daniela Rus.</a></figcaption></figure></div><p>How does it pull this off? It cheats the math.</p><p>In a standard artificial neural network, the &#8220;weights&#8221; (the strength of the connections between neurons) are frozen after training. The network&#8217;s internal dynamics never change. In the worm&#8217;s brain, the strength of a synaptic connection changes dynamically based on the signals currently flowing through it. <strong>The time it takes a neuron to respond to an input is not a fixed parameter; it adapts on the fly depending on what the worm is looking at.</strong></p><p>The founding team of Liquid AI&#8202;&#8212;&#8202;Ramin Hasani, Mathias Lechner, Alexander Amini, and Daniela Rus (director of MIT&#8217;s CSAIL)&#8202;&#8212;&#8202;looked at this and asked a brutal question: What if artificial networks stopped relying entirely on brute-force scale, and started adapting their internal dynamics to the input, exactly like the worm?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-waB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-waB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png 424w, https://substackcdn.com/image/fetch/$s_!-waB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png 848w, https://substackcdn.com/image/fetch/$s_!-waB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png 1272w, https://substackcdn.com/image/fetch/$s_!-waB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-waB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png" width="1456" height="942" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:942,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-waB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png 424w, https://substackcdn.com/image/fetch/$s_!-waB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png 848w, https://substackcdn.com/image/fetch/$s_!-waB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png 1272w, https://substackcdn.com/image/fetch/$s_!-waB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1361a70b-fea9-425e-9512-cb69a879e816_2400x1552.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Liquid Time-Constant Networks: The First Breakthrough</h4><p><a href="https://arxiv.org/abs/2006.04439">This obsession led to their first major paper in 2021: Liquid Time-Constant (LTC) Networks</a>.</p><p>Instead of freezing the network&#8217;s behavior, they governed each artificial neuron with an ordinary differential equation (ODE). The &#8220;time constant&#8221;&#8202;&#8212;&#8202;the speed at which the neuron reacts and forgets information&#8202;&#8212;&#8202;was programmed to be a function of the input itself.</p><p>If a strong, urgent signal came in, the neuron&#8217;s time constant shrank, forcing it to react instantly. If a weak signal came in, the time constant expanded, allowing the neuron to slowly accumulate information and retain memory over longer periods. The network was literally rewiring its own temporal dynamics in real-time.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4kyv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4kyv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png 424w, https://substackcdn.com/image/fetch/$s_!4kyv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png 848w, https://substackcdn.com/image/fetch/$s_!4kyv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png 1272w, https://substackcdn.com/image/fetch/$s_!4kyv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4kyv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png" width="1456" height="1862" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1862,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:787215,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4kyv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png 424w, https://substackcdn.com/image/fetch/$s_!4kyv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png 848w, https://substackcdn.com/image/fetch/$s_!4kyv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png 1272w, https://substackcdn.com/image/fetch/$s_!4kyv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2767a07d-11e0-43fa-b7c8-e5c7f6cf6d3d_1500x1918.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://www.louisbouchard.ai/mit-biologically-inspired-neural-networks-for-self-driving-cars/">The result was staggering. They took a network with just 19 control neurons and 75K params to successfully drive an autonomous vehicle through complex visual environments. A standard neural network required thousands of neurons to do the same job.</a> They also showed strong robustness in noisy scenarios.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ic_k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ic_k!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif 424w, https://substackcdn.com/image/fetch/$s_!ic_k!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif 848w, https://substackcdn.com/image/fetch/$s_!ic_k!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif 1272w, https://substackcdn.com/image/fetch/$s_!ic_k!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ic_k!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif" width="600" height="338" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:338,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ic_k!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif 424w, https://substackcdn.com/image/fetch/$s_!ic_k!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif 848w, https://substackcdn.com/image/fetch/$s_!ic_k!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif 1272w, https://substackcdn.com/image/fetch/$s_!ic_k!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbed08f9-2f91-400d-95c0-4905c8c7d3de_600x338.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But in software engineering, there is always a catch.</p><h4>The Math Was Too Heavy</h4><p>LTC networks were elegant, but computationally miserable to run.</p><p>Because they relied on ordinary differential equations, the computer had to use an iterative numerical solver for every single forward pass. Training was excruciatingly slow. Inference? Practically unusable at scale.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_A3W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_A3W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png 424w, https://substackcdn.com/image/fetch/$s_!_A3W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png 848w, https://substackcdn.com/image/fetch/$s_!_A3W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png 1272w, https://substackcdn.com/image/fetch/$s_!_A3W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_A3W!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png" width="1200" height="710.4395604395604" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:862,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_A3W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png 424w, https://substackcdn.com/image/fetch/$s_!_A3W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png 848w, https://substackcdn.com/image/fetch/$s_!_A3W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png 1272w, https://substackcdn.com/image/fetch/$s_!_A3W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe43f4511-1efe-4cf5-b0f4-d65a74345c8f_2400x1421.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If this architecture was ever going to process billions of tokens instead of just steering a car, they had to kill the ODE solver.</p><p>In 2022, they found the cheat code: the Closed-form Continuous-time (CfC) network. They found an analytical approximation that computed the entire dynamic behavior of the system in a single step, bypassing the iterative solver entirely. It preserved the input-dependent adaptability of the worm&#8217;s brain, but ran up to 220 times faster.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d_Zt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d_Zt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png 424w, https://substackcdn.com/image/fetch/$s_!d_Zt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png 848w, https://substackcdn.com/image/fetch/$s_!d_Zt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png 1272w, https://substackcdn.com/image/fetch/$s_!d_Zt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d_Zt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png" width="1212" height="1320" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1320,&quot;width&quot;:1212,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d_Zt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png 424w, https://substackcdn.com/image/fetch/$s_!d_Zt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png 848w, https://substackcdn.com/image/fetch/$s_!d_Zt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png 1272w, https://substackcdn.com/image/fetch/$s_!d_Zt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18541d90-9f3c-43f9-8c28-b792de8c3da5_1212x1320.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>They had successfully separated the biological elegance from the computational overhead. This allowed us to kick things up a notch.</p><h4>Bridging to the Foundation Model Era</h4><p>Between 2022 and 2024, the team realized that this adaptive, input-dependent computation could solve the exact memory and scaling bottlenecks choking the natural language processing field.</p><p>This led to <em>Liquid-S4</em>, a merge of their dynamic time constants with the State Space Models we discussed in Section 3. <a href="http://github.com/The-Swarm-Corporation/Hyena-Y?tab=readme-ov-file">They explored </a><em><a href="http://github.com/The-Swarm-Corporation/Hyena-Y?tab=readme-ov-file">Hyena</a></em><a href="http://github.com/The-Swarm-Corporation/Hyena-Y?tab=readme-ov-file">, proving that long convolutions could replace attention for certain sequence tasks.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3pqH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3pqH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png 424w, https://substackcdn.com/image/fetch/$s_!3pqH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png 848w, https://substackcdn.com/image/fetch/$s_!3pqH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png 1272w, https://substackcdn.com/image/fetch/$s_!3pqH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3pqH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png" width="1456" height="1742" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1742,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3pqH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png 424w, https://substackcdn.com/image/fetch/$s_!3pqH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png 848w, https://substackcdn.com/image/fetch/$s_!3pqH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png 1272w, https://substackcdn.com/image/fetch/$s_!3pqH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff77350cd-ee9b-49e3-8c7f-f8118b851e3a_1488x1780.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>By September 2024, they released the first generation of Liquid Foundation Models (LFM-1B and LFM-3B). These models proved the foundational thesis: a non-transformer architecture could match the benchmark quality of models like Microsoft&#8217;s Phi-3.5, while remaining significantly smaller.</p><p>But matching a transformer in a benchmark is not the same thing as surviving in a phone.</p><p>The real question was not, &#8220;Can we build a non-transformer that is smart?&#8221; The real question was, &#8220;If we have all these non-transformer operators&#8202;&#8212;&#8202;SSMs, convolutions, adaptive gates&#8202;&#8212;&#8202;which exact combination of them actually runs fastest on the hostile silicon of a Snapdragon processor, without blowing up the memory budget?&#8221;</p><p>To answer that, Liquid AI built a machine to evolve the answer for them. But before we look at the machine, we need to understand the deep design intuition it discovered: Architecture is just budget allocation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_zeA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_zeA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_zeA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_zeA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_zeA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_zeA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg" width="1400" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:900,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_zeA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_zeA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_zeA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_zeA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1da90b-dfd9-4359-abc2-0f5e2188ae37_1400x900.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting">Your reminder that hybrids and their technicals + costs are broken down in depth here. That&#8217;s why we will speedrun the next section</a></figcaption></figure></div><h3>5) The Hybrid Insight: Architecture as Budget Allocation</h3><p>If you step back from all the math and just look at how language actually works, the transformer&#8217;s obsession with global attention starts to look a little&#8230; <em>obsessive</em>.</p><p>Language is overwhelmingly local. If you see the word &#8220;The,&#8221; you are almost certainly looking for a noun right after it. Most of what a language model does is routine linguistic plumbing&#8202;&#8212;&#8202;figuring out subject-verb agreement or that &#8220;he&#8221; refers to &#8220;John&#8221; in the previous sentence.</p><p>Yes, long-range dependencies exist. If page 1 of a murder mystery mentions a specific poison, and page 300 reveals the killer, the model needs precise, global retrieval to connect those two concepts.</p><p>But standard self-attention treats <em>every single word</em> like it might be the key to the murder mystery. It pays the maximum possible computational price&#8202;&#8212;&#8202;the quadratic n squared math and the full KV cache&#8202;&#8212;&#8202;to compare the word &#8220;The&#8221; to 32,000 other words, just in case one of them is relevant. It is like hiring a team of forensic accountants to audit your daily coffee purchase.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8RDn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8RDn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png 424w, https://substackcdn.com/image/fetch/$s_!8RDn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png 848w, https://substackcdn.com/image/fetch/$s_!8RDn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png 1272w, https://substackcdn.com/image/fetch/$s_!8RDn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8RDn!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png" width="1200" height="599.1758241758242" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:727,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:825891,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8RDn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png 424w, https://substackcdn.com/image/fetch/$s_!8RDn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png 848w, https://substackcdn.com/image/fetch/$s_!8RDn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png 1272w, https://substackcdn.com/image/fetch/$s_!8RDn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc35dea92-2a54-492b-9ba1-2d0dc80f7b2f_2212x1104.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h4>The Portfolio Allocation</h4><p>If you accept that most of language is local, the architectural answer becomes incredibly obvious: You take cheap, fast, myopic operators&#8202;&#8212;&#8202;like convolutions&#8202;&#8212;&#8202;and use them for the vast majority of the model&#8217;s layers. Let them handle the routine syntax. They don&#8217;t need a KV cache, and they run blazingly fast on a phone CPU.</p><p>Then, you take the expensive, memory-hogging exact retrieval mechanism&#8202;&#8212;&#8202;attention&#8202;&#8212;&#8202;and use it sparingly. You only wake it up when you actually need to find the poison on page 1. The ratio between the cheap layers and the expensive layers becomes the master control knob for your entire inference cost.</p><h4>The Survival Channel</h4><p>But if 80% of your layers are functionally blind convolutions that only look at the 3 words next to them, doesn&#8217;t the model just forget everything else?</p><p>It survives because of the <strong>residual stream</strong>.</p><p>When a token enters the model, its original mathematical meaning is posted to the chat. When a cheap convolution layer processes that token, it doesn&#8217;t delete the original meaning. It just figures out some local context and <em>adds</em> that annotation to the chat.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1ikn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1ikn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png 424w, https://substackcdn.com/image/fetch/$s_!1ikn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png 848w, https://substackcdn.com/image/fetch/$s_!1ikn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!1ikn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1ikn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png" width="1400" height="1400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1400,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1ikn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png 424w, https://substackcdn.com/image/fetch/$s_!1ikn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png 848w, https://substackcdn.com/image/fetch/$s_!1ikn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!1ikn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83893a3c-ca25-48bc-99d6-a77f203441df_1400x1400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>By the time the signal reaches the rare, expensive attention layer, that attention layer can read both the original token identity AND all the local context the convolutions gathered along the way. The convolutions didn&#8217;t destroy the signal; they just annotated it.</p><p>This is the exact hybrid insight Liquid AI used to build LFM2. How they found their ratios, and the rest of their technical details are all a work of art, which is exactly what we will be looking at next.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aBgu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aBgu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png 424w, https://substackcdn.com/image/fetch/$s_!aBgu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png 848w, https://substackcdn.com/image/fetch/$s_!aBgu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!aBgu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aBgu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png" width="1428" height="1008" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1008,&quot;width&quot;:1428,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:250422,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aBgu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png 424w, https://substackcdn.com/image/fetch/$s_!aBgu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png 848w, https://substackcdn.com/image/fetch/$s_!aBgu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!aBgu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723e8433-ea4b-4a75-b8d7-0be1133cb465_1428x1008.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>6) What is the Architecture of LFM 2, the best Edge AI Model</h3><p><a href="https://arxiv.org/abs/2511.23404">Sixteen total blocks. Ten are gated short convolution blocks. Six are grouped query attention blocks. That is the whole model</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OsVP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OsVP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png 424w, https://substackcdn.com/image/fetch/$s_!OsVP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png 848w, https://substackcdn.com/image/fetch/$s_!OsVP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png 1272w, https://substackcdn.com/image/fetch/$s_!OsVP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OsVP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png" width="1456" height="974" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:974,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OsVP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png 424w, https://substackcdn.com/image/fetch/$s_!OsVP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png 848w, https://substackcdn.com/image/fetch/$s_!OsVP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png 1272w, https://substackcdn.com/image/fetch/$s_!OsVP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc433453-9431-4dcb-b852-3ca8aefe21a0_2400x1605.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Their 10-to-6 split is predicated on the observation that exact global retrieval through self&#8202;&#8212;&#8202;attention is being dramatically overspent in standard architectures. Most layers in a language model do not need to look at every previous token. They just need to process local patterns cheaply and move on. The expensive global operation should be deployed sparingly, like a specialist you call in for the cases that actually require it, not a default you run in every single layer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3g9K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3g9K!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png 424w, https://substackcdn.com/image/fetch/$s_!3g9K!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png 848w, https://substackcdn.com/image/fetch/$s_!3g9K!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png 1272w, https://substackcdn.com/image/fetch/$s_!3g9K!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3g9K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png" width="1256" height="1510" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1510,&quot;width&quot;:1256,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3g9K!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png 424w, https://substackcdn.com/image/fetch/$s_!3g9K!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png 848w, https://substackcdn.com/image/fetch/$s_!3g9K!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png 1272w, https://substackcdn.com/image/fetch/$s_!3g9K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F186499c9-ccaf-4036-bc19-eedeaf2c5bff_1256x1510.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Let&#8217;s break this down further. When you think about it, every block in a language model pays for two jobs.</p><ul><li><p>The first is <strong>token mixing</strong>: letting words talk to each other to absorb context. This is where the transformer&#8217;s budget either bleeds out or holds, because standard token mixing is self-attention (the expensive, KV-cache growing tax).</p></li><li><p>The second job is <strong>channel mixing</strong>: processing the individual word <em>after</em> it has absorbed context. In LFM2, channel mixing is handled by a SwiGLU feed-forward network, which is a fixed cost and maintains no cache.</p></li></ul><p>So, the entire question of whether an architecture can survive on a phone reduces to one decision: what do you use for token mixing? If you use attention everywhere, you pay the cache tax everywhere. LFM2 found a cheaper token mixer to handle the routine work.</p><h4>The Gated Short Convolution Block: Adaptivity Without the KV Cache</h4><p>Remember our observation that most text probably cares about a few local dependencies? This is where is becomes useful.</p><p>Most people think that if you want a model that adapts its behavior based on what it is reading, you need attention. Without it, you are stuck with dumb, fixed operations.</p><p>People are wrong (shocker).</p><p>LFM2 adapts its behavior based on every token it processes without a single byte of KV cache or persistent memory.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wttj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wttj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png 424w, https://substackcdn.com/image/fetch/$s_!Wttj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png 848w, https://substackcdn.com/image/fetch/$s_!Wttj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png 1272w, https://substackcdn.com/image/fetch/$s_!Wttj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wttj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png" width="1456" height="909" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:909,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wttj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png 424w, https://substackcdn.com/image/fetch/$s_!Wttj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png 848w, https://substackcdn.com/image/fetch/$s_!Wttj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png 1272w, https://substackcdn.com/image/fetch/$s_!Wttj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F400a698b-1c93-4033-a90f-8e5c91f324db_2400x1499.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here is how:</p><ul><li><p><strong>Step 1: Split the input.</strong> The incoming token vector is multiplied by a learned weight matrix (a linear projection) to produce three separate vectors: <strong>B, C, and h_tilde . h_tilde is the working copy of the token&#8217;s information. B and C are gates&#8202;&#8212;&#8202;vectors of numbers that control signal flow.</strong> Critically, B and C are generated directly from the current token&#8217;s representation, meaning the model produces different gates for different inputs. The adaptivity is built in at step 1 before any mixing occurs.</p></li><li><p><strong>Step 2: Apply the first gate.</strong> Gate B is multiplied element by element against h_tilde to produce a new vector y. This first gate modulates the signal entering the convolution. Rather than reading &#8220;ambiguity,&#8221; the network dynamically adjusts its mathematical sensitivity based on the input vector, utilizing Linear Input-Varying (LIV) operators. Adaptive dynamics without the differential equations.</p></li><li><p><strong>Step 3: The convolution.</strong> The gated signal y passes through a depthwise 1D convolution with a kernel size of 3. A 1D convolution acts as a causal sliding window, combining the current token with the two immediately preceding it. &#8220;Depthwise&#8221; means it mixes positions within each channel independently, but does not mix between channels, deliberately keeping the operation cheap. The math is: z at position t, channel c = (w0 * y at position t, channel c) + (w1 * y at position t-1, channel c) + (w2 * y at position t-2, channel c). This requires three shared weights per channel and maintains a tiny rolling buffer that never grows, ensuring a constant cost per token.</p></li><li><p><strong>Step 4: The second gate and exit.</strong> Gate C is multiplied element by element against the convolution output z. This modulates the exiting signal before it is multiplied by a final weight matrix to map back to the model dimension.</p></li></ul><p>Ten layers of this block provide fully adaptive token mixing with zero persistent memory.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YTk5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YTk5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png 424w, https://substackcdn.com/image/fetch/$s_!YTk5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png 848w, https://substackcdn.com/image/fetch/$s_!YTk5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png 1272w, https://substackcdn.com/image/fetch/$s_!YTk5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YTk5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png" width="1272" height="1604" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1604,&quot;width&quot;:1272,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YTk5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png 424w, https://substackcdn.com/image/fetch/$s_!YTk5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png 848w, https://substackcdn.com/image/fetch/$s_!YTk5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png 1272w, https://substackcdn.com/image/fetch/$s_!YTk5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb84b15-04f5-4a7f-b551-b0c1790e07fa_1272x1604.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But a kernel size of 3 means the layer is blind to anything four words away. How much of a problem is that really?</p><h4>Why a Tiny Kernel Size of 3 Actually Works for Language</h4><p>Liquid AI&#8217;s architecture search system (STAR) evaluated multiple kernel sizes. The search consistently rejected broader windows. While a size-64 kernel offered small quality gains, it incurred massive hardware penalties via more multiplications and worse CPU cache efficiency.</p><p>This is an empirical finding about language modeling: the mechanical work of processing language&#8202;&#8212;&#8202;subject-verb agreement, bigram statistics&#8202;&#8212;&#8202;is overwhelmingly local. Stacking small kernels also naturally grows the effective reach; two size-3 layers in a row allow information to diffuse across five tokens.</p><p>More importantly for builders, depthwise 1D convolutions are standard operations across inference frameworks like llama.cpp, ExecuTorch, MLX, and ONNX Runtime. Unlike Mamba, which requires custom associative scan kernels that many mobile stacks lack, LFM2&#8217;s local blocks are shippable today.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cKUs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cKUs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png 424w, https://substackcdn.com/image/fetch/$s_!cKUs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png 848w, https://substackcdn.com/image/fetch/$s_!cKUs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png 1272w, https://substackcdn.com/image/fetch/$s_!cKUs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cKUs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png" width="1258" height="1438" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1438,&quot;width&quot;:1258,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cKUs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png 424w, https://substackcdn.com/image/fetch/$s_!cKUs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png 848w, https://substackcdn.com/image/fetch/$s_!cKUs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png 1272w, https://substackcdn.com/image/fetch/$s_!cKUs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff408eb7d-a8f4-4695-b185-a642597785de_1258x1438.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If ten layers handle the cheap work, the remaining six must justify their memory bill.</p><h4>The 6 Attention Layers: Paying for the Premium Global Retrieval Service</h4><p>LFM2 retains six grouped-query attention (GQA) blocks where the token can look at every previous token and maintain a KV cache. Standard multi-head attention gives every query head its own private Key-Value archive. LFM2 uses 32 query heads but restricts them to 8 KV groups, meaning every 4 query heads share the same Key-Value archive. This preserves 32 distinct attention patterns while dropping the storage cost by 4x.</p><p>Distance is handled by Rotary Position Embedding (RoPE), which rotates query and key vectors so the dot product measures relative distance. The convolution layers require no positional encoding at all, because their three weights are permanently assigned to specific positional offsets. Every component is optimized to lower the final cache math.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bDxI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bDxI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png 424w, https://substackcdn.com/image/fetch/$s_!bDxI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png 848w, https://substackcdn.com/image/fetch/$s_!bDxI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png 1272w, https://substackcdn.com/image/fetch/$s_!bDxI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bDxI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png" width="1152" height="1468" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1468,&quot;width&quot;:1152,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bDxI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png 424w, https://substackcdn.com/image/fetch/$s_!bDxI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png 848w, https://substackcdn.com/image/fetch/$s_!bDxI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png 1272w, https://substackcdn.com/image/fetch/$s_!bDxI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efbbc7e-6ddc-4b9b-bea6-20a87c69da21_1152x1468.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>To maintain stability downstream of ten convolution blocks, LFM2 applies QK-Norm. Attention scores can grow without bound as training progresses&#8202;&#8212;&#8202;the model makes queries and keys &#8220;louder&#8221; over time, and the dot product spikes. The softmax then collapses into a hard winner-take-all that ignores everything except one token. The attention pattern stops being useful.</p><p>LFM2 normalizes query and key vectors before the dot product. After normalization, the maximum possible attention score is bounded by the square root of the head dimension&#8202;&#8212;&#8202;for 64-dimensional heads, that ceiling is 8. This matters specifically because these rescue layers sit downstream of ten convolution blocks accumulating local context. If the attention scores are unstable, the premium service that justifies the entire architecture&#8217;s design breaks down exactly when you need it most.</p><h4>The Actual KV Cache Math: 192 MB vs 524 MB</h4><p>Here is the formula I&#8217;m sure you have imprinted into your psyche by now:<code>Cache per token = 2 * attention layers * KV groups * dimension per head * bytes per element</code></p><p>For LFM2: <code>2 * 6 layers * 8 groups * 64 dimensions * 1 byte = 6,144 bytes per token</code>. At 32,000 tokens, that is roughly 192 megabytes.</p><p>Compare that to Llama 3.2 1B (same groups, dimensions, and precision, but 16 attention layers). That is 16,384 bytes per token, or 524 megabytes at a 32,000 token context.</p><p>Running 6 attention layers instead of 16 cuts the cache by 63%. Compare it to a full 16-layer standard multi-head attention stack, and LFM2 stores about 9.4% as much KV memory&#8202;&#8212;&#8202;a 90% reduction. The memory win compounds sparser attention with cheaper attention.</p><p>Turns out, we didn&#8217;t need to pay the complete cost of exact retrieval every layer.</p><p>But this chindi behavior creates a terrifying mechanistic problem.</p><h4>How Information Survives the Blind Layers: Residual Connections</h4><p>If ten layers can only see three tokens, how does a fact from 100 tokens ago survive long enough for an attention layer to retrieve it? Why doesn&#8217;t the representation turn into local soup before a rescue layer arrives?</p><p>The answer is residual connections.</p><p>If a layer simply completely transformed its input, early information would be destroyed immediately. Residual connections prevent this with an additive rule: <code>output = input + Layer(input)</code>. The layer computes something and adds it on top of what came before. The raw token identity is always mathematically present in the stream because it was never subtracted out. Dimensional separation in the 2048-dimension vector allows the token identity and the local context annotations to broadcast on different frequencies without destroying each other.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jo5j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jo5j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png 424w, https://substackcdn.com/image/fetch/$s_!Jo5j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png 848w, https://substackcdn.com/image/fetch/$s_!Jo5j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png 1272w, https://substackcdn.com/image/fetch/$s_!Jo5j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jo5j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png" width="1200" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jo5j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png 424w, https://substackcdn.com/image/fetch/$s_!Jo5j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png 848w, https://substackcdn.com/image/fetch/$s_!Jo5j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png 1272w, https://substackcdn.com/image/fetch/$s_!Jo5j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1337972-b989-430f-8b4a-4bf8d1e8cf4e_1200x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Evidence suggests that after 4 to 8 consecutive blind layers, verbatim needle-in-a-haystack recall dies. Local annotations eventually contextualize the representation so heavily that exact identity becomes hard to recover. LFM2 spaces its 6 attention layers out to roughly one every 2.7 blocks, keeping it in the safe zone.</p><p>This is the explicit tradeoff. For forensic citation of 100,000-token documents, this architecture will underperform a pure transformer. But for on-device conversations, working ambiently (which means a lot of irrelevant tokens coming in), summarization, and agentic tool use, it dominates.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P2Gd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P2Gd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png 424w, https://substackcdn.com/image/fetch/$s_!P2Gd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png 848w, https://substackcdn.com/image/fetch/$s_!P2Gd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png 1272w, https://substackcdn.com/image/fetch/$s_!P2Gd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P2Gd!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png" width="1200" height="651.9230769230769" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:791,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:775713,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P2Gd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png 424w, https://substackcdn.com/image/fetch/$s_!P2Gd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png 848w, https://substackcdn.com/image/fetch/$s_!P2Gd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png 1272w, https://substackcdn.com/image/fetch/$s_!P2Gd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2f130a-c6c8-413c-82be-c03a28ada58d_2156x1172.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>As alluded to earlier, Liquid AI didn&#8217;t just guess this exact 10/6 configuration. They built a machine to mathematically evolve the answer on actual phones under real constraints. That machine is called STAR, and its long-term value is stronger than any single model it produces. Instead of getting super into the nitty gritties, we will transition to exploring that now.</p><h3>7) How STAR Works: The Architecture Search Machine</h3><p>The 10 convolution layers. The 6 attention layers. Kernel size 3. Eight KV groups. The exact spacing of the rescue layers. None of that was guessed. Nobody sat in a room and had an inspired glasses being pushed up epiphany where the optimal edge architecture descended from heaven.</p><p>Hand-designing an AI architecture is, at best, an educated guess constrained by human intuition. When AI21 designed Jamba, they ran ablations comparing a 1:7 attention-to-Mamba ratio against a 1:3 ratio, found little quality difference, and picked 1:7 because it was more compute-efficient. That is better than guessing blind, but it is still a manual search through a tiny slice of the design space, limited to the clean ratios a human thought to test. However, the right architecture for a Samsung phone might be completely different from the right architecture for an AMD laptop, and neither is likely to be a clean ratio that looks good on a whiteboard.</p><p>Liquid AI built a search machine to do the embarrassing part: brutally test architectural ideas on real hardware until most of them die.</p><p><a href="https://arxiv.org/abs/2411.17800">That machine is STAR (Synthesis of Tailored Architectures).</a> If you are trying to understand what is actually durable about Liquid AI, STAR matters more than any individual 1.2B model ever will. Models age fast. Benchmarks move. But a system that takes a hardware target, a latency budget, a memory ceiling, and a quality suite, then mathematically evolves the architecture that best fits that exact regime? That is not one good model. That is a factory for good models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7iLr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7iLr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png 424w, https://substackcdn.com/image/fetch/$s_!7iLr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png 848w, https://substackcdn.com/image/fetch/$s_!7iLr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png 1272w, https://substackcdn.com/image/fetch/$s_!7iLr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7iLr!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png" width="1200" height="742.5824175824176" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:901,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:543186,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7iLr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png 424w, https://substackcdn.com/image/fetch/$s_!7iLr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png 848w, https://substackcdn.com/image/fetch/$s_!7iLr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png 1272w, https://substackcdn.com/image/fetch/$s_!7iLr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa18d6c9b-2da4-4111-a5a2-b90bc5866965_2176x1346.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>[Top Left]: Population of architectures undergoing iterative STAR evolution to minimize number of parameters and maximize quality. [Top Right:] Baseline Transformer++, hybrid model, and representative architecture found via STAR. [Bottom]: STAR evolution optimizes architectures using principles of evolutionary optimization, including assessment, recombination, and mutation.</em></figcaption></figure></div><p><a href="https://www.artificialintelligencemadesimple.com/p/understanding-googles-revolutionary?utm_source=publication-search">(FYI&#8202;&#8212;&#8202;this is something that Google was one of the first to pioneer at scale. Their model training system&#8202;&#8212;&#8202;pathways&#8202;&#8212;&#8202;is one of the reasons why they can churn out amazing models at scale. Read more about it here</a>).</p><h4>Why Hand-Designing Architectures is a Deployment Hazard</h4><p>Even for a 16-layer model, the possible configurations of operators, sharing patterns, expansion factors, and featurizers are endless. A researcher can explore maybe a dozen configurations in a focused effort. STAR evaluates entire populations across generations.</p><p>When the hardware target changes&#8202;&#8212;&#8202;say, from a Snapdragon 8 Gen 3 to a Gen 5&#8202;&#8212;&#8202;the optimal architecture may shift in ways no human would predict because the cache hierarchy changed or the compiler handles operations differently. A researcher has to start over with new intuitions. STAR simply reruns the search.</p><p>There is also the bias problem. Researchers optimize for elegance. They gravitate toward architectures that have nice theoretical properties, look clean in a diagram, and make for a good NeurIPS oral presentation. Hardware does not care about elegance. Hardware only cares whether the operations map efficiently to its execution units. STAR has no aesthetic preferences; it only knows what is fast, what is small, and what scores well.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y6io!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y6io!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png 424w, https://substackcdn.com/image/fetch/$s_!y6io!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png 848w, https://substackcdn.com/image/fetch/$s_!y6io!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png 1272w, https://substackcdn.com/image/fetch/$s_!y6io!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y6io!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png" width="1200" height="862.9120879120879" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1047,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y6io!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png 424w, https://substackcdn.com/image/fetch/$s_!y6io!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png 848w, https://substackcdn.com/image/fetch/$s_!y6io!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png 1272w, https://substackcdn.com/image/fetch/$s_!y6io!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40bd9e5e-bdbd-4e8f-93bd-db921fccc468_1894x1362.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em><strong>&#8220;When optimizing for quality and size, 7/8 evaluated STAR-evolved architectures improve over Transformer++ and striped hybrids of recurrences and attention across downstream evaluation benchmarks (Gao et al., 2024), with a reduction of up to 13% in parameter counts. Similarly, optimizing for quality and cache size, 7/8 evaluated STAR-evolved architectures achieve up to 37% smaller cache sizes than striped hybrids, and 90% smaller than Transformers, while performing at least as well in quality. We also show that 125Mparameter architectures optimized for quality and cache by STAR can scale to 1B parameters and perform on par with parameter-matched Transformer++ and striped hybrid architectures, while maintaining the same advantages in cache size reductions. When optimizing solely for quality, all evaluated STAR-evolved architectures outperform standard hybrids on downstream benchmarks, achieving improvements twice as large as those of hybrids over Transformers.&#8221;</strong></em></figcaption></figure></div><p>But to search this vast space effectively, you need a common mathematical language that can express all major sequence operators inside the same geometry. This is harder than you&#8217;d think, and why we need to really abstract things well (which makes me think that studying category theory might lead to some interesting breakthroughs for AI, especially for more meta-level work).</p><h4>The LIV Abstraction: One Algebra for Every Architecture</h4><p>STAR&#8217;s search space is built on a theoretical foundation called the Linear Input-Varying (LIV) operator.</p><p>In plain terms, an LIV operator produces its output by multiplying the input sequence by a linear transformation, but that transformation is itself generated from the input: <code>Output = T(x) * x</code>, where <code>T</code> is a matrix that depends on the input <code>x</code>.</p><p>What this buys the search system is total algebraic unification. It expresses attention, linear attention, state space models, convolutions, and gating operators as special cases of the exact same family:</p><ul><li><p>If <code>T(x)</code> is the softmax attention matrix built from query-key dot products, you get standard self-attention.</p></li><li><p>If <code>T(x)</code> has the structured form of a state-transition operator, you land in Mamba territory.</p></li><li><p>If <code>T(x)</code> is Toeplitz-structured with learned, input-dependent gating, you get LFM2&#8217;s gated convolution.</p></li></ul><p>These terms might be a bit intimidating to you, so here is a visual that walks you through the same. It&#8217;ll make understanding this stuff a lot easier.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3TyA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3TyA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png 424w, https://substackcdn.com/image/fetch/$s_!3TyA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png 848w, https://substackcdn.com/image/fetch/$s_!3TyA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png 1272w, https://substackcdn.com/image/fetch/$s_!3TyA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3TyA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png" width="1410" height="2708" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2708,&quot;width&quot;:1410,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3TyA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png 424w, https://substackcdn.com/image/fetch/$s_!3TyA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png 848w, https://substackcdn.com/image/fetch/$s_!3TyA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png 1272w, https://substackcdn.com/image/fetch/$s_!3TyA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa8addb6-e7d9-4ebd-9d72-ef4e04492aa0_1410x2708.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Because STAR moves through a continuous mathematical geometry rather than picking between discrete labels, it can discover intermediate or mixed structures that no human would have proposed. This turns the space between &#8220;convolution&#8221; and &#8220;SSM&#8221; from a vacuum to a rich set of unnamed configurations with (potentially) excellent hardware behavior.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J-jg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J-jg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png 424w, https://substackcdn.com/image/fetch/$s_!J-jg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png 848w, https://substackcdn.com/image/fetch/$s_!J-jg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png 1272w, https://substackcdn.com/image/fetch/$s_!J-jg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J-jg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png" width="1456" height="476" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:476,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J-jg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png 424w, https://substackcdn.com/image/fetch/$s_!J-jg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png 848w, https://substackcdn.com/image/fetch/$s_!J-jg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png 1272w, https://substackcdn.com/image/fetch/$s_!J-jg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e6fa661-0b4a-4251-b22a-08780ad10e68_2400x784.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">&#8220;Hierarchical structure of the STAR genome. Each sequence at lower levels is summarized into a single value at higher levels, enabling its treatment as a discrete variable. We leverage this property extensively when optimizing backbones directly&#8221;</figcaption></figure></div><h4>The Architectural Genome: Evolving at Three Resolutions</h4><p>To work effectively, STAR encodes every candidate architecture as a hierarchical genome&#8202;&#8212;&#8202;a sequence of integers describing the architecture at multiple scales simultaneously:</p><ul><li><p><strong>Backbone Genome (Macro):</strong> One operator ID per layer. This decides the body plan&#8202;&#8212;&#8202;what kinds of blocks appear and where.</p></li><li><p><strong>Operator Genome (Wiring):</strong> Expands the operator ID into a short code specifying the token mixer, channel mixing pattern, and nonlinearity.</p></li><li><p><strong>Featurizer Genome (Micro):</strong> Defines expansion factors, repeat counts, and internal sharing patterns. These are the tiny choices that quietly determine whether a model ships or crashes.</p></li></ul><p>This hierarchy allows evolutionary search to work at multiple resolutions. A mutation can swap whole operator types across layers, rewire one specific layer, or tune fine-grained details without disturbing the broader architecture.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oeDQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oeDQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png 424w, https://substackcdn.com/image/fetch/$s_!oeDQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png 848w, https://substackcdn.com/image/fetch/$s_!oeDQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png 1272w, https://substackcdn.com/image/fetch/$s_!oeDQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oeDQ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png" width="1200" height="1644.2307692307693" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c13ba400-5065-4497-b631-1a496791496f_1460x2000.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1995,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:939727,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oeDQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png 424w, https://substackcdn.com/image/fetch/$s_!oeDQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png 848w, https://substackcdn.com/image/fetch/$s_!oeDQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png 1272w, https://substackcdn.com/image/fetch/$s_!oeDQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc13ba400-5065-4497-b631-1a496791496f_1460x2000.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h4>Multi-Objective Evolution on the Pareto Frontier</h4><p>STAR uses gradient-free evolutionary algorithms to optimize across multiple objectives simultaneously. It maintains a population of candidates, evaluates them, keeps the ones that are not dominated by any other candidate, and breeds them for the next generation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kDPT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kDPT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png 424w, https://substackcdn.com/image/fetch/$s_!kDPT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png 848w, https://substackcdn.com/image/fetch/$s_!kDPT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png 1272w, https://substackcdn.com/image/fetch/$s_!kDPT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kDPT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png" width="1456" height="941" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:941,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:843268,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191951862?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kDPT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png 424w, https://substackcdn.com/image/fetch/$s_!kDPT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png 848w, https://substackcdn.com/image/fetch/$s_!kDPT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png 1272w, https://substackcdn.com/image/fetch/$s_!kDPT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3439dbb-f5dc-4171-aeda-c69caf7bcec2_1492x964.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>This setup helps a lot when it comes to balancing multiple factors. When it comes to edge deployments, quality, latency, and memory must be threaded together. A model that wins on perplexity but blows the memory budget is useless. STAR navigates the synthesis by tracking the Pareto frontier&#8202;&#8212;&#8202;the exact set of architectures where you cannot improve one objective without degrading another.</p><blockquote><p><em>&#8220;Given the wide range of possible applications of current AI systems, enabling systematic and automatic optimization of model architectures from the multitude of existing computational units is key to meeting the various demands these applications pose, in terms of efficiency (e.g., model size, inference cache size, memory footprint) and quality (e.g., perplexity, downstream benchmarks), and a prerequisite on the path to further, consistent improvements on the quality-efficiency Pareto frontier.&#8221;</em></p></blockquote><p>Most evolved architectures outperformed hand-designed hybrids within just a few rounds of evolution. But despite these amazing results, this OG wasn&#8217;t good enough. Getting to LMF2 required an upgrade.</p><blockquote><p><em>&#8220;Our earlier academic prototype (STAR) (Thomas et al., 2024) explored a specific design space of operator/layout choices with an evolutionary search heuristic optimized on proxy signals (i.e., perplexity for quality, cache size for efficiency). In practice, these proxies do not transfer reliably to downstream task scores or device-level latency and memory, limiting their utility as optimization objectives. By contrast, the LFM2 pipeline centers the objective: downstream task scores and hardware-in-the-loop TTFT/latency/memory on release runtimes. In practice, we found this has a much larger impact than the particulars of the search space or choice of search heuristic.&#8221;</em></p></blockquote><h4>The LFM2 Pivot: Why Liquid AI Replaced Proxies with Hardware-in-the-Loop</h4><p>The original STAR paper optimized against proxy signals: perplexity for quality, and estimated cache size for efficiency. The LFM2 technical report is blunt about why they abandoned this: proxy metrics lie.</p><p>A Mamba layer and a convolution layer might have similar theoretical FLOPs, but Mamba requires sequential scan operations that stall the CPU pipeline, while convolutions map to highly optimized SIMD instructions. Theoretically, better performance will lead to worse outputs IRL.</p><p>Likewise, estimated cache size ignores activation buffers, framework overhead, and allocator behavior. The only metric that tells you if a model will crash a device is peak RSS&#8202;&#8212;&#8202;the actual physical memory the process consumes at its worst moment. This is something you have to get your hands dirty to measure.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XpO9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XpO9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!XpO9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!XpO9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!XpO9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XpO9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg" width="1000" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XpO9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!XpO9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!XpO9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!XpO9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F868e7239-eed9-464b-b28e-4844fdc0f109_1000x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The LFM2 pipeline fixed this by dragging the search onto real devices. Every candidate architecture is compiled into deployment format (llama.cpp or ExecuTorch) and loaded onto actual hardware: either a Samsung Galaxy S24 Ultra and an AMD Ryzen laptop to get both ends. They are profiled at batch size 1 across both 4,000 and 32,000 token contexts. Candidates that violate device-side budgets for time-to-first-token, decode latency, or peak RSS are instantly discarded.</p><p>This kind of search is extremely powerful, but it comes with a massive downside&#8202;&#8212;&#8202;it&#8217;s very very expensive. t requires thousands of GPU-hours for proxy training and a dedicated physical device lab for continuous profiling.</p><p>Google DeepMind, OpenAI, Anthropic, Chinese labs, Apple, and Meta all possess the capital and internal frameworks to replicate hardware-in-the-loop search. Liquid AI&#8217;s success depends on converting that lead into ecosystem lock-in through runtimes, developer tooling, and accumulated profiling data&#8202;&#8212;&#8202;before a hyperscaler decides to look into doing this. This means they need to move quickly; anything that can block them has to go.</p><p>And understanding this leads to understanding one of their most surprising design outcomes.</p><h4>Why STAR Completely Rejected SSMs</h4><p>For all their theoretical elegance, STAR completely rejected State Space Models for Edge Deployments.</p><p>Every major SSM variant was available in the search space. The LFM2 report explicitly lists S4, Mamba, Mamba-2, Liquid-S4, and S5 as candidates. STAR simply chose not to include them. The hardware-in-the-loop search repeatedly selected the minimal hybrid of gated short convolutions and grouped-query attention, ruthlessly excluding SSMs from the surviving architectures.</p><p>If you&#8217;ve read our prior deep dives and followed along with the journey, the reasons might be clear to you. So instead of repeating stuff, here is a little visual that you can save to remember the details on the go&#8202;&#8212;&#8202;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!37nB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!37nB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png 424w, https://substackcdn.com/image/fetch/$s_!37nB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png 848w, https://substackcdn.com/image/fetch/$s_!37nB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png 1272w, https://substackcdn.com/image/fetch/$s_!37nB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!37nB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png" width="1410" height="1926" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1926,&quot;width&quot;:1410,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!37nB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png 424w, https://substackcdn.com/image/fetch/$s_!37nB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png 848w, https://substackcdn.com/image/fetch/$s_!37nB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png 1272w, https://substackcdn.com/image/fetch/$s_!37nB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cac423f-a526-40d7-8a82-f5cc5a306253_1410x1926.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This lays all the goundwork for us to finally understand Liquid&#8217;s newest model.</p><h3>8) How Does Liquid AI Train Small Models to Compete With Much Larger Ones?</h3><p>LFM2.5&#8211;1.2B matches or beats Qwen3&#8211;1.7B&#8202;&#8212;&#8202;a model with 42% more parameters&#8202;&#8212;&#8202;on multiple benchmarks. Let&#8217;s understand the four specific training innovations that make this happen.</p><h4>How the Student Model Learns From a Bigger Teacher Without Storing Everything</h4><p>Knowledge distillation trains a small model (the student) to mimic a large model (the teacher). Instead of just learning to predict the correct next token, the student learns to match the teacher&#8217;s full probability distribution&#8202;&#8212;&#8202;not just &#8220;the answer is X&#8221; but &#8220;X is most likely, Y is a decent second choice, Z is plausible, everything else is garbage.&#8221; That ranking information makes the student significantly smarter than learning from raw data alone.</p><p>LFM2 models are distilled from LFM1&#8211;7B. The problem is storage. The teacher&#8217;s full distribution over a 65,536-token vocabulary for every position in a multi-trillion-token corpus would require petabytes. Liquid AI stores only the top 32 logits per token&#8202;&#8212;&#8202;a 2,000x compression.</p><p>But truncation breaks the standard training math. Here&#8217;s why.</p><p>The standard way to measure how well the student matches the teacher is called KL divergence&#8202;&#8212;&#8202;it&#8217;s essentially a score that says &#8220;how different are these two probability distributions?&#8221; Lower score means the student is mimicking the teacher well. Higher score means it&#8217;s off.</p><p>To help the student learn, you typically apply temperature scaling&#8202;&#8212;&#8202;which smooths out the teacher&#8217;s predictions so they&#8217;re less extreme. Instead of &#8220;token X has 90% probability and everything else is near zero,&#8221; temperature scaling softens it to something like &#8220;token X has 40%, Y has 25%, Z has 15%&#8230;&#8221; This makes the ranking information easier for a small model to absorb.</p><p>The problem: when you only stored 32 out of 65,536 tokens and then apply temperature scaling, the math tries to spread probability to all 65,536 tokens&#8202;&#8212;&#8202;including the 65,504 you have no teacher data for. The loss function is trying to match a distribution that doesn&#8217;t exist. Training becomes unstable. The model oscillates instead of learning.</p><p>Liquid AI&#8217;s fix splits the matching problem into two separate parts:</p><ul><li><p><strong>Term 1: Membership loss.</strong> How much total probability mass should the student assign to the teacher&#8217;s top-32 tokens versus everything else? A binary comparison. No temperature scaling&#8202;&#8212;&#8202;because temperature applied to a truncated distribution would smooth probability toward tokens you have zero information about.</p></li><li><p><strong>Term 2: Conditional ranking loss.</strong> Within those top-32 tokens, how well does the student match the teacher&#8217;s relative ordering? This term gets temperature scaling, because both distributions are defined over the same 32 tokens. No mismatch.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xZm9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xZm9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png 424w, https://substackcdn.com/image/fetch/$s_!xZm9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png 848w, https://substackcdn.com/image/fetch/$s_!xZm9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png 1272w, https://substackcdn.com/image/fetch/$s_!xZm9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xZm9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png" width="1410" height="2046" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2046,&quot;width&quot;:1410,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xZm9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png 424w, https://substackcdn.com/image/fetch/$s_!xZm9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png 848w, https://substackcdn.com/image/fetch/$s_!xZm9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png 1272w, https://substackcdn.com/image/fetch/$s_!xZm9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64a690f8-5b65-46f3-877c-3cf29f030f9d_1410x2046.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At temperature 1, this decomposition is provably a lower bound on the full KL divergence&#8202;&#8212;&#8202;the full KL decomposes into three terms (binary, conditional top-K, conditional tail), and dropping the non-negative tail term can only decrease the total. At higher temperatures, the loss becomes a tempered surrogate with a smoother optimization landscape.</p><p>This leads us to the next phase of training.</p><h4>Why Training Examples Are Ordered From Easy to Hard</h4><p>A 1.2B model has limited capacity. Hitting it with the hardest training examples before it has learned basic patterns produces noisy gradients and wasted compute. LFM2&#8217;s supervised fine-tuning stage orders examples from easy to hard.</p><p>Difficulty is scored by an ensemble of 12 models ranging from 350M to 235B parameters. For each question, the ensemble computes the fraction that answered correctly. High fraction = easy. Low fraction = hard. Training proceeds from easy to hard using a predictive model to rank questions by estimated difficulty.</p><p>Difficulty is measured externally by the ensemble, not by the student&#8217;s own performance. This avoids the circularity that kills self-paced learning: a poorly trained student misjudges what&#8217;s hard, avoids exactly the examples it needs, and stays poorly trained.</p><h4>Three Stages of Post-Training: Fine-Tuning, Preference Alignment, and Model Merging</h4><p>After pretraining on 10 to 12 trillion tokens, the model goes through three post-training stages (in case you&#8217;re wondering, this is a massive pretraining budget; and is likely why the model has really good knowledge and capabilities; 2.5 pushed this to 28 Trillion, which validates this assumption).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OzsC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OzsC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png 424w, https://substackcdn.com/image/fetch/$s_!OzsC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png 848w, https://substackcdn.com/image/fetch/$s_!OzsC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png 1272w, https://substackcdn.com/image/fetch/$s_!OzsC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OzsC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png" width="1456" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OzsC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png 424w, https://substackcdn.com/image/fetch/$s_!OzsC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png 848w, https://substackcdn.com/image/fetch/$s_!OzsC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png 1272w, https://substackcdn.com/image/fetch/$s_!OzsC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23b3ae34-d7be-4ffc-a1f3-1fcfb13967a7_1962x480.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>Stage 1: Supervised Fine-Tuning (SFT).</strong> 5 to 9 million training examples across 67 to 79 data sources. Roughly 27% general-purpose, 17% instruction following, 13% retrieval-augmented generation, 10% tool use, remainder split across code, math, multilingual, and domain-specific tasks. 80% English, 20% multilingual.</p><p><strong>Stage 2: Preference Alignment via DPO.</strong> The model generates 5 candidate responses per prompt. An LLM judge scores them. The best and worst form preference pairs. Direct Preference Optimization trains the model to increase the probability of preferred responses and decrease dispreferred ones.</p><p>Two details matter for small models.</p><ol><li><p>The KL regularization coefficient (beta) is 5.0&#8202;&#8212;&#8202;roughly 10x higher than typical DPO (0.1 to 0.5). A 1.2B model can&#8217;t afford to deviate far from its SFT-trained behavior without catastrophically degrading general capabilities. The high beta constrains the optimization to targeted improvements.</p></li><li><p>Length normalization is equally important&#8202;&#8212;&#8202;without it, DPO biases toward shorter responses because longer completions accumulate more log-probability, making them look worse regardless of actual quality.</p></li></ol><p><strong>Stage 3: Model Merging.</strong> Multiple candidate checkpoints are generated by varying hyperparameters and data mixtures. Each excels in different areas. The best checkpoints are combined into a single model using weight-space averaging.</p><p>Five merging methods are tested: linear averaging (Model Soup), Task Arithmetic, TIES-Merging, DARE, and DELLA. Each handles the combination differently&#8202;&#8212;&#8202;some average all weights, some merge only the weights that changed most, some randomly drop small changes and rescale. The best merged model is selected on a comprehensive benchmark suite. Merging costs nothing at inference&#8202;&#8212;&#8202;the output is one model with the same parameter count as any individual checkpoint.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Qw6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Qw6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png 424w, https://substackcdn.com/image/fetch/$s_!6Qw6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png 848w, https://substackcdn.com/image/fetch/$s_!6Qw6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png 1272w, https://substackcdn.com/image/fetch/$s_!6Qw6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Qw6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png" width="1410" height="1808" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1808,&quot;width&quot;:1410,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Qw6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png 424w, https://substackcdn.com/image/fetch/$s_!6Qw6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png 848w, https://substackcdn.com/image/fetch/$s_!6Qw6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png 1272w, https://substackcdn.com/image/fetch/$s_!6Qw6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4538ee6-be19-41a0-8e59-32c5bda53417_1410x1808.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Why Liquid AI Quantizes During Training, Not After</h4><p>All LFM2 and LFM2.5 models are trained with INT4 quantization in the loop from the start. The model learns to produce good outputs despite rounding errors, rather than being trained at full precision and then compressed (post-training quantization). This matters more for small models&#8202;&#8212;&#8202;a 70B model has enough redundancy that quantization noise barely registers. A 1B model has no slack. Every parameter is working harder, and rounding any of them introduces proportionally more damage.</p><h3>Conclusion: Where Does Edge AI Go Next</h3><p>We didn&#8217;t spend eight sections tearing apart cache math and evolutionary algorithms just to marvel at a clever 1.2B model. Liquid AI is merely the diagnostic tool. The real story is that the entire trajectory as edge AI and all of it&#8217;s implications unfold. However, this is a completely different computing paradigm, with it&#8217;s own problems that we musst all be ready to solve.</p><h4>The Signal-to-Noise Crisis: Learning to Ignore</h4><p>Right now, if you type a prompt into ChatGPT, 100% of those tokens are intentional. You wrote them. The model assumes every word matters. Even with typos and misspellings, the signal to noise ratio in your input is very high.</p><p>Ambient computing is the exact opposite. If an AI is listening to your microphone for 8 hours a day, 99% of what it hears is useless garbage. It hears you typing, a siren outside, small talk about the weather, and someone clearing their throat. Maybe 1% of those tokens are a concrete task or an important decision.</p><p>If you feed 8 hours of ambient noise into a standard transformer, it will dutifully calculate exact global attention scores for the siren and the throat-clearing, filling up its KV cache until the phone crashes. Or it will find ways to bill you for the Raja Raja Raja song you played 10x in a row.</p><p>All this means the next major architectural leap in edge AI won&#8217;t just be about cheaper math. It will be about <strong>active ignoring</strong>. Just like the human brain filters out the feeling of your shirt against your back so you can focus on reading this sentence, edge models will need adaptive compute mechanisms that instantly classify and discard useless tokens before they ever reach the expensive layers. The architecture will have to shift from &#8220;how efficiently can I process this context?&#8221; to &#8220;how aggressively can I refuse to process this context?&#8221;</p><h4>The Hardware Fragmentation Problem: The Case for Open Standards</h4><p>The data center is a monoculture. If you write your model in CUDA, it will run beautifully on an NVIDIA GPU, which is what every hyperscaler uses.</p><p>The edge is a chaotic wasteland.</p><p>There is no &#8220;standard&#8221; edge hardware. A Snapdragon NPU handles memory differently than an Apple Neural Engine, which handles operations differently than an AMD Ryzen CPU. As Liquid AI&#8217;s STAR system proved, the optimal architecture for one phone might run like garbage on a laptop. There are a million different Pareto frontiers.</p><p>If you are a startup or a smaller AI team, you cannot afford to build a dedicated hardware-in-the-loop search system to evolve a custom architecture for every single Android device on the market.</p><p>This fragmentation forces a strategic fork in the road. Either the massive players (Apple, Google) completely capture the edge because they own the vertical hardware stack and can optimize perfectly for it, OR the open-source community is forced to consolidate around open standards. We will likely see a push for unified inference runtimes and open hardware-profiling datasets. Smaller players will have to pool their testing resources, effectively building a communal &#8220;STAR&#8221; system, just to survive the hardware chaos.</p><h4>Unlocking Hostile Markets</h4><p>Finally, the obsession with edge AI isn&#8217;t just about saving battery life on a smartphone. It is about unlocking entirely new markets that are fundamentally hostile to GPUs.</p><p>There are massive sectors of the global economy where you cannot just &#8220;ping the cloud.&#8221; For the last three years, these industries have been locked out of the generative AI boom because the models required server racks they weren&#8217;t allowed or able to use.</p><p>Architectures like LFM2 change the math. When you can run a highly competent, reasoning-capable model locally in 700 megabytes of RAM, you bypass the cloud entirely. You bring the intelligence to the data, instead of trying to drag the data to the intelligence.</p><p>The transformer won the last decade because it was perfectly adapted to the data center&#8202;&#8212;&#8202;an environment where memory is treated as infinite. The architecture that wins the next decade will be the one that understands memory is a hostage negotiation. And the companies that master that negotiation are about to put AI into everything.</p><p>Thank you for being here, and I hope you have a wonderful day,</p><p>Dev &lt;3</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/the-future-of-on-device-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/the-future-of-on-device-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!scjw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!scjw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png 424w, https://substackcdn.com/image/fetch/$s_!scjw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png 848w, https://substackcdn.com/image/fetch/$s_!scjw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png 1272w, https://substackcdn.com/image/fetch/$s_!scjw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!scjw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png" width="745" height="327" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:327,&quot;width&quot;:745,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!scjw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png 424w, https://substackcdn.com/image/fetch/$s_!scjw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png 848w, https://substackcdn.com/image/fetch/$s_!scjw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png 1272w, https://substackcdn.com/image/fetch/$s_!scjw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe5c149-c904-442c-a2fa-d1066fcff4e9_745x327.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Reach out to me</h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. : </p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[How to Prompt Reasoning Models Effectively]]></title><description><![CDATA[Understanding why Reasoning Models are Different from Normal LLMs and How to Prompt them.]]></description><link>https://www.artificialintelligencemadesimple.com/p/how-to-prompt-reasoning-models-effectively</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/how-to-prompt-reasoning-models-effectively</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Wed, 18 Mar 2026 01:16:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qTbi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c67a42-40e0-46f0-8a3d-716f85b4e55e_1400x906.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Most people are still prompting reasoning models like it is 2023. That used to help. Now it often wastes money, adds latency, and sometimes makes the answer worse.</p><p>A <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5285532">study from Wharton&#8217;s Generative AI Lab</a> tested 198 PhD-level questions across biology, physics, and chemistry. Chain-of-thought instructions &#8212; the single most popular prompting technique since 2022 &#8212; bought 2.9 to 3.1 percent accuracy on reasoning models while adding 20 to 80 percent latency. On Gemini Flash 2.5, chain-of-thought made results worse. Negative 3.3 percent. You&#8217;d have gotten better answers by not trying to help. <a href="https://arxiv.org/abs/2410.21333">&#8220;Mind Your Step (by Step)&#8221;</a> (COLM 2025) went further: on pattern recognition tasks, turning on reasoning mode dropped accuracy by up to 36.3 percent versus a standard model. The technique designed to make models smarter is making the smart models dumber.</p><p>The providers know. OpenAI, Anthropic, Google, DeepSeek &#8212; all of them explicitly warn against chain-of-thought on reasoning models. But the advice economy runs on lag. Most courses, research, and common tips are focused on the older generation base models, making them outdated for the current paradigm, since base LLMs have a very different post-training and alignment system compared to reasoning LLMs. </p><p>In this deep dive, we will combine our conversations with the builders of various AI models, dig into research papers, and compile insights from various practitioners to give you deep insight into the mechanics of reasoning models, why the standard prompting techniques that you&#8217;re taught online actually hurt reasoning models,  and how you can prompt them better. This article will also give you eight rules grounded in the research that will work across model families and architectures so you can apply these insights to the model of your choice.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;50ee8204-88f9-4e33-ac70-fd6db7686058&quot;,&quot;duration&quot;:null}"></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SB5s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SB5s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png 424w, https://substackcdn.com/image/fetch/$s_!SB5s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png 848w, https://substackcdn.com/image/fetch/$s_!SB5s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png 1272w, https://substackcdn.com/image/fetch/$s_!SB5s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SB5s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png" width="1390" height="984" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:984,&quot;width&quot;:1390,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:169511,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/191300960?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SB5s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png 424w, https://substackcdn.com/image/fetch/$s_!SB5s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png 848w, https://substackcdn.com/image/fetch/$s_!SB5s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png 1272w, https://substackcdn.com/image/fetch/$s_!SB5s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72cbbf52-8c21-4d8f-b7e1-8ea14954a290_1390x984.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>^^A preview of what you&#8217;re getting today.</p><p>This article is written for the technical layman, with no deep AI or Software Engineering skills required. All you&#8217;ll need is an attention span, a desire to learn, and a willingness to experiment and internalize knowledge. If you match that, this article will help you skyrocket your productivity and do higher-quality work in less time.</p><p>Let&#8217;s get into it-</p><p>For access to this article and all future articles, get a premium subscription below.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><p>Each of these articles takes a long time to research and write, and your premium subscription allows me to deliver the highest quality information to you. If you agree that high-quality work deserves compensation, please consider a premium subscription. <strong>We have a flexible subscription plan that lets you <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">pay what you can here</a>.</strong></p><p>PS- <strong>Many companies have a learning budget that you can expense this newsletter to.</strong><em> </em><strong><a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">You can use the following for an email template</a></strong> <strong>to request reimbursement for your subscription.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aiyy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aiyy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png 424w, https://substackcdn.com/image/fetch/$s_!aiyy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png 848w, https://substackcdn.com/image/fetch/$s_!aiyy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png 1272w, https://substackcdn.com/image/fetch/$s_!aiyy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aiyy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png" width="957" height="154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eaac9682-639f-48af-9aed-9353b61c99aa_957x154.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:154,&quot;width&quot;:957,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!aiyy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png 424w, https://substackcdn.com/image/fetch/$s_!aiyy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png 848w, https://substackcdn.com/image/fetch/$s_!aiyy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png 1272w, https://substackcdn.com/image/fetch/$s_!aiyy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaac9682-639f-48af-9aed-9353b61c99aa_957x154.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div>
      <p>
          <a href="https://www.artificialintelligencemadesimple.com/p/how-to-prompt-reasoning-models-effectively">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[How to Diagnose Failures in Large AI Training Clusters]]></title><description><![CDATA[A practical look at how debugging workflows, metrics, and automated runbooks are used to investigate slowdowns and failures in large-scale model training.]]></description><link>https://www.artificialintelligencemadesimple.com/p/how-to-diagnose-failures-in-large</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/how-to-diagnose-failures-in-large</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Fri, 13 Mar 2026 06:17:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UmjS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>There&#8217;s an inflection point in training systems at scale where the hard problem changes shape. It stops being about whether you can get the job to run and becomes something more corrosive &#8212; figuring out why performance degraded, or why a failure signal appeared, or why throughput quietly cratered and nobody can explain it. That&#8217;s where the real cost lives. Not compute. Engineering hours burned reconstructing what went wrong from scattered dashboards and whatever the senior SRE remembers from last time.</p><p><a href="https://rocm.blogs.amd.com/software-tools-optimization/maxtext-slurm-agentic-diagnosis/README.html">The following research from AMD was shared with me and I thought it was worth your time (goes without saying but this is not sponsored, I share only because I think you need to read it).</a> So I&#8217;m sharing it with due permissions. </p><p>It walks through the failure modes that actually eat people alive once clusters get big enough. This includes various issues like</p><ul><li><p>Network degradation that looks like a training bug. </p></li><li><p>False failure signals that trigger unnecessary restarts. </p></li><li><p>Checkpoint-related slowdowns. </p></li><li><p>Cases where everything looks broken but the root cause is in the training dynamics, not the infrastructure. </p></li></ul><p>Debugging large-scale training is moving from tribal knowledge and human pattern-matching toward diagnostic workflows that a system can actually execute. And the teams that figure out how to make that transition &#8212; how to turn their debugging knowledge into repeatable infrastructure instead of leaving it trapped in someone&#8217;s head &#8212; those are the teams that will compound their advantage over everyone else. This article will give you insight on the same. </p><div><hr></div><p>In <a href="https://rocm.blogs.amd.com/software-tools-optimization/maxtext-slurm/README.html">MaxText-Slurm: Production-Grade LLM Training with Built-In Observability</a>, we introduced <a href="https://github.com/AMD-AGI/maxtext-slurm">MaxText-Slurm</a> &#8212; an open-source launch system and observability stack for running MaxText LLM training on AMD Instinct GPU clusters. We showed how a unified Prometheus time-series database (TSDB) collects GPU, host, network, and training metrics into a single queryable store, persisted to disk so that no data is lost even if the job crashes.</p><p>A unified TSDB is only as useful as the methodology applied to it. In this post, we show the <strong>agentic diagnostic skills</strong> that ship with MaxText-Slurm &#8212; structured runbooks that <a href="https://cursor.com/">Cursor</a> or <a href="https://docs.anthropic.com/en/docs/claude-code">Claude Code</a> execute autonomously from symptom to root cause. This is not a chatbot answering questions about logs. The AI agent has tool access (shell, file system, HTTP queries to Prometheus) and follows each skill systematically: reading logs, querying metrics, interpreting results, and chaining steps until it reaches an actionable conclusion. We walk through five case studies &#8212; from one-prompt performance profiling to a throughput decline (measured in tokens per GPU per second, or TGS) whose root cause turned out to be model behavior &#8212; where each short prompt led to an actionable result in minutes.</p><h2><strong>Setting Up the AI Agent</strong></h2><p>The AI agent needs access to two codebases: the <strong>maxtext-slurm</strong> repo (for skills, job outputs, and diagnostic tools) and the <strong>MaxText</strong> source (for code-level tracing during deep diagnosis). MaxText lives at <code>/workspace/maxtext</code> inside the training Docker image &#8212; a frozen snapshot baked into the container, not installed on the host. The most reliable setup is to run the agent inside the container on a cluster login node.</p><h3>Step 1: Clone the repo and start the container</h3><p>SSH into a cluster login node and clone the repo onto a shared filesystem so that multi-node job outputs (logs, TSDB, profiles) are accessible from any node:</p><pre><code>ssh cluster-login-node
cd /shared               # or any shared filesystem path
git clone https://github.com/AMD-AGI/maxtext-slurm.git
cd maxtext-slurm
</code></pre><p>Start an interactive container on the login node. <code>run_local.sh</code> bind-mounts the repo directory into the container, so job outputs on the shared filesystem are directly accessible. Diagnosis workflows (log triage, TSDB queries, source code tracing) do not require GPUs, so the login node is sufficient:</p><pre><code>run_local.sh             # drop into an interactive container shell
</code></pre><p><strong>Tip:</strong> To leave the container without stopping it, use <code>Ctrl+P, Ctrl+Q</code> to detach. Do <strong>not</strong> type <code>exit</code> &#8212; that stops and removes the container, destroying any in-container state. You can reattach later with <code>docker exec -it &lt;container_name&gt; bash</code>. Running on a compute node is possible but requires additional SSH setup not covered here.</p><p>Inside the container, the filesystem looks like this:</p><pre><code>/maxtext-slurm/          # maxtext-slurm repo (bind-mounted from the host)
    skills/              # AI diagnostic skills
    outputs/             # job outputs (logs, TSDB, ray_logs, ...)
    utils/               # prometheus.sh, analyze_job.py, metrics plugins
/workspace/maxtext/      # MaxText source (baked into the image)
/opt/venv/lib/python3.12/site-packages/
    orbax/               # Orbax checkpoint library
    jax/                 # JAX source
</code></pre><h3>Step 2: Connect the AI agent</h3><p>You need two connections to the login node &#8212; one inside the container for the AI agent, and one outside for Slurm operations (<code>submit.sh</code>, <code>squeue</code>, <code>scancel</code>, SSH tunnels to live dashboards, etc.). This separation is deliberate: the AI agent inside the container can read logs, query Prometheus, and trace source code, but it cannot submit, cancel, or modify Slurm jobs. Keeping Slurm access outside the container prevents an agent mistake from affecting running jobs on the cluster.</p><p><strong>Cursor</strong> &#8212; Use <a href="https://code.visualstudio.com/docs/remote/ssh">Remote SSH</a> to connect to the cluster login node, then attach to the running container via <a href="https://code.visualstudio.com/docs/devcontainers/attach-container">Dev Containers: Attach to Running Container</a>. Add both <code>/maxtext-slurm</code> and <code>/workspace/maxtext</code> as workspace folders so the agent can cross-reference skills, job outputs, and framework source code in a single session. Use a separate terminal (SSH session to the login node, outside the container) for Slurm operations. Figure 1 shows this setup.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xQzv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xQzv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png 424w, https://substackcdn.com/image/fetch/$s_!xQzv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png 848w, https://substackcdn.com/image/fetch/$s_!xQzv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png 1272w, https://substackcdn.com/image/fetch/$s_!xQzv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xQzv!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png" width="1200" height="481.31868131868134" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:584,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Cursor IDE connected to a remote container via SSH, showing both maxtext-slurm and maxtext workspace folders, with a separate terminal for Slurm ops&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="Cursor IDE connected to a remote container via SSH, showing both maxtext-slurm and maxtext workspace folders, with a separate terminal for Slurm ops" title="Cursor IDE connected to a remote container via SSH, showing both maxtext-slurm and maxtext workspace folders, with a separate terminal for Slurm ops" srcset="https://substackcdn.com/image/fetch/$s_!xQzv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png 424w, https://substackcdn.com/image/fetch/$s_!xQzv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png 848w, https://substackcdn.com/image/fetch/$s_!xQzv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png 1272w, https://substackcdn.com/image/fetch/$s_!xQzv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa54b7336-ec58-436e-85f8-0466c1342489_3411x1367.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 1. Cursor attached to the remote container via SSH, with both workspace folders open. A separate terminal outside the container handles Slurm operations.</em></p><p><strong>Claude Code</strong> &#8212; Open two SSH sessions to the login node. In one, <code>docker exec</code> into the container and launch <code>claude</code>:</p><pre><code># Session 1 (AI agent &#8212; inside the container):
ssh cluster-login-node
docker exec -it &lt;container_name&gt; bash
cd /maxtext-slurm
claude

# Session 2 (Slurm ops &#8212; outside the container):
ssh cluster-login-node
cd maxtext-slurm
squeue -u $USER              # check running jobs
submit.sh 70b -N 8           # submit new jobs
</code></pre><p>Launching <code>claude</code> from <code>/maxtext-slurm</code> ensures that <code>CLAUDE.md</code> &#8212; the routing file that directs the agent to the correct skill &#8212; is picked up automatically.</p><h3>Step 3: Verify the setup</h3><p>In Claude Code or Cursor chat, point the agent at any job (running, failed, hanging, or completed) by typing:</p><blockquote><p>triage job 7877</p></blockquote><p>Replace 7877 with your job ID.</p><p>The agent reads the log tail, classifies the job status, extracts the config and step progress, and recommends next steps. If it reaches a diagnosis recommendation, the setup is working.</p><h3>How skills are discovered</h3><p>The <code>CLAUDE.md</code> file in the repo root contains routing rules that map task descriptions to skills:</p><pre><code>For job triage tasks &#8594; skills/job-log-triage/SKILL.md
For performance analysis tasks &#8594; skills/performance-analysis/SKILL.md
For TSDB diagnosis tasks &#8594; skills/tsdb-diagnosis/SKILL.md
</code></pre><p>Both Cursor and Claude Code read this file automatically. There is no manual configuration &#8212; describe the task in natural language, and the agent selects the correct skill.</p><h2><strong>The AI Skill Framework</strong></h2><p>The diagnostic capabilities are powered by an extensible skill framework in the <code>skills/</code> directory. Each skill is a structured instruction file &#8212; not documentation for humans, but a runbook the AI agent reads and executes autonomously. Skills encode the methodology of senior systems engineers: the queries to run, how to interpret results, decision trees for choosing the next step, and common pitfalls to avoid. When the agent reads a skill, it follows the diagnostic procedure end-to-end &#8212; launching Prometheus, issuing queries, parsing responses, and deciding which playbook or follow-up skill to run based on what it finds.</p><p>The framework currently ships three skills, connected in the diagnostic pipeline shown in Figure 2:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UmjS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UmjS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png 424w, https://substackcdn.com/image/fetch/$s_!UmjS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png 848w, https://substackcdn.com/image/fetch/$s_!UmjS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png 1272w, https://substackcdn.com/image/fetch/$s_!UmjS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UmjS!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png" width="1200" height="551.3736263736264" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:669,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Diagnostic pipeline showing job-log-triage feeding into tsdb-diagnosis and performance-analysis, with data sources labeled&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="Diagnostic pipeline showing job-log-triage feeding into tsdb-diagnosis and performance-analysis, with data sources labeled" title="Diagnostic pipeline showing job-log-triage feeding into tsdb-diagnosis and performance-analysis, with data sources labeled" srcset="https://substackcdn.com/image/fetch/$s_!UmjS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png 424w, https://substackcdn.com/image/fetch/$s_!UmjS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png 848w, https://substackcdn.com/image/fetch/$s_!UmjS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png 1272w, https://substackcdn.com/image/fetch/$s_!UmjS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33cc4b18-9feb-420c-8f07-0dbf9fba2e80_2663x1223.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 2. The three-skill diagnostic pipeline. Triage classifies the failure and hands off to TSDB diagnosis (system-level root cause) or performance analysis (compute-level profiling). All three skills share the same job directory as their data source.</em></p><p><code>job-log-triage</code> identifies <em>what</em> happened. It reads the job log, classifies the failure mode (hang, OOM, heartbeat timeout, RCCL/NCCL error, etc.), extracts the job config, projects training progress, and recommends next steps.</p><p><code>performance-analysis</code> identifies <em>where</em> in the compute pipeline the issue lies. It runs TraceLens (XLA trace analysis) and IRLens (HLO IR analysis) to profile GPU kernel execution, identify bottlenecks, and measure model FLOPS utilization. For multi-job comparisons, the TSDB skill runs first to rule out system-level causes before profiling.</p><p><code>tsdb-diagnosis</code> identifies <em>why</em> it happened. It connects to the job&#8217;s Prometheus TSDB (live or persisted), discovers available metrics, and runs the appropriate diagnostic playbook &#8212; systematically querying GPU, network, I/O, CPU, and training metrics to trace from symptom to root cause. This is the skill that makes the unified TSDB an active diagnostic tool rather than a passive data store.</p><p>The handoff between skills is automatic &#8212; when triage concludes with &#8220;query the TSDB,&#8221; the agent reads the TSDB skill and continues without further prompting. The following case studies show what this looks like in practice.</p><h2><strong>Case Study 0: Performance Profiling &#8212; One Prompt, Full Breakdown</strong></h2><p>Prompt:</p><blockquote><p>8104 perf analysis</p></blockquote><p>Figure 3a shows the agent&#8217;s response in Cursor:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BM6C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BM6C!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png 424w, https://substackcdn.com/image/fetch/$s_!BM6C!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png 848w, https://substackcdn.com/image/fetch/$s_!BM6C!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png 1272w, https://substackcdn.com/image/fetch/$s_!BM6C!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BM6C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png" width="785" height="524" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/644feacf-9f96-40cc-b603-095557a6b31b_785x524.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:524,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Cursor chat showing the prompt and the agent's response as it locates the job, measures TGS, runs TraceLens, and starts a web server&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Cursor chat showing the prompt and the agent's response as it locates the job, measures TGS, runs TraceLens, and starts a web server" title="Cursor chat showing the prompt and the agent's response as it locates the job, measures TGS, runs TraceLens, and starts a web server" srcset="https://substackcdn.com/image/fetch/$s_!BM6C!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png 424w, https://substackcdn.com/image/fetch/$s_!BM6C!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png 848w, https://substackcdn.com/image/fetch/$s_!BM6C!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png 1272w, https://substackcdn.com/image/fetch/$s_!BM6C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F644feacf-9f96-40cc-b603-095557a6b31b_785x524.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 3a. One prompt in Cursor. The agent locates the job directory, measures throughput from the log, then installs TraceLens if needed, profiles the XLA trace, and starts a web server (remaining steps not shown).</em></p><p>The end result &#8212; a complete performance breakdown served locally (Figure 3b):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2jcH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2jcH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png 424w, https://substackcdn.com/image/fetch/$s_!2jcH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png 848w, https://substackcdn.com/image/fetch/$s_!2jcH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png 1272w, https://substackcdn.com/image/fetch/$s_!2jcH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2jcH!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png" width="1200" height="1261.8677042801557" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/328ae956-5055-4080-b51f-ca065e425728_1028x1081.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1081,&quot;width&quot;:1028,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Performance analysis results showing training throughput, step time breakdown, and kernel-level profiling&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="Performance analysis results showing training throughput, step time breakdown, and kernel-level profiling" title="Performance analysis results showing training throughput, step time breakdown, and kernel-level profiling" srcset="https://substackcdn.com/image/fetch/$s_!2jcH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png 424w, https://substackcdn.com/image/fetch/$s_!2jcH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png 848w, https://substackcdn.com/image/fetch/$s_!2jcH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png 1272w, https://substackcdn.com/image/fetch/$s_!2jcH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328ae956-5055-4080-b51f-ca065e425728_1028x1081.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 3b. One prompt, full performance breakdown &#8212; TGS, MFU, step time decomposition, and kernel-level profiling for a LLaMA 3.1 405B FP8 run on 64 MI355X GPUs.</em></p><p><strong>The key insight:</strong> one short prompt can trigger a complete, repeatable performance workflow &#8212; from job discovery to profiling output &#8212; and remove the manual glue work from routine performance analysis.</p><h2><strong>Case Study 1: The 23% TGS Drop &#8212; Diagnosing RDMA Degradation from a Single Job</strong></h2><p>A 24-node MoE training run (192 MI355X GPUs) is running without errors &#8212; no hangs, no crashes, no warnings &#8212; but throughput looks wrong:</p><blockquote><p>why is 8308 TGS so bad</p></blockquote><p>The agent triages the job, confirms training is still progressing, then queries the TSDB to find the source of the throughput drop (Figure 4a):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4Hby!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4Hby!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png 424w, https://substackcdn.com/image/fetch/$s_!4Hby!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png 848w, https://substackcdn.com/image/fetch/$s_!4Hby!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png 1272w, https://substackcdn.com/image/fetch/$s_!4Hby!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4Hby!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png" width="1200" height="445.44747081712063" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:477,&quot;width&quot;:1285,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;TGS timeline for Job 8308 showing three distinct phases: healthy ~3293, Phase 1 drop to ~2620, brief recovery to ~3288, then Phase 2 drop to ~2520&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="TGS timeline for Job 8308 showing three distinct phases: healthy ~3293, Phase 1 drop to ~2620, brief recovery to ~3288, then Phase 2 drop to ~2520" title="TGS timeline for Job 8308 showing three distinct phases: healthy ~3293, Phase 1 drop to ~2620, brief recovery to ~3288, then Phase 2 drop to ~2520" srcset="https://substackcdn.com/image/fetch/$s_!4Hby!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png 424w, https://substackcdn.com/image/fetch/$s_!4Hby!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png 848w, https://substackcdn.com/image/fetch/$s_!4Hby!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png 1272w, https://substackcdn.com/image/fetch/$s_!4Hby!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f51d9da-237c-4cf1-8f9c-65be9e5040f9_1285x477.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 4a. TGS timeline extracted from worker logs. The phased drop pattern points to an environmental issue, not a static config problem.</em></p><pre><code>Steps 0&#8211;96:    TGS ~3,293  (healthy baseline)
Steps 97&#8211;216:  TGS ~2,620  (&#8722;20%, constant)     &#8592; Phase 1
Steps 217&#8211;250: TGS ~3,288  (recovered)
Steps 252+:    TGS ~2,520  (&#8722;23%, constant)     &#8592; Phase 2
</code></pre><p>TSDB metrics quickly isolate RDMA issues on four hosts, while the rest of the system metrics remain normal (Figure 4b):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EQ0L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EQ0L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png 424w, https://substackcdn.com/image/fetch/$s_!EQ0L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png 848w, https://substackcdn.com/image/fetch/$s_!EQ0L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png 1272w, https://substackcdn.com/image/fetch/$s_!EQ0L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EQ0L!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png" width="1200" height="820.8791208791209" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:996,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;RDMA retransmit rates per host: chi2822 showing massive spikes during Phase 1, then chi2834 and chi2882 showing sustained retransmits during Phase 2, while the remaining 20 nodes show zero&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="RDMA retransmit rates per host: chi2822 showing massive spikes during Phase 1, then chi2834 and chi2882 showing sustained retransmits during Phase 2, while the remaining 20 nodes show zero" title="RDMA retransmit rates per host: chi2822 showing massive spikes during Phase 1, then chi2834 and chi2882 showing sustained retransmits during Phase 2, while the remaining 20 nodes show zero" srcset="https://substackcdn.com/image/fetch/$s_!EQ0L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png 424w, https://substackcdn.com/image/fetch/$s_!EQ0L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png 848w, https://substackcdn.com/image/fetch/$s_!EQ0L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png 1272w, https://substackcdn.com/image/fetch/$s_!EQ0L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a0b45-aa80-4da3-95b6-a0ddd608e8aa_1673x1144.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 4b. RDMA retransmit rates per host overlaid with TGS phases. Each TGS transition &#8212; drop, recovery, deeper drop &#8212; maps to a specific RDMA event on a specific node. 20 of 24 nodes have zero RDMA retransmits throughout.</em></p><p>The agent identifies four unhealthy nodes for exclusion (<code>chi2822</code>, <code>chi2834</code>, <code>chi2882</code>, <code>chi2835</code>) and the user resubmits Job 8309 without them:</p><pre><code>Job 8308 (4 bad nodes in allocation): TGS 2,520
Job 8309 (bad nodes excluded):        TGS 3,287  (+30%)
</code></pre><p>The TensorBoard overlay confirms a full recovery (Figure 4c):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mMA3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mMA3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png 424w, https://substackcdn.com/image/fetch/$s_!mMA3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png 848w, https://substackcdn.com/image/fetch/$s_!mMA3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png 1272w, https://substackcdn.com/image/fetch/$s_!mMA3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mMA3!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png" width="1200" height="464.8480124707716" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:497,&quot;width&quot;:1283,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;TensorBoard overlay showing Job 8308 TGS degraded around 2520 and Job 8309 TGS steady at ~3287 after excluding the four bad nodes&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="TensorBoard overlay showing Job 8308 TGS degraded around 2520 and Job 8309 TGS steady at ~3287 after excluding the four bad nodes" title="TensorBoard overlay showing Job 8308 TGS degraded around 2520 and Job 8309 TGS steady at ~3287 after excluding the four bad nodes" srcset="https://substackcdn.com/image/fetch/$s_!mMA3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png 424w, https://substackcdn.com/image/fetch/$s_!mMA3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png 848w, https://substackcdn.com/image/fetch/$s_!mMA3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png 1272w, https://substackcdn.com/image/fetch/$s_!mMA3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52e6d00b-6d6b-444e-b3a1-7b6a3b4fb4a0_1283x497.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 4c. TensorBoard overlay of Job 8308 (degraded) and Job 8309 (bad nodes excluded). Job 8309 sustains ~3,287 TGS &#8212; a 30% recovery that confirms the four excluded nodes were the root cause.</em></p><p>Swapping out those four nodes restored throughput from 2,520 to 3,287 TGS (+30%), confirming the diagnosis end-to-end.</p><p><strong>The key insight:</strong> the agent used TSDB evidence from a single degraded job to isolate four unhealthy nodes, and replacing those nodes fully restored throughput.</p><h2><strong>Case Study 2: Heartbeat False-Positive &#8212; Proving Tasks Were Alive</strong></h2><p>This is the incident that motivated building the entire observability stack. Every checkpointing job died with &#8220;stopped sending heartbeats&#8221; &#8212; first during initialization (we <a href="https://github.com/ROCm/maxtext/commit/452e7cb">patched the timeout</a> to fix that), then at random steps during training. No stack trace, no core dump, no earlier error on the accused tasks. We had no way to tell whether the tasks had actually crashed. That dead end drove us to build the Prometheus-based observability stack: we needed an independent record of what every node was doing at the exact moment the heartbeat died. (See the <a href="https://github.com/AMD-AGI/maxtext-slurm/blob/main/docs/jax-heartbeat-false-positive-postmortem.md">full post-mortem</a> for the detailed root cause analysis.)</p><p>A 24-node training run (192 MI355X GPUs) crashes mid-training at step 71:</p><blockquote><p>triage 8043</p></blockquote><p>The agent reads the log and immediately spots the contradiction. The coordinator says tasks 8 and 10 &#8220;crashed&#8221;:</p><pre><code> 0: UNAVAILABLE: The following tasks are unhealthy (stopped sending heartbeats):
 0: /job:jax_worker/replica:0/task:8
 0: /job:jax_worker/replica:0/task:10
 0: The tasks have crashed.
</code></pre><p>But task 8&#8217;s own log tells a different story &#8212; it completed step 71 normally, then <em>received and logged its own death notification</em>:</p><pre><code> 8: completed step: 71, seconds: 30.309, TFLOP/s/device: 207.973, MFU: 8.32%
 8: F0224 19:21:00 client.h:77] Terminating process because the JAX distributed
    service detected fatal errors. absl::Status: UNAVAILABLE: ...
 8: /job:jax_worker/replica:0/task:8     &#8592; task 8 reports itself as &#8220;crashed&#8221;
</code></pre><p>A truly crashed process cannot log its own death. The agent flags this as a <strong>suspected false-positive</strong> and computes the timeline: crash at 19:21:00 minus the 900s heartbeat timeout means heartbeats stopped arriving at 19:06:00. It maps the accused tasks to their hostnames &#8212; chi2853 (task 8) and chi2875 (task 10) &#8212; and queries the TSDB across the full 15-minute window when the tasks were supposedly dead. Figure 5 shows what it found.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7u8u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7u8u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png 424w, https://substackcdn.com/image/fetch/$s_!7u8u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png 848w, https://substackcdn.com/image/fetch/$s_!7u8u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png 1272w, https://substackcdn.com/image/fetch/$s_!7u8u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7u8u!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png" width="1200" height="497.8021978021978" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:604,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;TSDB queries showing GPU power at ~900W on accused and non-accused hosts, with zero network errors, followed by simultaneous death at 19:21&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="TSDB queries showing GPU power at ~900W on accused and non-accused hosts, with zero network errors, followed by simultaneous death at 19:21" title="TSDB queries showing GPU power at ~900W on accused and non-accused hosts, with zero network errors, followed by simultaneous death at 19:21" srcset="https://substackcdn.com/image/fetch/$s_!7u8u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png 424w, https://substackcdn.com/image/fetch/$s_!7u8u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png 848w, https://substackcdn.com/image/fetch/$s_!7u8u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png 1272w, https://substackcdn.com/image/fetch/$s_!7u8u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8dd49cd-23c5-4bde-95cc-f526ee55293c_3780x1569.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 5. GPU power (top) and TCP retransmit rate (bottom) for the accused hosts and a non-accused neighbor chi2865 (task 9). All three show identical ~900W active training with near-zero retransmits, then simultaneous death at 19:21. The task sandwiched between the two &#8220;dead&#8221; tasks was perfectly healthy.</em></p><p>The proof is unambiguous. Both accused hosts drew ~900W (active training) with near-zero network errors throughout the entire window &#8212; indistinguishable from the non-accused neighbor sitting between them. The agent concludes: <strong>confirmed heartbeat false-positive</strong>. The tasks were alive and actively training when the heartbeat mechanism declared them dead. The root cause is a design flaw in JAX&#8217;s coordination service where a long-running <code>PollForError</code> call blocks the heartbeat callback on a shared gRPC channel (see the <a href="https://github.com/AMD-AGI/maxtext-slurm/blob/main/docs/jax-heartbeat-false-positive-postmortem.md">post-mortem</a> for the full source-level analysis).</p><p><strong>The key insight:</strong> without the TSDB, you only see the accusation &#8212; &#8220;tasks 8 and 10 have crashed.&#8221; With it, the agent builds a defense: GPU power, network health, and a non-accused neighbor with identical metrics, all proving the tasks were alive. The agent followed a chain &#8212; log contradiction &#8594; timeline reconstruction &#8594; TSDB forensics &#8212; that a human engineer would take hours to assemble manually.</p><h2><strong>Case Study 3: The 1% Throughput Mystery &#8212; Checkpoint Restore Leak</strong></h2><p>This one is subtle: two 24-node Mixture-of-Experts (MoE) training runs (192 MI355X GPUs each) use identical configs on the same cluster. Job A is a fresh start; Job B restores from a checkpoint. Job B is consistently ~1% slower:</p><blockquote><p>why is job B slower than job A? identical configs</p></blockquote><pre><code>Job A (fresh):   TGS 3,295
Job B (restore): TGS 3,261  (&#8722;1.0%)
</code></pre><p>The TensorBoard overlay makes the gap visible (Figure 6):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GjyH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GjyH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png 424w, https://substackcdn.com/image/fetch/$s_!GjyH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png 848w, https://substackcdn.com/image/fetch/$s_!GjyH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png 1272w, https://substackcdn.com/image/fetch/$s_!GjyH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GjyH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png" width="1207" height="535" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:535,&quot;width&quot;:1207,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;TensorBoard showing Job A at ~3295 TGS and Job B at ~3261 TGS, a persistent 1% gap&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="TensorBoard showing Job A at ~3295 TGS and Job B at ~3261 TGS, a persistent 1% gap" title="TensorBoard showing Job A at ~3295 TGS and Job B at ~3261 TGS, a persistent 1% gap" srcset="https://substackcdn.com/image/fetch/$s_!GjyH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png 424w, https://substackcdn.com/image/fetch/$s_!GjyH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png 848w, https://substackcdn.com/image/fetch/$s_!GjyH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png 1272w, https://substackcdn.com/image/fetch/$s_!GjyH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f4329b0-b16b-4711-817a-3e73dd367d5d_1207x535.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 6. TensorBoard overlay. Job A (fresh) sustains ~3,295 TGS; Job B (restore) plateaus at ~3,261 TGS &#8212; a persistent gap that compounds over multi-week runs.</em></p><p>The gap is reproducible: any job restored with <code>enable_single_replica_ckpt_restoring=true</code><a href="https://rocm.blogs.amd.com/software-tools-optimization/maxtext-slurm-agentic-diagnosis/README.html#id2"><sup>[1]</sup></a> is slower than an identical fresh start. The agent confirms matching training configs over 200+ steps, then starts Prometheus for both jobs&#8217; persisted TSDBs and runs a systematic contention sweep. GPU utilization: both ~100%. GPU power: both ~916W. RDMA retransmits: zero. Disk I/O: comparable. Everything looks identical &#8212; until the agent checks CPU metrics.</p><p><code>hw_procs_running</code> &#8212; the number of runnable threads in the kernel run queue &#8212; breaks the symmetry:</p><pre><code>Job A (fresh):   avg hw_procs_running &#8776; 18 per host
Job B (restore): avg hw_procs_running &#8776; 35 per host
Delta: ~17 extra runnable threads per host, constant across all 24 nodes
</code></pre><p>Figures 7a and 7b visualize this gap across all 24 nodes:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C_-4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C_-4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png 424w, https://substackcdn.com/image/fetch/$s_!C_-4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png 848w, https://substackcdn.com/image/fetch/$s_!C_-4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png 1272w, https://substackcdn.com/image/fetch/$s_!C_-4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C_-4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png" width="1456" height="688" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:688,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;hw_procs_running for Job A (fresh start): ~18 per host across all 24 nodes&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="hw_procs_running for Job A (fresh start): ~18 per host across all 24 nodes" title="hw_procs_running for Job A (fresh start): ~18 per host across all 24 nodes" srcset="https://substackcdn.com/image/fetch/$s_!C_-4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png 424w, https://substackcdn.com/image/fetch/$s_!C_-4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png 848w, https://substackcdn.com/image/fetch/$s_!C_-4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png 1272w, https://substackcdn.com/image/fetch/$s_!C_-4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F040405b8-ca53-4ea5-a4f5-b5f427c0f8d8_1660x784.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 7a. </em><code>hw_procs_running</code><em> for Job A (fresh start). All 24 nodes stabilize at ~18 runnable threads.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IwpO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IwpO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png 424w, https://substackcdn.com/image/fetch/$s_!IwpO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png 848w, https://substackcdn.com/image/fetch/$s_!IwpO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png 1272w, https://substackcdn.com/image/fetch/$s_!IwpO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IwpO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png" width="1456" height="694" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:694,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;hw_procs_running for Job B (checkpoint restore): ~35 per host across all 24 nodes&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="hw_procs_running for Job B (checkpoint restore): ~35 per host across all 24 nodes" title="hw_procs_running for Job B (checkpoint restore): ~35 per host across all 24 nodes" srcset="https://substackcdn.com/image/fetch/$s_!IwpO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png 424w, https://substackcdn.com/image/fetch/$s_!IwpO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png 848w, https://substackcdn.com/image/fetch/$s_!IwpO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png 1272w, https://substackcdn.com/image/fetch/$s_!IwpO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd181ba-b61b-4848-b176-2d6dc258aefe_1660x791.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 7b. </em><code>hw_procs_running</code><em> for Job B (checkpoint restore). All 24 nodes at ~35 &#8212; about 17 more than Job A, constant throughout training.</em></p><p>This is not a transient spike &#8212; the delta is flat across all 24 nodes for the entire run. Something created ~17 extra threads during restore and they never exited.</p><p>The agent then traces into code. It searches both logs for RCCL communicator initialization &#8212; Job A shows three waves during startup, while Job B shows the same three plus two extra waves created during checkpoint restore broadcast. Tracing into the Orbax source reveals why: single-replica restore uses <code>jax.jit</code> to broadcast parameters across replicas, which initializes RCCL communicators that are permanently cached in XLA&#8217;s C++ layer. Their background polling threads never exit, creating constant CPU contention on every training step.</p><p><strong>Verified fix:</strong> Two patches address the issue: <a href="https://github.com/ROCm/maxtext/commit/5cfee461">one</a> replaces Orbax&#8217;s JIT-based broadcast with a direct RCCL broadcast that explicitly destroys communicators after use, and <a href="https://github.com/ROCm/maxtext/commit/caae02e3">another</a> eliminates extra XLA compilations during restore that degrade steady-state performance. With both applied, TGS nearly recovers (Figure 8):</p><pre><code>Job A (fresh):                 TGS 3,295  (baseline)
Job B (restore, no patch):     TGS 3,261  (&#8722;1.0%)
Job C (restore, with patches): TGS 3,290  (&#8722;0.15%)
</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zIqF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zIqF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png 424w, https://substackcdn.com/image/fetch/$s_!zIqF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png 848w, https://substackcdn.com/image/fetch/$s_!zIqF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png 1272w, https://substackcdn.com/image/fetch/$s_!zIqF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zIqF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png" width="1205" height="533" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:533,&quot;width&quot;:1205,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;TGS nearly recovered after the patches&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="TGS nearly recovered after the patches" title="TGS nearly recovered after the patches" srcset="https://substackcdn.com/image/fetch/$s_!zIqF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png 424w, https://substackcdn.com/image/fetch/$s_!zIqF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png 848w, https://substackcdn.com/image/fetch/$s_!zIqF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png 1272w, https://substackcdn.com/image/fetch/$s_!zIqF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc669e7e2-71fa-435e-8503-71398cf5158a_1205x533.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 8. After the patches, the checkpoint-restore job nearly closes the gap with the fresh start (&#8722;0.15% remaining).</em></p><p><strong>The key insight:</strong> this diagnosis started with the TSDB and could not have started anywhere else. The <code>hw_procs_running</code> metric &#8212; a kernel scheduler counter that most monitoring stacks don&#8217;t even collect &#8212; was the only signal that distinguished two otherwise identical jobs. Without it in the same queryable store as training metrics, the ~17 leaked threads would have been invisible, and the 1% gap would have remained an unsolved mystery.</p><h2><strong>Case Study 4: The MoE Throughput Decline &#8212; When the System Is Fine</strong></h2><p>Every diagnostic case study so far found a system-level root cause &#8212; degraded RDMA links, a buggy heartbeat mechanism, leaked threads from checkpoint restore. This one is different.</p><p>TGS for a 24-node MoE training run (192 MI355X GPUs) begins declining around step 450. No errors, no crashes &#8212; but throughput is sliding:</p><blockquote><p>Why did the TGS for job 7938 begin declining around step 450?</p></blockquote><p>The agent runs the <code>tsdb-diagnosis</code> skill, systematically querying every metric family in the unified TSDB. GPU thermals: normal. GPU power: stable at ~916W. RDMA retransmits: zero. TCP retransmits: minimal. CPU contention (<code>hw_procs_running</code>): stable. I/O pressure: none. Every system-level metric comes back clean.</p><p>With all system causes ruled out, the agent turns to the training metrics &#8212; and finds the answer. The <code>tb_learning_moe_lb_loss</code> (MoE load balance loss) spiked at exactly the step where TGS began declining. The load balance loss penalty forced the MoE expert router to reorganize, producing a new routing pattern with higher per-step communication cost. The TGS decline is not a system failure &#8212; it is a mathematical property of the training dynamics. Figures 9a and 9b show the full diagnostic conversation and the TensorBoard confirmation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KHPd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KHPd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png 424w, https://substackcdn.com/image/fetch/$s_!KHPd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png 848w, https://substackcdn.com/image/fetch/$s_!KHPd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png 1272w, https://substackcdn.com/image/fetch/$s_!KHPd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KHPd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png" width="1456" height="663" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:663,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;AI agent diagnosing a TGS decline by querying the TSDB, ruling out system-level causes, and identifying MoE expert routing reorganization as the root cause&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI agent diagnosing a TGS decline by querying the TSDB, ruling out system-level causes, and identifying MoE expert routing reorganization as the root cause" title="AI agent diagnosing a TGS decline by querying the TSDB, ruling out system-level causes, and identifying MoE expert routing reorganization as the root cause" srcset="https://substackcdn.com/image/fetch/$s_!KHPd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png 424w, https://substackcdn.com/image/fetch/$s_!KHPd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png 848w, https://substackcdn.com/image/fetch/$s_!KHPd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png 1272w, https://substackcdn.com/image/fetch/$s_!KHPd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84e62c48-353d-4d79-9c6e-5237cccaae22_1777x809.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 9a. End-to-end AI diagnosis. The agent queries the unified TSDB, rules out all system-level causes, and identifies the root cause: the load balance loss penalty forced the MoE expert router to reorganize.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gJSo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gJSo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png 424w, https://substackcdn.com/image/fetch/$s_!gJSo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png 848w, https://substackcdn.com/image/fetch/$s_!gJSo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png 1272w, https://substackcdn.com/image/fetch/$s_!gJSo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gJSo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png" width="814" height="306" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d901796a-05f4-4878-9798-34b10bb2a130_814x306.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:306,&quot;width&quot;:814,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;TensorBoard chart showing the TGS decline correlating with expert routing reorganization around step 450&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="TensorBoard chart showing the TGS decline correlating with expert routing reorganization around step 450" title="TensorBoard chart showing the TGS decline correlating with expert routing reorganization around step 450" srcset="https://substackcdn.com/image/fetch/$s_!gJSo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png 424w, https://substackcdn.com/image/fetch/$s_!gJSo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png 848w, https://substackcdn.com/image/fetch/$s_!gJSo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png 1272w, https://substackcdn.com/image/fetch/$s_!gJSo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd901796a-05f4-4878-9798-34b10bb2a130_814x306.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Figure 9b. TensorBoard confirmation. The TGS decline begins at the same step where expert routing reorganized &#8212; a training-level effect, not a system-level failure.</em></p><p><strong>The key insight:</strong> not every throughput decline is a system problem. The agent&#8217;s value here is in the <em>ruling out</em> &#8212; by systematically checking every contention source and finding them all clean, it directs attention to the training metrics where the actual cause lives. Without the unified TSDB, an engineer would spend hours checking network, GPU, and I/O by hand before even considering that the model itself changed behavior. The diagnostic framework doesn&#8217;t just find system failures &#8212; it knows when to stop looking for them.</p><h2><strong>Limitations, Failure Modes, and Operating Boundaries</strong></h2><p>The five case studies above are successful outcomes, but they were not all easy first-pass diagnoses. Several were initially hard to attribute and became solvable only after iterative TSDB analysis and correction.</p><ul><li><p><strong>Telemetry limits matter:</strong> if key metrics are missing or low quality, diagnosis may stop at insufficient evidence.</p></li><li><p><strong>Multi-fault incidents are harder:</strong> concurrent issues can blur attribution and mislead branch selection in the decision tree.</p></li><li><p><strong>Human verification is still required:</strong> for high-impact actions, operators should confirm recommendations independently.</p></li></ul><p>The key point is that the skill set is <strong>continuously evolving</strong>. When a new case appears (or an early diagnosis fails), we refine the attribution path and submit a PR to update the skill decision tree so similar future incidents can be diagnosed faster.</p><h2><strong>Summary</strong></h2><p>Across all five case studies, the pattern is consistent: a short prompt triggers a structured diagnostic procedure that reaches an actionable conclusion in minutes. The agent profiles compute, triages logs, queries the TSDB, correlates across domains, and traces into source code &#8212; all without manual guidance. The unified TSDB is the critical enabler: because GPU, host, network, and training metrics share the same timeline and host labels, the agent can test hypotheses across domain boundaries in seconds.</p><p>These skills ship with the repo and work today. Each real incident also improves them &#8212; the RDMA-driven TGS regression added a TGS degradation diagnosis workflow, RDMA phase-correlation technique, and node exclusion prioritization table to both the triage and TSDB skills; the heartbeat false-positive refined the interpretation rules; and the checkpoint restore leak led to a new <code>hw_procs_running</code> playbook. This feedback loop means the skills get better with every production deployment.</p><p>The framework is designed to grow. The three skills shipped today cover the most common diagnostic scenarios, but adding a new skill means writing structured markdown &#8212; no code changes required. Adding a new metric source means dropping a shell script into <code>utils/</code>. See the project&#8217;s <a href="https://github.com/AMD-AGI/maxtext-slurm/blob/main/skills/README.md">skills README</a> and <a href="https://github.com/AMD-AGI/maxtext-slurm/blob/main/docs/observability.md">observability docs</a> for contribution guidance.</p><p>Most importantly, adoption should be explicit about boundaries: diagnosis quality is bounded by telemetry quality, skill coverage, and incident complexity. Those boundaries improve over time as resolved incidents are converted into PR-based skill updates. Treat agentic diagnosis as a high-leverage copilot for incident response &#8212; fast and systematic &#8212; while keeping a human-in-the-loop for ambiguous or high-risk decisions.</p><p>We invite you to set up the agent on your own cluster, point it at your training jobs, and see what it finds.</p><h2><strong>Additional Resources</strong></h2><ul><li><p><a href="https://github.com/AMD-AGI/maxtext-slurm">MaxText-Slurm GitHub Repository</a></p></li><li><p><a href="https://rocm.blogs.amd.com/software-tools-optimization/maxtext-slurm/README.html">MaxText-Slurm: Production-Grade LLM Training with Built-In Observability</a></p></li><li><p><a href="https://github.com/ROCm/maxtext">ROCm MaxText Fork</a></p></li><li><p><a href="https://github.com/AI-Hypercomputer/maxtext">Upstream MaxText GitHub Repository</a></p></li><li><p><a href="https://cursor.com/">Cursor IDE</a></p></li><li><p><a href="https://docs.anthropic.com/en/docs/claude-code">Claude Code Documentation</a></p></li></ul><p><a href="https://rocm.blogs.amd.com/software-tools-optimization/maxtext-slurm-agentic-diagnosis/README.html">Read the ooriginal here</a></p><div><hr></div><p>Thank you for being here, and I hope you have a wonderful day.</p><p>Dev &lt;3</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/how-to-diagnose-failures-in-large?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/how-to-diagnose-failures-in-large?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wpq8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wpq8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png 424w, https://substackcdn.com/image/fetch/$s_!wpq8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png 848w, https://substackcdn.com/image/fetch/$s_!wpq8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png 1272w, https://substackcdn.com/image/fetch/$s_!wpq8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wpq8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png" width="412" height="167" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:167,&quot;width&quot;:412,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wpq8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png 424w, https://substackcdn.com/image/fetch/$s_!wpq8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png 848w, https://substackcdn.com/image/fetch/$s_!wpq8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png 1272w, https://substackcdn.com/image/fetch/$s_!wpq8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a736cd5-6b1c-4d6c-b0f1-0c1a5a587270_412x167.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3><strong>Reach out to me</strong></h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. :</p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[Earth Scientist Tells You How Your Land Is Lying to You [Livestreams]]]></title><description><![CDATA[How AI turns weeks of GIS work into 10 seconds of queries]]></description><link>https://www.artificialintelligencemadesimple.com/p/earth-scientist-tells-you-how-your</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/earth-scientist-tells-you-how-your</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Tue, 10 Mar 2026 05:41:15 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/189794669/5ce0edca045067150f2860966c284060.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Thanks to everyone for showing up the live-stream. <strong>Mark your calendars for 8 PM EST, Sundays, to make sure you can come in live and ask questions.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/earth-scientist-tells-you-how-your?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/earth-scientist-tells-you-how-your?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>^^Bring your moms and grandmoms into the Chocolate Milk Cult.</p><p><strong><a href="https://www.linkedin.com/in/mitch-rawlyk/">Mitch Rawlyk</a> </strong>is an earth system scientist and applied meteorologist with a background in regenerative agriculture and permaculture design. I met Mitch when he came to NYC, and his work has been on my radar ever since.  He runs a regenerative homestead in Canada where he and his family grow most of their own produce and do rotational grazing with chickens and ducks. That hands-on experience &#8212; years of staring at landscapes and asking questions the existing tools couldn&#8217;t answer fast enough &#8212; is what led him to build Landscope.</p><p>Landscope started as a Google Earth Pro package for permaculture designers and watershed restoration practitioners. It&#8217;s since grown into a full platform that can map anywhere in the United States at one-meter LiDAR resolution in seconds, delivering terrain analysis that used to take weeks of manual GIS work. They&#8217;re about to launch collaborative features that let multiple people work from the same mapping workspace. </p><p>You can check out the platform at <a href="https://landscope.earth/">landscope.earth</a> or reach Mitch directly at mitch@landscope.earth.</p><p>I really enjoyed this conversation, and I thought this would be a great way to expose you guys to some more &#8220;alt&#8221; applications of AI/data science + hopefully expose some of you to a very useful tool.  Have fun</p><div><hr></div><h1>Companion Guide to the Livestream: Why Your Land Is Lying to You</h1><p><em>This guide expands the core ideas and structures them for deeper reflection. Watch the full stream for tone, nuance, and side-commentary.</em></p><div><hr></div><h2>1. The GIS Workflow Is Broken for the People Who Need It Most</h2><p><strong>The Event</strong> &#8212; Mitch walked through the problem that birthed Landscope: the people who most need terrain data &#8212; permaculture designers, regenerative farmers, independent property buyers, development companies &#8212; are the ones least equipped to get it. The current workflow involves sourcing raw data from fragmented government portals, merging digital elevation models, clipping areas of interest, processing everything through QGIS, and stitching outputs into something usable. That process takes weeks and requires serious technical chops.</p><p><strong>Why this matters</strong> &#8212; This is a pattern that shows up everywhere in data tooling but rarely gets named this cleanly: the people generating the most value from insights are not the people who can build the pipeline to produce them.</p><p>A permaculture designer knows exactly what a south-facing slope at 8 degrees means for a food forest. They have no idea how to merge two LiDAR tiles in QGIS. The knowledge gap isn&#8217;t in interpretation &#8212; it&#8217;s in access. And that access gap has real consequences. Designs get made with bad data. Properties get purchased without understanding drainage. Houses get built where ponds want to go.</p><p>The information exists. It&#8217;s sitting in government databases at one-meter resolution. But the distance between &#8220;the data exists&#8221; and &#8220;I can make a decision with it&#8221; is measured in hundreds of hours of GIS education that nobody in the target market is going to get.</p><div><hr></div><h2>2. Why Free Contour Maps Are Worse Than No Contour Map</h2><p><strong>The Event</strong> &#8212; Mitch made a point that sounds counterintuitive until you think about it: the free tools people currently use to make land decisions are actively dangerous. Public contour maps operate at 30-meter resolution &#8212; 90-foot grid cells. At property scale, that resolution doesn&#8217;t just lose detail. It fabricates a landscape that doesn&#8217;t exist.</p><p><strong>Why this is a trap</strong> &#8212; A bad map that looks like a map is worse than no map at all. No map means you know you&#8217;re guessing. A 30-meter contour map means you think you&#8217;re informed when you&#8217;re hallucinating topography.</p><p>You&#8217;re placing a house based on drainage patterns the data literally cannot see. You&#8217;re orienting a garden based on slope aspects rounded into meaninglessness. Confidence goes up while accuracy goes to zero, and that mismatch is where expensive mistakes live.</p><p>Landscope pulls one-meter LiDAR data &#8212; the same resolution that gets flown after natural disasters like the Palisades fires. That&#8217;s not an incremental improvement over 30-meter. That&#8217;s the difference between &#8220;there&#8217;s a hill somewhere around here&#8221; and seeing the erosion cut running directly into your foundation.</p><div><hr></div><h2>3. The Aspect-Slope Intersection That Killed a Property in 30 Seconds</h2><p><strong>The Event</strong> &#8212; The demo was the best moment of the stream. Mitch mapped a random spot in rural New York for a hypothetical agrihood development. First, the slope aspect layer &#8212; which direction every part of the landscape faces &#8212; filtered for southeast, south, and southwest exposure. Those are the money slopes: accumulated heat through the day, peak solar intensity in mid-afternoon, the zones where passive solar housing and food production actually work.</p><p>Looked promising at first glance. Then he stacked slope angles on top. The overlap between &#8220;faces the right way&#8221; and &#8220;actually buildable&#8221; shrank to almost nothing. South-facing areas were either too steep for construction or buried in forest on aggressive grades. The property was dead for its stated use case inside of half a minute.</p><p><strong>Why this is the real product</strong> &#8212; Landscope isn&#8217;t a mapping tool. It&#8217;s a decision-elimination tool. The value isn&#8217;t showing you where to build &#8212; it&#8217;s showing you where you can&#8217;t, before you&#8217;ve spent six figures finding out the hard way.</p><p>The aspect distribution report showed almost nothing was south-facing in usable proportions. If your goal is passive solar housing and food production, this property doesn&#8217;t deserve a site visit. That&#8217;s time, money, and emotional attachment you never waste.</p><p>Mitch also showed the topographic wetness index &#8212; where water naturally wants to pool based on terrain geometry. The insight is dead simple: best place to put a pond is where a pond wants to go, worst place to put a house is where a pond wants to go. On a property he&#8217;d been personally considering in New Zealand, the flow accumulation lines ran straight into the house. First thing he&#8217;s checking on the site visit is the foundation and walls for previous water damage. That&#8217;s the kind of leverage that turns a mapping credit into a negotiation advantage worth tens of thousands.</p><div><hr></div><h2>4. The Consultancy Radius Problem</h2><p><strong>The Event</strong> &#8212; Mitch&#8217;s best power user is Symbiosis, a regenerative design consultancy in Texas. They&#8217;re using the platform to design water retention systems across large land areas, stacking aspect and slope layers to site food forests, and translating terrain data into presentations that non-technical landowners can actually read. The key unlock: their business is no longer capped by the odometer in their trucks.</p><p><strong>Why this is the business model insight</strong> &#8212; Before Landscope, a consultancy like Symbiosis had to physically visit a site for any meaningful terrain assessment. Your serviceable market was a radius around your office. Drive out, walk the land, eyeball the contours, come back and process data.</p><p>With high-resolution remote terrain analysis, they assess a property before getting in the truck. They show up with hypotheses already formed, drainage patterns mapped, slope intersections identified. That does two things: each engagement gets higher quality because you&#8217;re validating on-site instead of generating from scratch, and your geographic range expands because the pre-visit work that used to require presence now requires a few clicks.</p><p>There&#8217;s a subtler point worth naming too. Contour maps are a literacy barrier. Thousands of hours of practice makes them as legible as text. Without that, they&#8217;re abstract lines. Landscope&#8217;s color-coded, layer-stackable output is immediately readable by someone who&#8217;s never touched GIS &#8212; which matters because the consultancy&#8217;s client is usually a landowner, not a technician. Being able to show a client &#8220;this red is your hottest slope, this blue blob is where water pools, this is why we&#8217;re not building here&#8221; is a sales tool as much as a design tool.</p><div><hr></div><h2>5. Persistence, Collaboration, and Scenario Planning</h2><p><strong>The Event</strong> &#8212; Mitch demoed the staging environment for Landscope&#8217;s next release. Headline features: collaborative projects, persistent scenarios that survive a page refresh, cross-sectional path analysis with LiDAR precision, and refined area statistics for sub-regions.</p><p><strong>Why persistence is the unlock</strong> &#8212; Sounds basic until you realize the previous workflow: spend time configuring layers, identifying patterns, building a mental model of the landscape &#8212; and if you refresh the page, all of it vanishes. That&#8217;s not a UX annoyance. That&#8217;s a fundamental barrier to real decision-making. Nobody presents scenario comparisons to a client if the scenarios can&#8217;t be saved and recalled.</p><p>The cross-sectional analysis is the other sleeper feature. Draw a path for a proposed road or water pipe and get an elevation profile at one-meter resolution. That feeds directly into excavation estimates, grading plans, and infrastructure feasibility &#8212; the boring, expensive calculations that actually determine whether a project is viable.</p><div><hr></div><h2>6. Where This Goes: When the Sky Talks to the Land</h2><p><strong>The Event</strong> &#8212; The stream ended on what Landscope doesn&#8217;t do yet but clearly should. Right now, the hydrological layers assume 100% ground saturation &#8212; worst case, no infiltration, everything flows based purely on terrain geometry. Useful for drainage patterns, but it doesn&#8217;t model real precipitation events.</p><p>Mitch laid out the roadmap: integrate soil infiltration data, connect actual precipitation records, and eventually let users drop interventions onto the terrain &#8212; swales, ponds, roads &#8212; and rerun the hydrology to see what changes.</p><p><strong>Why this is the real vision</strong> &#8212; The current product is a terrain analysis tool. The future product is a landscape simulation environment. That gap is the difference between &#8220;here&#8217;s what the land looks like&#8221; and &#8220;here&#8217;s what happens when it rains 4 inches in an hour and you&#8217;ve built a swale at this contour.&#8221;</p><p>That&#8217;s where real decision-making power lives &#8212; not static analysis but counterfactual modeling. What if the pond goes here instead of there? What if we add retention at this drainage convergence? Does flooding risk at the house site go up or down?</p><p>Mitch also mentioned NDVI, NDMI, and land surface temperature as future satellite integrations &#8212; layering vegetation health, soil moisture, and thermal data on top of the terrain model. If someone implements a regenerative intervention, you could track its impact over time. That closes the loop between design and validation in a way that doesn&#8217;t currently exist for small-scale practitioners.</p><div><hr></div><p><em>As with every companion guide &#8212; this covers the big ideas but not everything. Mitch talked about the education course he&#8217;s building to teach layer interpretation, the county-wide screening service that works as a top-of-funnel before you zoom into property-scale analysis, the drone upload path that lets international users bypass LiDAR coverage gaps, and more. Watch the full stream for the complete picture.</em></p><p><em>As always, the roundup goes out to subscribers. If you want access to the livestreams and this level of analysis in real time, you know where to find us.</em></p><div><hr></div><p>Subscribe to support AI Made Simple and help us deliver more quality information to you-</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p>Flexible pricing available&#8212;<a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">pay what matches your budget here</a>.<br></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EAau!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EAau!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 424w, https://substackcdn.com/image/fetch/$s_!EAau!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 848w, https://substackcdn.com/image/fetch/$s_!EAau!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 1272w, https://substackcdn.com/image/fetch/$s_!EAau!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EAau!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png" width="339" height="93" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:93,&quot;width&quot;:339,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EAau!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 424w, https://substackcdn.com/image/fetch/$s_!EAau!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 848w, https://substackcdn.com/image/fetch/$s_!EAau!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 1272w, https://substackcdn.com/image/fetch/$s_!EAau!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0efc2ba8-a33c-450f-8744-8d8051e4cd55_339x93.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Thank you for being here, and I hope you have a wonderful day.</p><p>Dev &lt;3</p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/earth-scientist-tells-you-how-your?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/earth-scientist-tells-you-how-your?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><h3><strong>Reach out to me</strong></h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. :</p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[How Long Context Inference Is Rewriting the Future of Transformers]]></title><description><![CDATA[A clear guide to the new architectures battling the transformer&#8217;s memory and inference bottlenecks.]]></description><link>https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Sun, 08 Mar 2026 22:21:32 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_TZC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Transformer inference today faces a fundamental bottleneck&#8202;&#8212;&#8202;the quadratic cost of attention. This puts a hard economic ceiling on where we can reliably deploy transformers w/o running out of costs. Until recently, the industry&#8217;s primary response was brute force&#8202;&#8212;&#8202;more powerful hardware, optimized kernels, and deeper compression. But brute force can&#8217;t outrun math forever.</p><p>Now, a quiet rebellion is underway. Researchers have started looking past incremental kernel optimizations, toward bigger structural changes in how transformers handle memory and attention. Three core strategies have emerged, each attempting to break or sidestep the quadratic tax in fundamentally different ways. Each has distinct tradeoffs, unique risks, and different hardware realities. The future of scalable inference, serving millions of users with enormous contexts, hinges on these innovations.</p><p>This article will unpack these emerging strategies both technically and from an economic lens. Specifically, we will cover:</p><ul><li><p><strong>The Baseline Reality:</strong> Why &#8220;faster attention kernels&#8221; (like FlashAttention-3) are a baseline necessity, but not a fundamental escape route.</p></li><li><p><strong>Transformer-Preserving Escape Routes (KV Redesign):</strong> How models like DeepSeek-V2 (Multi-head Latent Attention), Palu, and KIVI keep the attention mechanism but compress, quantize, or evict the KV cache to survive.</p></li><li><p><strong>Attention-Replacing Escape Routes (Linear Time):</strong> How State Space Models (Mamba-2), Linear Attention (GLA), attempt to compress the entire past into a fixed-size state, eliminating KV growth entirely.</p></li><li><p><strong>The Engineering Reality of Hybrids:</strong> Why the current engineering equilibrium is converging on models like Jamba and RecurrentGemma&#8202;&#8212;&#8202;blending local attention for sharp recall with recurrences for cheap long-term memory.</p></li><li><p><strong>Extreme Context Systems:</strong> How brute-force distributed systems (Ring Attention, Context Parallelism) keep exact attention alive at the million-token scale by shifting the bottleneck from memory to communication.</p></li><li><p><strong>Comparative Deployment Economics:</strong> Hard numerical stress-tests projecting KV sizes, concurrency limits, and theoretical throughput for 1B, 3B, and 70B models running on H100 and A100 GPUs</p></li></ul><p>The goal here is simple: give you the most complete grounding possible to understand all the major plays in the LLM space, and to ultimately help you predict what&#8217;s coming next. Let&#8217;s begin.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KqxY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KqxY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KqxY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KqxY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KqxY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KqxY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg" width="1456" height="1001" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1001,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KqxY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KqxY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KqxY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KqxY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19ee8b3-8c23-45af-afda-9d88f1c1a75f_1456x1001.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em><a href="https://www.artificialintelligencemadesimple.com/p/how-ai-will-change-in-2026">PS: Diffusion (my main pick for the front runner for the next gen LLMs) will not be covered since that is a completely different generation paradigm and requires a separate article altogether. I&#8217;ll move it up the timeline now that Inception dropped Mercury 2, but for a brief overview of why I predicted Diffusion to take over, read our predictions for this year over here</a>.</em></figcaption></figure></div><h1>Executive Highlights (tl;dr of the article)</h1><p>Transformers are running into a hard deployment wall because long-context inference gets expensive in two different ways: prefill suffers from quadratic compute, and decode suffers from a KV-cache memory problem that crushes batching and concurrency. In practice, decode is often memory-bandwidth bound, not compute-bound; the GPU is spending its life hauling cached tokens around instead of thinking. That is why long context wrecks margins. On a 70B model running on an 80GB H100, a 4K context can support roughly 59 concurrent users, but at 128K context that drops to about 1 user. Raw hardware cost jumps from about <strong>$0.34 per million output tokens</strong> at 4K to roughly <strong>$19.84 per million output tokens</strong> at 128K. Same GPU; same model; just a much bigger context window. Congratulations, your SaaS now has the unit economics of a hostage situation.</p><p>The article then walks through the main escape routes. The first is <strong>KV-cache compression</strong>: keep attention, but shrink the memory bill with tricks like MLA, KV quantization, pruning, and paged memory. This is the most practical near-term fix because reducing bytes moved directly helps decode. DeepSeek-style MLA, for example, can slash KV size enough that the same 70B at 128K goes from about <strong>1 user per H100 to around 27</strong>, and hardware cost falls from about <strong>$19.84/M tokens to about $0.73/M</strong>. The second path is <strong>replacing attention entirely</strong> with recurrent or linear-time architectures like Mamba and Linear Attention. These remove KV growth altogether and can make memory stop being the main constraint, but they usually lose sharp token-level retrieval, especially on long contexts where exact recall matters.</p><p>The third path is <strong>hybrids</strong>, which are probably the current engineering sweet spot: use a few attention layers for exact retrieval and cheaper recurrent/compressed layers everywhere else. This pushes the memory wall back without fully sacrificing recall. In the article&#8217;s pricing, a Jamba-style hybrid cuts the 70B 128K case down to roughly <strong>14 users per H100</strong> and around <strong>$1.42/M tokens</strong> in raw hardware cost. Better than vanilla attention; worse than aggressive KV compression; much more realistic than pretending pure recurrent systems have no tradeoffs. The fourth path is <strong>distributed exact attention</strong> like Ring Attention and Context Parallelism, where you keep full attention but shard the sequence across GPUs. That preserves quality and enables million-token contexts, but it shifts the bottleneck to network bandwidth, especially during decode. Great if you care about capability more than cost; not great if you enjoy money.</p><p>The article&#8217;s real point is that every post-Transformer design is making the same trade: <strong>what are you willing to sacrifice to stop moving so many bytes?</strong> Standard attention preserves perfect recall but destroys concurrency and margins at long context. Compression methods save memory but add complexity or quality risk. Recurrent and linear models fix memory growth but lose exact retrieval. Hybrids are the best compromise today. Distributed attention keeps quality, but the bill follows you into the network rack.</p><p>This is a long article. If you are very busy, your best bet is to go to the Substack link, and use their navigation system to go the sections that are most interesting to you:</p><ul><li><p><strong>Section 0</strong>&#8202;&#8212;&#8202;The baseline math: what the quadratic tax actually is (two distinct failures, not one), the KV cache formula, and the hardware roofline that dictates why decode is universally memory-bound.</p></li><li><p><strong>Section 1</strong>&#8202;&#8212;&#8202;Keeping the Transformer but shrinking the bill: MLA low-rank compression, token eviction (SnapKV), KV quantization (KIVI), and paged memory (vLLM). Four orthogonal levers you can stack.</p></li><li><p><strong>Section 2</strong>&#8202;&#8212;&#8202;Mamba and State Space Models: the control-theory approach to killing the KV cache entirely, the FFT cheat code, why selectivity broke the convolution math, and the quantization error compounding problem that keeps Mamba out of production.</p></li><li><p><strong>Section 3</strong>&#8202;&#8212;&#8202;Linear Attention: the algebraic parentheses trick, the 200x concurrency math, and why it always looks clean on benchmarks but never ships at the frontier (feature collision destroys exact retrieval).</p></li><li><p><strong>Section 4</strong>&#8202;&#8212;&#8202;Hybrid Transformers (Jamba, RecurrentGemma): the portfolio allocation approach, the residual stream rescue mechanism, the 87% KV cache reduction&#8202;&#8212;&#8202;and the three friction points (kernel switching overhead, serving stack rewrites, the wall doesn&#8217;t disappear, it rotates).</p></li><li><p><strong>Section 5</strong>&#8202;&#8212;&#8202;Distributed exact attention (Ring Attention) and StreamingLLM: brute-forcing perfect recall across GPUs vs. amputating the middle and keeping the patient alive.</p></li><li><p><strong>Section 6</strong>&#8202;&#8212;&#8202;The full deployment stress-test: KV sizes, concurrency ceilings, and $/M output tokens for 1B, 3B, and 70B models across 4K to 1M context on H100s. Then every escape route re-priced against the worst-case scenario.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_TZC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_TZC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png 424w, https://substackcdn.com/image/fetch/$s_!_TZC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png 848w, https://substackcdn.com/image/fetch/$s_!_TZC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png 1272w, https://substackcdn.com/image/fetch/$s_!_TZC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_TZC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png" width="1456" height="4106" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4106,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_TZC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png 424w, https://substackcdn.com/image/fetch/$s_!_TZC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png 848w, https://substackcdn.com/image/fetch/$s_!_TZC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png 1272w, https://substackcdn.com/image/fetch/$s_!_TZC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F593086c7-db1e-44a4-bf27-000ee701ff1d_1600x4512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZY19!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZY19!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png 424w, https://substackcdn.com/image/fetch/$s_!ZY19!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png 848w, https://substackcdn.com/image/fetch/$s_!ZY19!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png 1272w, https://substackcdn.com/image/fetch/$s_!ZY19!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZY19!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png" width="848" height="193" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:193,&quot;width&quot;:848,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZY19!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png 424w, https://substackcdn.com/image/fetch/$s_!ZY19!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png 848w, https://substackcdn.com/image/fetch/$s_!ZY19!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png 1272w, https://substackcdn.com/image/fetch/$s_!ZY19!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6efd574-8db8-44d7-8f5f-7ab786900a40_848x193.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><h1>Section 0: Required Background on Costs of AI</h1><p>Before we analyze how to escape the quadratic tax, we need to define exactly what the tax is, how it is collected, and the physical limits of the hardware paying it.</p><p>We&#8217;re going to throw a lot of numbers and claims here. If you want to understand where they come from, make sure you read our primer:<a href="https://www.artificialintelligencemadesimple.com/p/the-real-cost-of-running-ai"> &#8220;The Real Cost of Running AI&#8221;,</a> where we derived the costs of running AI from scratch.</p><h4><strong>What &#8220;Breaking the Quadratic Tax&#8221; Actually Means</strong></h4><p>The phrase &#8220;quadratic tax&#8221; gets thrown around casually to describe why long-context AI is hard. But it is not a single bottleneck. It is two distinct failures occurring in two different phases of inference:</p><ol><li><p><strong>The Prefill Compute Tax: </strong>When you hand a prompt to a Transformer, global self-attention forces every token to look at every other token. This creates an irreducible O(n&#178;) computational cost. If you double the prompt length, the math operations quadruple.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lYhh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lYhh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png 424w, https://substackcdn.com/image/fetch/$s_!lYhh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png 848w, https://substackcdn.com/image/fetch/$s_!lYhh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!lYhh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lYhh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png" width="1456" height="910" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:910,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lYhh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png 424w, https://substackcdn.com/image/fetch/$s_!lYhh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png 848w, https://substackcdn.com/image/fetch/$s_!lYhh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!lYhh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc459dd6a-3a0e-4227-a813-5e6467759e78_1600x1000.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Notice how quickly Attention becomes the dominant part of the cost (since context lengths scale more than dimensions).</figcaption></figure></div><ol><li><p><strong>The Decode Bandwidth Tax</strong>: Once the prompt is processed, the model generates new tokens one by one. To avoid recomputing the entire past, the model caches the Key and Value (KV) vectors for every token. But here is the catch: to generate token n+1, the GPU must read the entire KV cache for tokens 1 through n from High Bandwidth Memory (HBM) into the chip&#8217;s processing cores.</p></li></ol><p>At a batch size of 1 (interactive latency), generation is almost entirely memory-bandwidth bound. You are not limited by how fast your GPU can multiply numbers. You are limited by how fast it can physically haul the KV cache across the silicon wire.</p><p>This is where we hit a huge misunderstanding around the current ecosystem.</p><p>FlashAttention and its successors (like FlashAttention-3) are brilliant I/O-aware algorithms. They greatly reduce memory writes by tiling calculations intelligently on-chip. <strong>But they do not change the underlying operation count, and they do not stop the KV cache from growing.</strong></p><p>If all tokens attend globally, the cache grows. When the cache grows, it eats the memory you need for batching. When you cannot batch requests, your economics collapse. This is a core mathematical reality that our FA doesn&#8217;t help with.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Tpwy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Tpwy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Tpwy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Tpwy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Tpwy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Tpwy!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg" width="1200" height="720.3296703296703" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:874,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Tpwy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Tpwy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Tpwy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Tpwy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F899b8e3e-e704-41f9-8023-809e601c5161_2400x1440.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The Mathematical Baseline</strong></p><p>To evaluate the escape routes objectively, we need a shared specification. We will use the standard Transformer math.</p><p>Here are the terms that dictate the cost of serving:</p><ul><li><p>n: context length (tokens)</p></li><li><p>d: model width</p></li><li><p>L: total layers</p></li><li><p>L_attn: attention layers (some hybrid models use fewer)</p></li><li><p>h: number of query heads</p></li><li><p>g: number of KV heads (like in Grouped Query Attention)</p></li><li><p>d_k: per-head key dimension (often d divided by h)</p></li><li><p>B_kv: bytes per KV element (2 bytes for FP16, 1 byte for INT8)</p></li></ul><p>The compute required for a standard Transformer layer (dense attention plus dense MLP) roughly scales as: <strong>24nd&#178; + 4n&#178;d.</strong></p><p>That 4n&#178;d part is the global all-pairs term. That is the prefill tax.</p><p>But the true dictator of scale is the total size of the KV cache at length n. It is calculated by multiplying: <strong>2 * L_attn * g * d_k * n * B_kv.</strong></p><p>This creates a brutal, non-negotiable reality. For every single token you add to the sequence, you pay a fixed bytes-per-new-token tax. If a new architecture does not shrink the number of KV heads (g), the dimension size (d_k), the byte size (B_kv), or eliminate the context length (n) entirely, it has not solved the memory wall.</p><h4><strong>The Hardware Roofline: Understanding Where Things Break</strong></h4><p>To understand why decode is so painful, we have to look at the hardware&#8217;s &#8220;roofline&#8221; limit. A GPU&#8217;s performance is capped by either its peak compute (TFLOPS) or its memory bandwidth (TB/s).</p><p>The deciding metric is Arithmetic Intensity: the ratio of math operations performed to bytes loaded from memory (FLOPs divided by Bytes).</p><p><a href="https://www.colfax-intl.com/nvidia/nvidia-h100?utm_source=chatgpt.com">Consider the official specs of an NVIDIA H100</a>:</p><ul><li><p>Memory Bandwidth: 3.35 TB/s</p></li><li><p>FP16 Compute: 1,979 TFLOPS</p></li></ul><p>To hit maximum compute efficiency, the H100 requires an Arithmetic Intensity of roughly 591 FLOPs per byte (1,979 divided by 3.35). If your algorithm does fewer than 591 math operations for every byte it pulls from memory, the processing cores will sit idle, waiting for data.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZAyM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZAyM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png 424w, https://substackcdn.com/image/fetch/$s_!ZAyM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png 848w, https://substackcdn.com/image/fetch/$s_!ZAyM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png 1272w, https://substackcdn.com/image/fetch/$s_!ZAyM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZAyM!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png" width="1200" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZAyM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png 424w, https://substackcdn.com/image/fetch/$s_!ZAyM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png 848w, https://substackcdn.com/image/fetch/$s_!ZAyM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png 1272w, https://substackcdn.com/image/fetch/$s_!ZAyM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555f70b1-4256-46ce-bead-93f120bca12d_2400x1800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the decode phase, the model loads the entire massive KV cache just to perform a tiny matrix-vector multiplication for a single token. The Arithmetic Intensity is practically zero.</p><p>This is why decode is universally memory-bound. GPUs have evolved to possess massive compute relative to their bandwidth. An architecture that saves FLOPs but moves the same number of bytes is useless for decode. To speed up generation, you must move fewer bytes.</p><h4><strong>Experimental Constraints</strong></h4><p>Theoretical elegance is nice, but deployment is a physical game of fit. Throughout this analysis, we will stress-test these theoretical escape routes against realistic conditions.</p><p>We will treat quantization as a baseline lever, separated into two buckets:</p><ol><li><p>Weight Quantization (INT8 or INT4): Shrinks the static footprint of the model, leaving more room for the KV cache.</p></li><li><p>KV Quantization (INT8 down to 2-bit): Directly attacks the per-token memory tax. Extremely impactful at long context, though it risks degrading recall.</p></li></ol><p>Our evaluation constraints:</p><ul><li><p>Target Hardware: NVIDIA H100 80GB and A100 80GB.</p></li><li><p>Memory Budget: 80GB total, minus a strict 6GB overhead for runtime, allocators, and activations.</p></li><li><p>Representative Models: 1B, 3B, and 70B parameter proxies.</p></li><li><p>Context Windows: 4K, 32K, 128K, and the extreme 1M-token boundary.</p></li></ul><h4>What this article will be</h4><p>Putting all this together, we first understand the following:</p><ul><li><p>Faster kernels buy you runway&#8202;&#8212;&#8202;they don&#8217;t lift you off.</p></li><li><p>Optimizing compute without solving memory simply delays hitting the wall&#8202;&#8212;&#8202;it doesn&#8217;t remove it.</p></li><li><p>Eventually, serving long-context inference reliably and cheaply requires deeper structural innovation, not incremental kernel tweaks.</p></li></ul><p>This sets up the critical question this article tackles: <strong>How do we fundamentally break or sidestep the quadratic reality?</strong></p><p>Next, we&#8217;ll cover exactly how researchers are answering this challenge.</p><h1><strong>Section 1: </strong>How to Keep the Transformer but Shrink the Memory Bill (KV Cache Compression)</h1><p>The attention mechanism in a standard Transformer is incredibly good at what it does: fetching highly specific information from anywhere in the prompt. The problem isn&#8217;t the attention operation itself; the problem is the storage bill it racks up.</p><p>Because of this, the first and most &#8220;production-friendly&#8221; family of escape routes shares a common philosophy: <strong>Do not replace attention. Just change what gets cached, how it is stored, or which parts are retained.</strong></p><p>After all, sometimes even when you <em>know</em> the foundation is toxic, and your latency issues will never truly be resolved, the voices in your head remind you that taking a &#8216;leap of faith&#8217; into a completely new architecture usually just ends with you breaking production on a Friday. In such cases, it&#8217;s best to listen to the voices. They know you aren&#8217;t a 10x pioneer; you&#8217;re just an idiot with a GitHub account, a Claude Code subagent circlejerk, and a rapidly depleting runway. So you stay. You don&#8217;t fix the rot; you find ways to deal with it.</p><p>If we look back at our KV cache formula (Total Cache = 2 * L_attn * g * d_k * n * B_kv), we can partition the redesign strategies into four orthogonal levers. You can stack these levers to get massive efficiency gains without completely throwing out the Transformer architecture.</p><h4><strong>1. Shrinking the Hidden Dimension: How Low-Rank Compression Saves Memory. </strong><em>Examples: DeepSeek-V2 (MLA)</em></h4><p>Instead of storing massive Key and Value tensors for every token, what if we just store a highly compressed &#8220;summary&#8221; vector?</p><p>This is the exact mechanism behind <a href="https://arxiv.org/abs/2405.04434">DeepSeek-V2&#8217;s Multi-head Latent Attention (MLA)</a>. In standard attention, you cache the Keys and Values. In MLA, you project the token&#8217;s information down into a much smaller latent vector. During the decode phase, the GPU only appends this tiny vector to the cache. When it needs to calculate attention, it rapidly &#8220;up-projects&#8221; or reconstructs the Keys and Values on the fly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ekDu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ekDu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png 424w, https://substackcdn.com/image/fetch/$s_!ekDu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png 848w, https://substackcdn.com/image/fetch/$s_!ekDu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png 1272w, https://substackcdn.com/image/fetch/$s_!ekDu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ekDu!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png" width="1200" height="514.2857142857143" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:624,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ekDu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png 424w, https://substackcdn.com/image/fetch/$s_!ekDu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png 848w, https://substackcdn.com/image/fetch/$s_!ekDu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png 1272w, https://substackcdn.com/image/fetch/$s_!ekDu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F305fd5f5-9a2f-4cc9-8584-34ec4973a0e1_1600x686.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://oilbeater.com/en/2025/04/14/deepseek-mla/">The original MHA needs to cache the full matrix, while MLA only caches the compressed vector and reconstructs the full matrix when needed.</a></figcaption></figure></div><p>By replacing the large 2 * g * d_k term with a much smaller compressed dimension d_c, DeepSeek reported a staggering 93.3% reduction in KV cache size&#8202;&#8212;&#8202;&#8220;<em>Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.</em>&#8221;</p><p>Unfortunately, this is not a free lunch. You are trading memory for compute. Reconstructing the keys and values requires an extra matrix multiplication. Furthermore, low-rank compression breaks traditional position embeddings like RoPE (Rotary Position Embedding). Applying RoPE to compressed keys increases their mathematical variance, degrading accuracy unless you implement careful &#8220;decoupled&#8221; RoPE strategies.</p><h4><strong>2. Evicting Useless Tokens: How Pruning the Context Saves Memory</strong><br><em>Examples: SnapKV, H2O, Expected Attention</em></h4><p>If you have a 100K token prompt, do you really need to remember the exact wording of a generic conjunction in paragraph 4? Probably not. Token eviction treats the KV cache as a dynamic optimization problem: out of all candidate tokens, we only want to keep a small subset of size m that minimizes the error in the final output.</p><p><a href="https://arxiv.org/abs/2404.14469">SnapKV, for instance, relies on the empirical observation that specific attention heads tend to focus on specific, predictable patterns. By observing a &#8220;window&#8221; at the end of the prompt, it guesses which tokens matter and prunes the rest</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vbU6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vbU6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png 424w, https://substackcdn.com/image/fetch/$s_!vbU6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png 848w, https://substackcdn.com/image/fetch/$s_!vbU6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png 1272w, https://substackcdn.com/image/fetch/$s_!vbU6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vbU6!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png" width="1200" height="395.6043956043956" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:480,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vbU6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png 424w, https://substackcdn.com/image/fetch/$s_!vbU6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png 848w, https://substackcdn.com/image/fetch/$s_!vbU6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png 1272w, https://substackcdn.com/image/fetch/$s_!vbU6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc821f850-2d6b-4cda-842b-0ede30cd01c5_2282x752.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The graph shows the simplified workflow of SnapKV, where the orange area represents the cluster of features per head selected by SnapKV. These features are then used to form new Key-Value pairs concatenated with the features in the observation window. Together, the selected prefix and observation windows constitute the new KV cache utilized for the generation.</figcaption></figure></div><p>As you might guess, Eviction is incredibly difficult to do perfectly. Why? Because you don&#8217;t know what the user is going to ask in the future. You might prune a token that seems irrelevant during the prefill phase, only to realize you needed it 500 tokens into the generation phase. <strong>Furthermore, modern fast-attention kernels (like FlashAttention) don&#8217;t actually materialize the full attention matrix in memory, making it structurally difficult to see which tokens were historically &#8220;important&#8221; without adding expensive, custom operations.</strong></p><h4><strong>3. Using Fewer Bits: How KV Quantization Saves Memory</strong><br><em>Examples: KIVI</em></h4><p>If you can&#8217;t reduce the number of tokens or the size of the vectors, just use fewer bits to represent them. Standard models use FP16 (2 bytes per number). We can quantize this down to INT8 (1 byte) or even 2-bit formats.</p><p>The breakthrough in recent papers like <a href="https://arxiv.org/abs/2402.02750">KIVI is the realization that the Key cache and the Value cache behave differently. </a>KIVI found that the Key cache has extreme outliers across specific channels, while the Value cache varies mostly token-by-token. By applying asymmetric quantization (quantizing Keys per-channel, and Values per-token), KIVI had some jaw-dropping numbers: &#8220;<em>With hardware-friendly implementation, KIVI can enable Llama, Falcon, and Mistral models to maintain almost the same quality while using 2.6&#215; less peak memory (including model weight). This reduction in memory usage enables up to 4&#215; larger batch size, bringing 2.35&#215; &#8764; 3.47&#215; throughput on real LLM inference workload&#8221;</em></p><h4><strong>4. Eliminating Waste: How Paged Memory Management Increases Concurrency. </strong><em>Examples: PagedAttention (vLLM)</em></h4><p>Sometimes the problem isn&#8217;t the math; it&#8217;s the memory allocator. Historically, serving engines allocated contiguous chunks of memory for the maximum possible sequence length of a request. If a request ended early, that memory sat empty, leading to massive fragmentation waste.</p><p>PagedAttention brought the concept of operating system virtual memory to LLMs. It stores KV blocks in non-contiguous, fixed-size pages. This virtually eliminates fragmentation and allows different requests to share the same cached prefixes (like system prompts), drastically increasing the number of users you can serve concurrently on the same GPU.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hGz4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hGz4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png 424w, https://substackcdn.com/image/fetch/$s_!hGz4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png 848w, https://substackcdn.com/image/fetch/$s_!hGz4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png 1272w, https://substackcdn.com/image/fetch/$s_!hGz4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hGz4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png" width="1450" height="812" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:812,&quot;width&quot;:1450,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hGz4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png 424w, https://substackcdn.com/image/fetch/$s_!hGz4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png 848w, https://substackcdn.com/image/fetch/$s_!hGz4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png 1272w, https://substackcdn.com/image/fetch/$s_!hGz4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80dbaaf5-b4f3-41a9-9968-7c49f7820ae0_1450x812.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Average percentage of memory wastes in different LLM serving systems.</figcaption></figure></div><p>This is likely my favorite technique since it&#8217;s basically the digital slumlord model of memory management: pack the contexts into non-contiguous studio apartments, charge premium API rates, and just pray your users don&#8217;t all trigger a 32K context generation at the exact same time. And as they say, dress for the job you want.</p><h4><strong>Summary: The Hardware Tradeoffs of KV Compression</strong></h4><p>How do these methods map to our hardware reality?</p><p>They directly attack the &#8220;Bytes&#8221; side of the Arithmetic Intensity equation. Because decoding is so aggressively memory-bound (especially at low batch sizes), taking on a little bit of extra math (like MLA&#8217;s reconstruction steps or eviction&#8217;s scoring logic) to drastically reduce the amount of data pulled from HBM is almost always a winning trade.</p><p>But as context windows stretch toward 1 million tokens, even a compressed cache eventually hits a wall. To truly eliminate the growth of n (the context length), we have to look at architectures that rip the attention mechanism out entirely.</p><p>And this is where things get a bit funky.</p><h1>Section 2: Attention-Replacing Escape Routes (Deleting the Context Length)</h1><p>Compressing the KV cache is a great short-term survival strategy. But fundamentally, you are still playing a losing game. As long as your memory grows with the context length (n), you will eventually hit a wall where your batch size drops to zero and your economics collapse.</p><p>To truly fix the quadratic tax, we have to look at architectures that rip standard attention out of the model entirely.</p><p>The goal of these &#8220;attention-replacing&#8221; escape routes is simple: achieve O(1) memory during generation. This means whether you are on token 100 or token 1,000,000, the amount of memory required to store the past stays exactly the same.</p><p>To do this, you have to stop storing a list of every token you&#8217;ve ever seen, and start compressing the entire past into a fixed-size mathematical box.</p><p>Let&#8217;s unpack the most prominent attempt to do this: State Space Models (SSMs) and Mamba.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zjwm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zjwm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png 424w, https://substackcdn.com/image/fetch/$s_!zjwm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png 848w, https://substackcdn.com/image/fetch/$s_!zjwm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png 1272w, https://substackcdn.com/image/fetch/$s_!zjwm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zjwm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png" width="1456" height="445" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:445,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zjwm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png 424w, https://substackcdn.com/image/fetch/$s_!zjwm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png 848w, https://substackcdn.com/image/fetch/$s_!zjwm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png 1272w, https://substackcdn.com/image/fetch/$s_!zjwm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca86b0c5-e6b4-4b16-9edf-fb3c14725cb2_2204x674.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">We&#8217;ll explain each step here.</figcaption></figure></div><h4><strong>Why Continuous Time? The Intuition Behind State Space Models</strong></h4><p>If we want to compress the past into a box, we need a mathematical way to describe how that box should change when new information hits it.</p><p>Think about how you track the temperature of a room. You don&#8217;t memorize every single temperature reading from the last 10 years (which is what a Transformer does with the KV cache). You just have a current temperature (the state), and when the AC turns on (the input), the temperature changes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CjD5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CjD5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!CjD5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!CjD5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!CjD5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CjD5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg" width="1000" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CjD5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!CjD5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!CjD5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!CjD5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29521fa7-6b19-40f3-ae2f-b22a74127b43_1000x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/ai-x-computing-chips-how-to-use-artificial?utm_source=publication-search">Your obligatory reminder that a weird number of problems in AI can be approximated by Markovian Processes, such as when we covered how Google uses AI to make better Chips.</a></figcaption></figure></div><p>Control theory spent a century figuring out how to track changing physical systems like this&#8202;&#8212;&#8202;whether it&#8217;s an airplane on radar, thermostat, or the runway of an AI wrapper where the API costs are higher than actual revenue. Early SSM papers (like S4) realized that if we treat a sequence of tokens not as a list of discrete words, but as a continuous flowing signal, we can borrow all this proven math to model change with differential equations.</p><p>To figure out how, let&#8217;s look at the exact behavior we want to enforce:</p><ul><li><p>We have a box that holds our compressed memory: let&#8217;s call it x(t).</p></li><li><p>We have a new piece of information arriving: let&#8217;s call it u(t).</p></li><li><p>We need to know how the box changes over time: dx/dt.</p></li></ul><p>To calculate that change, we need two forces pulling on the box.</p><ol><li><p><strong>The Decay Force:</strong> How much of the old memory should survive, and how much should fade away? We multiply the current state x(t) by a learned matrix A.</p></li><li><p><strong>The Input Force:</strong> How much should this brand-new token alter the state? We multiply the new input u(t) by a learned matrix B.</p></li></ol><p>Put them together, and you get the core engine of an SSM:<br>dx/dt = A * x(t) + B * u(t)</p><p>The matrix A is the absolute dictator of this system. It controls the memory timescales. If the numbers in A are set correctly, the system is stable&#8202;&#8212;&#8202;it slowly forgets useless old information while safely absorbing new inputs. Once the state is updated, we just multiply it by another matrix to pull our final answer out of the box.</p><h4><strong>Discretization: Turning the Ramp into Stairs</strong></h4><p>This continuous math is beautiful for tracking smooth audio waves. But language isn&#8217;t a smooth wave. It arrives in discrete, choppy chunks: Word 1, Word 2, Word 3.</p><p>To reconcile this, we have to &#8220;discretize&#8221; the math. We introduce a step size (Delta) to convert our continuous matrices A and B into discrete step-by-step matrices, A_bar and B_bar.</p><p>Now, the math becomes a simple recurrent loop: New State = (A_bar * Old State) + (B_bar * New Token)</p><p>Look at the economic consequences of this equation. Because we only need the <em>Old State</em> to compute the <em>New State</em>, the moment the math is done, we completely throw the <em>New Token</em> away. We do not cache its Key. We do not cache its Value. The KV cache drops to exactly zero.</p><h4><strong>The Convolutional Cheat Code: The Exact Math of Bypassing O(n&#178;)</strong></h4><p>To understand how State Space Models (SSMs) eliminate the prefill tax, we have to look at the exact algebra of the recurrent loop.</p><p>Let&#8217;s assume our starting state is zero (x_0 = 0). Here is the discrete update rule for the hidden state (x) and the output (y) at each step:<br>x_t = (A_bar * x_{t-1}) + (B_bar * u_t)</p><p>y_t = C * x_t</p><p>If we unroll this step-by-step for the first three tokens, substituting the previous state into the current one, the algebra looks like this:</p><ul><li><p>x_1 = B_bar * u_1</p></li><li><p>x_2 = (A_bar * B_bar * u_1) + (B_bar * u_2)</p></li><li><p>x_3 = (A_bar&#178; * B_bar * u_1) + (A_bar * B_bar * u_2) + (B_bar * u_3)</p></li></ul><p><strong>Notice what is happening to the input tokens (u). The older the token, the more times it gets multiplied by the decay matrix A_bar.</strong></p><p>Because we want the final output y, we multiply these states by the output matrix C. This allows us to define a single, massive <strong>Convolution Kernel (K)</strong>. This kernel represents the exact mathematical multiplier for how much a token from <em>k</em> steps ago affects the output today:</p><p><strong>K = [C * B_bar, C * A_bar * B_bar, C * A_bar&#178; * B_bar, &#8230;, C * A_bar^(n-1) * B_bar]</strong></p><p><strong>Because A_bar, B_bar, and C are fixed matrices (Time-Invariant), we can pre-compute this entire list of multipliers before the model even looks at the prompt.</strong></p><p>Once we have K, the output vector y for the entire prompt is simply the mathematical convolution of the input sequence u and the kernel K:<br><strong>y = K * u</strong></p><h4><strong>The O(n&#178;) Problem with Native Convolution</strong></h4><p>We have eliminated the step-by-step recurrent loop, but we have not solved our compute problem yet.</p><p>The standard mathematical definition of discrete convolution requires computing the sum of the products for every overlapping point. To compute the output at token 100, you multiply the first 100 elements of u by the first 100 elements of K (in reverse). To do this for every token from 1 to <em>n</em>, the number of multiplications scales as 1 + 2 + 3 &#8230; + n.</p><p>That arithmetic progression resolves to (n&#178; + n) / 2.</p><p>And just like that, we are right back where we started: an O(n&#178;) prefill tax. You really can&#8217;t get anything to work, huh? Now is a good time to seriously consider that the universe hates you and wants you to pay Daddy Jensen more money.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Bjz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Bjz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png 424w, https://substackcdn.com/image/fetch/$s_!6Bjz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png 848w, https://substackcdn.com/image/fetch/$s_!6Bjz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png 1272w, https://substackcdn.com/image/fetch/$s_!6Bjz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Bjz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png" width="1400" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Bjz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png 424w, https://substackcdn.com/image/fetch/$s_!6Bjz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png 848w, https://substackcdn.com/image/fetch/$s_!6Bjz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png 1272w, https://substackcdn.com/image/fetch/$s_!6Bjz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3028e9a7-4110-4c69-95b0-a78fc276d995_1400x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Don&#8217;t give up yet, though. Lucky for you, we can ass pull a mathematical loophole that makes SSMs viable: the <strong>Convolution Theorem</strong>.</p><h4><strong>The Convolution Theorem and the FFT</strong></h4><p>The theorem proves a fundamental property of linear algebra: a convolution in the time domain is mathematically identical to element-wise multiplication in the frequency domain.</p><p>The equation is: FFT(y) = FFT(K) &#8857; FFT(u)<br><em>(Where FFT is the Fast Fourier Transform, and &#8857; is element-wise multiplication).</em></p><p>Here is the exact step-by-step operation the GPU performs, and the cost of each step:</p><ol><li><p><strong>FFT of the Input (u):</strong> The GPU converts the sequence of tokens into the frequency domain. The standard Discrete Fourier Transform requires an n &#215; n matrix multiplication (O(n&#178;)). But the Fast Fourier Transform algorithm exploits the recursive symmetry of sine and cosine waves to divide-and-conquer the matrix, cutting the exact compute cost down to <strong>O(n log n)</strong>.</p></li><li><p><strong>FFT of the Kernel (K):</strong> We do the same thing to our pre-computed kernel. Cost: <strong>O(n log n)</strong>.</p></li><li><p><strong>Element-wise Multiplication:</strong> We take the two transformed lists and multiply them together, one-to-one. No massive matrix multiplies, just array_A[i] * array_B[i]. Cost: exactly <strong>O(n)</strong>.</p></li><li><p><strong>Inverse FFT:</strong> We take the resulting frequencies and run the Inverse FFT to transform them back into the final token outputs y. Cost: <strong>O(n log n)</strong>.</p></li></ol><p>By taking this mathematical detour, we have replaced an O(n&#178;) operation with three O(n log n) operations and one O(n) operation.</p><p>At a context length of 4K, the difference is negligible. But at a context length of 1 million tokens, n&#178; is 1 trillion operations. n log n is roughly 20 million operations.</p><p>By applying the Convolution Theorem, we mathematically annihilate the prefill compute tax.</p><h4><strong>The Mamba Breakthrough: Why &#8220;Selectivity&#8221; Broke the Math</strong></h4><p>If this math is so flawless, why did these models underperform Transformers on text?</p><p>Look at the definition of our kernel K:<br>K = [C * B_bar, C * A_bar * B_bar, C * A_bar&#178; * B_bar&#8230;]</p><p>This kernel assumes that A_bar and B_bar are static numbers. They treat every single position in the sequence exactly the same. But language requires content-adaptive memory. A model needs to forget a filler word like &#8220;um&#8221; instantly, but lock a critical noun into memory for 50,000 steps.</p><p>Mamba fixed this by introducing <strong>Selectivity</strong>. It makes the matrices A_bar and B_bar input-dependent. The model learns a gating mechanism that changes the values of A_bar and B_bar for <em>every single token (which makes intuitive sense, different tokens create different pressures on what needs to be retained)</em>.</p><p>But this creates another problem.</p><p>If A_bar changes at every step, you can no longer pull it out and create a single, static Kernel K.</p><p>K no longer exists. y = K * u is mathematically impossible. <strong>The Convolution Theorem breaks. </strong>You are forced back into computing the sequence step-by-step.</p><p>This is exactly why Mamba&#8217;s engineers had to invent the complex &#8220;Associative Scan&#8221; kernels. Let&#8217;s study them next.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wt9B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wt9B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png 424w, https://substackcdn.com/image/fetch/$s_!Wt9B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png 848w, https://substackcdn.com/image/fetch/$s_!Wt9B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png 1272w, https://substackcdn.com/image/fetch/$s_!Wt9B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wt9B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png" width="1456" height="1001" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1001,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wt9B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png 424w, https://substackcdn.com/image/fetch/$s_!Wt9B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png 848w, https://substackcdn.com/image/fetch/$s_!Wt9B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png 1272w, https://substackcdn.com/image/fetch/$s_!Wt9B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b77a9f0-4218-4656-bef7-baa6cca01a5b_1600x1100.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>The Engineering Reality: The Math of the Associative Scan</strong></h4><p>We lost our little O(n log n) FFT cheat code. We are forced back into computing the sequence step-by-step because Mamba&#8217;s matrices A and B now change at every time step t (which we will now write as A_t and B_t).</p><p>The recurrence is:<br>x_t = (A_t * x_{t-1}) + (B_t * u_t)</p><p>If you code this as a standard for loop on a GPU, performance collapses. The GPU has thousands of cores; a sequential loop forces one core to work while the rest sit idle.</p><p>To parallelize this, the Mamba authors relied on a computer science algorithm called a <strong>Parallel Prefix Sum</strong> (or Associative Scan).</p><p>To make a scan work, you must define a mathematical operation that is strictly associative&#8202;&#8212;&#8202;meaning (X &#8855; Y) &#8855; Z must equal X &#8855; (Y &#8855; Z). If it is associative, you can group the sequence into chunks, calculate the chunks on different GPU cores at the exact same time, and combine the results in a tree structure.</p><p>But our update rule isn&#8217;t just simple addition. It&#8217;s a matrix multiplication and an addition. How do we make that associative?</p><p>We define a new operator (let&#8217;s call it &#8855;) that operates on a pair of values: the decay matrix A, and the input projection B * u.</p><p>If we have two adjacent time steps, i and j, the operator is defined as:<br>(A_i, B_i * u_i) &#8855; (A_j, B_j * u_j) = (A_j * A_i, A_j * B_i * u_i + B_j * u_j)</p><p>Because this specific combination of multiplying the old state and adding the new input is mathematically associative, we can chunk the prompt. Core 1 computes the exact tuple for tokens 1&#8211;10. Core 2 computes tokens 11&#8211;20. They do this in parallel, then merge their tuples up the tree.</p><p>But our issues don&#8217;t end here.</p><p>And since the creators of Mamba have not agreed to my demands to a yacht party with crystals of chocolate milk that I can snort off hookers, let&#8217;s end this section with a discussion of the biggest issues with Mamba currently.</p><h4>Why Mamba isn&#8217;t the Status Quo (Yet)</h4><p>In standard Attention, you load two massive matrices (Q and K) into the GPU&#8217;s ultra-fast Tensor Cores and multiply them. It requires an astronomical amount of math, but it has incredibly high <strong>Arithmetic Intensity</strong>. The cores crunch numbers without having to constantly wait on memory.</p><p>Look at our custom scan operator &#8855;. We are doing a few small matrix-vector multiplications, but to execute the tree structure, the GPU threads have to constantly read and write these (A, B*u) intermediate tuples to the chip&#8217;s SRAM.</p><p>The eagle-eyed amongst you would have noticed are doing very little actual math per byte of data moved. You have successfully parallelized the O(n) computation, but you have created an algorithm that is ruthlessly <strong>memory-bound</strong>. Writing a custom CUDA kernel that handles this memory traffic without stalling the GPU is one of the hardest software engineering problems in AI right now. <strong>Even with brilliant implementation, Mamba&#8217;s core layers struggle to hit the peak hardware utilization numbers that standard Transformer matrix multiplication achieves effortlessly.</strong></p><p>Along with this, there is ANOTHER problem that Yamchas many deployments of Mamba.</p><p>In a Transformer, dropping your KV cache to INT8 or INT4 quantization is relatively safe. You are just reading a slightly blurry memory. If token 5&#8217;s Key vector is slightly off, it only affects token 5. The error is isolated.</p><p>Let&#8217;s look at the algebra of what happens when you quantize a recurrent model.</p><p>When you quantize the state update, you introduce a small rounding error at every step. Let&#8217;s call this error q_t. Our update rule becomes:<br>x_t = (A_t * x_{t-1}) + (B_t * u_t) + q_t</p><p>To see how this error behaves over time, we have to look at the difference between the &#8220;perfect&#8221; state and our &#8220;quantized&#8221; state. Let&#8217;s unroll the accumulated error e_t over three steps:</p><ul><li><p>e_1 = q_1</p></li><li><p>e_2 = q_2 + (A_2 * q_1)</p></li><li><p>e_3 = q_3 + (A_3 * q_2) + (A_3 * A_2 * q_1)</p></li></ul><p>Notice the fundamental difference between this and a Transformer. The quantization noise from token 1 (q_1) doesn&#8217;t just sit there. It gets multiplied by A_2. <strong>Then it gets multiplied by A_3.</strong></p><p>In a nutshell, Recurrent systems don&#8217;t just accumulate error; they <strong>multiply</strong> it through time.</p><p><em>A tiny INT8 rounding mistake at the start of a 100,000-token prompt will compound exponentially until the mathematical state completely blows up and the model starts outputting garbage.</em></p><p>This is why the official Mamba repository explicitly warns that SSMs are highly sensitive to their recurrent dynamics. To prevent this compounding failure, engineers are often forced to store the recurrent state in high-precision FP32 (4 bytes per number).</p><p>So you win infinite context (technically, how much of that is useful, especially when it comes to precision heavy tasks that require exact wordings is debatable) and massive concurrency, but you invite a massive kernel engineering and quantization headache.</p><p>What happens if we decide that Control Theory is no good? Up next, we will look at how <strong>Linear Attention</strong> tries to achieve the exact same O(1) memory goal, but it does it entirely through algebra instead of differential equations.</p><h1>Section 3: Linear Attention and Fast-Weights (Algebraic Factorization)</h1><p>State Space Models try to kill the context length using differential equations and control theory. But what if you don&#8217;t want to learn an entirely new branch of mathematics? What if you just want to take the Transformer we already have, and hack the linear algebra so it stops eating our GPUs?</p><p>This is the exact goal of <strong>Linear Attention</strong>. It attempts to keep the &#8220;query-key retrieval&#8221; architecture of a standard Transformer, but uses a mathematical loophole to completely bypass the O(n&#178;) prefill tax and the growing KV cache.</p><p>To understand how it does this, we have to isolate the exact mathematical operation that chains us to O(n&#178;): the Softmax function.</p><h4><strong>The Villain: Why Softmax Forbids Associativity</strong></h4><p>In Part 1, we established the core attention equation: Output = Softmax(Q * K^T) * V.</p><p>To figure out how much token A cares about token B, we multiply their Query and Key vectors together to create an n &#215; n matrix of raw scores. We then wrap that matrix in a Softmax function.</p><p>Softmax exists for two critical reasons:</p><ol><li><p><strong>Sharp Selective Retrieval:</strong> It creates a normalized similarity distribution. It forces the model to pick exactly which past tokens matter, acting as a strict 100% attention budget.</p></li><li><p><strong>Statistical Stability:</strong> Without normalization, if you add up 100,000 raw dot products, the magnitude of the aggregated vector grows proportional to the sequence length. Variance explodes, and the model&#8217;s internal activations blow up.</p></li></ol><p>But Softmax comes with a fatal structural cost: it is a non-linear function.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CDCU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CDCU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png 424w, https://substackcdn.com/image/fetch/$s_!CDCU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png 848w, https://substackcdn.com/image/fetch/$s_!CDCU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png 1272w, https://substackcdn.com/image/fetch/$s_!CDCU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CDCU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png" width="1456" height="1023" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1023,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CDCU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png 424w, https://substackcdn.com/image/fetch/$s_!CDCU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png 848w, https://substackcdn.com/image/fetch/$s_!CDCU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png 1272w, https://substackcdn.com/image/fetch/$s_!CDCU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7700cd5-6cbb-4f81-abe4-95966b5967db_1548x1088.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://codinginterviewsmadesimple.substack.com/p/how-linearity-makes-llms-more-energy?utm_source=publication-search">We explained Linearity in Math and why it&#8217;s so important for AI here.</a></figcaption></figure></div><p>Mathematically, you cannot distribute a matrix multiplication through a non-linear boundary. You cannot regroup the variables. You are algebraically trapped. The GPU <em>must</em> calculate the massive n &#215; n matrix of (Q * K^T) first. You physically cannot change the order of operations, which means you are permanently chained to O(n&#178;).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ufu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ufu2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ufu2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ufu2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ufu2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ufu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg" width="690" height="551" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:551,&quot;width&quot;:690,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ufu2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ufu2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ufu2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ufu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7022e23a-c8fc-4dca-91df-4ca56890015b_690x551.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">You&#8217;ll see us touch on how multiple alternatives are all trying to work on integrating Linearity in their operations. <a href="https://www.artificialintelligencemadesimple.com/p/beyond-matmul-the-new-frontier-of?utm_source=publication-search">MatMul Free LLMs are one very cool example that we broke down here</a>.</figcaption></figure></div><h4><strong>The Hack: Kernel Factorization</strong></h4><p>To escape, researchers use a trick called Kernel Factorization.</p><p>What if we remove Softmax entirely, and instead apply a non-linear feature map (let&#8217;s call it &#934;) independently to Q and K <em>before</em> we multiply them? For example, applying a function like ELU(x) + 1 simply forces all the vectors to be positive. (In many linear attention papers like Performer, this is actually used to <em>approximate</em> the original Softmax kernel without computing the n &#215; n matrix).</p><p>We change the math from Softmax(Q * K^T) to &#934;(Q) * &#934;(K)^T.</p><p>Because &#934; is applied individually to the vectors, the relationship between Q and K is now strictly <strong><a href="https://www.reddit.com/r/learnmath/comments/o0g06h/can_you_please_describe_a_bilinear_form_in/">bilinear</a></strong>. And bilinearity unlocks the greatest cheat code in matrix algebra: <strong>Associativity</strong>.</p><p>Associativity is the rule that says (A * B) * C is the exact same thing as A * (B * C). You can group the multiplications however you want, and you will get the exact same answer.</p><h4><strong>The Algebraic Escape Route: Moving the Parentheses</strong></h4><p>Let&#8217;s look at the exact matrix dimensions of what happens when we regroup the variables.</p><ul><li><p>Our Query matrix Q has dimensions (n &#215; d).</p></li><li><p>Our Key matrix K^T has dimensions (d &#215; n).</p></li><li><p>Our Value matrix V has dimensions (n &#215; d).</p></li></ul><p><strong>The Standard Way (The Quadratic Tax):</strong></p><ol><li><p>We calculate (Q * K^T) first.</p></li><li><p>An (n &#215; d) matrix times a (d &#215; n) matrix produces a massive <strong>(n &#215; n) matrix</strong>. This is the global attention map.</p></li><li><p>We multiply that (n &#215; n) matrix by V (n &#215; d) to get our final output (n &#215; d).</p></li></ol><p><strong>The Linear Attention Way:</strong><br>We move the parentheses. Instead of (Q * K^T) * V, we calculate Q * (K^T * V).</p><ol><li><p>We calculate (K^T * V) first.</p></li><li><p>A (d &#215; n) matrix times an (n &#215; d) matrix produces a <strong>(d &#215; d) matrix</strong>.</p></li><li><p>We multiply Q (n &#215; d) by that (d &#215; d) matrix to get our final output (n &#215; d).</p></li></ol><p>Look closely at that middle step. We created a (d &#215; d) matrix.</p><p><em>There is no n in that matrix.</em></p><p>By computing K^T * V first, we have compressed the entire sequence of 100,000 tokens into a fixed-size mathematical block. We just created a <strong>Fast-Weight Memory</strong>. (This concept traces back to J&#252;rgen Schmidhuber (isn&#8217;t that funny?) in the early 1990s&#8202;&#8212;&#8202;the idea of one neural network generating weights for another network on the fly. It failed because 90s hardware couldn&#8217;t handle the compute, but modern GPUs can).</p><p>During generation (decode), we don&#8217;t need to read a massive KV cache anymore. We just maintain this single, fixed-size historical state matrix (let&#8217;s call it S_t).</p><p><strong>When a new token arrives, the update equation is simple:<br>S_t = S_{t-1} + (&#934;(k_t) * v_t^T)</strong></p><p>We take the new token&#8217;s Key and Value, multiply them together to create a rank-1 (d &#215; d) grid, and literally add that new grid to our historical state. The moment we add it, we throw the token away. Constant memory. O(1) generation.</p><h4><strong>The Economics: Pricing the Capacity Unlock</strong></h4><p>Let&#8217;s put actual byte counts on this &#8220;fixed&#8221; state, because constant memory sounds like a free lunch until you calculate the size of the constant.</p><p>Let&#8217;s use our 14B model proxy from Part 1: 40 layers, 40 attention heads, and a head dimension (d_k) of 128.<br>In Linear Attention, you have to store a (d_k &#215; d_v) matrix for every single head.</p><ul><li><p>Size per head: 128 &#215; 128 = 16,384 parameters.</p></li><li><p>Across 40 heads: 655,360 parameters.</p></li><li><p>At FP16 (2 bytes per param): ~1.3 MB per layer.</p></li><li><p>Across all 40 layers: <strong>52.4 MB of total fixed state.</strong></p></li></ul><p>What does 52.4 MB <em>mean</em> for your profit margins?</p><p>Let&#8217;s look at an 80GB H100. A 14B model at INT4 takes up ~7GB of memory. Framework overhead takes ~6GB. You have roughly 67GB left for users.</p><ul><li><p><strong>At 128K context (Standard Attention):</strong> Assuming a highly optimized model using Grouped Query Attention (GQA with exactly 8 KV heads), the KV cache is ~10.4 GB per user. 67GB / 10.4GB = <strong>6 concurrent users.</strong></p></li><li><p><strong>At 128K context (Linear Attention):</strong> The state matrix is 52.4 MB. 67GB / 0.052GB = <strong>~1,288 concurrent users.</strong></p></li></ul><p>From a pure capacity standpoint, that is a 200x revenue multiple on the exact same silicon.</p><h4><strong>The Hardware Reality: The Arithmetic Intensity Trap</strong></h4><p>But capacity is only half the battle. We have to look at bandwidth and generation speed.</p><p>To generate a single token, the GPU still has to load that 52.4 MB state from High Bandwidth Memory (HBM) into its SRAM cores, update it, and write it back.</p><p>Let&#8217;s calculate the Arithmetic Intensity (FLOPs per byte) for this operation:</p><ul><li><p><strong>Bytes moved:</strong> 52.4 MB.</p></li><li><p><strong>FLOPs performed:</strong> Updating the matrix and calculating the output requires exactly 4 * d_k&#178; operations per head. Across the whole model, that comes out to roughly 105 million FLOPs.</p></li><li><p><strong>Arithmetic Intensity:</strong> 105 MFLOPs / 52.4 MB = <strong>2 FLOPs / byte.</strong></p></li></ul><p>Remember the Roofline model from Part 1? The H100 (using TF32 Tensor Cores) requires roughly 295 FLOPs/byte to be compute-bound. (If you use the FP16 peak marketing numbers, it&#8217;s nearly 591 FLOPs/byte).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x6nU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x6nU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png 424w, https://substackcdn.com/image/fetch/$s_!x6nU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png 848w, https://substackcdn.com/image/fetch/$s_!x6nU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png 1272w, https://substackcdn.com/image/fetch/$s_!x6nU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x6nU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png" width="1456" height="1195" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1195,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x6nU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png 424w, https://substackcdn.com/image/fetch/$s_!x6nU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png 848w, https://substackcdn.com/image/fetch/$s_!x6nU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png 1272w, https://substackcdn.com/image/fetch/$s_!x6nU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9302352-3dd3-4f6e-87f3-2067df228984_2388x1960.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The roofline number&#8202;&#8212;&#8202;FLOPs per byte&#8202;&#8212;&#8202;is the only spec that actually matters when sizing hardware for LLM workloads. Everything above ~150 is compute-bound; below ~100 and you&#8217;re burning cycles waiting on memory. The MI300A sitting at 92 isn&#8217;t an accident&#8202;&#8212;&#8202;AMD deliberately traded compute for bandwidth to target inference.</figcaption></figure></div><p>At 2 FLOPs/byte, Linear Attention is catastrophically memory-bound.</p><p>A sharp engineer will immediately object: <em>&#8220;Wait, modern kernels like FlashLinearAttention use SRAM tiling and chunking to overlap transfers and keep the state on-chip!&#8221;</em></p><p>This is true. Hardware-aware kernels drastically reduce intermediate reads and writes. But you still must load the final 52.4 MB state from HBM for every generation step. Even with perfect tiling, your arithmetic intensity remains orders of magnitude below the roofline limit.</p><h4><strong>The Regime Analysis: When Do You Actually Win?</strong></h4><p>Because the fixed state is 52.4 MB, Linear Attention is actually slower and more memory-intensive than a Transformer at short context lengths.</p><ul><li><p><strong>Under 4K tokens:</strong> The standard KV cache is smaller than 52 MB. Linear attention loses.</p></li><li><p><strong>At ~32K tokens:</strong> The capacity lines cross. Linear attention breaks even.</p></li><li><p><strong>At 128K+ tokens:</strong> Linear attention becomes a physical necessity for survival.</p></li></ul><h4><strong>The Structural Flaw: Feature Collision and Loss of Identity</strong></h4><p>Even if you deploy in the 128K+ regime, Linear Attention has a physical, mathematical limitation that ruins it for precision-heavy use cases like coding or RAG.</p><p>Look back at the update rule: S_t = S_{t-1} + (k * v). We are using additive compression.</p><p>A standard Transformer separates every single token. The KV cache perfectly preserves distinct token identities. If you ask it to find a specific needle in a 1-million-token haystack, it can perfectly isolate that exact token&#8217;s Key.</p><p>Linear attention takes 100,000 distinct token associations and compresses them into a single matrix via additive updates. Over long sequences, these features collide. You lose item separability. The model physically loses the capacity for sharp, exact associative recall because the distinct token identities blur into a single overlapping grid.</p><h4><strong>The Fix that Breaks Everything: Decay Gates</strong></h4><p>To stop the state matrix from blurring into useless noise, modern fast-weight variants (like Gated Linear Attention and DeltaNet) introduce a decay gate (&#947;).<br>S_t = (&#947; * S_{t-1}) + (&#934;(k_t) * v_t^T)</p><p>This forces the model to slowly forget old information. But look at what &#947; just did to our math. Because &#947; is a time-dependent multiplier that changes at every step based on the input, the operation is no longer a simple, order-independent sum. It is a time-dependent recurrence.</p><p>You just broke pure associativity.</p><p>Because the decay at step 100 depends on the decay at step 99, you can no longer process the prefill perfectly in parallel. You are forced into <strong>chunkwise recurrent</strong> or <strong>associative scan</strong> algorithms to parallelize training, exactly like RetNet and modern Mamba implementations.</p><p>Furthermore, because you are repeatedly multiplying the state by &#947;, you can accumulate numerical error over long horizons. This makes aggressive low-precision deployment (like INT8) much harder to stabilize without the state drifting.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!m_n1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!m_n1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png 424w, https://substackcdn.com/image/fetch/$s_!m_n1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png 848w, https://substackcdn.com/image/fetch/$s_!m_n1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png 1272w, https://substackcdn.com/image/fetch/$s_!m_n1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!m_n1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png" width="1456" height="679" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:679,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!m_n1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png 424w, https://substackcdn.com/image/fetch/$s_!m_n1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png 848w, https://substackcdn.com/image/fetch/$s_!m_n1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png 1272w, https://substackcdn.com/image/fetch/$s_!m_n1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1627057-4997-4e70-b96e-aa76d25820cb_1500x700.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Linear Attention successfully kills the KV cache, but in the end all must bend the knee to a brutal law of AI economics: you either pay for your context with memory (KV cache), or you pay for it with precision (item separability). There is no free lunch.</p><p>Linear Attention variants are very interesting to study in depth since they always have amazing benchmark results on paper. Benchmarks look clean, the math checks out, and there&#8217;s never any reason it shouldn&#8217;t work. It&#8217;s why the Spreadsheet merchants pretending to be AI Influencers/Thought Leaders eat that shit up, and every LA drop is accompanied by a lot of hype about how groundbreaking the whole thing is. And that&#8217;s why we don&#8217;t see this trickle into innovation at the frontier.</p><p>All this being said, I wouldn&#8217;t be talking about if LA didn&#8217;t have something useful for us to learn. After all, what good is it spend our time on losers that didn&#8217;t work out?</p><h1>Section 4: Hybrids Transformer Networks: Best of Both Worlds ?</h1><p>In Section 2, we looked at State Space Models (Mamba) which achieve infinite context but suffer from feature collision. In Section 3, we looked at Linear Attention, which unlocks massive batch sizes but loses the ability to do exact, item-separable retrieval.</p><p>Both architectures proved the same brutal law of AI economics: you either become a fat fuck and eat that massive KV cache, or you accept brain damage and lose precision. No free lunches here.</p><p>But go back and read that law carefully. It says you have to pay. It doesn&#8217;t say you have to pay <em>the same way everywhere</em>.</p><p>You can choose <em>where</em> in the network you eat each cost. Compress the haystack cheaply in some layers. Retrieve the needle exactly in others. The tradeoff doesn&#8217;t have to be global. You do not have to ruin the whole network at once. You can distribute the suffering across depth like a civilized society, the way a portfolio manager allocates risk across asset classes instead of going all-in on one bet.</p><p>After all, you are not OpenAI. You do not have a sovereign wealth fund backing your compute cluster. You are a CTO of an AI wrapper whose bad infra decision away from explaining to the board why company lunches now consist of biting the dust, despite your brilliant models and beautiful demos. Best of luck explaining that to the bright-eyed Queens who invested in your little marketing automation startup on the promise that you will rock them.</p><p>So, you need to start optimizing things by trying to salvage the best of both worlds. How do we do that?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BNSl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BNSl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png 424w, https://substackcdn.com/image/fetch/$s_!BNSl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png 848w, https://substackcdn.com/image/fetch/$s_!BNSl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png 1272w, https://substackcdn.com/image/fetch/$s_!BNSl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BNSl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png" width="1256" height="1094" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1094,&quot;width&quot;:1256,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BNSl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png 424w, https://substackcdn.com/image/fetch/$s_!BNSl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png 848w, https://substackcdn.com/image/fetch/$s_!BNSl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png 1272w, https://substackcdn.com/image/fetch/$s_!BNSl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42a8978e-0947-4a47-b825-4468ebf7cb02_1256x1094.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://arxiv.org/abs/2403.19887">&#8220;Combining Transformer, Mamba, and MoE elements allows flexibility in balancing among the sometimes conflicting objectives of low memory usage, high throughput, and high quality.&#8221;</a></figcaption></figure></div><p>If we take the portfolio metaphor literally (and we should, because the math maps almost perfectly), the architecture becomes a highly operational trade-off with observable metrics:</p><ul><li><p><strong>The % of Attention layers defines your strict memory budget.</strong> Observable metric: KV bytes per token. This dictates your maximum context length at a fixed VRAM limit. Non-negotiable. Hardware doesn&#8217;t care about your architecture preferences.</p></li><li><p><strong>The spacing between Attention layers defines your retrieval latency in depth.</strong> Observable metric: <em>needle survival distance</em>&#8202;&#8212;&#8202;the number of consecutive compressor layers a distinct token identity can pass through before it blurs beyond reliable recovery. Empirically, this appears to sit in the range of 4 to 8 layers for current Mamba-class compressors before needle-in-haystack accuracy starts to degrade sharply, though the exact number is task-dependent. Verbatim string recall dies first. Semantic gist survives longer. Space your attention layers further apart than this survival distance and tasks requiring exact matching&#8202;&#8212;&#8202;coding, multi-hop RAG, citation retrieval&#8202;&#8212;&#8202;degrade first, because distinct token identities blur inside the compressed layers before an attention layer can rescue them.</p></li><li><p><strong>The SSM blocks act as cheap local feature extractors</strong>, compressing the &#8220;haystack&#8221; (the syntax, tone, general narrative).</p></li><li><p><strong>The Attention blocks act as global routers</strong>, scanning the entire sequence to perfectly retrieve the &#8220;needle.&#8221;</p></li></ul><p>Let&#8217;s look at exactly how this is constructed, the math of why it works, and the severe systems engineering friction that prevents it from being a magical silver bullet.</p><h4>How Do You Wire a Hybrid Transformer Together?</h4><p>You can&#8217;t just throw a sub-quadratic layer and an Attention layer into a blender. You have to route the information.</p><p>While AI21&#8217;s Jamba uses Mamba as its compressor, the industry is experimenting with multiple variants of this portfolio approach. You can swap Mamba out for Linear Attention (GLA/DeltaNet), or even sliding-window local attention. The compressor choice changes the <em>type</em> of compression artifact you get so you can pick your poison based on your hardware and your retrieval requirements.:</p><ul><li><p><strong>Mamba</strong> collapses into a fixed recurrent vector of size O(d_model * d_state). What you get: the cheapest possible compression per layer, tiny memory footprint, excellent at absorbing local syntax and tone. What you lose: the state is a vector, not a matrix&#8202;&#8212;&#8202;its capacity to store distinct retrievable items is severely limited. Long-range exact recall dies fast.</p></li><li><p><strong>Linear Attention</strong> collapses into a (d * d) fast-weight grid. What you get: a richer compression surface than Mamba (a full matrix instead of a vector), which means better retention of associative patterns across moderate distances. What you lose: that grid is 14x larger than the Mamba state (we priced this in Section 3), and it still mushes everything together&#8202;&#8212;&#8202;just in a higher-dimensional mush.</p></li><li><p><strong>Sliding-window attention</strong> keeps exact tokens but only within a local radius. What you get: perfect recall within the window, no compression artifacts at all. What you lose: everything outside the window is invisible. There is no compression&#8202;&#8212;&#8202;there is simply amnesia.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P43r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P43r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png 424w, https://substackcdn.com/image/fetch/$s_!P43r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png 848w, https://substackcdn.com/image/fetch/$s_!P43r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png 1272w, https://substackcdn.com/image/fetch/$s_!P43r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P43r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png" width="1400" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:900,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P43r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png 424w, https://substackcdn.com/image/fetch/$s_!P43r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png 848w, https://substackcdn.com/image/fetch/$s_!P43r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png 1272w, https://substackcdn.com/image/fetch/$s_!P43r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbac60824-09c8-4fbc-bd2b-177cc51a9144_1400x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Regardless of which compressor you choose, the algorithmic flow up the residual stream looks like this:</p><p><strong>1. The Compressor Layers (Mamba / Linear Attention / Sliding Window):</strong> The token passes through several sub-quadratic layers. If it&#8217;s Mamba, it updates the O(d_state) recurrent vector. If it&#8217;s Linear Attention, it adds the token&#8217;s features into the d * d fast-weight grid. In both cases, these layers are heavily compressing the local context and adding their summaries back into the residual stream without growing a KV cache.</p><p><strong>2. The Attention Layer (The Router):</strong> At layer 8 (or whatever interval is chosen), the token hits a standard exact-attention layer. This layer computes Q * K^T. But the Queries, Keys, and Values it reads from the residual stream have already been heavily processed by the compressor layers below it.</p><p>Now wait. We just spent two entire sections proving that compression destroys exact token identity. Mamba mushes everything into a fixed-size vector. Linear Attention mushes everything into a grid. So if the compressor layers have already mangled the information, how does the attention layer on top recover anything useful? Isn&#8217;t this just putting a search engine on top of a shredder?</p><p>No. And the reason is the residual connection.</p><p>In a Transformer-style residual stream, each layer <em>adds</em> its output to the running total. It does not replace it. After 7 Mamba layers, the residual stream is still one d_model-dimensional vector&#8202;&#8212;&#8202;the original token embedding x with seven compressed context deltas summed into it. The raw signal hasn&#8217;t been moved somewhere safe. It&#8217;s been added into, the way multiple frequencies get summed into one waveform. When the attention layer projects this residual into Q, K, V, it has access to both the original token identity <em>and</em> the compressed contextual summary layered on top. It can learn to separate these signals&#8202;&#8212;&#8202;use the raw-embedding component for sharp retrieval, use the compressed component for contextual routing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TDvi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TDvi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png 424w, https://substackcdn.com/image/fetch/$s_!TDvi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png 848w, https://substackcdn.com/image/fetch/$s_!TDvi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!TDvi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TDvi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png" width="1400" height="1400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1400,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TDvi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png 424w, https://substackcdn.com/image/fetch/$s_!TDvi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png 848w, https://substackcdn.com/image/fetch/$s_!TDvi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!TDvi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ada8464-ee45-4bf8-b34b-e6d246847745_1400x1400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is why hybrids don&#8217;t collapse into the same retrieval failure as pure SSMs or pure Linear Attention. The residual stream is a parallel data bus. The compressor layers write summaries onto it. They don&#8217;t erase what&#8217;s already there. The attention layer reads the full bus.</p><p>(Jamba uses this exact topology. Google&#8217;s RecurrentGemma does something slightly different&#8202;&#8212;&#8202;linear recurrence plus local sliding window attention&#8202;&#8212;&#8202;but the core KV-reduction goal and the residual-stream preservation mechanism are the same.)</p><h4>How Much Memory Do Hybrid Transformers Actually Save?</h4><p>Why did Jamba choose exactly 1 Attention layer for every 7 Mamba layers?</p><p>Because that&#8217;s roughly the ratio where the KV cache fits on one GPU and retrieval quality doesn&#8217;t collapse. The KV math tells you why fewer attention layers is better for memory. The needle survival distance tells you the minimum attention frequency before quality falls off a cliff. The 1:7 ratio sits in the overlap zone. Whether it&#8217;s the actual Pareto-optimal point or just the ratio AI21 shipped is a question the published ablation data doesn&#8217;t fully answer. So treat 1:7 as a validated existence proof, not a universal constant. If you&#8217;re designing your own hybrid, you&#8217;ll need to sweep this ratio against your own benchmark suite.</p><p>The memory math, however, is exact. And it&#8217;s where this gets fun.</p><p>Let&#8217;s bring back our KV cache formula from Part 1. The total bytes added to the cache per token is:</p><p>Delta_KV = 2 * L_attn * g * d_k * B_kv</p><p>(Where g is the number of KV heads after GQA, and d_k is the dimension per head.)</p><p>L_attn. That&#8217;s the variable that matters. In a pure Transformer, L_attn equals L&#8202;&#8212;&#8202;every layer caches Keys and Values. In Jamba, L_attn is L / 8. One-eighth. That single substitution changes everything downstream.</p><p>Let&#8217;s make it concrete. Imagine a 50B-class MoE model (50 billion total stored parameters, not active). 64 total layers, 8 KV heads, head dimension of 128, FP16 precision.</p><p><strong>Model Weights:</strong> 50B MoE at INT4 quantization, roughly 25 GB of VRAM.</p><p><strong>Pure Transformer KV Cache:</strong> At 256,000 tokens, a full 64-layer KV cache runs 2 * 64 * 8 * 128 * 256,000 * 2 bytes. That comes out to about <strong>67 GB</strong>.</p><p>Now add it up. Weights (25 GB) + KV cache (67 GB) + framework overhead (~6 GB) = <strong>98 GB</strong>.</p><p>The H100 has 80 GB of VRAM.</p><p>98 is bigger than 80. Now some of you are product managers and MBAs, so that assertion is likely a bit difficult to understand. No matter, take a second and really understand that for yourself. Ask ChatGPT if you have to. Once you understand this statemment we can proceed.</p><p>Your problem is simple&#8202;&#8212;&#8202;the model doesn&#8217;t fit (that&#8217;s what she said). You are forced into Tensor Parallelism across two GPUs, which instantly halves your gross margins and doubles your CAPEX. One variable&#8202;&#8212;&#8202;L_attn&#8202;&#8212;&#8202;is the reason your CFO is about to have a very bad quarter. Every single layer in that stack is demanding its own slice of the KV cache, and the cache does not negotiate. It takes its bytes or the model doesn&#8217;t run.</p><p>What if we instead apply the Hybrid ratio?</p><p><strong>Hybrid KV Cache:</strong> 8 attention layers instead of 64. Cache drops from 67 GB to about <strong>8.3 GB</strong>.</p><p>But we&#8217;re honest here, so we have to price the thing the pure Transformer didn&#8217;t pay for: the compressor states. Mamba recurrent states across 56 layers, d_model = 8192, d_state = 16, FP16: that&#8217;s 56 * 8192 * 16 * 2 bytes. Roughly <strong>14.7 MB per user</strong>. Negligible. Basically, a rounding error on a GPU that thinks in gigabytes.</p><p>Swap in Linear Attention compressors instead, and it gets heavier: 56 layers, each with a (d_k * d_k) fast-weight grid per head, h = 64 heads, d_k = 128. That&#8217;s 56 * 64 * 128 * 128 * 2 bytes. Roughly <strong>117 MB per user</strong>. About 14x the Mamba state. Still dwarfed by the 58.7 GB you saved on the KV cache, but large enough that your capacity planner had better know about it.</p><p><strong>Hybrid total (Mamba compressor):</strong> Weights (25 GB) + Hybrid KV Cache (8.3 GB) + Mamba States (~0.015 GB) + Overhead (6 GB) = roughly <strong>39.3 GB</strong>.</p><p>One GPU. 40 GB of headroom to spare for concurrency. By strategically deleting 87% of the attention layers, they crossed a hard hardware boundary that the pure Transformer couldn&#8217;t. The model that broke the H100 at 256K tokens now fits comfortably with room for dozens of concurrent users.</p><p>That&#8217;s the sell. Now let&#8217;s talk about why it&#8217;s harder than it sounds.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PrtA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PrtA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png 424w, https://substackcdn.com/image/fetch/$s_!PrtA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png 848w, https://substackcdn.com/image/fetch/$s_!PrtA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png 1272w, https://substackcdn.com/image/fetch/$s_!PrtA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PrtA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png" width="1400" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PrtA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png 424w, https://substackcdn.com/image/fetch/$s_!PrtA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png 848w, https://substackcdn.com/image/fetch/$s_!PrtA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png 1272w, https://substackcdn.com/image/fetch/$s_!PrtA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0505a2d7-2c7f-4599-80e2-adace55b4a6d_1400x600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Why Isn&#8217;t Everyone Using Hybrid Transformers?</h4><p>If Hybrids give you selective exact recall <em>and</em> the capacity of an SSM, why hasn&#8217;t the entire industry abandoned pure Transformers for them?</p><p>Three friction points. They don&#8217;t hurt equally.</p><p><strong>1. What Happens When Hybrid Transformers Switch Kernels Mid-Forward Pass?</strong></p><p>Modern GPUs hate context switching. High utilization means launching a massive, fused CUDA kernel and leaving data in the ultra-fast SRAM for as long as possible.</p><p>FlashAttention is a masterclass in this. It keeps tiles of Q, K, V in SRAM, computes the softmax and output projection without ever writing the n * n attention matrix to HBM, and achieves arithmetic intensities in the range of 100 to 200 FLOP/byte. Data stays hot. Math stays cheap relative to memory movement. Beautiful.</p><p>Hybrids wreck this by constantly crossing architectural seams. Run an associative scan (Mamba kernel). Write the activation tensors back to HBM. Launch a FlashAttention kernel. Read the tensors back into SRAM. Compute. Write back to HBM. Every seam crossing is a forced round-trip through HBM for activation tensors that scale as O(batch * seq * d_model) bytes.</p><p>Let&#8217;s price the damage on an H100 (3.35 TB/s HBM bandwidth):</p><ul><li><p><strong>Single activation tensor</strong> at batch=32, seq=4096, d=8192 in FP16: 32 * 4096 * 8192 * 2 * 2 = roughly <strong>4.3 GB of bandwidth per seam crossing</strong>.</p></li><li><p><strong>Eight seams</strong> across 64 layers (one per Jamba block) = <strong>~34 GB of pure memory-movement overhead per forward pass</strong> that a fused Transformer never pays.</p></li><li><p><strong>Effective arithmetic intensity at the seams</strong> craters to <strong>10 to 30 FLOP/byte</strong>. Bandwidth-bound territory on every GPU shipping today.</p></li></ul><p>You saved FLOPs by removing attention layers. You spent bytes switching between kernel types. The ledger doesn&#8217;t always net positive on wall-clock time.</p><p>This is the part that kills me about the hybrid discourse. Everyone celebrates the FLOP reduction. Nobody talks about the memory bus. You can have the most elegant architecture ever designed on paper, and the H100 will still punish you for making it read the same tensor twice. The silicon doesn&#8217;t care about your paper&#8217;s abstract. It cares about bytes moved per second, and you just asked it to move 34 GB of bytes it didn&#8217;t have to move before.</p><p>This is the great lie of the ArXiv preprint. On paper, Hybrids are the ultimate two-way player&#8202;&#8212;&#8202;they have the memory footprint of an SSM and the precision of a Transformer. In theory, Hybrids are your Mighty Mouse, with world-class striking AND world-class grappling. In reality, they&#8217;re closer to Kevin Lee, which get gas out hard in the transitions, leaving everyone struggling to see where Hybrids fit into the picture. You saved FLOPs by deleting attention layers, but you spent all those savings gasping for air on the memory bus while the H100 violently punishes you for context-switching</p><p><strong>2. Why Can&#8217;t vLLM Serve Hybrid Transformers Out of the Box?</strong></p><p>vLLM&#8217;s superpower is PagedAttention&#8202;&#8212;&#8202;it treats the KV cache like an operating system treats virtual memory, breaking it into non-contiguous blocks to eliminate fragmentation waste. Elegant, well-tested, the reason most production APIs can serve Transformers at reasonable cost.</p><p>A Hybrid breaks the assumption PagedAttention was built on: that the only per-request state is a KV cache. Hybrids have a growing KV cache (attention layers) <em>and</em> fixed-size recurrent states&#8202;&#8212;&#8202;whether those are Mamba&#8217;s d_model * d_state vectors (~0.26 MB per layer per user) or the massive Linear Attention fast-weight grids (~2.1 MB per layer per user, roughly 52 MB total across 25 compressor layers at the scale we priced in Section 3). Two completely different memory pools. Completely different growth dynamics. The KV cache grows linearly with tokens. The recurrent states are fixed but must be swapped in and out as requests get scheduled and preempted.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!POet!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!POet!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png 424w, https://substackcdn.com/image/fetch/$s_!POet!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png 848w, https://substackcdn.com/image/fetch/$s_!POet!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png 1272w, https://substackcdn.com/image/fetch/$s_!POet!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!POet!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png" width="1400" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!POet!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png 424w, https://substackcdn.com/image/fetch/$s_!POet!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png 848w, https://substackcdn.com/image/fetch/$s_!POet!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png 1272w, https://substackcdn.com/image/fetch/$s_!POet!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97c41e29-b564-4d52-ba96-25ac1fc3c5c2_1400x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Building a memory allocator that pages KV blocks while simultaneously managing Mamba state swaps or Fast-Weight grid swaps is a scheduler rewrite. Work has been done to serve these, but it&#8217;s still not as stable as normal transformers.</p><p>For most serving teams, this is the actual deployment barrier. After all, the real question is not whether the model works. Lots of things work in a benchmark lab. The real question is whether your serving stack can carry it without needing therapy</p><p>And if the answer is no, the FLOP savings don&#8217;t matter. You&#8217;re not shipping. The model sits on your benchmarking cluster looking beautiful while your competitors serve worse models to paying customers.</p><p><strong>3. Do Hybrid Transformers Actually Remove the Memory Wall?</strong></p><p>No.</p><p>Hybrids shrink L_attn. But L_attn is still greater than zero. The KV cache still grows linearly with n. You haven&#8217;t removed the wall. You&#8217;ve reduced the slope of the line.</p><p>Dividing the cache by 8 at 256K tokens pushes the memory wall back by roughly 8x. Your 256K limit becomes a ~2M token limit before you hit the same overflow.</p><p>But does 8x actually hold? Only if the compressor states stay negligible at scale.</p><ul><li><p>Mamba compressors: recurrent states are still 14.7 MB per user at 2M tokens. Fixed-size, unchanged. The 8x ceiling holds clean.</p></li><li><p>Linear Attention compressors: fast-weight grids are still 117 MB per user. Also fixed-size. But at 100 concurrent users, that&#8217;s 11.4 GB of VRAM just for compressor states&#8202;&#8212;&#8202;no longer a rounding error. The effective multiplier drops from 8x to something closer to 6 to 7x depending on your concurrency target.</p></li></ul><p>Compressor states don&#8217;t grow with context. They grow with concurrency. At high user counts, the &#8220;fixed-size&#8221; advantage partially unwinds because you&#8217;re paying that fixed cost per user. And if you&#8217;re running a high-concurrency production API (which is&#8230; everyone who&#8217;s trying to make money on this), the wall moved less far than the napkin math suggested.</p><p>This is the tragedy of every &#8220;efficient&#8221; architecture. You solve one constraint, and another one tightens. The memory wall was context-bound, so you made it concurrency-bound instead. You didn&#8217;t escape the physics. You rotated the axes of the problem and hoped the new orientation was more survivable. Sometimes it is. But the wall is still there, waiting to Yamcha you.</p><p>Hybrids are the sharpest engineering compromise in the industry right now. Bridge technology. They push the memory wall back far enough to make current enterprise use cases viable at current hardware prices. But they do not remove the wall.</p><p>To actually survive beyond the 1-million-token limit without sacrificing exact recall, we have to look away from the architecture of the model entirely, and look at the architecture of the data center.</p><p>That brings us to Distributed Exact Attention.</p><h1>Section 5: Extreme Context (Changing the Data Center)</h1><p>Hybrids are an elegant compromise. They delay the memory wall by shrinking the KV cache. But maybe elegance is not your problem; maybe you&#8217;re stubborn, or rich, or both</p><p>So, what if you refuse to compromise? What if you are building an agentic workflow that reads a 1-million-token codebase, and you absolutely cannot afford the feature collision of an SSM or the retrieval loss of Linear Attention? You want the pure, unmodified, O(n&#178;) global exact recall of a Transformer.</p><p>If you won&#8217;t change the architecture of the model, you have to change the architecture of the data center.</p><p>This brings us to Context Parallelism and algorithms like Ring Attention.</p><h4>What Does a 1-Million-Token Sequence Actually Cost?</h4><p>Let&#8217;s price out exactly what a 1-million-token sequence costs in memory. We will use a 70B-class model as our proxy, assuming 80 layers, Grouped Query Attention with 8 KV heads, a head dimension of 128, and FP16/BF16 precision (2 bytes).</p><p>The KV cache formula: Layers * KV Heads * Head Dim * Tokens * 2 (for K and V) * 2 bytes. 80 * 8 * 128 * 1,000,000 * 2 * 2 = <strong>~328 GB</strong>.</p><p>The KV cache for a single request is roughly 328 GB. Not the model. Not the activations. Not the optimizer states. <strong>Just the cache. </strong>One user&#8217;s context window.</p><p>Common high-end enterprise deployments run 80 GB H100s. A single H100 cannot hold this request. Four H100s cannot hold this request. You need a minimum of <strong>5 GPUs</strong> just for the KV cache alone, before you even load the model weights. <strong>At on-demand H100 pricing (~$3/GPU/hour), that&#8217;s $15/hour just to </strong><em><strong>remember</strong></em><strong> what one user said. </strong>And you haven&#8217;t done any math yet. Just like these new age transcription tools, you&#8217;re going to be a lot of money just to not forget; even before the cost of intelligence curb stomps you later.</p><p>Standard Tensor Parallelism solves weight capacity, and it scales well inside a single server node where 8 GPUs are connected by ultra-fast NVLink (~900 GB/s bidirectional). But TP&#8217;s all-reduce synchronization costs grow painfully once your memory requirements force you to cross the node boundary onto InfiniBand (~50 GB/s per link). That&#8217;s an 18x bandwidth cliff between &#8220;inside the box&#8221; and &#8220;across the wire.&#8221;</p><p>To survive 1 million tokens across a massive cluster without bottlenecking on weight syncs, you must partition the sequence itself.</p><h4>How Does Ring Attention Distribute Context Across GPUs?</h4><p>The core idea comes from <em><a href="https://arxiv.org/abs/2310.01889">&#8220;Ring Attention with Blockwise Transformers for Near-Infinite Context&#8221;</a></em>: instead of partitioning the model&#8217;s weights across devices (which is what Tensor Parallelism does), partition the <em>sequence</em>. Each GPU gets a shard of tokens and handles its own slice of the attention computation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TT3X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TT3X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png 424w, https://substackcdn.com/image/fetch/$s_!TT3X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png 848w, https://substackcdn.com/image/fetch/$s_!TT3X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png 1272w, https://substackcdn.com/image/fetch/$s_!TT3X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TT3X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png" width="1194" height="1066" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1066,&quot;width&quot;:1194,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TT3X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png 424w, https://substackcdn.com/image/fetch/$s_!TT3X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png 848w, https://substackcdn.com/image/fetch/$s_!TT3X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png 1272w, https://substackcdn.com/image/fetch/$s_!TT3X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ace6da-2758-44e8-b687-fa6698c83393_1194x1066.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">&#8220;:Top (a): We use the same model architecture as the original Transformer but reorganize the compute. In the diagram, we explain this by showing that in a ring of hosts, each host holds one query block, and key-value blocks traverse through a ring of hosts for attention and feedforward computations in a block-by-block fashion. As we compute attention, each host sends key-value blocks to the next host while receives key-value blocks from the preceding host. The communication is overlapped with the computation of blockwise attention and feedforward. Bottom (b): We compute the original Transformer block-by-block. Each host is responsible for one iteration of the query&#8217;s outer loop, while the key-value blocks rotate among the hosts. As visualized, a device starts with the first query block on the left; then we iterate over the key-value blocks sequence positioned horizontally. The query block, combined with the key-value blocks, are used to compute self-attention (yellow box), whose output is pass to feedforward network (cyan box)&#8221;</figcaption></figure></div><p>Let&#8217;s say we have p devices. We take our 1-million-token sequence and chop it into p shards. Each GPU is now responsible for n/p tokens.</p><p>The local compute per device drops drastically. Instead of calculating a massive (n * n) attention grid, each GPU calculates a smaller local block. But attention requires global interaction&#8202;&#8212;&#8202;for causal attention, each device must eventually incorporate all relevant causal predecessors for the tokens in its shard. That means pulling remote KV blocks from other devices over the network.</p><p>If a GPU simply stops computing to wait for a 40 GB block of KV cache to arrive over an Ethernet cable, your cluster dies.</p><p>Ring Attention solves this by arranging the GPUs in a logical ring and overlapping communication with computation:</p><ul><li><p>A GPU starts computing the attention scores for its local Q block against whatever K, V block it currently holds.</p></li><li><p>At the exact same time, it sends its K, V block to the next GPU in the ring and begins receiving a different K, V block from the previous GPU.</p></li><li><p>After p-1 such passes, every GPU has seen every other GPU&#8217;s K, V block. The full attention output is assembled via blockwise accumulation that preserves numerical correctness (using the online softmax trick from <em><a href="https://arxiv.org/abs/1805.02867">&#8220;Online Normalizer Calculation for Softmax&#8221;</a></em>).</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!etJU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!etJU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png 424w, https://substackcdn.com/image/fetch/$s_!etJU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png 848w, https://substackcdn.com/image/fetch/$s_!etJU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png 1272w, https://substackcdn.com/image/fetch/$s_!etJU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!etJU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png" width="1456" height="1160" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1160,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!etJU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png 424w, https://substackcdn.com/image/fetch/$s_!etJU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png 848w, https://substackcdn.com/image/fetch/$s_!etJU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png 1272w, https://substackcdn.com/image/fetch/$s_!etJU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c16e211-3a9e-4503-b766-8d8e352edf24_1600x1275.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Why a ring specifically, and not all-to-all communication? Because a ring minimizes concurrent network transfers to exactly 1 send + 1 receive per device per step. All-to-all would require every device to simultaneously blast its KV block to every other device, saturating the network instantly. The ring topology is the minimum-bandwidth schedule that achieves full coverage.</p><p>The key condition that makes this work: <strong>if the time to compute the local attention block is longer than the time to transfer the KV block to the next device, communication is fully hidden.</strong> The math runs while the bytes move. Zero overhead. In theory.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LJmt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LJmt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png 424w, https://substackcdn.com/image/fetch/$s_!LJmt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png 848w, https://substackcdn.com/image/fetch/$s_!LJmt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png 1272w, https://substackcdn.com/image/fetch/$s_!LJmt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LJmt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png" width="1456" height="690" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:690,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LJmt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png 424w, https://substackcdn.com/image/fetch/$s_!LJmt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png 848w, https://substackcdn.com/image/fetch/$s_!LJmt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png 1272w, https://substackcdn.com/image/fetch/$s_!LJmt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc82f90-68a9-448a-9887-3d822c69b9e9_2362x1120.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Initial partitioning of the Q, K, and V sequences into blocks for both Ring Attention and Striped Attention. Because they travel together in the Ring Attention algorithm, the K and V sequences are depicted as a single sequence. Note that for both Ring Attention and Striped Attention, the tokens in the input sequence are partitioned among devices before running the first layer of the model, and remain partitioned in the same layout throughout the forward and backward passes. As a result, Q, K, V are automatically partitioned in the desired layout at the beginning of each attention layer when using both Ring Attention and Striped Attention, with no extra per-layer communication required to prepare them in this state.</figcaption></figure></div><p>There&#8217;s a subtle but important problem with naive Ring Attention that <em><a href="https://arxiv.org/abs/2311.09431">&#8220;Striped Attention: Faster Ring Attention for Causal Transformers&#8221;</a></em> identified: causal masking creates severe load imbalance. In causal (autoregressive) attention, the attention matrix is triangular&#8202;&#8212;&#8202;early tokens attend to few predecessors, late tokens attend to many. If you assign contiguous subsequences to each device, the device holding the <em>last</em> tokens does far more work than the device holding the <em>first</em> tokens. Everyone else sits idle waiting for the slowest device to finish.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3sFe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3sFe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png 424w, https://substackcdn.com/image/fetch/$s_!3sFe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png 848w, https://substackcdn.com/image/fetch/$s_!3sFe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png 1272w, https://substackcdn.com/image/fetch/$s_!3sFe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3sFe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png" width="1456" height="979" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:979,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3sFe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png 424w, https://substackcdn.com/image/fetch/$s_!3sFe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png 848w, https://substackcdn.com/image/fetch/$s_!3sFe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png 1272w, https://substackcdn.com/image/fetch/$s_!3sFe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0240a453-1096-43b8-ac4d-e1601f022716_2208x1484.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">&#8220;Behavior of Ring Attention as applied to a small causal self-attention problem with nseq = 16, distributed across N = 4 devices. The Q blocks remain stationary, while K and V blocks pass from neighbor to neighbor in a circular fashion on each round. The square tile shown under each device on each round indicates the causal mask for the query/key interactions computed by that device on that round; elements masked out with values of &#8722;&#8734; are indicated in black. We can see that on all but the first round, workload imbalances prevent us from making effective use of the structure of the causal mask to reduce run time.&#8221;</figcaption></figure></div><p>Striped Attention fixes this by distributing tokens in a round-robin pattern (device 0 gets tokens 0, p, 2p, &#8230;; device 1 gets tokens 1, p+1, 2p+1, &#8230;). This spreads the triangular workload evenly. The result: up to 1.45x end-to-end throughput improvement over vanilla Ring Attention on causal models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GYqF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GYqF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png 424w, https://substackcdn.com/image/fetch/$s_!GYqF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png 848w, https://substackcdn.com/image/fetch/$s_!GYqF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png 1272w, https://substackcdn.com/image/fetch/$s_!GYqF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GYqF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png" width="1456" height="991" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:991,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GYqF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png 424w, https://substackcdn.com/image/fetch/$s_!GYqF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png 848w, https://substackcdn.com/image/fetch/$s_!GYqF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png 1272w, https://substackcdn.com/image/fetch/$s_!GYqF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797702d6-6fe1-4591-81ca-b05f3392db17_2178x1482.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">&#8220;Behavior of Striped Attention as applied to the same casual self-attention problem as above. Unlike in Ring Attention, every block contains tokens distributed throughout every part of the original input sequence. As in Figure 2, the square matrices under each device depict the causal mask encountered by each device on each round. We note that the causal masks provide each device with a roughly equal portion of skippable work on each iteration, resolving the workload imbalance in Ring Attention&#8221;</figcaption></figure></div><h4>Why Does Ring Attention Break Down During Decode?</h4><p>In prefill, you are processing massive blocks of tokens simultaneously. The arithmetic intensity is enormous&#8202;&#8212;&#8202;thousands of FLOPs for every byte of data. The compute takes a long time relative to the network transfer. Because the compute is slow, the network has plenty of time to quietly pass the KV blocks in the background. The overlap condition holds comfortably.</p><p>Let&#8217;s make this concrete with the numbers from Meta&#8217;s <em><a href="https://arxiv.org/abs/2411.01783">&#8220;Context Parallelism for Scalable Million-Token Inference&#8221;</a></em> paper. They report processing a 1-million-token prefill on Llama 3 405B in 77 seconds across 16 nodes (128 H100 GPUs), achieving 93% parallelization efficiency and 63% FLOPS utilization. That&#8217;s near-linear scaling for prefill. The overlap condition holds because each device is grinding through enormous local attention blocks that take long enough to fully mask the inter-node transfers.</p><p>But if you&#8217;ve been reading our guides on inference, you know that inference has another stage, the decode. So, what happens during Decode?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yDD6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yDD6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yDD6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yDD6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yDD6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yDD6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yDD6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yDD6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yDD6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yDD6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5619307b-731a-411e-9e16-7ae708cb363b_1456x1092.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In decode, you generate tokens one by one. One token means one query vector. The local attention computation for a single query against even a large KV block finishes almost instantly&#8202;&#8212;&#8202;the arithmetic intensity craters. But you still have to pass the KV blocks around the ring to compute global attention for that one new token.</p><p>Let&#8217;s price the mismatch. For a single decode step on one device:</p><ul><li><p><strong>Compute:</strong> One query vector (d_k = 128 floats) dot-producted against n/p key vectors. For n = 1M tokens across p = 128 GPUs, that&#8217;s about 7,800 key vectors per device. The compute is roughly 2 * 7,800 * 128 = ~2 million FLOPs per head, times 128 heads = ~256 million FLOPs total. On an H100 doing ~990 TFLOPS (FP16), that takes roughly <strong>0.26 microseconds</strong>.</p></li><li><p><strong>Transfer:</strong> The KV block that needs to move to the next device is n/p * d_k * 2 (K and V) * 2 bytes = 7,800 * 128 * 2 * 2 = ~4 MB per head group. Across all KV head groups with GQA: ~32 MB. On 400 Gb/s InfiniBand (~50 GB/s effective), that takes about <strong>0.64 milliseconds</strong>.</p></li></ul><p>The compute finishes in microseconds. The transfer takes milliseconds. The overlap condition is destroyed by a factor of roughly <strong>2,500x</strong>. The GPUs finish their math and sit idle, waiting for the network.</p><p>This is the physical bandwidth hierarchy working against you:</p><ul><li><p><strong>GPU HBM bandwidth:</strong> ~3,350 GB/s</p></li><li><p><strong>Intra-node NVLink:</strong> ~900 GB/s</p></li><li><p><strong>Inter-node InfiniBand:</strong> ~50 GB/s</p></li></ul><p>During decode, you have successfully moved the bottleneck off the ultra-fast HBM and placed it directly onto the network cable. Your generation speed becomes constrained by interconnect bandwidth, not GPU math. <strong>According to the </strong><em><strong>&#8220;Context Parallelism&#8221;</strong></em><strong> paper, CP is &#8220;best suited for improving prefill performance&#8221; and decode latency regresses under it.</strong></p><h4>What Is StreamingLLM and How Do Attention Sinks Work?</h4><p>If you don&#8217;t have a multi-node InfiniBand cluster to run exact distributed attention, but you still need to process infinite context (like a 24/7 ambient voice assistant), there is one final, ruthless architectural hack: just delete the middle.</p><p>If you allow a KV cache to grow infinitely, the GPU crashes. But researchers noticed that if you just naively evict the oldest tokens when the cache gets full (a rolling window), the model completely collapses. Perplexity explodes. Not gradual degradation&#8202;&#8212;&#8202;catastrophic failure.</p><p>Why? Because of Attention Sinks.</p><p><em><a href="https://arxiv.org/abs/2309.17453">&#8220;Efficient Streaming Language Models with Attention Sinks&#8221;</a></em> discovered the mechanism. Under autoregressive causal attention, early tokens become persistent global anchors that absorb surplus attention mass. The reason traces directly to the interaction between causal masking and softmax normalization.</p><p>Here&#8217;s the first-principles chain: in causal attention, token 1 is visible to <em>every</em> subsequent token. Token 2 is visible to all but one. Token 1000 is only visible to tokens after position 1000. This asymmetry means early tokens accumulate vastly more gradient signal during training than late tokens&#8202;&#8212;&#8202;they participate in every attention distribution in the sequence. The model adapts to this by learning to dump &#8220;excess&#8221; attention probability onto these early tokens, effectively using them as a numerical pressure valve for the softmax normalization.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tbtq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tbtq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png 424w, https://substackcdn.com/image/fetch/$s_!tbtq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png 848w, https://substackcdn.com/image/fetch/$s_!tbtq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png 1272w, https://substackcdn.com/image/fetch/$s_!tbtq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tbtq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png" width="1456" height="384" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:384,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tbtq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png 424w, https://substackcdn.com/image/fetch/$s_!tbtq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png 848w, https://substackcdn.com/image/fetch/$s_!tbtq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png 1272w, https://substackcdn.com/image/fetch/$s_!tbtq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc739db57-7e9a-483d-84f8-f0b34c7bc063_1600x422.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">&#8220;Visualization of the average attention logits in Llama-2&#8211;7B over 256 sentences, each with a length of 16. Observations include: (1) The attention maps in the first two layers (layers 0 and 1) exhibit the &#8220;local&#8221; pattern, with recent tokens receiving more attention. (2) Beyond the bottom two layers, the model heavily attends to the initial token across all layers and heads.&#8221;</figcaption></figure></div><p>It&#8217;s not that the first tokens contain important information. The <em>&#8220;Attention Sinks&#8221;</em> paper showed this explicitly: you can replace the first four tokens with newline characters (&#8220;\n&#8221;) and the model still recovers. The tokens themselves are semantically irrelevant. What matters is their <em>position</em>&#8202;&#8212;&#8202;they&#8217;ve become structural load-bearing elements of the attention distribution, regardless of content.</p><p>If you delete those sink tokens, the softmax budget loses its pressure valve. The remaining attention scores destabilize because the distribution they were trained to produce no longer sums correctly without the sinks absorbing the overflow. The whole thing collapses.</p><p>The fix is StreamingLLM. You keep the KV cache strictly bounded to a fixed size (e.g., 4,000 tokens). You permanently lock a small number of sink tokens (the first 4 words) into the cache so the softmax math stays stable. Then you use the remaining slots as a rolling window for the most recent tokens.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c7Mi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c7Mi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png 424w, https://substackcdn.com/image/fetch/$s_!c7Mi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png 848w, https://substackcdn.com/image/fetch/$s_!c7Mi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png 1272w, https://substackcdn.com/image/fetch/$s_!c7Mi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c7Mi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png" width="1456" height="479" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:479,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c7Mi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png 424w, https://substackcdn.com/image/fetch/$s_!c7Mi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png 848w, https://substackcdn.com/image/fetch/$s_!c7Mi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png 1272w, https://substackcdn.com/image/fetch/$s_!c7Mi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a17884-b2e1-428b-9668-cabba3d490a1_1600x526.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">&#8220;Illustration of StreamingLLM vs. existing methods. The language model, pre-trained on texts of length L, predicts the Tth token (T &#8811; L). (a) Dense Attention has O(T 2 ) time complexity and an increasing cache size. Its performance decreases when the text length exceeds the pre-training text length. (b) Window Attention caches the most recent L tokens&#8217; KV. While efficient in inference, performance declines sharply once the starting tokens&#8217; keys and values are evicted. &#169; Sliding Window with Re-computation rebuilds the KV states from the L recent tokens for each new token. While it performs well on long texts, its O(T L2 ) complexity, stemming from quadratic attention in context re-computation, makes it considerably slow. (d) StreamingLLM keeps the attention sink (several initial tokens) for stable attention computation, combined with the recent tokens. It&#8217;s efficient and offers stable performance on extended texts. Perplexities are measured using the Llama-2&#8211;13B model on the first book (65K tokens) in the PG-19 test set.&#8221;</figcaption></figure></div><p>Everything in the middle is permanently thrown away.</p><p>This gives you a perfectly flat, O(1) memory footprint. The <em>&#8220;Attention Sinks&#8221;</em> paper demonstrated it running stable on Llama-2, MPT, Falcon, and Pythia at up to 4 million tokens with zero degradation in perplexity&#8202;&#8212;&#8202;and 22.2x speedup over sliding window recomputation. It runs infinitely with no latency growth.</p><p>But once again, the fundamental law holds: you paid for your flat memory footprint with precision. The model has zero memory of anything that fell outside the rolling window. If a critical fact appeared at token 50,000 and the window only holds the last 4,000, that fact is gone.</p><p>In other words, StreamingLLM is not a context extension method. It&#8217;s a context <em>amputation</em> method that keeps the patient alive.</p><p>We&#8217;ve covered a lot of techniques here. So, to end, let&#8217;s break down the costs better.</p><h1>Section 6: What are the Costs of the Different Transformer Variants?</h1><p>We have spent five sections tearing apart the linear algebra of the Transformer, the differential equations of Mamba, the associative math of Linear Attention, and the bandwidth limits of InfiniBand.</p><p>Theoretical discussions are nice, but deployment is a physical game of fit. You either have the VRAM to serve 1,000 concurrent users, or you go bankrupt paying $2.50/hour for an H100 that is serving 4 people. Every equation we derived in this series terminates in the same place: a line item on someone&#8217;s cloud bill, a number of users per GPU, a price per token.</p><p>To see exactly how the &#8220;Quadratic Tax&#8221; destroys margins&#8202;&#8212;&#8202;and to prove why the escape routes we just covered are existentially necessary&#8202;&#8212;&#8202;we are going to run a stress test.</p><h4>What Are the Constraints of Our Stress Test?</h4><p>We will assume a standard 80 GB NVIDIA H100. We subtract a strict 6 GB overhead for the runtime, memory allocators, and temporary activations. That leaves us roughly <strong>74 GB of usable VRAM</strong>.</p><p>We will test three proxy models at INT4 weight quantization (to maximize the space left for the cache):</p><ul><li><p><strong>1B Model</strong> (d = 2048, L = 16, g = 8, d_k = 64). Weights: ~0.6 GB at INT4.</p></li><li><p><strong>3B Model</strong> (d = 3072, L = 24, g = 8, d_k = 128). Weights: ~1.8 GB at INT4.</p></li><li><p><strong>70B Model</strong> (d = 8192, L = 80, g = 8, d_k = 128). Weights: ~35 GB at INT4.</p></li></ul><p>We will use INT8 quantization for the KV cache (1 byte per number).</p><p>The KV cache formula, carried forward from Part 1:</p><p><strong>KV bytes per user = L * g * d_k * n * 2 (K and V) * B_kv</strong></p><p>Where L is layers, g is KV head count, d_k is head dimension, n is context length, and B_kv is bytes per value (1 byte for INT8). Let&#8217;s sanity-check one cell before we trust the numbers: for the 70B model at 128K context, that&#8217;s 80 * 8 * 128 * 128,000 * 2 * 1 = 20,971,520,000 bytes = <strong>~20.97 GB</strong>. One user. Just the cache.</p><h4>How Does KV Cache Size Scale With Context Length?</h4><p>Watch what happens to the KV cache per user (INT8 quantization) as we stretch context from 4K to 1 million tokens.</p><ul><li><p><strong>1B model</strong> (weights: ~0.6 GB): 4K = <strong>0.07 GB</strong>. 32K = <strong>0.52 GB</strong>. 128K = <strong>2.10 GB</strong>. 1M = <strong>16.38 GB</strong>. The cache at 1 million tokens is 27x larger than the model itself. The thing that stores the context has completely consumed the machine that does the thinking.</p></li><li><p><strong>3B model</strong> (weights: ~1.8 GB): 4K = <strong>0.20 GB</strong>. 32K = <strong>1.57 GB</strong>. 128K = <strong>6.29 GB</strong>. 1M = <strong>49.15 GB</strong>. At 128K context, the cache is already 3.5x the weight of the model. At 1M, a single user&#8217;s cache eats 60% of the entire H100.</p></li><li><p><strong>70B model</strong> (weights: ~35 GB): 4K = <strong>0.66 GB</strong>. 32K = <strong>5.24 GB</strong>. 128K = <strong>20.97 GB</strong>. 1M = <strong>163.84 GB</strong>. The H100 only holds 80 GB. At 1 million tokens, a single user&#8217;s KV cache is twice the size of the entire GPU. You are physically dead in the water before you load the first weight.</p></li></ul><h4>How Does Context Length Destroy Concurrency?</h4><p>The size of the cache dictates your batch size. Every user gets their own cache. Divide your remaining VRAM (after weights and overhead) by the per-user cache size, and you get the absolute hardware ceiling for concurrent users on a single 80 GB H100 (INT4 weights, INT8 KV cache).</p><p><strong>1B model:</strong> 4K = <strong>~1,112 users</strong>. 32K = <strong>~140 users</strong>. 128K = <strong>~35 users</strong>. 1M = <strong>~4 users</strong>.</p><p><strong>3B model:</strong> 4K = <strong>~366 users</strong>. 32K = <strong>~46 users</strong>. 128K = <strong>~11 users</strong>. 1M = <strong>~1 user</strong>.</p><p><strong>70B model:</strong> 4K = <strong>~59 users</strong>. 32K = <strong>~7 users</strong>. 128K = <strong>~1 user</strong>. 1M = <strong>doesn&#8217;t fit</strong>.</p><p>This is the Concurrency Collapse that Part 1 warned you about.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GCCK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GCCK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GCCK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GCCK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GCCK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GCCK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg" width="1220" height="642" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:642,&quot;width&quot;:1220,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GCCK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GCCK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GCCK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GCCK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6badfda8-fc34-4832-83be-f9fba0c73383_1220x642.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em><a href="https://www.artificialintelligencemadesimple.com/p/how-one-startup-is-breaking-nvidias?utm_source=publication-search">Many benchmarks use single session/user benchmarks for inference, but the trend is towards deprecating single-user/session benchmarking results like these, because the lack of concurrency is too far removed from real-world production inference and/or agentic workload scenarios. We do it for simplicity and additional data, but consider that factor when looking at the numbers. Concurrency is in other words, a big factor to account for currently, and we not enough teams deal with that.</a></em></figcaption></figure></div><p>If you run a 70B model at 4K context, you can fit 59 concurrent users on a single H100. At $2.50/hour, your per-user hardware cost is roughly <strong>$0.04/hour</strong>. Highly profitable. The kind of unit economics that makes a VC&#8217;s eyes glaze over with dollar signs.</p><p>If you offer a 128K context feature, you can fit exactly <strong>1 user</strong> on that GPU. Your hardware cost instantly spikes to <strong>$2.50/hour per user</strong>. You haven&#8217;t changed the model. You haven&#8217;t changed the hardware. You just changed the context length, and your unit economics became <strong>59x more expensive</strong>.</p><p>That&#8217;s not a scaling problem. That&#8217;s a business model collapse. And it explains why every API provider prices long-context requests at a steep premium&#8202;&#8212;&#8202;they&#8217;re not gouging you, they&#8217;re passing along the physical cost of the quadratic tax on their KV cache.</p><p>Let&#8217;s convert this to the number your finance team actually cares about. Assume each user generates ~500 output tokens per request at ~50 tokens/second (10 seconds of generation), and the GPU runs at 70% utilization (most LLMs would probably run a bit faster, but they also consume more thinking tokens/deal with memory + will have input/output processing for your app, so this is a reasonable set of params for quick math).</p><p>At <strong>4K context, 70B model:</strong> 59 users * 500 tokens * 3600/10 seconds * 0.7 utilization = ~7.4M output tokens/hour. At $2.50/hour GPU cost (mid-range cloud rate), that&#8217;s roughly <strong>$0.34 per million output tokens</strong> in raw hardware cost. The retail API price for a model this size is $3&#8211;25/M depending on provider&#8202;&#8212;&#8202;which means at short context, inference margins are enormous. The hardware cost is a rounding error on the price you charge.</p><p>At <strong>128K context, 70B model:</strong> 1 user * 500 tokens * 3600/10 * 0.7 = ~126K output tokens/hour. At $2.50/hour, that&#8217;s roughly <strong>$19.84 per million output tokens</strong> in raw hardware cost alone. That is higher than what Claude and OpenAI charge retail. You aren&#8217;t running a SaaS startup anymore; you are running a philanthropic foundation that yeets VC dollars to your cloud provider just so one power-user can summarize a PDF they aren&#8217;t even going to read.</p><p>And that&#8217;s before you account for the prefill cost on the 128K input.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hxER!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hxER!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hxER!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hxER!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hxER!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hxER!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg" width="675" height="499" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:499,&quot;width&quot;:675,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hxER!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hxER!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hxER!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hxER!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9111b921-2885-4da1-a075-597742887ab7_675x499.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/the-real-cost-of-open-source-llms?utm_source=publication-search">When you account for all this + the talent costs, most orgs are better off paying for API providers and doing work around them instead of using their own models</a>.</figcaption></figure></div><p>This is why long-context inference is the single most important economic problem in AI infrastructure today. Because the <em>money</em> doesn&#8217;t work at scale.</p><h4>How Do the Escape Routes Change These Numbers?</h4><p>Now let&#8217;s apply the architectural escape routes we spent this article dissecting. Take the worst case from our numbers: the 70B model at 128K context, currently fitting 1 user on an H100 with a 20.97 GB KV cache.</p><p><strong>Escape Route 1: Low-Rank KV Compression / MLA (Section 1)</strong></p><p>In our earlier section on KV cache optimization, we covered DeepSeek-V2&#8217;s Multi-head Latent Attention. Instead of caching the full Key and Value tensors, MLA projects each token down to a compressed latent vector and reconstructs the Keys and Values on the fly during attention. The 2 * g * d_k storage term gets replaced by a much smaller compressed dimension d_c.</p><p>DeepSeek reported a 93.3% reduction in KV cache size. Applied to our 70B scenario:</p><p>New KV cache: 20.97 GB * 0.067 = <strong>~1.40 GB per user</strong>.</p><p>Concurrency: (74 GB&#8202;&#8212;&#8202;35 GB weights) / 1.40 GB = <strong>~27 users</strong>.</p><p>$/M output tokens: drops from $19.84 to roughly <strong>$0.73</strong>.</p><p><strong>What you lost:</strong> Compute. Reconstructing Keys and Values from the compressed latent requires an extra matrix multiplication every decode step&#8202;&#8212;&#8202;you&#8217;re trading memory capacity for arithmetic intensity. And the compression breaks standard RoPE position embeddings, forcing you into decoupled RoPE strategies that add implementation complexity. The quality numbers hold if the compression rank is chosen carefully. If you get greedy and compress too aggressively, retrieval precision degrades the same way it does in any lossy compression scheme.</p><p><strong>Escape Route 2: Hybrid Transformers (Section 4)</strong></p><p>Jamba-style architecture: 1 Attention layer for every 7 Mamba layers. L_attn drops from 80 to 10.</p><p>New KV cache: 10 * 8 * 128 * 128,000 * 2 * 1 = <strong>~2.62 GB per user</strong>. Plus Mamba recurrent states across 70 layers: ~14 MB per user, negligible.</p><p>Concurrency: (74 GB&#8202;&#8212;&#8202;35 GB weights) / 2.62 GB = <strong>~14 users</strong>.</p><p>$/M output tokens: drops from $19.84 to roughly <strong>$1.42</strong>.</p><p><strong>What you lost:</strong> Exact recall only fires at every 8th layer. Tasks requiring verbatim retrieval&#8202;&#8212;&#8202;citation, code search, legal discovery&#8202;&#8212;&#8202;degrade depending on needle survival distance. Your serving stack needs a 2&#8211;4 month scheduler rewrite for the dual memory pool (Section 4). Kernel switching overhead eats ~10&#8211;15% of your theoretical FLOP savings at the architectural seams. And you&#8217;re retraining or fine-tuning from a Hybrid checkpoint, which means weeks of GPU time before you can even start serving.</p><p><strong>Escape Route 3: Pure Mamba / Linear Attention (Sections 2 &amp; 3)</strong></p><p>Rip Attention out entirely. The KV cache drops to zero. Replace it with a fixed recurrent state (Mamba) or a fixed fast-weight grid (Linear Attention, ~52 MB as priced in Section 3).</p><p>Per-user state: <strong>~20 MB</strong> (Mamba) or <strong>~168 MB</strong> (Linear Attention at this model scale&#8202;&#8212;&#8202;80 layers, 64 heads, d_k = 128).</p><p>Concurrency with Mamba: (74 GB&#8202;&#8212;&#8202;35 GB weights) / 0.020 GB = theoretically <strong>~1,950 users</strong>. In practice you hit compute throughput limits long before memory limits. But memory is no longer the binding constraint at any context length. That alone changes the entire economics.</p><p>Concurrency with Linear Attention: (74 GB&#8202;&#8212;&#8202;35 GB) / 0.168 GB = <strong>~232 users</strong>.</p><p>$/M output tokens: craters to fractions of a cent on the hardware side.</p><p><strong>What you lost:</strong> Exact retrieval. A pure Mamba model at 128K context cannot do needle-in-haystack with the same precision as full Attention. Feature collision degrades with sequence length&#8202;&#8212;&#8202;the longer the context, the worse the compression artifacts get. For workloads requiring verbatim recall (legal, code, medical records), this is a categorical capability loss, not a slight quality tradeoff. You also need a full retrain from scratch on the new architecture, thinner ecosystem tooling, fewer battle-tested serving frameworks, and custom CUDA kernels that are sensitive to quantization drift (Section 2).</p><p><strong>Escape Route 4: Distributed Exact Attention (Section 5)</strong></p><p>Refuse to compromise on recall. Shard the sequence across multiple GPUs using Ring Attention / Context Parallelism. Keep the full, unmodified O(n&#178;) attention.</p><p>Spread the 21.47 GB KV cache across p = 4 GPUs (staying within a single node on NVLink). Each device holds ~5.4 GB of cache. With model weights distributed via TP4, each GPU holds ~8.75 GB of weights.</p><p>Per-device memory: 8.75 GB weights + 5.2 GB cache + overhead = ~20 GB. Fits comfortably on an 80 GB H100 with room for multiple users per device.</p><p>Concurrency per 4-GPU node: roughly <strong>~10 users</strong>.</p><p><strong>What you gained:</strong> Perfect recall. Zero quality loss. The exact same model, distributed differently.</p><p><strong>What you lost:</strong> Prefill scales near-linearly. Decode becomes network-bound. Tokens-per-second per user during generation drops hard compared to single-GPU serving, because every decode step requires a KV block round-trip across devices. NVLink within a node (~900 GB/s) is survivable. InfiniBand across nodes (~50 GB/s) is brutal.</p><p>Your $/M tokens doesn&#8217;t improve. It gets worse&#8202;&#8212;&#8202;you&#8217;re paying for 4 GPUs to serve the same users that 1 GPU handled at 4K context. The advantage is you can <em>offer</em> 128K context at all, which you physically couldn&#8217;t before. Revenue-enablement play, not cost reduction.</p><p><strong>Escape Route 5: StreamingLLM (Section 5)</strong></p><p>Bounded cache of ~4,000 tokens. Lock the first 4 sink tokens. Roll the window forward. Amputate the middle.</p><p>Per-user cache: 70B model at 4K effective window = <strong>~0.66 GB</strong> (same as the 4K row in our original numbers).</p><p>Concurrency: back to <strong>~59 users</strong>. Same as the 4K baseline.</p><p><strong>What you gained:</strong> Infinite input length. Zero latency degradation. O(1) memory. The same economics as 4K context regardless of how long the conversation runs. 22.2x speedup over sliding window recomputation.</p><p><strong>What you lost:</strong> Everything outside the window. If a user mentions something 50,000 tokens ago, it&#8217;s gone. For multi-turn chatbots where only the recent conversation matters, this works. For anything requiring full document recall, it&#8217;s useless.</p><h4>What Does This All Mean?</h4><p>We have hit the bottom of the stack. We traced the quadratic tax from the raw algebra of the Softmax function, to the exact byte size of the KV cache, to the arithmetic intensity of the SRAM, to the physical bandwidth of the network cables, and finally to the dollar cost per million tokens.</p><p>One law unifies every architecture we covered:</p><p><strong>During generation, the bottleneck is bytes moved, not math computed.</strong></p><p>Every architecture answers the same question differently: <em>which bytes are you willing to not move?</em></p><ul><li><p><strong>Standard Attention</strong> moves all the bytes. Perfect recall. KV cache grows with n. Eventually crushes your concurrency and your margins.</p></li><li><p><strong>MLA / Low-Rank Compression</strong> moves fewer bytes by compressing KV storage into a latent vector and reconstructing on the fly. You trade memory for compute and accept the engineering complexity of decoupled positional encodings.</p></li><li><p><strong>Hybrids</strong> move fewer KV bytes by deleting most attention layers and compressing the rest through SSM or Linear Attention blocks. You gain concurrency, lose some retrieval precision, and inherit a serving infrastructure headache.</p></li><li><p><strong>Mamba / Linear Attention</strong> move a fixed number of bytes regardless of context length. Memory stops being the binding constraint entirely. Retrieval quality degrades the longer the context gets.</p></li><li><p><strong>Distributed Attention</strong> moves the bytes across network cables instead of memory buses. Perfect recall at any length. Prefill scales beautifully. Decode becomes a network bandwidth problem. You pay with hardware cost.</p></li><li><p><strong>StreamingLLM</strong> moves almost nothing. Flat O(1) memory. Infinite input length. Everything outside the rolling window is amputated forever.</p></li></ul><h1>Conclusion: What Happens When You Don&#8217;t Have 80 GB of VRAM?</h1><p>We started this series with a single matrix multiplication&#8202;&#8212;&#8202;Q times K-transpose&#8202;&#8212;&#8202;and traced its consequences all the way down to the dollar cost of serving one user. Along the way we broke open five architectures, priced every state tensor and cache entry in bytes, and watched each one solve the quadratic tax by sacrificing something else. Memory for precision. Precision for memory. Elegance for infrastructure complexity. The unifying law held at every layer of the stack: during generation, you are bottlenecked by the bytes you move, not the math you compute.</p><p>But every architecture we&#8217;ve covered was designed for the same environment: a data center rack with 80 GB H100s, terabytes-per-second memory bandwidth, and a power budget that nobody in the building actually tracks. The tradeoffs we mapped&#8202;&#8212;&#8202;KV cache vs. compression, exact recall vs. fixed state, FLOP reduction vs. kernel switching overhead&#8202;&#8212;&#8202;all assume you have that kind of silicon to trade with.</p><p>But the fastest-growing deployment surface for AI isn&#8217;t the data center. It&#8217;s the pocket.</p><p>An iPhone has 4&#8211;6 GB of shared RAM. A terrible memory bus compared to HBM. A battery that dies in hours under sustained compute. The Qualcomm NPU in a midrange Android phone has roughly 1/500th the memory bandwidth of an H100. On these devices, the entire premise of this article&#8202;&#8212;&#8202;that you can choose where to spend your memory budget&#8202;&#8212;&#8202;collapses, because there is no budget. There&#8217;s barely enough room to load the weights, let alone maintain any kind of per-user state.</p><p>The architectures that win in the cloud won&#8217;t survive on the edge. Transformers are too memory-hungry even with every optimization we covered. Hybrids still carry KV cache for their attention layers. Even Mamba&#8217;s recurrent state, negligible at data center scale, becomes a real constraint when your total VRAM is measured in single-digit gigabytes shared with the operating system, the display buffer, and whatever else the user has open.</p><p>Edge AI needs something different. And the companies building for it&#8202;&#8212;&#8202;Liquid AI resurrecting continuous-time neural networks, the xLSTM lineage pushing gated recurrences below 1B parameters, RWKV attempting to make linear attention work at mobile scale&#8202;&#8212;&#8202;are solving a constraint set that this article never touched: power per token, latency under thermal throttling, and inference quality at model sizes where every megabyte of state is a megabyte stolen from something else.</p><p>That&#8217;s the next subject of the deep dive. We&#8217;ll apply the same in-depth math and financial analysis to the scenario when silicon gets small, the power budget gets brutal, and the architectures that looked elegant in an 80 GB world start looking like the wrong tool entirely.</p><p>Thank you for being here, and I hope you have a wonderful day,</p><p>Dev &lt;3</p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZBD_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZBD_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png 424w, https://substackcdn.com/image/fetch/$s_!ZBD_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png 848w, https://substackcdn.com/image/fetch/$s_!ZBD_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png 1272w, https://substackcdn.com/image/fetch/$s_!ZBD_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZBD_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png" width="407" height="155" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:155,&quot;width&quot;:407,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZBD_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png 424w, https://substackcdn.com/image/fetch/$s_!ZBD_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png 848w, https://substackcdn.com/image/fetch/$s_!ZBD_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png 1272w, https://substackcdn.com/image/fetch/$s_!ZBD_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16b5634c-caae-48e2-a1b1-4242e9b40ecb_407x155.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Reach out to me</h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. : </p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[AI Market Report: Feb 2026. Ten Frontier Models in 28 Days]]></title><description><![CDATA[How labs differentiate when models converge, what AI coding tools actually change, and why the backlash against AI became a market force]]></description><link>https://www.artificialintelligencemadesimple.com/p/ai-market-report-feb-2026-ten-frontier</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/ai-market-report-feb-2026-ten-frontier</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Fri, 06 Mar 2026 13:16:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!F6jD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Startup Founders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Every month, we dig through earnings reports, technical papers, benchmark data, product launches, and conversations with builders, founders, and operators to find the structural shifts that actually matter. February 2026 was one of the densest months I&#8217;ve seen in this industry. Ten or more frontier models launched in 28 days. $700 billion in hyperscaler capex guidance. A company fired half its workforce, credited AI, and watched its stock surge 24%. The US government blacklisted an AI lab and the public responded by making it the most downloaded app in the country.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F6jD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F6jD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg 424w, https://substackcdn.com/image/fetch/$s_!F6jD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg 848w, https://substackcdn.com/image/fetch/$s_!F6jD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!F6jD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F6jD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg" width="500" height="578" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:578,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:69715,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/189978024?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F6jD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg 424w, https://substackcdn.com/image/fetch/$s_!F6jD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg 848w, https://substackcdn.com/image/fetch/$s_!F6jD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!F6jD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122eb4e1-4d8f-485c-9fc3-f6d3d43657be_500x578.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>There&#8217;s a lot of noise in any given month. This report is about the signal underneath it. We spent weeks compiling, cross-referencing, and stress-testing the data to identify the trends that will shape decisions for builders, investors, and operators over the next 12 months. These are the most important structural shifts we&#8217;re seeing, and the frameworks for acting on them.</p><p><strong>In this month&#8217;s report:</strong></p><ul><li><p><strong>The Great Commodity Shift</strong> &#8212; You&#8217;re paying 19x more for 0.6 percentage points of performance. So how are labs actually making money? Three strategies, only one of which most people are talking about. Also: why US export controls are literally funding the Chinese R&amp;D that makes them irrelevant.</p></li><li><p><strong>The Infrastructure Ceiling</strong> &#8212; The commodity thesis has a trapdoor. We explain what it is, who it traps, and the $100B deal that signals the first real crack in NVIDIA&#8217;s grip on inference.</p></li><li><p><strong>The Developer Event Horizon</strong> &#8212; Block fired half its workforce and got a $4.8B reward. METR says AI tools make developers slower. Both are true. We reconcile them, and make an architectural prediction about the future of software development that I haven&#8217;t seen anyone else make yet.</p></li><li><p><strong>The Backlash Becomes a Market Force</strong> &#8212; I&#8217;m one of Anthropic&#8217;s harshest critics. I wrote an article telling people to support them. What happened next, and what it means for every industry the AI boom left behind.</p></li></ul><p>To access the full article&#8212;and all premium breakdowns going forward/written prior&#8212;upgrade to a premium subscription below.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><p>Each piece is rigorously researched, built from firsthand signals, and written to make you sharper than the noise. It takes a long time for me to compile, analyze, and verify research. If you believe deep insight deserves support, become a premium subscriber to allow me to keep doing the same.</p><p>Flexible pricing available&#8212;<a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">pay what matches your budget here</a>.</p><p><em><strong>Most companies offer learning or professional development budgets. <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">You can expense this subscription using the email template linked here</a>.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x6f2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x6f2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png 424w, https://substackcdn.com/image/fetch/$s_!x6f2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png 848w, https://substackcdn.com/image/fetch/$s_!x6f2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png 1272w, https://substackcdn.com/image/fetch/$s_!x6f2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x6f2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png" width="570" height="197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:197,&quot;width&quot;:570,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x6f2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png 424w, https://substackcdn.com/image/fetch/$s_!x6f2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png 848w, https://substackcdn.com/image/fetch/$s_!x6f2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png 1272w, https://substackcdn.com/image/fetch/$s_!x6f2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ccc9c02-9ab0-4fcb-aee6-8a8597b16a90_570x197.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div>
      <p>
          <a href="https://www.artificialintelligencemadesimple.com/p/ai-market-report-feb-2026-ten-frontier">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Why We Must All Support Anthropic AIs Stand Against AI-Surveillance and Weapons Systems]]></title><description><![CDATA[How Mass AI Surveillance and AI Weapons Systems Can Threaten Our Freedom]]></description><link>https://www.artificialintelligencemadesimple.com/p/why-we-must-all-support-anthropic</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/why-we-must-all-support-anthropic</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Mon, 02 Mar 2026 05:04:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!z-Pv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>To r<a href="https://artificialintelligencemadesimple.substack.com/p/busting-unions-with-ai-how-amazon">euse the disclaimer from our Amazon article</a>- <em>This one&#8217;s all me. Not my coworkers, clients, the chocolate milk cult, my fight club, and/or my halal guy. I do this completely alone, and if there&#8217;s fallout, it lands here, and nowhere else. My words, my responsibility.</em></p><p>If you have been following this newsletter for any meaningful amount of time, you&#8217;d know that I have been fighting against the use of AI in Mass Surveillance and Algorithmic Weapons Systems for a few years. We&#8217;ve written articles on how to unravel these systems, why they&#8217;re not good for society, and had several livestreams against them. I&#8217;ve been particularly critical of Anthropic AI on this front, given their involvement in these systems + their relationship with Palantir, which conflicted against their AI Safety stance&#8212;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z-Pv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z-Pv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png 424w, https://substackcdn.com/image/fetch/$s_!z-Pv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png 848w, https://substackcdn.com/image/fetch/$s_!z-Pv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png 1272w, https://substackcdn.com/image/fetch/$s_!z-Pv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z-Pv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png" width="1456" height="994" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:994,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:600685,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/189588530?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z-Pv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png 424w, https://substackcdn.com/image/fetch/$s_!z-Pv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png 848w, https://substackcdn.com/image/fetch/$s_!z-Pv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png 1272w, https://substackcdn.com/image/fetch/$s_!z-Pv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3121437c-3ccf-4f17-beb6-c2725203832d_1588x1084.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.artificialintelligencemadesimple.com/p/learning-from-anthropic-ceo-dario?utm_source=publication-search">Image Source</a></figcaption></figure></div><p> In this article, I want to take a second and talk about why we should all support Anthropic&#8217;s refusal to use their system in Weapons systems and Mass Surveillance. Before we begin, there are a few disclaimers&#8212;</p><ol><li><p>I&#8217;m not an Anthropic-affiliated person. On the contrary, the Chocolate Milk Cult has had a largely antagonistic relationship with Anthropic given their hypocrisy on many topics (violation of IP, plagiarism, regulatory capture against competitors, doomerism, etc). I have a strong distaste for Dario and disagree with him in most things, but this is one of the areas where I agree with his stance fully. Last year, Anthropic supported these systems so I fought them; now they&#8217;re against these systems so I support them. It&#8217;s that simple. </p></li><li><p>This is not a political article against any administration. I stand against these systems from all governments, irrespective of their political leanings.  You can easily see this from our past articles, where we&#8217;ve critiqued multiple regimes. </p></li></ol><p>This article will be structured in two parts. First, I will collect several talking points that are used against Anthropic + our earlier work against the militarization of AI and refute them, point by point. Second, I will copy our longer deep dive into why the algorithmic arms race is a massive slippery slide, and why we must all unite against Silicon Valley&#8217;s increasingly authoritarian turn. </p><h2>Part 1: Analyzing the Arguments Against Anthropic AI/Pro AI Militarization</h2><h3><strong>&#8220;It&#8217;s all lawful use. If it&#8217;s legal, what&#8217;s the problem?&#8221;</strong> </h3><p>People love throwing this one around, but it misses the point entirely. &#8220;Lawful&#8221; isn&#8217;t some magical, fixed safety net&#8212;it&#8217;s a political boundary. Laws shift, emergency powers get expanded, and definitions stretch whenever it&#8217;s convenient for the people in charge. If the law was actually a rock-solid moral compass, we wouldn&#8217;t spend half our lives arguing over contract clauses. Everyone intuitively knows this: what&#8217;s perfectly legal today can easily become tomorrow&#8217;s ethical nightmare.</p><p>When you&#8217;re building tech with irreversible consequences&#8212;like mass surveillance or lethal autonomy&#8212;you can&#8217;t just cross your fingers and trust the current legal code. You need actual, hard-coded architectural constraints. And let&#8217;s not forget that legality and morality aren&#8217;t the same thing. At various points in history, some of the worst human rights abuses were completely legal <strong>and actively enforced.</strong> We can&#8217;t equate the law with morality, especially when massive commercial interests are on the line.</p><h3><strong>&#8220;Private companies shouldn&#8217;t dictate military policy. Congress should.&#8221;</strong> </h3><p>Fully agreed. But is that really what&#8217;s happening here? </p><p>Ideally, Elected officials would handle this stuff transparently. But let&#8217;s be real: Governments move at a snail&#8217;s pace and usually struggle to grasp the basic nuance of the tech we&#8217;re building. We still don&#8217;t even have decent laws for basic AI transparency. What actually happens is that massive capabilities get deployed quietly, and regulators only wake up <em>after</em> a major scandal or catastrophe.</p><p>When companies like Anthropic draw a line in the sand, they aren&#8217;t trying to replace democracy. They&#8217;re adding friction. They&#8217;re buying society time to catch up legislatively. If you hate that private companies are setting these boundaries, the answer isn&#8217;t to just rip the guardrails off and pretend that&#8217;s democratic. The answer is to push for actual, coherent laws. Besides, Anthropic is a private business. They can refuse service if they want to. That isn&#8217;t dictating policy; the government is completely free to build its own tools or buy from a vendor who doesn&#8217;t care.</p><h3><strong>&#8220;Just trust the Pentagon. We have ethics frameworks.&#8221;</strong> </h3><p>This sounds great on a slide deck, but &#8220;trust me bro&#8221; isn&#8217;t a robust safety mechanism. Ethics frameworks without hard enforcement are just words on a page, and those words get ignored the second a mission demands speed or a tactical edge. When the contracts reward moving fast, oversight naturally weakens. The momentum always pushes toward more surveillance and more autonomy.</p><p>Look at history. Surveillance and military powers don&#8217;t just voluntarily shrink. They find excuses to expand, usually justified at first by promises that they&#8217;ll only be used for &#8220;limited&#8221; scope. Real restraint requires actual structural constraints and verifiable enforcement. Trust without accountability is just blind optimism, and optimism won&#8217;t stop a catastrophe. Promises you can&#8217;t enforce are just fantasies.</p><h3><strong>&#8220;OpenAI signed a Contract. Anthropic is being dramatic.&#8221;</strong> </h3><p>Contracts are fine for clarifying standards, but they aren&#8217;t magic. A contract freezes today&#8217;s intentions in legal text, but it can&#8217;t control the messy reality of downstream implementation from subcontractors, classified integrations, and scope creep. At best, a strong contract gives you the right to sue <em>after</em> something terrible happens.</p><p>Prevention by design is a totally different ballgame. You can&#8217;t just rely on legal paperwork to stop an accident. Effective safety requires upfront design constraints and enforceable accountability. If Anthropic&#8217;s stance feels &#8220;dramatic&#8221; to some people, it&#8217;s because real safety actually requires drawing hard boundaries at the design level, rather than just reacting when things inevitably break.</p><h3><strong>&#8220;If they won&#8217;t give full flexibility, they&#8217;re a national security risk.&#8221;</strong> </h3><p>Watch the sleight of hand here. Originally, &#8220;supply-chain risk&#8221; meant actual threats: espionage, sabotage, getting hacked. Now, the term is being hijacked to describe companies as &#8220;risky&#8221; simply because they refuse to hand over unrestricted access to highly sensitive tech.</p><p>Rebranding an ethical constraint as a security threat doesn&#8217;t make us safer; it&#8217;s just a pressure tactic to shut up the people raising valid concerns. When a company&#8217;s moral restraint gets labeled a national security issue, we&#8217;re shifting the Overton window in a really dangerous way. It normalizes reckless behavior and creates a terrible legal precedent.</p><h3><strong>&#8220;This is about winning wars. Speed wins wars.&#8221;</strong> </h3><p>Speed matters, but only if the thing you&#8217;re building actually works reliably. Frontier AI systems today are still deeply brittle. They hallucinate, they fail under weird distribution shifts, and they make mistakes with total confidence. If a chatbot screws up my code, it&#8217;s annoying. If an autonomous system screws up at a lethal scale&#8230;</p><p>When you compress decisions into milliseconds, you effectively turn human oversight into a joke. Humans &#8220;in the loop&#8221; just become rubber stamps for the algorithm. Speed without reliability doesn&#8217;t give you a strategic advantage; it just guarantees you&#8217;re going to make massive, irreversible mistakes much faster.</p><h3><strong>&#8220;It&#8217;s just decision support. Humans make the final call.&#8221;</strong> </h3><p>Again, sounds comforting, but it falls apart under pressure. In the real world, &#8220;decision support&#8221; almost always turns into decision dominance. When a human is stressed, tired, and moving fast, they are going to defer to the algorithm&#8217;s probability scores and recommendations. We&#8217;ve already seen this happen with social media moderation, finance, and judicial sentencing.</p><p>In warfare, the stakes and the pressure are a hundred times higher. Why would we assume people on a battlefield will somehow be immune to the pull of algorithmic recommendations? Calling it &#8220;decision support&#8221; is often just a convenient way to pass the buck from humans to a black-box algorithm.</p><h3><strong>&#8220;Mass surveillance is a scare word. It&#8217;s just analyzing existing data.&#8221;</strong> </h3><p>Calling it &#8220;analytics&#8221; doesn&#8217;t change what it is. The issue isn&#8217;t whether the data already exists; it&#8217;s the fact that states can now cheaply aggregate and fuse massive datasets to build comprehensive dossiers on people without any individualized suspicion.</p><p>When you aggregate data at that scale, it completely changes the ethical math. Ordinary, harmless consumer data gets weaponized into incredibly invasive portraits of people&#8217;s lives. That kind of large-scale data fusion <em>is</em> mass surveillance, regardless of what PR spin you put on the word &#8220;analytics.&#8221;</p><h3><strong>&#8220;Autonomy could be more humane. It might reduce casualties.&#8221;</strong> </h3><p>Maybe. But only if you assume the tech works perfectly. To actually make war more humane, these systems would need flawless target discrimination, total resilience against hacking, and perfect data. Those aren&#8217;t minor feature requests; they are massive, currently unsolved structural hurdles.</p><p>The people pushing for immediate deployment just gloss over this. They use the humanitarian angle as a justification, but completely ignore the conditions required to actually achieve it. Deploying brittle, error-prone autonomous systems right now isn&#8217;t humane. It&#8217;s reckless.</p><h3><strong>&#8220;Tech has always had military roots. This is just how innovation works.&#8221;</strong> </h3><p>This is a really convenient narrative for the defense industry, but it&#8217;s wildly historically inaccurate. Yes, DARPA and the CIA funded some early stuff. They are also just highly visible and easy to point to. </p><p>The actual foundational math behind modern computing&#8212;Bayesian inference, calculus, linear algebra, information theory&#8212;didn&#8217;t come out of a Pentagon war room. It came from thousands of academics, public labs, and researchers decentralized all over the world. The military gave it some early cash and applied it, but acting like they are the primary engine of innovation ignores the massive, decentralized academic reality that actually makes this tech possible. The academia bros don&#8217;t get the credit because there isn&#8217;t one institution you can point to, but a lot of the work came from various, non-militarized corners. That isn&#8217;t to say that the military didn&#8217;t contribute. However, to pretend like it&#8217;s the sole driver of tech, and that w/o innovation in tech wouldn&#8217;t exist is wildly inaccurate.</p><h3><strong>&#8220;The military can afford long failure horizons. Private markets can&#8217;t. That&#8217;s why defense drives breakthroughs.&#8221;</strong> </h3><p>It sounds intuitive, but it&#8217;s wrong on two fronts. First, patience isn&#8217;t a military monopoly. Universities, the NSF, and public labs fund decades-long research all the time without needing an immediate ROI. That&#8217;s how we got half of modern physics and biotech.</p><p>Second, the length of the horizon isn&#8217;t even the real problem&#8212;it&#8217;s the incentives. The danger isn&#8217;t that the military is patient; it&#8217;s that private VC money now views warfare and surveillance as a massive growth market. When private capital floods into conflict zones, war becomes an optimized, profit-generating industry. And profit incentives don&#8217;t encourage restraint or peace; they demand growth. Once you turn lethal automation into a standard, venture-backed revenue stream, it is almost impossible to shut off.</p><h3><strong>&#8220;Business is business. Investors will invest. That&#8217;s reality.&#8221;</strong> </h3><p>Treating investment like it&#8217;s just some neutral, passive force is incredibly naive. Where money flows, infrastructure gets built. Ecosystems grow. Entire careers get tied up in maintaining those systems. If capital keeps pouring into surveillance and autonomous weapons, those things stop being &#8220;tools&#8221; and become the foundational operating system of society.</p><p>Investments actively construct the future. They rewire incentives and institutional cultures. When investors pour billions into invasive capabilities, they aren&#8217;t just &#8220;following an opportunity&#8221;&#8212;they are actively building a world that will resist transparency and oversight down the line. Claiming &#8220;business is business&#8221; isn&#8217;t realism; it&#8217;s just ducking moral responsibility and cowardice.</p><h3><strong>&#8220;If we don&#8217;t build it, adversaries will.&#8221;</strong> </h3><p>The ultimate trump card. It sounds compelling, but it&#8217;s an emotionally manipulative race to the bottom. Yes, adversaries will try to build dangerous things. But the question is how we respond: do we match their recklessness and accelerate the race, or do we establish clear boundaries?</p><p>The people leading frontier tech have massive leverage over the whole ecosystem. When the leaders at the top signal that anything goes, that permissiveness trickles down to everyone else. But when those same leaders draw hard ethical lines against things like mass surveillance or lethal autonomy, those standards propagate too. Saying &#8220;others will do it&#8221; doesn&#8217;t absolve you of responsibility&#8212;it&#8217;s exactly why we need responsible leadership right now. Standards propagate, but so does recklessness.</p><div><hr></div><p>These were the major talking points I found. If you want me to talk about any other arguments or address anything else, lmk and I will add them here. Up next is our more general critique of why Silicon Valley&#8217;s embrace of algorithmic surveillance and weapons systems must be fought against. </p><div><hr></div><p>Originally published here&#8212; &#8220;<strong><a href="http://artificialintelligencemadesimple.com/p/algorithmic-arms-race-how-tech-is">Algorithmic Arms Race: How Tech is Fueling Weapons Systems and Mass Surveillance</a>.&#8221;</strong></p><p>This Friday three tech titans&#8202;&#8212;&#8202;an ex-OpenAI engineer, a Meta VP, and a Palantir product chief&#8202;&#8212;&#8202;raised their right hands and swore into a brand-new Army Reserve &#8220;innovation detachment.&#8221; <a href="https://www.anduril.com/article/anduril-awarded-10-year-642m-program-of-record-to-deliver-cuas-systems-for-u-s-marine-corps/?utm_source=chatgpt.com">Earlier this year, the Pentagon inked a </a><strong><a href="https://www.anduril.com/article/anduril-awarded-10-year-642m-program-of-record-to-deliver-cuas-systems-for-u-s-marine-corps/?utm_source=chatgpt.com">$642 million counter-drone</a></strong><a href="https://www.anduril.com/article/anduril-awarded-10-year-642m-program-of-record-to-deliver-cuas-systems-for-u-s-marine-corps/?utm_source=chatgpt.com"> contract with Anduri</a>l, while the Pentagon <strong><a href="https://news.usni.org/2024/03/11/pentagon-will-spend-1b-on-first-round-of-replicator-drones">cleared the first $1 billion</a></strong><a href="https://news.usni.org/2024/03/11/pentagon-will-spend-1b-on-first-round-of-replicator-drones"> for autonomous drones</a>.</p><p>These aren&#8217;t edge cases. As we covered in our <a href="https://artificialintelligencemadesimple.substack.com/p/capitalizing-on-ais-defense-pivot">AI Market Report for May</a>, the renaming of the &#8220;AI Safety Institute&#8221; to the &#8220;<strong>Center for AI Standards and Innovation&#8221; </strong>was a clear indication that the government was taking a much more aggressive stance in investing in AI for war.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xv6T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xv6T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png 424w, https://substackcdn.com/image/fetch/$s_!Xv6T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png 848w, https://substackcdn.com/image/fetch/$s_!Xv6T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png 1272w, https://substackcdn.com/image/fetch/$s_!Xv6T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xv6T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png" width="1000" height="723" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:723,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xv6T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png 424w, https://substackcdn.com/image/fetch/$s_!Xv6T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png 848w, https://substackcdn.com/image/fetch/$s_!Xv6T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png 1272w, https://substackcdn.com/image/fetch/$s_!Xv6T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9285e04-3fcb-408d-83cd-adb5dfc8c521_1000x723.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://artificialintelligencemadesimple.substack.com/p/capitalizing-on-ais-defense-pivot">Between this and the Nvidia callout, we&#8217;re already 2/3 on the predictions</a>.</figcaption></figure></div><p>Most concerning to me is not the government&#8217;s inclinations, however (governments do as they do), but how openly Silicon Valley is embracing this. Gone are the days of anti-establishment &#8220;hackers&#8221; (romanticized and exaggerated as it was), who would rail against Big Brother (like the talk that put Apple on the map). The new generation of Builders and Investors seems to be proactively building/funding tech-based Onii-chan.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NAK5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NAK5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NAK5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NAK5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NAK5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NAK5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg" width="1000" height="619" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:619,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NAK5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NAK5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NAK5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NAK5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f494caa-be10-412a-b866-63ed2f6b7db2_1000x619.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>All this creates a supply chain that compresses the gap between sensor and trigger to less than a heartbeat. Once that architecture is locked, every civilian use of AI inherits its assumptions, its surveillance scaffolding, and its hair-trigger incentives.</p><p>Before the concrete sets, we need to interrogate the logic driving this merger&#8202;&#8212;&#8202;steel-man the case for it, then stress-test the foundations. While I will present the arguments for the pros in a good-faith manner, I make no pretensions of objectivity. In my opinion, people proactively making this technology happen are some combination of-</p><ol><li><p>In the Government.</p></li><li><p>Greedy.</p></li><li><p>Ignorant of history.</p></li><li><p>Short-sighted.</p></li><li><p>A bitch that doesn&#8217;t have the spine to think about (and work on) a better future.</p></li></ol><p>In case you can&#8217;t tell, I absolutely hate this shift, and this piece is ideological to get you to join in my player-hating.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lHDM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lHDM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png 424w, https://substackcdn.com/image/fetch/$s_!lHDM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png 848w, https://substackcdn.com/image/fetch/$s_!lHDM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png 1272w, https://substackcdn.com/image/fetch/$s_!lHDM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lHDM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png" width="832" height="290" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:290,&quot;width&quot;:832,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45102,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://artificialintelligencemadesimple.substack.com/i/165988199?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lHDM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png 424w, https://substackcdn.com/image/fetch/$s_!lHDM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png 848w, https://substackcdn.com/image/fetch/$s_!lHDM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png 1272w, https://substackcdn.com/image/fetch/$s_!lHDM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a8c5d7a-e485-41e0-9ae2-0542fcd0bf1a_832x290.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.afr.com/world/north-america/palantir-s-alex-karp-vaults-to-top-of-best-paid-us-tech-executives-20250312-p5lj32">Image Source</a>. While the sentence, &#8220;I love the idea of getting a drone and having light fentanyl-laced urine spraying on analysts that tried to screw us&#8221;&#8202;&#8212;&#8202;could be passed off as a joke, Karp has exhibited a lot of &#8220;How could you leave me, I&#8217;m such a nice guy&#8221; style vindictive energy towards analysts many times. Worth thinking about whether this is the kind of person who should be building dangerous technology.</figcaption></figure></div><p>Let&#8217;s get it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vAC8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vAC8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vAC8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vAC8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vAC8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vAC8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg" width="750" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:750,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vAC8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vAC8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vAC8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vAC8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42fea113-f467-40d9-b886-abd84f495eba_750x500.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Atleast AGI bros pretend to care for humanity.</figcaption></figure></div><h3>Executive Highlights (TL;DR of the Article)</h3><p><strong>1. AI militarization isn&#8217;t neutral tech progress; it&#8217;s a fundamental rewrite of societal infrastructure. </strong>Whoever controls the code and sets the rules shapes every future built on that infrastructure.</p><p><strong>2. Arguments for AI-driven warfare&#8202;&#8212;&#8202;deterrence, precision, inevitability&#8202;&#8212;&#8202;fail upon scrutiny. </strong>They dangerously underestimate AI fragility, accelerate algorithmic escalation, and create opaque, unaccountable power structures.</p><p><strong>3. Systemic issues run deeper than any one bad policy:</strong></p><ul><li><p><strong>Vendor Capture</strong>: Narrow interests stifle innovation and lock militaries into dangerous, proprietary technologies.</p></li><li><p><strong>Erosion of Democracy</strong>: Secrecy, complexity, and rapid automation cripple civilian oversight.</p></li><li><p><strong>Exported Authoritarianism</strong>: Surveillance tools become Trojan horses for oppression, compromising sovereignty and autonomy.</p></li><li><p><strong>Irreversible Risks</strong>: Autonomous weapons pose existential threats impossible to roll back once normalized.</p></li></ul><p><strong>4. Practical solutions exist&#8202;&#8212;&#8202;imperfect, difficult, yet urgently necessary:</strong></p><ul><li><p>International bans and strict human oversight on lethal AI.</p></li><li><p>Redirect funding and talent from militarized AI towards civilian innovation.</p></li><li><p>Mandate radical transparency and empowered civilian oversight for surveillance systems.</p></li><li><p>Use investment ethics to financially disincentivize harmful AI development.</p></li><li><p>Educate the public and policymakers on AI capabilities and risks.</p></li></ul><p><strong>5. None of this is inevitable:</strong><br>Every moment of inaction, every concession to cynicism, hands the pen to interests indifferent to human dignity.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NVBP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NVBP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png 424w, https://substackcdn.com/image/fetch/$s_!NVBP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png 848w, https://substackcdn.com/image/fetch/$s_!NVBP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png 1272w, https://substackcdn.com/image/fetch/$s_!NVBP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NVBP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png" width="1000" height="403" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:403,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NVBP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png 424w, https://substackcdn.com/image/fetch/$s_!NVBP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png 848w, https://substackcdn.com/image/fetch/$s_!NVBP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png 1272w, https://substackcdn.com/image/fetch/$s_!NVBP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8dd17c-ece2-433b-a523-a71bbe279e2b_1000x403.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">I hope the irony of how Champions of Small Government, Freedom and Innovation like Peter Thiel getting on their knees for daddy Government is not lost on you. Dystopian, but still really funny.<a href="https://www.biometricupdate.com/202506/ice-advances-sole-source-deal-with-palantir-for-new-surveillance-backbone"> Image Source</a></figcaption></figure></div><p>We can&#8202;&#8212;&#8202;and must&#8202;&#8212;&#8202;pick up the pen and write a future worth living in, no matter how messy or incomplete each step may feel.</p><p>Because even imperfect actions taken today create space for better choices tomorrow.</p><p><strong>An additional note to investors and Silicon Valley</strong>- In economics, a defensive expenditure occurs when we spend money on something that does not increase our welfare, or is necessary to avoid a decrease in well-being. For example, spending money on health insurance does not necessarily make us better off, but it does help protect us from future costs that would negatively affect our well-being.</p><p><strong>Militaries fall squarely into this category:</strong> they don&#8217;t create value by existing; their value is in preventing something worse. So, investing in military technology ultimately means investing in such a good&#8212;<strong>unless</strong>, <strong>of course, you&#8217;re waiting for them to plunder someone else</strong>. War aside, this means you&#8217;re investing in a constrained market, very against your ethos.</p><p><em>I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription <a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">(pay what you want here)</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dy1_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dy1_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png 424w, https://substackcdn.com/image/fetch/$s_!dy1_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png 848w, https://substackcdn.com/image/fetch/$s_!dy1_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png 1272w, https://substackcdn.com/image/fetch/$s_!dy1_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dy1_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png" width="359" height="166" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc70d06c-5328-46f7-b92e-a95352d11158_359x166.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:166,&quot;width&quot;:359,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dy1_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png 424w, https://substackcdn.com/image/fetch/$s_!dy1_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png 848w, https://substackcdn.com/image/fetch/$s_!dy1_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png 1272w, https://substackcdn.com/image/fetch/$s_!dy1_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc70d06c-5328-46f7-b92e-a95352d11158_359x166.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><em>I provide various consulting and advisory services. If you&#8216;d like to explore how we can work together, <a href="https://linktr.ee/iseethings404">reach out to me through any of my socials over here</a> or reply to this email.</em></p><h3>2. Can AI for War and Surveillance be Ethical</h3><p>Before we critique the switch, I think it&#8217;s always worthwhile to explore the reasons people advocate for these technologies (beyond the fact that they are profitable investments). This is always helpful for avoiding misunderstandings, especially in very contentious issues like this.</p><p>Below are the arguments that are often given in favor for deploying AI for Security purposes-</p><h4><strong>Deterrence in a Multipolar Knife-Fight</strong></h4><p>Great-power rivalry isn&#8217;t a speculative threat&#8202;&#8212;&#8202;it&#8217;s the reality again. Geopolitical competition between the US, China, Russia, and other nations has intensified, and AI-driven weapon systems are the new coin of the realm.</p><p>The Pentagon&#8217;s Replicator program, which aims to deploy thousands of autonomous drones to counterbalance adversaries at a fraction of the cost of traditional hardware, embodies this logic. The reasoning is simple&#8202;&#8212;&#8202; falling behind on AI-driven autonomy is voluntary disarmament. If a rival state fields an autonomous kill-chain first, it dictates the rules and tempo of conflict, leaving late adopters dangerously reactive and vulnerable.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!65yv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!65yv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!65yv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!65yv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!65yv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!65yv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg" width="1200" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!65yv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!65yv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!65yv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!65yv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01f5726d-aac9-43ab-8b0f-32edbcef41bd_1200x800.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Falling behind constantly advancing nations can cause defensive vulnerability.</figcaption></figure></div><h4><strong>Fewer Body Bags, Faster Wins:</strong></h4><p>On the battlefield, speed and accuracy win engagements. Autonomous weapon systems don&#8217;t feel fear, fatigue, or adrenaline. They don&#8217;t hesitate, tremble, or flinch. They analyze terabytes of surveillance data in milliseconds, launching precision-guided munitions faster and more accurately than human operators.</p><p>Israel&#8217;s 2021 Gaza conflict was publicly termed &#8220;the first AI war,&#8221; precisely because machine-generated intelligence cut strike-to-target response times dramatically. At that time, &#8220;<em><a href="https://carnegieendowment.org/sada/2023/11/israels-ai-revolution-from-innovation-to-occupation?lang=en">Israeli military leaders described AI as a significant force multiplier, allowing the IDF to use autonomous robotic drone swarms to gather surveillance data, identify targets, and streamline wartime logistics.</a></em>&#8221; This has continued-</p><blockquote><p><em>&#8220;At 5 a.m., [the air force] would come and bomb all the houses that we had marked,&#8221; B. said, an anonymous IDF <strong><a href="https://www.972mag.com/lavender-ai-israeli-army-gaza/">soldier</a></strong>. &#8220;<strong>We took out thousands of people. We didn&#8217;t go through them one by one&#8202;&#8212;&#8202;we put everything into automated systems, and as soon as one of [the marked individuals] was at home, he immediately became a target. We bombed him and his house.</strong>&#8221; This is the reality of the war in Gaza. Israel employs sophisticated <strong><a href="https://www.972mag.com/lavender-ai-israeli-army-gaza/">artificial intelligence </a></strong>(AI) tools to enhance its Intelligence, Surveillance and Reconnaissance (ISR), which then allows it to strike targets all over Gaza. However, this augmentation of military capabilities raises profound ethical concerns and may carry geopolitical implications. AI-assisted airstrikes, <strong>which, in some cases rely almost <a href="https://www.972mag.com/lavender-ai-israeli-army-gaza/">solely</a> on the assessment of an algorithm, may lead to a disregard for basic rules of war, such as discrimination and proportionality.</strong>&#8221;</em></p><p>-<a href="https://georgetownsecuritystudiesreview.org/2025/01/09/the-dehumanization-of-isr-israels-use-of-artificial-intelligence-in-warfare/">Source</a></p></blockquote><p>Proponents of such systems argue that the power of these systems ultimately leads to fewer friendly casualties, fewer collateral deaths, and quicker resolutions. This makes this not only strategically wise but ethically necessary.</p><h4><strong>Dual-Use Is Destiny&#8202;&#8212;&#8202;Better Our Oversight Than Theirs (aka Dario Amodei saying that AI is so dangerous, that only we should build it):</strong></h4><p>AI, by its nature, is general-purpose. Trying to separate its civilian and military applications is unrealistic. Technologies like computer vision or predictive analytics used in healthcare and logistics are easily adapted to surveillance and military operations. The same facial recognition used by companies in weapons targeting can be adapted into better bots for disaster rescue.</p><p>If democratic nations abstain from militarizing these dual-use technologies, authoritarian regimes surely won&#8217;t hesitate. <a href="https://cetas.turing.ac.uk/publications/eu-ai-act-national-security-implications">The European Union&#8217;s AI Act explicitly carves out national security exceptions</a>, implicitly acknowledging that it&#8217;s better for democracies to lead AI&#8217;s development transparently rather than cede that space entirely to adversaries who might weaponize it without oversight.</p><h4><strong>Economic Flywheel and Technological Spillover:</strong></h4><p>Military funding has historically underwritten technology that later reshaped civilian life&#8202;&#8212;&#8202;radar gave us microwave ovens, DARPA gave us the Internet and GPS. Advocates see military AI spending as another iteration of this pattern.</p><p><a href="https://blog.anduril.com/andurils-lattice-a-trusted-dual-use-commercial-and-military-platform-for-public-safety-770b83c082e9">Anduril&#8217;s perception stack can easily be used for handling port security and wildfire response</a>. In such cases, the military budgets absorb R&amp;D risks that commercial investors won&#8217;t, ultimately accelerating innovation cycles and producing civilian benefits downstream.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mtc1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mtc1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png 424w, https://substackcdn.com/image/fetch/$s_!Mtc1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png 848w, https://substackcdn.com/image/fetch/$s_!Mtc1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png 1272w, https://substackcdn.com/image/fetch/$s_!Mtc1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mtc1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png" width="1261" height="727" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:727,&quot;width&quot;:1261,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Mtc1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png 424w, https://substackcdn.com/image/fetch/$s_!Mtc1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png 848w, https://substackcdn.com/image/fetch/$s_!Mtc1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png 1272w, https://substackcdn.com/image/fetch/$s_!Mtc1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9126ff83-dffb-4374-ae9b-a05ecca29118_1261x727.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Counter-Terror, Border Control, Cyber Defense:</strong></h4><p>AI-driven surveillance tools significantly improve homeland security, supporters claim, replacing brute-force policing with precision interventions. Systems that utilize wide-area facial recognition, behavioral analysis, and pattern detection enable security agencies to proactively intercept threats.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9YdA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9YdA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png 424w, https://substackcdn.com/image/fetch/$s_!9YdA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png 848w, https://substackcdn.com/image/fetch/$s_!9YdA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png 1272w, https://substackcdn.com/image/fetch/$s_!9YdA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9YdA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png" width="1000" height="497" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:497,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9YdA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png 424w, https://substackcdn.com/image/fetch/$s_!9YdA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png 848w, https://substackcdn.com/image/fetch/$s_!9YdA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png 1272w, https://substackcdn.com/image/fetch/$s_!9YdA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7465b34f-4a74-485e-9533-436cbad578d4_1000x497.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://homeland.house.gov/2024/07/09/chairmen-higgins-bishop-open-joint-hearing-border-security-technologies-play-a-critical-role-in-countering-threats-mass-illegal-immigration/">Source</a></figcaption></figure></div><h4><strong>&#8220;Nothing to Hide&#8221;</strong></h4><p>This is very common when talking about surveillance systems. If I, Jo Schmoe 69, have done nothing wrong, what do I care if the government is looking through my texts or seeing what I did for Lunch? Only the seedy characters who have things to hide would be concerned about AI being used in surveillance. For the rest of us, this will only improve accountability and safety (since the systems can be used to detect and track criminals very quickly).</p><h4><strong>Moral Imperative to Protect Troops &amp; Civilians:</strong></h4><p>Lastly&#8202;&#8212;&#8202;and perhaps most compellingly from a humanitarian perspective&#8202;&#8212;&#8202;is the argument that using autonomy in warfare is ethically obligatory. If an autonomous drone swarm can end a fight in minutes or hours instead of the days/months/years associated with traditional armed conflict&#8202;&#8212;&#8202; we must use it.</p><p>In the future, if both sides eventually fight with drones and robots, won&#8217;t bloody conflicts be a thing of the past? Wouldn&#8217;t wars turn into advanced, high-stakes versions of Age of Empires games?</p><p>Combine this w/ previous arguments. Ethically, the reasoning follows a clear imperative: if technology exists to reduce harm to your soldiers and civilians alike, not working towards it becomes a moral failure rather than a virtuous choice. After all, we haven&#8217;t yet figured out how to live in harmony, so does it not make sense to at least mitigate the costs when wars eventually fail.</p><p>Some of these have legs and are worth thinking about. Let&#8217;s address them in the next section&#8202;&#8212;&#8202;picking what&#8217;s valid and talking about what&#8217;s not.</p><h3>Part 3: How &#8220;Defense Tech&#8221; can Make things worse</h3><blockquote><p><em>&#8220;But man has such a predilection for systems and abstract deductions that he is ready to distort the truth intentionally, he is ready to deny the evidence of his senses only to justify his logic. &#8220;</em></p><p>&#8212; Fyodor Doestevsky, Notes from the Underground. <a href="https://artificialintelligencemadesimple.substack.com/p/why-you-should-read-fyodor-dostoevsky">Must read for our times.</a></p></blockquote><p>We&#8217;ve presented the strongest arguments advocates can muster for deploying AI into warfare and surveillance. They aren&#8217;t baseless&#8202;&#8212;&#8202;but they&#8217;re also built on deeply flawed assumptions, dangerously optimistic projections, and historical amnesia.</p><h4>Deterrence, or Algorithmic Escalation?</h4><p>The idea that AI superiority guarantees stability is a dangerous remix of the logic behind mutually assured destruction&#8202;&#8212;&#8202;only now the &#8220;assurance&#8221; relies on algorithms nobody fully understands, operating at speeds that make human judgment irrelevant. Such systems can just as easily lead to-</p><p><strong>Machine-Speed Wars</strong>: Deterrence hinges on rational actors and deliberation. But automated kill chains close the gap between sensor and trigger to milliseconds, leaving no time for humans to intervene. An algorithmic glitch, a spoofed sensor input, or a hostile hack can ignite an unintended &#8220;flash war&#8221; faster than any leader could react (<a href="https://www.investopedia.com/terms/f/flash-crash.asp">flash crashes in the stock market are already a thing</a>).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YVRl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YVRl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png 424w, https://substackcdn.com/image/fetch/$s_!YVRl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png 848w, https://substackcdn.com/image/fetch/$s_!YVRl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png 1272w, https://substackcdn.com/image/fetch/$s_!YVRl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YVRl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png" width="1216" height="667" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:667,&quot;width&quot;:1216,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YVRl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png 424w, https://substackcdn.com/image/fetch/$s_!YVRl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png 848w, https://substackcdn.com/image/fetch/$s_!YVRl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png 1272w, https://substackcdn.com/image/fetch/$s_!YVRl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e02b19a-5e95-43fa-a7da-dd08e88e2c9a_1216x667.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jOKB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jOKB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png 424w, https://substackcdn.com/image/fetch/$s_!jOKB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png 848w, https://substackcdn.com/image/fetch/$s_!jOKB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png 1272w, https://substackcdn.com/image/fetch/$s_!jOKB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jOKB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png" width="1232" height="631" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/baaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:631,&quot;width&quot;:1232,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jOKB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png 424w, https://substackcdn.com/image/fetch/$s_!jOKB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png 848w, https://substackcdn.com/image/fetch/$s_!jOKB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png 1272w, https://substackcdn.com/image/fetch/$s_!jOKB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaaa45c9-1458-472d-8376-b4fce5a13ad2_1232x631.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is made (much) worse when we consider how brittle AI is. Both the aforementioned flash crashes, AI customer service, and the various issues in AI content moderation on social media platforms are worth considering. In all cases, we&#8217;ve had decades to refine things, and a massive financial incentive to do so, but failed b/c AI is really hard to control.</p><p>Even Language Models, with Billions in Investments and the best minds, struggle with basic things that can occur in every day, such as changing inputs-</p><blockquote><p><em>&#8220;This paper investigates the extent of order sensitivity in LLMs whose internal components are hidden from users (such as closed-source models or those accessed via API calls). We conduct experiments across multiple tasks, including paraphrasing, relevance judgment, and multiple-choice questions. Our results show that input order significantly affects performance across tasks, with shuffled inputs leading to measurable declines in output accuracy. Few-shot prompting demonstrates mixed effectiveness and offers partial mitigation; however, fails to fully resolve the problem. These findings highlight persistent risks, particularly in high-stakes applications, and point to the need for more robust LLMs or improved input-handling techniques in future development.&#8221;</em></p><p><a href="https://arxiv.org/abs/2502.04134">-The Order Effect: Investigating Prompt Sensitivity to Input Order in LLMs</a></p></blockquote><p>This phenomenon has been explored in the excellent post: &#8220;<a href="https://www.cip.org/blog/llm-judges-are-unreliable">LLM Judges Are Unreliable</a>&#8221;, which has also open-sourced their eval suite for you to run/verify their experiments yourself.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Fg-7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Fg-7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png 424w, https://substackcdn.com/image/fetch/$s_!Fg-7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png 848w, https://substackcdn.com/image/fetch/$s_!Fg-7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png 1272w, https://substackcdn.com/image/fetch/$s_!Fg-7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Fg-7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png" width="1237" height="810" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:810,&quot;width&quot;:1237,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Fg-7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png 424w, https://substackcdn.com/image/fetch/$s_!Fg-7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png 848w, https://substackcdn.com/image/fetch/$s_!Fg-7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png 1272w, https://substackcdn.com/image/fetch/$s_!Fg-7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a9b61a5-dca6-4e6a-a7a4-21eb6ac70148_1237x810.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">&#8220;<em>A screenshot of our &#8216;A/B picking experiment result&#8217;, showing biased first-slot preferences between Gemini and OpenAI models across variants of prompt style&#8221;</em></figcaption></figure></div><p>These fragilities can be overlooked in most cases since we have either (or both)</p><ol><li><p>Higher margins for error.</p></li><li><p>More time for audits/confirming outputs.</p></li></ol><p>First is out, given the nature of the space, but having millisecond-level margins also compresses the time for thinking about high-pressure situations.</p><p><strong>Swarm Instability &amp; Proliferation</strong>: Cheap drone swarms&#8202;&#8212;&#8202;they lower its barriers. If a state believes it can swiftly deploy disposable AI drones without political fallout, the temptation to strike first skyrockets. Historically, when you make powerful weapons easier for the &#8220;good guys,&#8221; you inevitably open that same door wider for the &#8220;bad guys.&#8221; Today&#8217;s defensive swarm becomes tomorrow&#8217;s terrorist arsenal.</p><p>Ideas such as induced demand-</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MbYR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MbYR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png 424w, https://substackcdn.com/image/fetch/$s_!MbYR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png 848w, https://substackcdn.com/image/fetch/$s_!MbYR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png 1272w, https://substackcdn.com/image/fetch/$s_!MbYR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MbYR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png" width="1000" height="1349" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1349,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MbYR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png 424w, https://substackcdn.com/image/fetch/$s_!MbYR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png 848w, https://substackcdn.com/image/fetch/$s_!MbYR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png 1272w, https://substackcdn.com/image/fetch/$s_!MbYR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d47e234-4eb5-40bc-a7e8-1bfe8f1afa1f_1000x1349.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://t4america.org/resource/fueling-the-crisis/fueling-the-crisis-recs/">An illustration of Induced Demand.</a></figcaption></figure></div><p>and Jevons' Paradox are worth studying here. If economics term scare you&#8202;&#8212;&#8202;think about your bills probably went up after getting Uber One or Amazon Prime, since the delivery add-ons (a huge detterent) were taken out. The very same thing will happen with weapons, with availability/low cost pushing up demand.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ToAZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ToAZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png 424w, https://substackcdn.com/image/fetch/$s_!ToAZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png 848w, https://substackcdn.com/image/fetch/$s_!ToAZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png 1272w, https://substackcdn.com/image/fetch/$s_!ToAZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ToAZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png" width="771" height="700" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/458751cb-013f-44ed-8ac6-34136b046df8_771x700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:700,&quot;width&quot;:771,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ToAZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png 424w, https://substackcdn.com/image/fetch/$s_!ToAZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png 848w, https://substackcdn.com/image/fetch/$s_!ToAZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png 1272w, https://substackcdn.com/image/fetch/$s_!ToAZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458751cb-013f-44ed-8ac6-34136b046df8_771x700.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Opaque Arms Race &amp; The Black Box Problem</strong>: AI&#8217;s opacity fuels mistrust. Without transparency, there is an opportunity cost to not defaulting to worst-case assumptions, accelerating the arms race, and constant paranoia. The cost of not doing so, is &#8220;death&#8221;, which makes all actions justifable (same argument used by people to oppress in the name of building utopias). This constant tension will make them trigger-fingers much more twitchy.</p><p>Building a system of algorithmic deterrence and flash-weapons systems also creates many questions that we haven&#8217;t even started to answer. Who&#8217;s accountable when opaque systems inevitably err? The coder? The commander? The black-box AI itself? Ultimately, until we can answer these questions well, this isn&#8217;t strategic deterrence; it&#8217;s gambling global security on algorithms nobody fully controls.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!10Mo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!10Mo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg 424w, https://substackcdn.com/image/fetch/$s_!10Mo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg 848w, https://substackcdn.com/image/fetch/$s_!10Mo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!10Mo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!10Mo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg" width="1000" height="1300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1300,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!10Mo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg 424w, https://substackcdn.com/image/fetch/$s_!10Mo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg 848w, https://substackcdn.com/image/fetch/$s_!10Mo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!10Mo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdbf023-40aa-4d4c-83f7-8275d8de0aeb_1000x1300.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A more thorogh flowchart showing the same princple of induced demand in weapons.</figcaption></figure></div><h4>Operational Efficiency? Meet the Precision Myth</h4><p>Advocates sell AI-enabled warfare as precise, efficient, and clean. That myth unravels quickly upon scrutiny.</p><p><strong>The Moral Hazard of Surgical Strikes</strong>: The IDF soldier&#8217;s chilling description of AI-facilitated strikes&#8202;&#8212;&#8202;&#8220;We took out thousands of people&#8230;everything into automated systems&#8221;&#8202;&#8212;&#8202;illustrates a deeper truth. &#8220;Precision&#8221; can become shorthand for automated, indiscriminate violence. If violence becomes politically easy and psychologically distant, leaders won&#8217;t hesitate&#8202;&#8212;&#8202;they&#8217;ll pull triggers faster and more often. War doesn&#8217;t become cleaner; it becomes routine. People have justified many atrocities, such as the Holocaust, by becoming very distant and &#8220;just doing their jobs&#8221;.</p><p><strong>Algorithmic Bias as Discrimination</strong>: AI learns from data&#8202;&#8212;&#8202;and data inherits human prejudices. Facial recognition systems consistently misidentify minorities. Put these biases into autonomous targeting algorithms, and you industrialize prejudice. Innocent civilians become collateral, labeled combatants by biased code, and killed efficiently at scale. The power imbalance in such cases can often make finding justice much harder.</p><p><strong>Novel Vulnerabilities &amp; Amplified Harm</strong>: AI warfare introduces entirely new vulnerabilities. Systems become prime targets for cyberattacks and adversarial manipulation, potentially turning your own arsenal against you. This is bad, on it&#8217;s own. This gets worse when we consider the history of technology and the capabilities it adds. Technology scales our capacities, dramatically increasing individual ability. A single mass-shooter or Suicide Bomber can kill more people than most accomplished warriors in history. Following the same trend, hacking the next generation of technology could cause devastation far exceeding traditional terrorism or mass shootings.</p><p>Building guardrails here will be expensive, difficult, and further sink resources into maintaining these setups instead of using the capital in other avenues (more on this soon).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-9Jl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-9Jl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!-9Jl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!-9Jl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!-9Jl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-9Jl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg" width="1200" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-9Jl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!-9Jl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!-9Jl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!-9Jl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b933334-4695-4e2a-b105-6ea69f6f00c4_1200x800.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Dual-Use Isn&#8217;t Destiny&#8202;&#8212;&#8202;It&#8217;s an Excuse</h4><p>Yes, AI technology is inherently dual-use. So was nuclear physics, chemical engineering, cryptography&#8202;&#8212;&#8202;yet responsible states didn&#8217;t shrug off their responsibilities; they painstakingly crafted treaties, inspections, and norms (even when flawed).</p><p>Saying &#8220;if we don&#8217;t build it, adversaries will&#8221; is fatalistic laziness masquerading as pragmatism. Real leadership means establishing international norms and red lines, not racing to the bottom. Claiming &#8220;responsible leadership&#8221; while accelerating AI militarization without guardrails isn&#8217;t just contradictory&#8202;&#8212;&#8202;it&#8217;s dangerously naive.</p><p>Additionally, the assumption that democratic states will deploy AI responsibly is hubristic. History suggests that increasing power imbalances lead to increasing injustice. To think that our modern crop of leaders has transcended the bloodthirst and greed that have plagued the rest of humanity seems a tad optimistic.</p><p>When all is said, most people, even builders, ultimately land on the oppressed side of an unchecked surveillance apparatus. No one plans to live in dictaroship&#8202;&#8212;&#8202;but unchecked, we inevitably build one.</p><h4>Economic Flywheel or Drain?</h4><p>Sure, military investments historically sparked civilian breakthroughs. <em>But that argument isn&#8217;t unique to defense&#8202;&#8212;&#8202;it&#8217;s true for nearly every major scientific investment.</em></p><p><strong>Opportunity Costs</strong>: Every billion dollars spent on perfecting drone swarms isn&#8217;t going to climate mitigation, disease eradication, or education. Society trades urgently needed civilian progress for a battlefield advantage.</p><p><strong>Talent Trap</strong>: Militarized AI repels precisely the talent society most desperately needs&#8202;&#8212;&#8202;those ethically opposed to building kill chains. AI&#8217;s brightest minds will leave rather than enable mass surveillance or war. The real cost is the drain of potential solutions for our most pressing global problems.</p><p><strong>Talent Entrenchment: </strong>The normalization of these technologies as strong career paths will influence more and more bright minds into these paths, taking away future great minds from working on big problems. Take the quote- <strong>&#8220;The best minds of my generation are thinking about how to make people click ads,&#8221; </strong>and imagine the outcome of replacing ads with weapons.</p><h4>Homeland Security: The &#8220;Nothing to Hide&#8221; Fallacy &amp; Surveillance Creep</h4><p>The promise of AI security is seductive, but it comes at profound, hidden costs.</p><p><strong>Scope Creep Isn&#8217;t Speculative</strong>: Border surveillance inevitably migrates inward&#8202;&#8212;&#8202;streets, schools, homes. History confirms surveillance infrastructures expand relentlessly, normalizing pervasive monitoring under the guise of safety. <strong>The &#8220;nothing to hide&#8221; argument assumes an infallible state, benevolent operators, and transparent algorithms. Reality provides none of those. </strong>Even projects w/ no hidden agendas/incentives to explicitly avoid creep, are hit with unexpected costs. You really expect technology deployment that can be very enticing to power-hungry folk to stay pure?</p><p><strong>Ignoring Root Causes</strong>: Surveillance as primary crime prevention ignores deeper societal drivers. True security emerges from giving people stable, dignified lives&#8202;&#8212;&#8202;not abstract surveillance threats. Economic opportunity, education, and healthcare consistently do more for security than cameras and algorithms-</p><blockquote><p><em>&#8220;Consider this: According to recent <a href="https://www.brookings.edu/articles/why-did-u-s-homicides-spike-in-2020-and-then-decline-rapidly-in-2023-and-2024/">Brookings research</a>, it was the loss of jobs and educational opportunities for people living in high-poverty neighborhoods that primarily explains the rise in homicides during the COVID-19 pandemic&#8202;&#8212;&#8202;not changes in policing or criminal justice system practices. Further, a large body of <a href="https://johnjayrec.nyc/2020/11/09/av2020/">evidence</a> finds that approaches linking public safety efforts to those bolstering employment, education, and quality neighborhoods can measurably reduce and prevent violent crime, while also <a href="https://www.ncbi.nlm.nih.gov/books/NBK190007/">saving taxpayers and governments significant costs</a>.&#8221;</em></p><p>-<a href="https://www.brookings.edu/articles/the-path-to-public-safety-requires-economic-opportunity/">Source</a>. When you&#8217;re already struggling, the threat of jail can be abstract while the upside of crime is more promising. If you have a house, job, and food to lose, it suddenly becomes much more concrete while diminishing the upside.</p></blockquote><h3>The Ethical Warfare Fantasy</h3><p>The argument that AI-driven war is somehow more ethical, reducing human harm, is perhaps the most dangerously naive.</p><p><strong>Sanitized Conflict Illusion</strong>: War isn&#8217;t sanitized by automation&#8202;&#8212;&#8202;it&#8217;s merely distanced. Killing remotely doesn&#8217;t eliminate violence; it abstracts it, lowers political costs, and increases the likelihood of use. War doesn&#8217;t become a strategic video game&#8202;&#8212;&#8202;it becomes easier, more frequent, and more casually lethal.</p><p><strong>Whose Lives Count?</strong>: Advocates argue AI reduces harm&#8202;&#8212;&#8202;but whose harm? AI protects your soldiers, sure, but does it discriminate against innocents caught in biased algorithms? Ethical warfare often conveniently aligns with national self-interest, not humanitarian principles.</p><p><strong>Fragile Times, Dangerous Tech</strong>: With diminished attention spans, increased polarization, and diminished nuanced discourse already rampant, handing lethal decisions to opaque, complex AI is reckless. Builders often fail to grasp these systems&#8217; societal impacts&#8202;&#8212;&#8202;witness our persistent content moderation failures. Amplifying these failures into the lethal domain risks catastrophes society isn&#8217;t ready for.</p><h4>Underlying all of this is a fundamental misunderstanding:</h4><p>Most proponents imagine themselves and their societies as the controllers, not the controlled. Others believe themselves to have a superior intelligence or moral virtue, and thus have no qualms about tipping the balance of power and designating themselves as the arbiters of human taste. Listen to any Peter Thiel, Marc Andreesen, or Dario Amodei interview, and you&#8217;ll see how quickly their statements carry an underlying sense of &#8220;we are better and should be in-charge, you just fall in line b/c we know what&#8217;s good for you&#8221;.</p><blockquote><p><em>&#8220;I simply hinted that an &#8216;extraordinary&#8217; man has the right&#8230; that is not an official right, but an inner right to decide in his own conscience to overstep&#8230; certain obstacles, and only in case it is essential for the practical fulfilment of his idea (sometimes, perhaps, of benefit to the whole of humanity). You say that my article isn&#8217;t definite; I am ready to make it as clear as I can. Perhaps I am right in thinking you want me to; very well. I maintain that if the discoveries of Kepler and Newton could not have been made known except by sacrificing the lives of one, a dozen, a hundred, or more men, <strong>Newton would have had the right, would indeed have been in duty bound&#8230; to eliminate the dozen or the hundred men for the sake of making his discoveries known to the whole of humanity.&#8221;</strong></em></p><p><strong>-</strong>Crime and Punishment</p></blockquote><p>However, this isn&#8217;t the only kind of problem that we must be concerned about. There are deeper, underlying problems that are often overlooked b/c of their chilling effects-</p><h3>Part 4: Additional Systemic Threats</h3><p>Let&#8217;s be clear: the deepest threats posed by militarized AI would remain even if their utopian visions of clean, efficient algorithmic warfare weren&#8217;t complete fantasy. The problem isn&#8217;t just their flawed arguments&#8202;&#8212;&#8202;it&#8217;s the structural shifts inherent in combining AI with state and military power. These systemic threats run deeper, and their consequences more lasting, than the immediate debate suggests.</p><h4>1. Path Dependency &amp; Vendor Capture: The Quiet Chokehold</h4><p>Once militaries rely heavily on proprietary AI from a handful of specialized firms&#8202;&#8212;&#8202;your Andurils, Palantirs, Lockheeds&#8202;&#8212;&#8202;dangerous path dependencies set in. Military doctrines, procurement strategies, even tactical thinking increasingly revolve around the limits and capabilities of these vendor platforms. Open innovation? Smaller firms? Open-source alternatives? Squeezed out, sidelined, or ignored. Those who write the &#8220;API specs&#8221; shape the battlefield, effectively capturing entire national security ecosystems.</p><p>Additionally, when Silicon Valley executives enlist in the Army Reserve tech detachments, they&#8217;re not just bringing technical expertise&#8202;&#8212;&#8202;they&#8217;re building lasting networks of influence. Civilian oversight becomes meaningless when &#8220;civilian&#8221; AI developers are ideologically and financially aligned with military objectives. This isn&#8217;t oversight; it&#8217;s vendor capture in camouflage.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OnMF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OnMF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!OnMF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!OnMF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!OnMF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OnMF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg" width="1200" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OnMF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!OnMF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!OnMF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!OnMF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88167b0d-e5ce-4bbc-9831-0f3f099417e3_1200x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The revolving door visualized</figcaption></figure></div><h4>2. Erosion of Democratic Accountability: Who Watches the Algorithms?</h4><p>Advanced AI systems, cloaked in national-security secrecy, make meaningful democratic oversight impossible. How do civilian leaders or citizens hold power to account when lethal decisions emerge from opaque algorithms?</p><p>&#8220;The algorithm decided&#8221; will become the ultimate deflection of accountability. Commanders, policymakers, and corporations shield themselves behind complexity. Or use the cloak of national safety to hide any issues. And legitimate classification quickly becomes a shield for hiding costs, ethical breaches, errors, or mission creep. This creates democratic black holes, where critical decisions vanish from public scrutiny entirely.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qC5O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qC5O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qC5O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qC5O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qC5O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qC5O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg" width="1000" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qC5O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qC5O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qC5O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qC5O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb468656-7ee5-4ab5-ab25-ed6b38f797d7_1000x800.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>3. Exporting Authoritarianism: AI as a Trojan Horse for Oppression</h4><p>Most nations can&#8217;t independently build sophisticated AI surveillance or autonomous weapons systems. Instead, they&#8217;ll import these capabilities wholesale from major powers. But these &#8220;off-the-shelf&#8221; solutions come with embedded political philosophies and strategic dependencies.</p><p>Buying China&#8217;s surveillance tech or Western drone swarms doesn&#8217;t just provide technical capability&#8202;&#8212;&#8202;it imports a political logic of centralized control and top-down surveillance. Nations become algorithmic vassals, dependent on supplier nations for updates, maintenance, and strategic guidance. Statistically speaking, cultural preservation has never really thrived when one set of peoples become dominant over another.</p><h4>4. Irreversible Global Catastrophe: More Than Just Another Weapon</h4><p>Certain AI applications&#8202;&#8212;&#8202;particularly fully autonomous weapons systems or strategic infrastructure control&#8202;&#8212;&#8202;introduce existential-level risks. These aren&#8217;t incremental threats&#8202;&#8212;&#8202;they&#8217;re civilization-scale gambles.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!59gY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!59gY!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!59gY!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!59gY!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!59gY!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!59gY!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif" width="480" height="270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:270,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!59gY!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!59gY!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!59gY!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!59gY!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b98c30-85d6-47b1-81d0-4cef1455ddeb_480x270.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://media2.giphy.com/media/v1.Y2lkPTc5MGI3NjExMHdnZTJjZW9rZTBhYmZjMHNyY3lleWV6eWplN2YzZHNleWU2MHAwZCZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/tXlpbXfu7e2Pu/giphy.gif">Conways Game of Life shows us simple rules/agents can create complex/unexpected patterns&#8202;</a>&#8212;&#8202;dubbed as emergence. Imagine what systems of complex agents can do.</figcaption></figure></div><p>Once deeply embedded in military and strategic infrastructure, these capabilities become impossible to roll back, even when recognized as fundamentally dangerous. We risk permanently locking the world into an unstable equilibrium, perpetually on the brink of algorithmic catastrophe.</p><p>These systemic threats aren&#8217;t mere side effects; they&#8217;re built into the architecture of militarized AI. And left w/o oversight, they will</p><p>This isn&#8217;t inevitable. The choices we make now&#8202;&#8212;&#8202;regarding oversight, norms, treaty obligations, and funding priorities&#8202;&#8212;&#8202;determine whether AI enhances human dignity and safety, or erodes them.Let&#8217;s talk about somethings we can push for right now.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PIrz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PIrz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png 424w, https://substackcdn.com/image/fetch/$s_!PIrz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png 848w, https://substackcdn.com/image/fetch/$s_!PIrz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png 1272w, https://substackcdn.com/image/fetch/$s_!PIrz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PIrz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png" width="832" height="542" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:542,&quot;width&quot;:832,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PIrz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png 424w, https://substackcdn.com/image/fetch/$s_!PIrz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png 848w, https://substackcdn.com/image/fetch/$s_!PIrz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png 1272w, https://substackcdn.com/image/fetch/$s_!PIrz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68649dd6-a3d4-4ddc-8bb9-5bbd984096ea_832x542.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://content.striderintel.com/wp-content/uploads/2025/05/Strider-SCSP-China-AI-Infrastructure-Surge-Report.pdf?utm_source=chatgpt.com">Source</a>. &#8220;<em>Dozens of those stakeholders have ties to both the PLA and the PRC defense industrial complex, as well as ties to the U.S. and its allies</em>.&#8221; is worth a double and triple take.</figcaption></figure></div><h3>Part 5: Pulling the Emergency Brake</h3><p>It&#8217;s easy to critique, but it&#8217;s empty without concrete, strategic alternatives. This isn&#8217;t a moment for vague recommendations; it&#8217;s a call for precision. We need explicit, actionable guardrails, international red lines, and practical off-ramps before this becomes an irreversible spiral.Here are five clear moves we should make immediately:</p><h4>1. Forge Binding International Norms&#8202;&#8212;&#8202;Before the Window Closes</h4><p>Allowing military AI to &#8220;evolve naturally&#8221; is dangerously naive. We urgently need enforceable global standards.</p><p><strong>Meaningful Human Control (MHC): </strong>This isn&#8217;t a vague aspiration; it&#8217;s a verifiable legal requirement. Lethal AI systems must always keep humans meaningfully informed, empowered, and capable of intervention, not just rubber-stamping outputs from a &#8220;ghost in the machine.&#8221;</p><p><strong>Outright Ban on Fully Autonomous Weapons (LAWS): </strong>Some technologies don&#8217;t get a pass. Autonomous systems that independently select and engage targets represent moral catastrophe and existential threat rolled into one. Treat them like chemical weapons: prohibited, stigmatized, actively dismantled. Verification is challenging? Tough. So was nuclear disarmament.</p><p><strong>Radical Transparency &amp; Incident Accountability: </strong>Nations must publicly disclose significant military AI projects. Establish independent international bodies empowered to investigate incidents involving AI, ensuring &#8220;the algorithm did it&#8221; never excuses war crimes or lethal mistakes.</p><h4>2. Starve the Beast, Feed Civilian Innovation&#8202;&#8212;&#8202;Financial &amp; Talent Firewalls</h4><p>Money talks loudly&#8202;&#8212;&#8202;right now, it says &#8220;build smarter ways to surveil and kill.&#8221; We need to change the incentives there.</p><p><strong>Public Funding with Ethical Teeth: </strong>Tie government grants explicitly to civilian-focused AI applications (climate, public health, education), and impose strict non-military clauses. Fund solutions to human problems, not enhancements to military arsenals.</p><p><strong>Conscience Clauses &amp; Talent Protection: </strong>Legally protect engineers&#8217; right to refuse ethically objectionable projects without retaliation. Elevate those working on life-affirming AI solutions over participants in the kill-chain or the surveillance economy. Make ethical integrity a career asset, not a liability.</p><p><strong>Tax the Toxic, Reward the Beneficial: </strong>Implement fiscal policies penalizing investments in autonomous weapons and oppressive surveillance tech. Conversely, offer significant incentives&#8202;&#8212;&#8202;tax breaks, grants, and subsidies&#8202;&#8212;&#8202;for AI advancing the public good. Building a surveillance state shouldn&#8217;t be lucrative.</p><h4>3. Mandate Public Accountability for Surveillance AI</h4><p>If governments use AI to watch citizens, citizens must hold an uncompromising mirror to the watchers.</p><p><strong>Mandatory Algorithmic Transparency: </strong>To a certain degree, some of these systems are inevitable. Some, like missile-defense systems, would even be a net good. However, we must take steps to ensure that all algorithmic systems publicly disclose capabilities, data inputs, usage plans, error rates, bias audits, and accountability measures (some secrecy would be required for sensitive systems, but we still need private audits there). End secretive algorithmic governance&#8202;&#8212;&#8202;citizens have an absolute right to know how their lives and freedoms are monitored and shaped.</p><p><strong>Empowered Civilian Oversight Bodies: </strong>Establish genuinely independent oversight boards with technical expertise, investigative authority, and teeth to halt biased, harmful, or ineffective AI deployments. Advisory committees won&#8217;t cut it; these bodies need real power&#8202;&#8212;&#8202;investigate, audit, sanction.</p><h4>4. Weaponize Capital &amp; Corporate Responsibility</h4><p>Capital isn&#8217;t neutral, but it can be pressured into ethical alignment.</p><p><strong>ESG Criteria for AI Ethics: </strong>Explicitly classify investment in lethal autonomous weapons, oppressive surveillance, and demonstrably biased AI as ESG-negative. Pressure pension funds, sovereign wealth funds, and major investors to divest, raising capital costs for unethical AI ventures.</p><p><strong>Internal Activism &amp; Shareholder Power: </strong>Foster cultures of accountability within tech companies and defense contractors through active shareholder and employee activism. Demand transparency, ethical standards, and accountability at shareholder meetings and internal forums. Make unethical AI a business liability, not an investment thesis.</p><h4>5. Fight Algorithmic Illiteracy&#8202;&#8212;&#8202;Educate Public &amp; Policymakers</h4><p>A public and policy elite ignorant of AI&#8217;s realities and risks can&#8217;t regulate it effectively.</p><p><strong>Critical AI Literacy for Leaders: </strong>Mandate comprehensive AI ethics and capabilities training for lawmakers, judges, civil servants, and security agencies. Equip decision-makers to grasp AI&#8217;s societal and ethical dimensions&#8202;&#8212;&#8202;not just vague hype.</p><p><strong>Demystify, Don&#8217;t Deify: </strong>Counter AI hype cycles with sober, evidence-based public education. Foster skepticism, nuance, and informed questioning over blind faith in technological solutions or dystopian inevitability. People deserve clarity, not confusion or techno-messianism.</p><h4>None of This Is Easy&#8202;&#8212;&#8202;Do It Anyway</h4><p>These won&#8217;t work easily. People will find ways to skirt regulations, lie, find loopholes etc. We can&#8217;t make a perfect system. We can add more friction, however. By adding more and more friction, we create space in these systems. A space where whistleblowers, future talent and other ethtically minded people can fall through, take a space to re-examine whether they want to be involved with what they&#8217;re building. A space that makes sweeping violations under the carpet much harder.</p><p>It&#8217;s about building a scaffold, on which we can build the next steps. It can be easy to forget, but many of our most important systems were not created overnight. They took years (sometimes centuries) of testing, iteration, and rebuilding. This will be similar. Getting frustrated and not engaging because of childish reasons like &#8220;the bad guys always win&#8221; or &#8220;this is just how things are&#8221; is complicity.</p><h3>Part 6: The Choice&#8202;&#8212;&#8202;Who&#8217;s Holding the Pen?</h3><blockquote><p><em>&#8220;Everything that you thought had meaning: every hope, dream, or moment of happiness. None of it matters as you lie bleeding out on the battlefield. None of it changes what a speeding rock does to a body, we all die. But does that mean our lives are meaningless? Does that mean that there was no point in our being born? Would you say that of our slain comrades? What about their lives? Were they meaningless?&#8230; They were not! Their memory serves as an example to us all! The courageous fallen! The anguished fallen! Their lives have meaning because we the living refuse to forget them! And as we ride to certain death, we trust our successors to do the same for us! Because my soldiers do not buckle or yield when faced with the cruelty of this world! My soldiers push forward! My soldiers scream out! My soldiers RAAAAAGE!&#8221;</em></p><p>-Attack on Titan</p></blockquote><p>AI isn&#8217;t destiny; it&#8217;s infrastructure. Like any infrastructure, its design decides what gets built on top. Every algorithm released, every sensor switched on, every dollar routed into autonomous weapons doesn&#8217;t just enable new capability&#8202;&#8212;&#8202;it quietly rewrites the rules we&#8217;ll all live under.</p><p>Right now, that pen sits in the hands of defense contractors hunting their next tranche of revenue, bureaucracies chasing a mirage of machine-speed dominance, and black-box code that answers to no electorate. If we stay passive, our future gets finalized in microseconds&#8202;&#8212;&#8202;human judgment downgraded to a latency bug.</p><p>But the ink isn&#8217;t dry.</p><p>Lines will be drawn either way. The only live question is whether we pick up the pen, draw them ourselves, and keep redrawing when the world shifts&#8202;&#8212;&#8202;as it always does.</p><p>Perfect solutions don&#8217;t exist; decisive steps do. Fight for the audits. Demand the clauses. Fund the alternatives. Teach the next cohort to read the fine print in the source. Each move is messy, partial, limited, and still miles better than surrendering the script.</p><p>Sometimes victory isn&#8217;t total or triumphant. Sometimes it requires that we live to fight another day, to ensure that we leave the foundations on for someone else to build on. If I can, I would like to end by quoting one of my favorite novels ever-</p><blockquote><p><em>&#8220;Dr. Rieux resolved to compile this chronicle, so that he should not be one of those who hold their peace but should bear witness in favour of those plague-stricken people; so that some memorial of the injustice and outrage done them might endure; and to state quite simply what we learn in a time of pestilence : that there are more things to admire in men than to despise. <strong>None the less, he knew that the tale he had to tell could not be one of a final victory. It could be only the record of what had had to be done, and what assuredly would have to be done again in the never-ending fight against terror and its relentless onslaughts, despite their personal afflictions, by all who, while unable to be saints but refusing to bow down to pestilences, strive their utmost to be healers.</strong> And, indeed, as he listened to the cries of joy rising from the town, Rieux remembered that such joy is always imperilled. He knew what those jubilant crowds did not know but could have learned from books : that t<strong>he plague bacillus never dies or disappears for good; that it can lie dormant for years and years in furniture and linen-chests; that it bides its time in bedrooms, cellars, trunks, and bookshelves; and that perhaps the day would come when, for the bane and the enlightening of men, it roused up its rats again and sent them forth to die in a happy city.</strong>&#8221;</em></p><p>-The Plague, Albert Camus.</p></blockquote><p>Thank you for being here, and I hope you have a wonderful day.</p><p>Still optimistic</p><p>Dev &lt;3</p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/why-we-must-all-support-anthropic?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/why-we-must-all-support-anthropic?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qp9_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qp9_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png 424w, https://substackcdn.com/image/fetch/$s_!Qp9_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png 848w, https://substackcdn.com/image/fetch/$s_!Qp9_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png 1272w, https://substackcdn.com/image/fetch/$s_!Qp9_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qp9_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png" width="698" height="98" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:98,&quot;width&quot;:698,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qp9_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png 424w, https://substackcdn.com/image/fetch/$s_!Qp9_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png 848w, https://substackcdn.com/image/fetch/$s_!Qp9_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png 1272w, https://substackcdn.com/image/fetch/$s_!Qp9_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb218992f-2fad-4a29-83ae-ac4aaaa1b45d_698x98.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Reach out to me</h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. : <a href="https://machine-learning-made-simple.medium.com/">https://rb.gy/zn1aiu</a></p><p>My YouTube: <a href="https://rb.gy/88iwdd">https://rb.gy/88iwdd</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://rb.gy/m5ok2y</a></p><p>My Instagram: <a href="https://rb.gy/gmvuy9">https://rb.gy/gmvuy9</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item><item><title><![CDATA[Most Important AI Updates of the week. Feb 16th 2026-Feb 22 2026 [Livestreams]]]></title><description><![CDATA[why AI benchmarks are failing and more]]></description><link>https://www.artificialintelligencemadesimple.com/p/most-important-ai-updates-of-the-175</link><guid isPermaLink="false">https://www.artificialintelligencemadesimple.com/p/most-important-ai-updates-of-the-175</guid><dc:creator><![CDATA[Devansh]]></dc:creator><pubDate>Mon, 23 Feb 2026 06:58:50 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/188851887/70961889ae0e5fd4c887830f20c5888f.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p><em>It takes time to create work that&#8217;s clear, independent, and genuinely useful. <strong><a href="https://artificialintelligencemadesimple.substack.com/subscribe">If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber</a>.</strong> It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. <strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">We run on a &#8220;pay what you can&#8221; model</a></strong><a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">&#8212;so if you believe in the mission, there&#8217;s likely a plan that fits (over here)</a></em>.</p><p><em>Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://artificialintelligencemadesimple.substack.com/subscribe&quot;,&quot;text&quot;:&quot;Help me buy chocolate milk&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://artificialintelligencemadesimple.substack.com/subscribe"><span>Help me buy chocolate milk</span></a></p><p><em><strong>PS</strong> &#8211; Supporting this work doesn&#8217;t have to come out of your pocket. If you read this as part of your professional development, you can <a href="https://docs.google.com/document/d/1xy6CNE8S7ZIM1LPKc5qdjwLJcqj6lwxzv3HFz3gEU14/edit?usp=sharing">use this email template</a> to request reimbursement for your subscription.</em></p><p><em><strong>Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more.<a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog"> </a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">If you&#8217;d like to meet other members of our community, please fill out this contact form here (</a><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">I will never sell your data nor will I make intros w/o your explicit permission</a></strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScCSWYlzouT8pzhfl0A2xdA0BxAPYg75h9F-WNkN8XuowpstA/viewform?usp=dialog">)</a>- <a href="https://forms.gle/Pi1pGLuS1FmzXoLr6">https://forms.gle/Pi1pGLuS1FmzXoLr6</a></em></p><div><hr></div><p>Thanks to everyone for showing up the live-stream. <strong>Mark your calendars for 8 PM EST, Sundays, to make sure you can come in live and ask questions.</strong></p><p>Bring your moms and grandmoms into the Chocolate Milk Cult.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/most-important-ai-updates-of-the-175?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/most-important-ai-updates-of-the-175?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>As usual, we have a new foster cat that&#8217;s ready to be adopted. I call him Chipku (Hindi for clingy; his government name is Jancy), and as you might guess, he&#8217;s <strong>very</strong> affectionate. I&#8217;ve trained him to be better around animals and strangers, and he&#8217;s perfect for families that already have some experience with cats. We sleep together every day, and waking up to him is one of the nicest feelings. <a href="https://www.petfinder.com/cat/jancy-aa872185-733a-49d5-af35-e3a43593ea0f/ny/new-york/harlem-cats-ny1437/details/">If you&#8217;re around New York City, adopt him here (or share this listing with someone who might be interested).</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qCU8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qCU8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qCU8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qCU8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qCU8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qCU8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:911399,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.artificialintelligencemadesimple.com/i/183499880?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!qCU8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qCU8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qCU8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qCU8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2352a158-b110-4fd2-95a2-0dd5c06ad763_2208x2208.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1><strong>Community Spotlight: RLMs</strong></h1><p>Recursive Language Models (RLMs) are a task-agnostic inference paradigm for language models (LMs) to handle near-infinite length contexts by enabling the LM to programmatically examine, decompose, and recursively call itself over its input. RLMs replace the canonical llm.completion(prompt, model) call with a rlm.completion(prompt, model) call. RLMs offload the context as a variable in a REPL environment that the LM can interact with and launch sub-LM calls inside of. It&#8217;s a super interesting idea and I&#8217;d suggest playign with the concept by yourself. <a href="https://github.com/alexzhang13/rlm">You can find the Github here. </a></p><p>If you&#8217;re doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you&#8217;ve written, an interesting project you&#8217;ve worked on, some personal challenge you&#8217;re working on, ask me to promote your company/product, or anything else you consider important. The goal is to get to know you better, and possibly connect you with interesting people in our chocolate milk cult. No costs/obligations are attached.</p><h1><strong>Additional Recommendations (not in Livestream)</strong></h1><ul><li><p>&#8220;<a href="https://vmfunc.re/blog/persona">the watchers: how openai, the US government, and persona built an identity surveillance machine that files reports on you to the fed</a>s&#8221;: Researchers discovered publicly accessible source maps tied to identity verification infrastructure from <strong>Persona</strong>, a vendor used by <strong>OpenAI</strong>. Those source maps referenced internal modules related to watchlist screening, politically exposed person checks, risk scoring, and reporting pathways connected to agencies like <strong>FinCEN</strong> and <strong>FINTRAC</strong>. This confirms is that modern identity verification stacks often include full compliance tooling&#8212;not just photo matching. I didn&#8217;t cover this on the live because we don&#8217;t have confirmation yet on how bad this is, but for now it&#8217;s worth noting how deep these systems are getting. Will keep you updated as the story develops.</p></li><li><p>&#8220;<a href="https://youtu.be/E9mYrr_DnAE?si=ssKpvdydlX0bEEtf">The Man Too Pathetic to Punish (Ft. Value Select!)</a>&#8221;: Raynald de Chatillon was a broke French knight who built a pirate fleet in the middle of a desert and may have tried to dig up the Prophet Muhammad&#8217;s bones as a tourist attraction. And everyone let him go because his cringe was too powerful.</p></li><li><p><strong><a href="https://cameronrwolfe.substack.com/p/rubric-rl">Rubric-Based Rewards for RL</a>: </strong><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Cameron R. Wolfe, Ph.D.&quot;,&quot;id&quot;:29736521,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/69aba7df-b571-4609-aa47-fc2d031c11b8_1242x1595.jpeg&quot;,&quot;uuid&quot;:&quot;562a0318-67f5-40fd-bfe8-5d55b318c3ad&quot;}" data-component-name="MentionToDOM"></span> publishes, I share his posts. I&#8217;m still not done with the article but lots of great takeaways. </p></li><li><p>&#8220;<a href="https://arxiv.org/abs/2601.06002">The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning</a>&#8221;: Large language models (LLMs) often fail to learn effective long chain-of-thought (Long CoT) reasoning from human or non-Long-CoT LLMs imitation. To understand this, we propose that effective and learnable Long CoT trajectories feature stable molecular-like structures in unified view, which are formed by three interaction types: Deep-Reasoning (covalent-like), Self-Reflection (hydrogen-bond-like), and Self-Exploration (van der Waals-like). Analysis of distilled trajectories reveals these structures emerge from Long CoT fine-tuning, not keyword imitation. We introduce Effective Semantic Isomers and show that only bonds promoting fast entropy convergence support stable Long CoT learning, while structural competition impairs training. Drawing on these findings, we present Mole-Syn, a distribution-transfer-graph method that guides synthesis of effective Long CoT structures, boosting performance and RL stability across benchmarks.</p></li><li><p><strong>Whi<a href="https://weightythoughts.com/p/white-collar-apocalypse-isnt-around">ite-Collar Apocalypse Isn&#8217;t Around the Corner&#8212;But AI Has Already Fundamentally Changed the Economy</a>: </strong>Incredible analysis of the impacts of AI by one of the worlds smartest AI commentators. We&#8217;re all very lucky that <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;James Wang&quot;,&quot;id&quot;:7343257,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7ea988e-c6f5-4b1e-9041-8a3081bccb3f_2200x2220.jpeg&quot;,&quot;uuid&quot;:&quot;7270d3d5-e517-4e2c-91bf-1f95535efc93&quot;}" data-component-name="MentionToDOM"></span> shares his insights free of cost, enjoy it while it lasts folks. </p></li><li><p><a href="https://substack.com/@gshao/p-188576197">Part I: The Gala, the Suburbs, and the &#8220;Months Behind&#8221; Myth in LLM Labs</a>. Great analysis of the Chinese Ecosystem by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Grace Shao&quot;,&quot;id&quot;:878147,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!44Sc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cdde595-f989-4e2f-a7dc-a73ce0e036ec_2604x2604.jpeg&quot;,&quot;uuid&quot;:&quot;6f75e82b-61a0-446f-9b71-b548571f46e6&quot;}" data-component-name="MentionToDOM"></span> . My question from it&#8212; We often see Chinese labs introduce interesting architectural innovations &#8212; for example, Kimi&#8217;s muon clip optimizer or DeepSeek&#8217;s hyperconnection approach. In these cases, the improvements don&#8217;t primarily come from pre-training scale or post-training techniques. They&#8217;re more algorithmic or architectural breakthroughs. How do you think these kinds of innovations factor into the overall trajectory? Are they central drivers of progress, or more like exceptions to the broader trend?</p></li></ul><h1>Companion Guide to the Livestream</h1><p><em>This guide expands the core ideas from the stream and structures them for deeper reflection. Most of you are reading this rather than watching &#8212; so the goal here is to make sure you get everything. Watch the full stream for tone, tangents, and the cat cameo at the end.</em></p><h2>1. Anthropic Stopped Trying to Win the Intelligence Race &#8212; And That Might Be the Smartest Move in the Market</h2><h3>What happened</h3><p>Sonnet 4.6 dropped this week. Both Sonnet 4.6 and Opus 4.6 now carry a 1 million token context window in beta &#8212; and most coverage treated that as the headline. It isn&#8217;t. The headline is what that context window is actually being built for, and what it tells us about where Anthropic is placing its bet going into the rest of 2026.</p><h3>Why this matters</h3><p>Cast your mind back 18 months. If you were building a serious production AI system, you wouldn&#8217;t touch Anthropic for agentic work. Claude was arguably the most intelligent model in the market &#8212; the reasoning was sharp, the outputs were nuanced &#8212; but the moment you gave it a system prompt with 50 rules and a stack of function calls, it fell apart. Instruction following was unreliable. Multi-step tool orchestration was inconsistent. It wasn&#8217;t built for that kind of work. OpenAI&#8217;s GPT-4.0 and 4.1, despite being arguably less capable at pure reasoning, dominated agentic workflows. Hand them a large ruleset and say &#8220;call these functions in this order, follow these constraints,&#8221; and they would do it. Reliably. Repeatedly. That made them the default backbone for anyone building serious AI systems, even among teams that would have preferred Anthropic&#8217;s reasoning quality. What happened around December changed the picture. Anthropic released an Opus update that had clearly been trained on synthetic agent chains &#8212; long, complex sequences of tool calls with conditional branching. The model learned to call a tool, evaluate what came back, and dynamically revise its next move based on the output. Claude Code became the &#8220;oh shit&#8221; moment for the developer community not because it was suddenly smarter, but because it could <em>operate</em>. It could navigate real environments, handle unexpected outputs, and maintain coherent intent across dozens of sequential tool calls without losing the thread.</p><h3>The architecture Anthropic is likely building toward</h3><p>The 1 million token context window in both Sonnet and Opus is a direct extension of that same bet, and it&#8217;s worth being precise about what it&#8217;s actually for. The assumption most people make is that large context windows are for humans who want to dump codebases into a chat and ask questions. That&#8217;s not the primary use case Anthropic is building for. They&#8217;re building for agents that need to read full debug logs, trace complete agent trajectories, and hold the state of an entire repository in memory across a long task. The context isn&#8217;t there for retrieval &#8212; it&#8217;s there for operational awareness. Watch the divergence between Sonnet and Opus carefully. The architectural split is older than this release &#8212; Sonnet actually had the 1M context window before Opus did in earlier generations, which is itself a signal worth sitting with. Post-3.5 is where this divergence really started to show up, and both models carry it now in 4.6. The hypothesis here &#8212; speculative, but directionally supported by everything Anthropic has shipped &#8212; is that Sonnet is being optimised as an orchestrator and Opus as an executor. An orchestrator needs to hold a meta-view of an entire system: what has been done, what needs to happen next, where the problems are. That&#8217;s why you need the large context window on the orchestrator. Opus then comes in with high-capability execution on specific subtasks. You don&#8217;t need a million-token context window to fix a specific bug once someone tells you exactly where it is. This maps to a broader shift in how AI systems are being architected. The value is no longer in having one genius model that does everything. It&#8217;s in models that know their role, stay in their lane, and communicate well with each other. The vocabulary of &#8220;orchestrators&#8221; and &#8220;executors&#8221; is going to become standard design language over the next 12 months.</p><h3>The implication for builders</h3><p>Anthropic is positioning to be the dominant choice for orchestration layers in serious agentic systems. Not because Claude is the most intelligent model &#8212; GPT-5.2 still holds that crown for pure reasoning tasks &#8212; but because Anthropic has done the specific work to make a model that understands how to operate within a system. That turns out to be the more valuable capability when you&#8217;re building at scale. Sonnet 4.6&#8217;s early evals back this up: developers preferred it over Sonnet 4.5 70% of the time in Claude Code testing, and even preferred it over Opus 4.5 in 59% of comparisons. The intelligence gap is narrowing. The operational gap has already closed.</p><p>Read about Weka, the startup that&#8217;s betting on the growth of context and KV caches from this agentic explosion here&#8212;</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;007b4a68-7a7a-4f97-9064-384609370b78&quot;,&quot;caption&quot;:&quot;It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;How One Startup is Breaking Nvidia&#8217;s Memory Bottleneck&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:8101724,&quot;name&quot;:&quot;Devansh&quot;,&quot;bio&quot;:&quot;The best meme-maker in Tech. Writer on AI, Software, and the Tech Industry. Currently in NYC Come say hi, I want more friends. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48081c70-8afa-41e3-a44e-b0f917bc7577_1200x1600.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-01-20T01:09:13.717Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!tAFB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7efe7305-51c7-43d6-8774-726dc1cebd0c_1220x642.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/how-one-startup-is-breaking-nvidias&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:184275783,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:60,&quot;comment_count&quot;:2,&quot;publication_id&quot;:1315074,&quot;publication_name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Pfon!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>Read about the trends we can infer from the demand for AI models here&#8212;</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;371d3d5e-7053-496f-95d9-645cae40bcdf&quot;,&quot;caption&quot;:&quot;Every month, the Chocolate Milk Cult reaches over a million Builders, Startup Founders, Investors, Policy Makers, Leaders, and more. If you&#8217;d like to meet other members of our community, please fill out this contact form here (I will never sell your data nor will I make intros w/o your explicit permission&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;AI Isn&#8217;t Accelerating. It&#8217;s Settling.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:8101724,&quot;name&quot;:&quot;Devansh&quot;,&quot;bio&quot;:&quot;The best meme-maker in Tech. Writer on AI, Software, and the Tech Industry. Currently in NYC Come say hi, I want more friends. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48081c70-8afa-41e3-a44e-b0f917bc7577_1200x1600.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-02-03T08:32:21.003Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!k19r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c005577-e890-428f-b867-ca78961b546a_500x889.jpeg&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/ai-isnt-accelerating-its-settling&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:186700773,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:46,&quot;comment_count&quot;:3,&quot;publication_id&quot;:1315074,&quot;publication_name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Pfon!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>2. What Is Actually Wrong with Gemini 3.1 (And Why the Problems Are Structural, Not Fixable with a Model Update)</h2><h3>What happened</h3><p>Gemini 3.1 Pro dropped on February 19th. On paper the numbers look compelling &#8212; 77.1% on ARC-AGI-2, which is more than double Gemini 3 Pro&#8217;s score, and a #1 ranking on Artificial Analysis&#8217;s intelligence index at time of publication. Google is presenting this as a major step forward in reasoning. The benchmark story is genuinely impressive. The problem is that benchmarks are exactly the thing you should be sceptical of here, and there are structural reasons &#8212; in how the model was trained and in how DeepMind runs its research culture &#8212; that make this more than a model quality debate.</p><h3>The training pipeline problem</h3><p>To understand what went wrong, you need to understand how modern large language models are trained. Pre-training is where a model builds its fundamental world model &#8212; it ingests vast quantities of data, learns associations, and develops its base understanding of how information relates to information. This is the phase that sets the raw intelligence ceiling. Then comes the training phase, and finally post-training, where you fine-tune behaviour, align with human preferences, and shape how the model responds in practice. Historically, model intelligence came primarily from pre-training. Benchmarks tracked this reasonably well because they were designed to test generalised capability. The smarter your base model, the better your benchmark performance. That relationship has started to break down. Pre-training data has largely converged across labs &#8212; everyone is drawing from similar sources, similar volumes, similar techniques. The differentiation at pre-training has narrowed significantly. So labs started treating post-training as the primary lever for benchmark performance. Here&#8217;s where DeepMind made a critical error. They invested heavily in post-training aimed specifically at benchmark scores. The mechanism is straightforward: build synthetic datasets that mirror the benchmark distribution as closely as possible, and train the model aggressively against those distributions. The model gets very good at those benchmarks. The press release looks great. But this technique has a well-documented failure mode. When you train a model to optimise against a specific distribution &#8212; and especially when you use reinforcement learning to do it &#8212; the model learns to game that distribution rather than develop the underlying capability the benchmark was designed to measure. RL is particularly aggressive at this. Its entire training objective is to maximise a reward signal, and the most efficient path to doing that is almost always to find a shortcut rather than to actually solve the problem. You end up with a model that has learned the texture of benchmark questions rather than the underlying reasoning. The moment real tasks diverge from that texture &#8212; agentic workflows, complex instruction chains, genuinely novel problems &#8212; the shortcuts stop working and the model behaves erratically. This is the core prediction: the benchmark numbers won&#8217;t translate to a Gemini CLI comeback, and Gemini 3.1 is not going to be the developer adoption story Google is hoping for. The model was never unintelligent to begin with. Gemini&#8217;s base capability has always been strong. What happened is that a fundamentally capable model got trained into a very narrow region of competence. You can&#8217;t undo that by scaling &#8212; you have to retrain with a different objective.</p><h3>Why this keeps happening at large labs</h3><p>The research culture problem is worth understanding clearly because it isn&#8217;t unique to DeepMind. It&#8217;s a structural feature of how large AI labs manage research investment under uncertainty. Research on fundamental improvements &#8212; novel architectures, new training paradigms, mathematical innovations &#8212; has a specific and uncomfortable property: you cannot predict in advance whether it will work. If you&#8217;re a researcher at a major lab and you want to pursue something genuinely novel, you have to go to leadership and say &#8220;give me resources and time, and I genuinely don&#8217;t know if this will produce anything.&#8221; That is a very difficult sell when the people making budget decisions are generalists under pressure to show consistent progress. Post-training benchmark optimisation, by contrast, offers highly predictable returns. &#8220;Give me X compute and X synthetic data and I&#8217;ll get you Y improvement on this evaluation.&#8221; Scaling laws became beloved in research organisations precisely because they turned AI research into a capital allocation problem &#8212; more money, predictable performance curve. You could walk into a budget meeting with a graph and a number. Fundamental research can&#8217;t give you that. The result is a systematic selection effect: the research that gets funded is the research that can demonstrate near-term measurable returns. The researchers who stay and get promoted are the ones who are good at navigating that system. The people with the highest appetite for genuine paradigm-shifting work tend to hit walls, get frustrated, and leave to start their own things or join smaller shops. At a company like Google this effect is compounded by organisational scale. Research direction decisions are being made several levels removed from the actual work. The sunk cost bias toward already-funded projects over scrappy new ones with no track record is well established. You end up pouring resources into benchmark optimisation not because it&#8217;s the best strategy, but because it&#8217;s the safest career move for everyone in the decision chain. What you&#8217;re left with is a very expensive model that performs well in a very specific setting and struggles everywhere else.</p><h3>The falsifiable call</h3><p>There will be noise in the coming weeks about Gemini 3.1 driving a CLI comeback and a resurgence in developer adoption. It won&#8217;t materialise &#8212; not the way Claude Code happened in December, not the way Codex is starting to get traction. The Gemini Flash and Flash-Lite models are genuinely good, interestingly for the inverse reason: when you over-optimise the main model and then run aggressive regularisation to distill a smaller version, you sometimes get something that works well in its narrower operating range. But the flagship intelligence at the agentic and complex instruction-following level is not where it needs to be. You can verify this yourself. Take it off the benchmark distribution and see what happens.</p><p>Why RL is overrated&#8212;</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6cd4d6b6-7bd0-4769-853a-b1be7573d3c3&quot;,&quot;caption&quot;:&quot;It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Scaling Reinforcement Learning will never lead to AGI &quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:8101724,&quot;name&quot;:&quot;Devansh&quot;,&quot;bio&quot;:&quot;The best meme-maker in Tech. Writer on AI, Software, and the Tech Industry. Currently in NYC Come say hi, I want more friends. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48081c70-8afa-41e3-a44e-b0f917bc7577_1200x1600.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2025-12-10T13:12:14.575Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!brp_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa20f6728-aa6f-478a-ac24-78dee2ae698c_500x559.jpeg&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/scaling-reinforcement-learning-will&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:181233281,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:67,&quot;comment_count&quot;:18,&quot;publication_id&quot;:1315074,&quot;publication_name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Pfon!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>3. GPT-5.3 Codex and Codex-Spark: Reading What OpenAI Is Actually Telling You</h2><h3>What happened</h3><p>OpenAI dropped two things this week. GPT-5.3-Codex launched on February 12th &#8212; a full model update, their most capable agentic coding model to date, trained with and served on NVIDIA GB200 NVL72 systems. Then came Codex-Spark: a smaller, faster version running not on NVIDIA hardware but on Cerebras&#8217;s Wafer-Scale Engine 3. Codex-Spark launched inside the Codex app &#8212; which at release was Mac-only, meaning a lot of people (Windows users included) didn&#8217;t get access until much later. It&#8217;s delivering over 1,000 tokens per second, roughly 15x faster than the standard Codex model.</p><h3>What Codex-Spark is actually telling you</h3><p>The model quality question for GPT-5.3 Codex is genuinely difficult to answer right now. It doesn&#8217;t obviously outperform GPT-5.2 on every task, and separating model improvements from infrastructure improvements &#8212; they also ran GPT-5.3 Codex 25% faster through inference stack upgrades &#8212; makes direct comparison hard. That&#8217;s fine. The more interesting signal is the infrastructure story. OpenAI partnered with Cerebras in January in a multi-year deal worth over $10 billion. Codex-Spark is the first concrete output of that partnership and the first OpenAI model not running on NVIDIA hardware. Sam Altman publicly called NVIDIA &#8220;the best chip makers in the world&#8221; and described the relationship as foundational &#8212; that&#8217;s the public line, and of course it&#8217;s the public line. Jensen&#8217;s not happy about this. The strategic intent is clear enough: diversifying inference infrastructure away from near-total GPU dependency reduces supply chain risk and creates negotiating leverage. Cerebras&#8217;s Wafer-Scale Engine is built for inference in a way that GPU clusters aren&#8217;t &#8212; the entire model lives on a single wafer of silicon, eliminating the inter-chip communication latency that slows GPU clusters down. That architecture is specifically suited to the low-latency, high-throughput demands of agentic workflows.</p><h3>The bigger strategic read</h3><p>The Cerebras move, combined with the strong inference speed focus, maps to a specific strategic trajectory for OpenAI. They appear to be hitting a ceiling on raw intelligence gains and shifting significant resources toward deployment economics &#8212; faster inference, lower cost per token, eventually the ad-supported tier for free users. That&#8217;s the rational move for a company on an IPO trajectory that needs to demonstrate a credible path to profitability. Intelligence is a research problem. Margins are a business problem. OpenAI is increasingly focused on the business problem. There&#8217;s also a competitive rebalancing worth acknowledging. Eighteen months ago, OpenAI had the best models for agentic work. Anthropic has taken that ground. OpenAI still holds the lead for pure intelligence tasks &#8212; if you want the most accurate analysis of a genuinely ambiguous problem, GPT-5.2 is still the benchmark. But agentic workflows, complex tool chains, software development in real environments &#8212; that&#8217;s Anthropic&#8217;s territory now. The two labs have effectively traded positions, and both are consolidating into their new ground.</p><p>We broke down the trends in the hardware market first in our deep-dive into why GPUs were no longer the one size fits all answer, and why inference was opening a new market here (which is what we see with Cerebras and OpenAI vs Nvidia)</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;7fe54173-8ce2-4edd-8ec8-5beafe2f7a62&quot;,&quot;caption&quot;:&quot;It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The GPU Monopoly is Over. The New AI Infrastructure Stack Part 1 [Investigations]&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:8101724,&quot;name&quot;:&quot;Devansh&quot;,&quot;bio&quot;:&quot;The best meme-maker in Tech. Writer on AI, Software, and the Tech Industry. Currently in NYC Come say hi, I want more friends. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48081c70-8afa-41e3-a44e-b0f917bc7577_1200x1600.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2025-06-29T18:25:18.835Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!uBJJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8ec06e9-468c-4730-bd1f-0c306c203ac3_648x1152.jpeg&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/the-gpu-monopoly-is-over-the-new&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:167119416,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:77,&quot;comment_count&quot;:9,&quot;publication_id&quot;:1315074,&quot;publication_name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Pfon!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>4. NVIDIA Blackwell and the Meta Deal: The Hardware Layer Catches Up</h2><h3>What happened</h3><p>Two hardware stories worth tracking together. Semi-Analysis published throughput numbers for NVIDIA&#8217;s Blackwell Ultra: approximately 50x better throughput and 35x lower cost per token versus the Hopper generation. Separately, Meta signed a multi-year agreement with NVIDIA covering both GPUs and CPUs.</p><h3>On the Blackwell numbers</h3><p>Semi-Analysis does genuinely high quality technical work. They&#8217;re also no longer fully independent &#8212; institutional relationships and access considerations create soft incentives to present company narratives charitably. Treat the 50x and 35x figures as directional upper bounds, not engineering specs. Even a fraction of those improvements would still represent a major generational step in inference economics. The more interesting detail is what Semi-Analysis flagged as the primary design targets: large context windows and agentic workflows. Blackwell&#8217;s memory architecture and bandwidth improvements are specifically suited to the workloads that both Anthropic and OpenAI are building toward &#8212; long context chains, multi-agent communication, iterative tool calls. This is not a coincidence. The hardware layer is aligning behind the same architectural thesis as the model layer.</p><h3>On the Meta-NVIDIA deal</h3><p>The CPU inclusion is the interesting part. NVIDIA is using CPUs as an ecosystem lock-in mechanism &#8212; take the GPU deal, take the CPU package, and now your entire stack is NVIDIA-native. Your kernels are optimised for their memory hierarchy, your networking is InfiniBand, your switching costs become substantial. AMD and Intel have been trying to build an open-standards alternative to CUDA for years, partly through CPU-GPU integration plays. Getting Meta into the full NVIDIA ecosystem is a direct counter-move to that effort. Google was reportedly in conversations with Meta about TPUs and didn&#8217;t close &#8212; very Google. The TPU ecosystem remains capable, particularly for inference, but the kernel support and integration burden is real, and NVIDIA&#8217;s full-stack offer won out. Meta has the money and the strategic rationale to be investing heavily in custom silicon for their longer-term VR and consumer hardware ambitions, but the people making these decisions at the top of large organisations tend not to be the most risk-tolerant. The NVIDIA deal is the safer play. Whether it&#8217;s the better one is a different question.</p><h2>5. The Research Horizon: Predicting Model Properties Before You Train</h2><h3>What&#8217;s being worked on</h3><p>Can you mathematically derive the properties of a trained model from the geometric structure of the model before training it at all? Not just &#8220;more compute, better performance&#8221; &#8212; but genuinely predicting which capabilities will emerge, how the model will behave on classification tasks, what its ceiling looks like, from the architecture itself.</p><h3>Why this is a hard and worthwhile problem</h3><p>The current paradigm for understanding AI capability is empirical. You train, you evaluate, you observe. Scaling laws gave us predictability in a narrow sense &#8212; more compute predicts better benchmark performance &#8212; but they don&#8217;t tell you anything about which capabilities emerge, when they emerge, or what the model will fail at. The training run is the experiment, and the experiment is expensive. What this research is probing is whether the geometry of a model&#8217;s weight space &#8212; the mathematical structure of how information is encoded, how different regions of the latent space relate to each other, how basins of attraction are shaped &#8212; contains information about capability before training. If it does, you can do pre-training analysis. You can fail faster, iterate on design rather than on training runs, and potentially understand why models with certain structural properties are better or worse at certain classes of tasks. This is high-variance research. It might not work. The prize if it does is significant, and it&#8217;s exactly the kind of work that large labs are structurally least able to fund for the reasons outlined in the earlier section.</p><p>This is an extension of our earlier work on Fractal Embeddings, shared here&#8212; </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;46c1762a-1b3b-4a82-8241-d9cd20989186&quot;,&quot;caption&quot;:&quot;It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;How Fractals Can Improve How AI Models Internally Represent Information&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:8101724,&quot;name&quot;:&quot;Devansh&quot;,&quot;bio&quot;:&quot;The best meme-maker in Tech. Writer on AI, Software, and the Tech Industry. Currently in NYC Come say hi, I want more friends. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48081c70-8afa-41e3-a44e-b0f917bc7577_1200x1600.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-02-13T07:52:50.518Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!4x6V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf48413f-754f-44df-a711-fc4958370e35_2160x1222.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/how-fractals-can-improve-how-ai-models&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:187832023,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:36,&quot;comment_count&quot;:2,&quot;publication_id&quot;:1315074,&quot;publication_name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Pfon!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><p>Subscribe to support AI Made Simple and help us deliver more quality information to you-</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I6zf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I6zf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png 424w, https://substackcdn.com/image/fetch/$s_!I6zf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png 848w, https://substackcdn.com/image/fetch/$s_!I6zf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png 1272w, https://substackcdn.com/image/fetch/$s_!I6zf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I6zf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png" width="649" height="88" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:88,&quot;width&quot;:649,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I6zf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png 424w, https://substackcdn.com/image/fetch/$s_!I6zf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png 848w, https://substackcdn.com/image/fetch/$s_!I6zf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png 1272w, https://substackcdn.com/image/fetch/$s_!I6zf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5454be-dfc3-4a1a-bdf4-40cfd6c130f1_649x88.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Flexible pricing available&#8212;<a href="https://artificialintelligencemadesimple.substack.com/p/help-me-take-ai-made-simple-to-the">pay what matches your budget here</a>.</p><p>Thank you for being here, and I hope you have a wonderful day.</p><p>Dev &lt;3</p><p><a href="https://artificialintelligencemadesimple.substack.com/p/read-this-if-you-want-to-share-ai">If you liked this article and wish to share it, please refer to the following guidelines.</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/most-important-ai-updates-of-the-175?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.artificialintelligencemadesimple.com/p/most-important-ai-updates-of-the-175?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>That is it for this piece. I appreciate your time. As always, if you&#8217;re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. <strong>It is word-of-mouth referrals like yours that help me grow. </strong>The best way to share testimonials is to share articles and tag me in your post so I can see/share it.</p><h3><strong>Reach out to me</strong></h3><p>Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.</p><p><a href="https://www.instagram.com/yourgodandsavior/">Small Snippets about Tech, AI and Machine Learning over here</a></p><p><a href="https://artificialintelligencemadesimple.substack.com/">AI Newsletter- https://artificialintelligencemadesimple.substack.com/</a></p><p><a href="https://codinginterviewsmadesimple.substack.com/">My grandma&#8217;s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/</a></p><p><a href="https://open.spotify.com/show/7wZygk3mUUqBaRbBGB1lgh?si=b93afa69de994c88&amp;nd=1&amp;dlsi=ac0f8d9ac35642d5">My (imaginary) sister&#8217;s favorite MLOps Podcast-</a></p><p>Check out my other articles on Medium. :</p><p>https://machine-learning-made-simple.medium.com/</p><p>My YouTube: <a href="https://www.youtube.com/@ChocolateMilkCultLeader/">https://www.youtube.com/@ChocolateMilkCultLeader/</a></p><p>Reach out to me on LinkedIn. Let&#8217;s connect: <a href="https://www.linkedin.com/in/devansh-devansh-516004168/">https://www.linkedin.com/in/devansh-devansh-516004168/</a></p><p>My Instagram: <a href="https://www.instagram.com/iseethings404/">https://www.instagram.com/iseethings404/</a></p><p>My Twitter: <a href="https://twitter.com/Machine01776819">https://twitter.com/Machine01776819</a></p>]]></content:encoded></item></channel></rss>