Playback speed

Share post at current time

0:00

Transcript

Most important AI updates of the week 8th August to 15th September 2025[Livestream]

OpenAI is going to IPO, Mira Murati fights Hallucinations and more.

Sep 16, 2025

is Thank to everyone for showing up the live-stream. Mark your calendars for 8 PM EST, Sundays, to make sure you can come in live and ask questions.

Bring your moms and grandmoms into my cult.

Before you begin, here is your obligatory reminder to adopt my foster monkey Floop. He’s affectionate, relaxed and can adjust to other pets, kids, or people. No real reason not to adopt him. So if you’re around NYC, and want a very low maintenance but affectionate cat— then consider adopting him here.

Community Spotlight: Us

Our community crossed over 250K followers for our deep-dives (238K on Substack; 18K email subs on Medium; 36K Followers on Medium). Planning to do a “How I write this newsletter” style piece. Lmk if you have any questions you’d like me to cover.

If you’re doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you’ve written, an interesting project you’ve worked on, some personal challenge you’re working on, ask me to promote your company/product, or anything else you consider important. The goal is to get to know you better, and possibly connect you with interesting people in our chocolate milk cult. No costs/obligations are attached.

Additional Recommendations (not in Livestream)

Jess Leão
always has great overviews of some important AI discussions in the week. She’s pretty selective with her picks and gives a decent level tof coverage to each topic, so her work is pretty good for getting a handle on discussions. Check this weeks coverage over here.
Should Every Woman Be on Hormone Therapy? by
Maryann
is an incredibly powerful read.
Conrad Gray
has great coverage of important resources over here.
The Turing Post team popped off in this discussion of the different chips required to run AI. Nothing but respect to
Ksenia Se
and the others involved.
Latest News on TechBio Revolution 🌳by
Marina T Alamanou
covers amazing information (as usual for her).
the cost of revolution is you is a fantastic video about ethics, winning, and privilege in the context of Code Geass. Great writing and presentation; the ideas discussed what is already a peak show.
The Carpenter Who Outsmarted Newton is a great (true) story about how obsession beats “intelligence” and how breakthroughs come from all walks of life. It’s why accessibility and opportunity matter.
Giles McMullen
Learning Molecular Representation in a Cell is a great demonstration of how AI can be used to actually solve hard problems.
Interesting Engineering ++
puts out a lot of heat. Luth, LFM2, and the Evolution of LLM/SLM Optimization Strategies is no exception.
5 interesting AI Safety & Responsibility papers (#2) by
AI Policy Perspectives
is very thought-provoking.

Companion Guide to the Livestream

This guide expands the core ideas and structures them for deeper reflection. Watch the full stream for tone, nuance, and side-commentary.

0. Buzz Cuts are the Superior Hair Cut.

Why everyone should have buzz cuts.

1. OpenAI IPO: Not Just Cash, But Structure

OpenAI’s decision to transition into a Public Benefit Corporation (PBC) isn’t cosmetic—it rewires the organization for the capital markets. A PBC means the nonprofit board still holds control, but the company can now access new classes of equity and debt financing. For investors, this means OpenAI can finally join peers on the public markets without abandoning its unusual governance.

Why it matters:

Selective disclosure: expect an IPO prospectus that shows carefully chosen financials while shielding the messier parts of its nonprofit + for-profit hybrid.
Wall Street’s tempo: once listed, OpenAI will be yoked to quarterly earnings calls, revenue targets, and analyst expectations. That’s a cultural shock to a lab that grew up evangelizing mission over margin.
Microsoft’s grip loosens: by moving into multi-cloud and multi-model arrangements, OpenAI reduces dependency on Azure—shifting bargaining power.

The rumored 2026 IPO is plausible, but still speculation. What’s certain is that this PBC move is the structural precondition for going public.

Read more:

2. Wealth Games & Financial Engineering

Larry Ellison’s net worth briefly surpassed Elon Musk’s as Oracle stock surged. That moment was more than a vanity metric—it showed how financial engineering is now weaponized in tech. Oracle’s narrative maneuvers drove valuations upward, punishing competitors by proxy.

This is the new pattern: valuations are no longer just a scoreboard, they are instruments of influence. Expect more boardrooms treating stock price jumps like strategic victories rather than side effects.

Read more:

3. Which Models Actually Lead

Marketing hype doesn’t match lived performance. The current working hierarchy is:

Gemini – consistently the strongest “average case.” Excellent research capacity, reliable outputs across many domains.
GPT-5 – better at planning, instruction following, and adapting flexibly. Slightly weaker at front-end coding compared to Claude.
Claude – overly formal, stiff, and frustrating for exploratory or risky tasks.

Beyond the top three:

Manus – excels in full-stack agentic workflows, where tool orchestration matters more than raw text output.
Qwen3 / Qwen3-Coder – open-weight models with competitive performance. Their long context windows and coding specialization make them strong candidates for local deployments.
Cohere – expensive and strategically adrift. Their rerankers are marginal at best compared to more efficient retrieval pipelines.

The competitive edge now is less about raw “IQ” and more about flexibility, tool use, and memory.

Read more:

4. Tool Use vs Scale → Verifier Economies

Research keeps confirming what builders already know: smaller models with tools outperform larger models without them. A lean LLM connected to calculators, retrievers, and reasoning chains can rival or beat state-of-the-art giants.

But there’s a trap: tools amplify errors. If a model can’t recognize when a tool misfires, the error cascades. That’s why the next evolution isn’t just “agents” but verifier economies—systems where models audit their own reasoning paths, scrap dead ends, and recalibrate in real time.

This isn’t just an academic point. It’s the foundation of the next infrastructure layer: whoever solves verifiers gains control over how agentic stacks scale safely and reliably.

Read more:

5. Stability and Hallucinations

Two key research thrusts:

Mira Murati’s Thinking Labs: pushing determinism—same input, same output. This matters because reliability trumps brilliance in regulated domains. A model that’s merely “good” but predictable is more valuable than one that’s brilliant but inconsistent.
OpenAI’s hallucination framing: hallucinations emerge because models are rewarded when correct but not penalized when wrong. That incentive design encourages guessing over calibrated uncertainty.

Both are partial fixes. The deeper flaw is early-path lock-in: once a model starts down a bad reasoning branch, it rationalizes and doubles down instead of correcting. The solution will require layered interventions: determinism, uncertainty calibration, verifier loops, and self-reward systems. I think Mira Murati’s approach will be closer to fix that, but I think there needs to be a lot of more than determinism to “fix” hallucinations.

Read more:

6. Regulation: Noise vs Strategy

Not all regulation is equal.

California SB-53: requires AI companies to report “critical incidents” within 15 days (or 24 hours if imminent harm). It’s more focused than prior bills, but risks turning into compliance noise that distracts from deeper risks.
EU: strong on fairness and data privacy, but weak in pushing future-facing AI R&D.
China: aggressively mandates open source and steers AI into hard sciences—an industrial policy approach rather than a laissez-faire one.
U.S.: using AI as geopolitical leverage—exporting full “AI stacks” and buying stakes in Intel to reduce supply chain exposure.

The divergence is stark: some jurisdictions legislate distraction, others legislate power.

And here is the meme promised.

r/stupidpol - World of Statistics feed The EU just passed China on GDP EU: $19.99 trillion China: $19.23 trillion A GLIMPSE INTO THE FUTURE? CHINA'5 CYBER?UNK imglip com CItyI

Read more:

7. Capital Misallocation

The venture landscape looks vibrant, but the capital flows are misaligned. Billions chase AI CRM clones, “copilot for X,” and trivial SaaS assistants, while nuclear simulations, personalized medicine, and materials discovery scrape by.

This isn’t just boring—it’s corrosive. Capital locked into shallow tools diverts talent, compute, and investor patience away from harder, higher-impact work. Short-term profit incentives dominate, leaving the grand promises of AI (climate modeling, drug design, infrastructure optimization) underfunded.

Read more:

CBInsights: AI SaaS funding tracker
Nature: Limits of current AI research focus
1984 (the book).

8. Chips, Robotics, and Sovereignty

Robotics and chips are two sides of the same coin. As robotics grows more advanced, it exposes chip bottlenecks; as chips advance, they unlock new robotic capabilities. This symbiosis is why governments treat them as sovereignty levers.

Current moves:

Apple Intelligence: designed to process as much as possible on-device, with Private Cloud Compute only when necessary.
Google Gemini Nano: runs offline on Pixel devices, demonstrating device-level autonomy.
Samsung’s $44B Texas fab: emblematic of U.S. efforts to reduce dependence on TSMC and Taiwan.

The deeper implication: these aren’t just product decisions—they are geopolitical strategies. Hardware independence is becoming synonymous with national security.

Read more:

Apple on Private Cloud Compute
Google on Gemini Nano for Pixel
Reuters: Samsung’s Texas fab expansion
Austin Lyons
puts out some brilliant stuff on hardware. One of my best subscriptions over here to keep up with things.

9. Copyright & Data

Anthropic’s proposed $1.5B settlement ($3,000 per book) with authors stalled after Judge Alsup rejected it, citing the need for a proper title list and a fairer claims process.

Why it matters: this case is precedent-setting. If upheld, it would force AI companies to consistently pay for licensed training data. If weakened, it signals that scraping and settlement is an acceptable cost of doing business.

Read more:

10. Security Undercurrents

DeepSeek was caught with an exposed database—API keys, logs, internal data all left unsecured. It validated concerns that many AI vendors rush to scale without hardening pipelines.

Lesson: local deployment may reduce dependence, but even open-weight models are only as safe as their operational hygiene.

Read more:

Subscribe to support AI Made Simple and help us deliver more quality information to you-

Flexible pricing available—pay what matches your budget here.

Thank you for being here, and I hope you have a wonderful day.

Dev <3

If you liked this article and wish to share it, please refer to the following guidelines.

That is it for this piece. I appreciate your time. As always, if you’re interested in working with me or checking out my other work, my links will be at the end of this email/post. And if you found value in this write-up, I would appreciate you sharing it with more people. It is word-of-mouth referrals like yours that help me grow. The best way to share testimonials is to share articles and tag me in your post so I can see/share it.