What are the tells? A lot of the tells are just common tells of writing. I've been writing for a long time( pre AI) so you could always just check against that if you want.
I had the same feeling when was reading this (fantastic!) article. It gave me ~97% confidence that the (great!) ideas and insights are all from a human, but writing is heavily redacted by an LLM.
After thinking more, it is also very well possible that you unconsciously absorbed the LLM style and rhetorical devices, or just happen to use similar ones.
In either case, this LLM flavor felt like a huge distraction from the content and certainly did not contribute to its quality. Which is, happy to repeat, still great.
It is not that there is one single tell, more like tens of tells sprinkled across, and my "LLM-flavor" radar was gradually accumulating the alarm sound until it became very clear towards the first quarter of the text.
There are a lot of "it is not X, it is Y" patterns. Really a lot. Tens of them. That was probably the biggest thing. It feels dramatically overused and gives a sense of limited "rhetorical device toolkit".
Then, the general energy - lots of short, punchy, not entirely finished sentences. This is a good device, but in moderation, and it is bit overused in the text.
The last one is hard to describe, but the words kind of connect to each other in a superhumanly smooth way. The best writers connect words in delighfully beautiful ways, but it does not feel "smooth", more like "spiky". Great human prose gives high amount of surprisal per word.
Again, it is possible that it is 100% authentically written by you. I don't think it matters, what matters is diversity, unpredictability, and sometimes imperfection of language patterns, and the article could have more of those. And, importantly, moderation: not every sentence needs to astonish the reader, and it is OK to be plain and boring sometimes.
I'm not going to say 100%, but it's mostly written by me. What you perceive as LLM tells have a much easier explanation --
1. Contrastive Emphasis -- it's not a it's B-- is a very common technique especially when you have discussions trying to debunk/disagree with things. This whole article is that. If you read some of my other articles you'll see that I don't really use that technique that much when it's not useful. Also counting the text in the images (with research) this article clears 50k words. So tens of them is actually not a lot.
2. I keep my sentences mostly short because that allows me to add emphasis. The rule I follow is one idea per paragraph. Since we run through a lot of ideas, this creates a lot of paras and standalone sentences.
3. Re spiky writing-- I'm not a writer. I'm a researcher. My goal isn't to play with language as much as it is to communicate ideas clearly. Which is why I optimize for clarity over everything else.
I do use LLMs to aggregate a lot of research and organize talking points but generally most of my articles are mostly written by me. I write about cutting edge research, which LLMs don't understand and have lots of hallucinations about so LLMs aren't the most helpful for the domain.
Fun fact a lot of llms hate my writing style because of the tangents, references, takes etc .
My biggest takeaway: it seems that RL is less about fixing bugs and more about bribing the model with rewards until it behaves. You can’t just ‘patch the brain’ without paying the retraining bill. And that bill is super, super expensive and the resulting model does not even generalise to other domains. Reward is not enough after all.
Also despite being quite dense, this piece is full of wit and is a very accessible and fun read. Great work!
Excellent breakdown of RL's economic collapse. The cost-per-skill arithmetic is brutal and not talked about enough. Each narrow capability eating tens of millions in compute with zero transfer is basically like building a company where every new feature requires rebuilding the entire infra from scratch. The moment you realize AlphaGo's 45k years of experience cant even play checkers, the whole "scale to AGI" narrative kinda falls apart. Whatreally got me though is how foundation models inadvertently prove the point by doing all the heavy lifting through pretraining while RL just steers at the end.
who exactly is saying pure RL scaling will lead to AGI / ASI? Outside of non-technical, simplistic media narrative, I feel like nobody is really making such statements. Even Amodei / Hassabis aren’t saying RL scaling = eventual goal (whatever name you want to slap on). Of course, most recently, LeCun, Fei-Fei Li, Karpathy, and Ilya all have made waves saying neither LLM nor LLM+RL will get you there.
Look at the capital allocations for RL environments and the investments into Reinforcement Learning for FMs. That is the implicit assumption across the board
I fundamentally disagree that CapEx = belief in path to AGI.
There are two parallel sets of activities for all these frontier labs:
1. continue to make more and more money
2. eventually be the lab to get to AGI
Today’s CapEx investment is really about #1. The investment thesis here is:
“there is a highly competitive landscape for frontier models to achieve x benchmark and win the vibes (e.g. Gemini 3), so we are going to be the lab that executes it the best. By the way, we need a whole lot of compute to continue to push performance (via RL) and to serve our customers”.
That’s a different thing from saying yes, this is exactly how we achieve AGI. You can still improve a model without achieving AGI and make a whole lot of revenue while you’re at it. Pick your timeline, some say 5 years, and some say 30 years. Either way, research and product are two different tracks.
Case in point: OpenAI says they want to achieve AGI and that’s their mission. Then why do ads? Why do erotic writing? Why even have a chatbot at all? If your sole mission is not to make money in the interim years, but solely to achieve AGI via whatever method you believe in, you should be doing like Ilya, release absolutely no product until AGI. He’s doing just fine with fundraising, as I’m sure LeCun will raise a whole lot of capital!
Another case: why would Google continue to do non-LLM and non-RL research? They’re working on world models, Titans, Nested Learning, etc. If you believe RL is it, you should stop paying millions of dollars for researchers to do other things.
Everyone sees that incrementally improved models win the minds today. You can use that as runway to work on other architectures and path to AGI.
Look at all the VCs investing in RL environments for the express purpose that this is the key to AGI.
On another note-- I've also spoken to a lot of researchers at the AI labs. There is a very strong push towards RL being the way you train systems to be better and unlock emergent intelligence. My point with writing is that even if you don't care about agi (I don't think AGI is a real thing) the investments in RL presently are not good long term bets for intelligence. Its more of a short term , predictable bet for scaling (hence the comparisons to parameter scaling).
Lastly, when you say if Google believed that RL was the future, they won't invest in anything else. That's a misunderstanding of how big organizations work. That's a misunderstanding of how big organizations work it's not as the Google is one coherent organization with an overarching director. Version says this is the research we invest into, Etc. They actually have much smaller teams and autonomous budgets Etc. So it's not especially when you're working at the size of Google all. It's not that the organization wide has to be convinced that RL is the future or not for them to put out research with the use RL or to use an alternative Etc. Cuz you could still have small research teams doing completely different things. It's like Google's not interested in solving Kansas so why do they keep doing? This is because they have some team that that is and that's true for a lot of larger organizations. You can't look at their small. You can't look at individual pieces of research and say argue about whether or not it's a company-wide priority.
Obviously written by an LLM but still a very strong argument
Not going to say yes or no, but what makes you become so confident about LLM written?
Especially since the premise is actually very against what's in the LLM databases (RL is the industry standard).
It’s intuition / pattern matching. I see some tells. It’s also probably confirmation bias, as the majority of internet content is AI generated today.
Evidence shows that AI generated [therapy/games/ads/articles] are vastly less effective for the sole reason that it is AI.
Of course, the substance is still all there. But it doesn’t matter. It’s a moral sin against humanity and an aesthetic sore.
What are the tells? A lot of the tells are just common tells of writing. I've been writing for a long time( pre AI) so you could always just check against that if you want.
I had the same feeling when was reading this (fantastic!) article. It gave me ~97% confidence that the (great!) ideas and insights are all from a human, but writing is heavily redacted by an LLM.
After thinking more, it is also very well possible that you unconsciously absorbed the LLM style and rhetorical devices, or just happen to use similar ones.
In either case, this LLM flavor felt like a huge distraction from the content and certainly did not contribute to its quality. Which is, happy to repeat, still great.
It is not that there is one single tell, more like tens of tells sprinkled across, and my "LLM-flavor" radar was gradually accumulating the alarm sound until it became very clear towards the first quarter of the text.
There are a lot of "it is not X, it is Y" patterns. Really a lot. Tens of them. That was probably the biggest thing. It feels dramatically overused and gives a sense of limited "rhetorical device toolkit".
Then, the general energy - lots of short, punchy, not entirely finished sentences. This is a good device, but in moderation, and it is bit overused in the text.
The last one is hard to describe, but the words kind of connect to each other in a superhumanly smooth way. The best writers connect words in delighfully beautiful ways, but it does not feel "smooth", more like "spiky". Great human prose gives high amount of surprisal per word.
Again, it is possible that it is 100% authentically written by you. I don't think it matters, what matters is diversity, unpredictability, and sometimes imperfection of language patterns, and the article could have more of those. And, importantly, moderation: not every sentence needs to astonish the reader, and it is OK to be plain and boring sometimes.
Also the tldr is LLM written.
I don't like writing tldr so I write the entire article and ask LLMs to generate tldrs. Hence the first quarter
I'm not going to say 100%, but it's mostly written by me. What you perceive as LLM tells have a much easier explanation --
1. Contrastive Emphasis -- it's not a it's B-- is a very common technique especially when you have discussions trying to debunk/disagree with things. This whole article is that. If you read some of my other articles you'll see that I don't really use that technique that much when it's not useful. Also counting the text in the images (with research) this article clears 50k words. So tens of them is actually not a lot.
2. I keep my sentences mostly short because that allows me to add emphasis. The rule I follow is one idea per paragraph. Since we run through a lot of ideas, this creates a lot of paras and standalone sentences.
3. Re spiky writing-- I'm not a writer. I'm a researcher. My goal isn't to play with language as much as it is to communicate ideas clearly. Which is why I optimize for clarity over everything else.
I do use LLMs to aggregate a lot of research and organize talking points but generally most of my articles are mostly written by me. I write about cutting edge research, which LLMs don't understand and have lots of hallucinations about so LLMs aren't the most helpful for the domain.
Fun fact a lot of llms hate my writing style because of the tangents, references, takes etc .
Super interesting piece.
My biggest takeaway: it seems that RL is less about fixing bugs and more about bribing the model with rewards until it behaves. You can’t just ‘patch the brain’ without paying the retraining bill. And that bill is super, super expensive and the resulting model does not even generalise to other domains. Reward is not enough after all.
Also despite being quite dense, this piece is full of wit and is a very accessible and fun read. Great work!
And thank you for the kind words
Yes
Excellent breakdown of RL's economic collapse. The cost-per-skill arithmetic is brutal and not talked about enough. Each narrow capability eating tens of millions in compute with zero transfer is basically like building a company where every new feature requires rebuilding the entire infra from scratch. The moment you realize AlphaGo's 45k years of experience cant even play checkers, the whole "scale to AGI" narrative kinda falls apart. Whatreally got me though is how foundation models inadvertently prove the point by doing all the heavy lifting through pretraining while RL just steers at the end.
Thank you
This is a great breakdown of why RL struggles but the deeper issue isn’t algorithmic.
It’s architectural.
RL tries to build intelligence from reward.
LLMs build it from prediction.
But both inherit the same limitation: they optimize inside a frozen objective.
General intelligence needs something neither paradigm provides:
a system that can update its own objective as its world model grows.
Understanding before optimization, not optimization pretending to be understanding.
Yep
who exactly is saying pure RL scaling will lead to AGI / ASI? Outside of non-technical, simplistic media narrative, I feel like nobody is really making such statements. Even Amodei / Hassabis aren’t saying RL scaling = eventual goal (whatever name you want to slap on). Of course, most recently, LeCun, Fei-Fei Li, Karpathy, and Ilya all have made waves saying neither LLM nor LLM+RL will get you there.
Look at the capital allocations for RL environments and the investments into Reinforcement Learning for FMs. That is the implicit assumption across the board
I fundamentally disagree that CapEx = belief in path to AGI.
There are two parallel sets of activities for all these frontier labs:
1. continue to make more and more money
2. eventually be the lab to get to AGI
Today’s CapEx investment is really about #1. The investment thesis here is:
“there is a highly competitive landscape for frontier models to achieve x benchmark and win the vibes (e.g. Gemini 3), so we are going to be the lab that executes it the best. By the way, we need a whole lot of compute to continue to push performance (via RL) and to serve our customers”.
That’s a different thing from saying yes, this is exactly how we achieve AGI. You can still improve a model without achieving AGI and make a whole lot of revenue while you’re at it. Pick your timeline, some say 5 years, and some say 30 years. Either way, research and product are two different tracks.
Case in point: OpenAI says they want to achieve AGI and that’s their mission. Then why do ads? Why do erotic writing? Why even have a chatbot at all? If your sole mission is not to make money in the interim years, but solely to achieve AGI via whatever method you believe in, you should be doing like Ilya, release absolutely no product until AGI. He’s doing just fine with fundraising, as I’m sure LeCun will raise a whole lot of capital!
Another case: why would Google continue to do non-LLM and non-RL research? They’re working on world models, Titans, Nested Learning, etc. If you believe RL is it, you should stop paying millions of dollars for researchers to do other things.
Everyone sees that incrementally improved models win the minds today. You can use that as runway to work on other architectures and path to AGI.
Let me make my statement more specific.
Look at all the VCs investing in RL environments for the express purpose that this is the key to AGI.
On another note-- I've also spoken to a lot of researchers at the AI labs. There is a very strong push towards RL being the way you train systems to be better and unlock emergent intelligence. My point with writing is that even if you don't care about agi (I don't think AGI is a real thing) the investments in RL presently are not good long term bets for intelligence. Its more of a short term , predictable bet for scaling (hence the comparisons to parameter scaling).
Lastly, when you say if Google believed that RL was the future, they won't invest in anything else. That's a misunderstanding of how big organizations work. That's a misunderstanding of how big organizations work it's not as the Google is one coherent organization with an overarching director. Version says this is the research we invest into, Etc. They actually have much smaller teams and autonomous budgets Etc. So it's not especially when you're working at the size of Google all. It's not that the organization wide has to be convinced that RL is the future or not for them to put out research with the use RL or to use an alternative Etc. Cuz you could still have small research teams doing completely different things. It's like Google's not interested in solving Kansas so why do they keep doing? This is because they have some team that that is and that's true for a lot of larger organizations. You can't look at their small. You can't look at individual pieces of research and say argue about whether or not it's a company-wide priority.