Discussion about this post

User's avatar
Michael Power's avatar

You see the problem clearly. Benchmarks are gamed. Tokenmaxxing is a lie. Popularity metrics are manufactured. Deployment partnerships are expensive and not scalable. You describe every symptom with precision. But you cannot see the solution. You are doing triple somersaults to measure input — tokens, benchmarks, virality, FDEs — when the answer is simple. Measure output. Solved problems. Julies. (I have written on this extensively in my own Medium). Measure Brainpower. 1 Julie = 200 joules. Reproducible. Verifiable. Comparable. The industry is breaking its back trying to infer value from everything except value itself. Stop measuring the flour. Start measuring the bread. The Julie is not complicated. It is just honest.

David's avatar

Good points all. As with any new tech, the industry needs to be flexible and realistic in developing benchmarks. Value, useful output (call it intelligence delivered) is becoming the new and real measure. In 12-18 months the real measure might be something new.

5 more comments...

No posts

Ready for more?