Discussion about this post

User's avatar
Sirsh's avatar

this is a great article. good work. i love this idea. i spent most of my phd in the frequency domain so to speak - this fractal structure of language is very cool thing to think about. im seeing that this works out of the box by exploiting some of this natural structure in language and getting better performance means achieving higher order corrections via whatever tricks. its reminding me somehow of the esoteric renormalization group which i spent a bit of time thinking about. exciting stuff

Thomas Cherickal's avatar

Brilliant article, as always.

I believe that (F)FTs and FFNs have a huge role to play in optimizing Transformers and LLMs.

It's likely that as research becomes deeper, we will find increasingly higher performance optimizations to the existing architecture.

Of course, edge computing, local LLMs, robotics, and IoT products will enjoy much greater improvement in performance with far less compute.

I am pumped and excited by that thought!

Amazing article, Devansh.

You're my favorite AI writer, not just on Substack, but on the entire Internet.

The way you really make complex concepts simple is incredible and noteworthy.

Incredible because it educates your large audience without boring or losing them.

Noteworthy because you are helping the entire Internet to understand these concepts without getting lost in the technical details.

Great work!

10 more comments...

No posts

Ready for more?