Discussion about this post

User's avatar
Emanuel Maceira's avatar

This is one of the best breakdowns of the on-device memory bottleneck I've read. The KV cache math alone should be required reading for anyone pitching "edge AI" products.

From the IoT side, the signal-to-noise problem you flag at the end is the real unlock. We work with multi-sensor edge gateways that ingest ambient telemetry 24/7 -- temperature, vibration, RF signal quality, device logs. Easily 95% of those tokens are noise. Today we solve it with dumb heuristic filters before anything hits a model, but the idea of architectures that can natively refuse to process irrelevant context is exactly what the next generation of smart edge devices needs.

The hardware fragmentation point also hits hard. We see this constantly across Qualcomm, MediaTek, and even custom RISC-V silicon in industrial IoT. The fact that STAR rejected every SSM variant once real device profiling was in the loop is a massive signal -- theoretical FLOPs mean nothing if the silicon can't execute the operations efficiently.

Curious: do you see the STAR approach eventually becoming an open standard, or does Liquid AI's moat depend on keeping that search infrastructure proprietary?

NV's avatar

mind boggling and OHT :)

3 more comments...

No posts

Ready for more?