Mani Pal

Engineer-researcher

Mani Pal

LLM systems, CUDA kernels, inference optimization, compression, interpretability, and distributed AI infrastructure.

Sparse Models / 2025-10

Sparse Models Fail Quietly Before They Fail Loudly

Sparse MoE systems can look healthy on loss curves while the router is already collapsing. Entropy and load metrics need to be first-class.

9 min

MoERouting EntropyScaling

Outline

  • Expert utilization as a health signal.
  • z-loss threshold behavior.
  • Matched-FLOP benchmarking.
  • What to log before scaling up.

References