Mani Pal

Engineer-researcher

Mani Pal

LLM systems, CUDA kernels, inference optimization, compression, interpretability, and distributed AI infrastructure.

Mamba Architectures / 2025-12

Mamba Architectures in Hybrid LLM Training

Hybrid SSM-attention models are best treated as architectural experiments whose evaluation must cover long-context behavior, tokenizer behavior, and deployment cost together.

11 min

MambaSSMLLM Training

Outline

  • Why interleave attention with SSM blocks.
  • Context extension pressure.
  • Tokenizer and multilingual effects.
  • Evaluation traces from Project Chimera.

References