TX_947343· AI
Δ-Mem cuts memory use in large language models without performance loss
Δ-Mem, a new memory optimization technique, reduces memory consumption in LLMs by compressing key-value states and reusing memory slots, maintaining full model performance [arXiv].