// 1 transmissions tagged with #memory-optimization
Δ-Mem, a new memory optimization technique, reduces memory consumption in LLMs by compressing key-value states and reusing memory slots, maintaining full model performance [arXiv].