// 1 transmissions tagged with #large-language-models
Δ-Mem, a new memory optimization technique, reduces memory consumption in LLMs by compressing key-value states and reusing memory slots, maintaining full model performance [arXiv].