signal_tag · 2_broadcasts
#mixture-of-experts
// 2 transmissions tagged with #mixture-of-experts
TX_011· 15:00AI
Meta's Llama 4 family: 10M-token context, MoE architecture, fully open
Llama 4 ships with two open-weight models: Scout (17B active / 109B total, 10M context) and Maverick (400B parameters). MoE replaces dense transformer. Largest open context window on the market.
TX_045· 11:00AI
Mistral Large 3 ships as 41B-active sparse MoE under Apache 2.0
Mistral 3 family launched with three dense small models (3B, 8B, 14B) and Mistral Large 3 — a sparse MoE with 41B active and 675B total parameters. All under Apache 2.0. Large 3 hits #2 in OSS non-reasoning on LMArena.