上海AI Lab最新推出Mixture-of-Memories:线性注意力也有稀疏记忆了


-
论文地址:https://arxiv.org/abs/2502.13685
-
代码地址:https://github.com/OpenSparseLLMs/MoM
-
未来还会集成在:https://github.com/OpenSparseLLMs/Linear-MoE
-
模型权重开源在:https://huggingface.co/linear-moe-hub





-
Linear Attention, Lightning Attention, RetNet, GLA, DeltaNet, Gated DeltaNet 属于 linear attention 类
-
Mamba2 属于 SSM 类,HGRN2 属于 linear RNN 类
-
TTT, Titans 属于 Test-Time Training 类
(来源:机器之心)











