请问fastmoe能被集成到VLLM里吗 #191

pangsg · 2024-02-04T07:22:34Z

No description provided.

laekov · 2024-02-04T07:23:45Z

好问题. 目前没有这样的尝试.

laekov · 2024-02-04T07:25:53Z

FastMoE 的单卡版本或多卡并行版本并不涉及对 kv-cache 进行变动. 理论上和 page attention 是正交关系. 可以一起用.

pangsg · 2024-02-04T07:28:18Z

好的谢谢，我也是这个思路，正常尝试进行集成

…

---原始邮件--- 发件人: "La Eako ***@***.***> 发送时间: 2024年2月4日(周日) 下午3:26 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [laekov/fastmoe] 请问fastmoe能被集成到VLLM里吗 (Issue #191) FastMoE 的单卡版本或多卡并行版本并不涉及对 kv-cache 进行变动. 理论上和 page attention 是正交关系. 可以一起用. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

pangsg · 2024-02-08T03:20:57Z

定义MOE的时候，需要显式的self.moe().cuda()这样去把fastmoe layer放到GPU上吗

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

请问fastmoe能被集成到VLLM里吗 #191

请问fastmoe能被集成到VLLM里吗 #191

pangsg commented Feb 4, 2024

laekov commented Feb 4, 2024

laekov commented Feb 4, 2024

pangsg commented Feb 4, 2024 via email

pangsg commented Feb 8, 2024

请问fastmoe能被集成到VLLM里吗 #191

请问fastmoe能被集成到VLLM里吗 #191

Comments

pangsg commented Feb 4, 2024

laekov commented Feb 4, 2024

laekov commented Feb 4, 2024

pangsg commented Feb 4, 2024 via email

pangsg commented Feb 8, 2024