Added Mixtral and Phi to docs (#134)

predibase · Dec 15, 2023 · ce99dbf · ce99dbf
1 parent 549bbb8
commit ce99dbf
Show file tree

Hide file tree

Showing 2 changed files with 18 additions and 0 deletions.
diff --git a/docs/models/adapters.md b/docs/models/adapters.md
@@ -28,6 +28,14 @@ Any combination of linear layers can be targeted in the adapters, which correspo
 - `down_proj`
 - `lm_head`
 
+### Mixtral
+
+- `q_proj`
+- `k_proj`
+- `v_proj`
+- `o_proj`
+- `lm_head`
+
 ### Qwen
 
 - `c_attn`
@@ -36,6 +44,14 @@ Any combination of linear layers can be targeted in the adapters, which correspo
 - `w2`
 - `lm_head`
 
+### Phi
+
+- `Wqkv`
+- `out_proj`
+- `fc1`
+- `fc2`
+- `lm_head`
+
 ### GPT2
 
 - `c_attn`

diff --git a/docs/models/base_models.md b/docs/models/base_models.md
@@ -6,7 +6,9 @@
  - [CodeLlama](https://huggingface.co/codellama)
 - 🌬️[Mistral](https://huggingface.co/mistralai)
  - [Zephyr](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
+- 🔄 [Mixtral](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)
 - 🔮 [Qwen](https://huggingface.co/Qwen)
+- 🏛️ [Phi](https://huggingface.co/microsoft/phi-2)
 - 🤖 [GPT2](https://huggingface.co/gpt2)
 
 Other architectures are supported on a best effort basis, but do not support dynamic adapter loading.