diff --git a/docs/docs/integrations/providers/nvidia.mdx b/docs/docs/integrations/providers/nvidia.mdx index f53e9125131b2f..71aac40700e16e 100644 --- a/docs/docs/integrations/providers/nvidia.mdx +++ b/docs/docs/integrations/providers/nvidia.mdx @@ -6,7 +6,7 @@ > [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to NVIDIA hosted API endpoints for > NVIDIA AI Foundation Models like `Mixtral 8x7B`, `Llama 2`, `Stable Diffusion`, etc. These models, -> hosted on the [NVIDIA NGC catalog](https://catalog.ngc.nvidia.com/ai-foundation-models), are optimized, tested, and hosted on +> hosted on the [NVIDIA API catalog](https://build.nvidia.com/), are optimized, tested, and hosted on > the NVIDIA AI platform, making them fast and easy to evaluate, further customize, > and seamlessly run at peak performance on any accelerated stack. > diff --git a/docs/docs/integrations/text_embedding/nvidia_ai_endpoints.ipynb b/docs/docs/integrations/text_embedding/nvidia_ai_endpoints.ipynb index 44df88bf85100f..2fd7c4041a3a1e 100644 --- a/docs/docs/integrations/text_embedding/nvidia_ai_endpoints.ipynb +++ b/docs/docs/integrations/text_embedding/nvidia_ai_endpoints.ipynb @@ -85,9 +85,6 @@ "import getpass\n", "import os\n", "\n", - "## API Key can be found by going to NVIDIA NGC -> AI Foundation Models -> (some model) -> Get API Code or similar.\n", - "## 10K free queries to any endpoint (which is a lot actually).\n", - "\n", "# del os.environ['NVIDIA_API_KEY'] ## delete key and reset\n", "if os.environ.get(\"NVIDIA_API_KEY\", \"\").startswith(\"nvapi-\"):\n", " print(\"Valid NVIDIA_API_KEY already in environment. Delete to reset\")\n", @@ -112,11 +109,7 @@ "source": [ "## Initialization\n", "\n", - "The main requirement when initializing an embedding model is to provide the model name. An example is `nvolveqa_40k` below.\n", - "\n", - "For `nvovleqa_40k`, you can also specify the `model_type` as `passage` or `query`. When doing retrieval, you will get best results if you embed the source documents with the `passage` type and the user queries with the `query` type.\n", - "\n", - "If not provided, the `embed_query` method will default to the `query` type, and the `embed_documents` mehod will default to the `passage` type." + "When initializing an embedding model you can select a model by passing it, e.g. `ai-embed-qa-4` below, or use the default by not passing any arguments." ] }, { @@ -129,10 +122,7 @@ "source": [ "from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n", "\n", - "embedder = NVIDIAEmbeddings(model=\"nvolveqa_40k\")\n", - "\n", - "# Alternatively, if you want to specify whether it will use the query or passage type\n", - "# embedder = NVIDIAEmbeddings(model=\"nvolveqa_40k\", model_type=\"passage\")" + "embedder = NVIDIAEmbeddings(model=\"ai-embed-qa-4\")" ] }, { @@ -156,7 +146,7 @@ "id": "pcDu3v4CbmWk" }, "source": [ - "### **Similarity/Speed Test**\n", + "### **Similarity**\n", "\n", "The following is a quick test of the methods in terms of usage, format, and speed for the use case of embedding the following data points:\n", "\n", @@ -250,7 +240,7 @@ "s = time.perf_counter()\n", "# To use the \"query\" mode, we have to add it as an instance arg\n", "q_embeddings = NVIDIAEmbeddings(\n", - " model=\"nvolveqa_40k\", model_type=\"query\"\n", + " model=\"ai-embed-qa-4\", model_type=\"query\"\n", ").embed_documents(\n", " [\n", " \"What's the weather like in Komchatka?\",\n", @@ -501,7 +491,7 @@ "source": [ "vectorstore = FAISS.from_texts(\n", " [\"harrison worked at kensho\"],\n", - " embedding=NVIDIAEmbeddings(model=\"nvolveqa_40k\"),\n", + " embedding=NVIDIAEmbeddings(model=\"ai-embed-qa-4\"),\n", ")\n", "retriever = vectorstore.as_retriever()\n", "\n", @@ -515,7 +505,7 @@ " ]\n", ")\n", "\n", - "model = ChatNVIDIA(model=\"mixtral_8x7b\")\n", + "model = ChatNVIDIA(model=\"ai-mixtral-8x7b-instruct\")\n", "\n", "chain = (\n", " {\"context\": retriever, \"question\": RunnablePassthrough()}\n",