Support custom specified models for HF model parsers #863

saqadri · 2024-01-10T20:15:32Z

Support custom specified models for HF model parsers

The HF prompt schema has been updated in #850 to support a "model" property, which can specify the model to use for a given HF task.

This change updates the model parsers to use this setting.

Test Plan:

Tested with the following prompt:

    {
      "name": "image_caption",
      "input": {
        "attachments": [
          {
            "data": "https://s3.amazonaws.com/lastmileai.aiconfig.public/uploads/2024_1_10_18_41_31/1742/bear_with_honey.png",
            "mime_type": "image/png"
          },
          {
            "data": "https://s3.amazonaws.com/lastmileai.aiconfig.public/uploads/2024_1_10_18_41_31/7275/fox_in_forest.png",
            "mime_type": "image/png"
          }
        ]
      },
      "metadata": {
        "model": {
          "name": "HuggingFaceImage2TextTransformer",
          "settings": {
            "max_new_tokens": 10,
            "model": "Salesforce/blip-image-captioning-base"
          }
        },
        "parameters": {}
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 0,
          "data": "a bear sitting on a rock eating honey",
          "metadata": {}
        },
        {
          "output_type": "execute_result",
          "execution_count": 1,
          "data": "a red fox in the woods",
          "metadata": {}
        }
      ]
    },

Validated the the model setting was respected and worked.

Stack created with Sapling. Best reviewed with ReviewStack.

Ankush-lastmile · 2024-01-10T20:39:26Z

...sions/HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/image_2_text.py

- self.pipelines[model_name] = pipeline(task="image-to-text", model=model_name)
- captioner = self.pipelines[model_name]
+ model_name = get_hf_model(aiconfig, prompt, self)
+ key = model_name if model_name is not None else "__default__"


nit: double underscores are usually associated with dunder methods or special built-in values in Python. Would prefer a different value here To avoid any confusion

can make a little enum class or something, like

class Sentinel(Enum): DEFAUT = "DEFAULT"

Then check for that value wherever needed and supply the real default value in a specific use case

Ankush-lastmile · 2024-01-10T20:41:15Z

I don't see ASR model parser updated, did that not need updating?

Ankush-lastmile

assuming the model_name == model_parser.id(): was tested, lgtm though left a couple of comments

jonathanlastmileai

Generally LGTM, just would suggest addressing nits and doing a thorough test plan

jonathanlastmileai · 2024-01-10T21:05:21Z

...sions/HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/image_2_text.py

- self.pipelines[model_name] = pipeline(task="image-to-text", model=model_name)
- captioner = self.pipelines[model_name]
+ model_name = get_hf_model(aiconfig, prompt, self)
+ key = model_name if model_name is not None else "__default__"


can make a little enum class or something, like

class Sentinel(Enum): DEFAUT = "DEFAULT"

Then check for that value wherever needed and supply the real default value in a specific use case

jonathanlastmileai · 2024-01-10T21:06:08Z

...HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/text_summarization.py

- self.summarizers[model_name] = pipeline("summarization", model=model_name)
- summarizer = self.summarizers[model_name]
+ model_name = get_hf_model(aiconfig, prompt, self)
+ key = model_name if model_name is not None else "__default__"


jonathanlastmileai · 2024-01-10T21:06:28Z

...HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/text_summarization.py

- should_stream = (options.stream if options else False) and (
- not "stream" in completion_data or completion_data.get("stream") != False
- )
+ should_stream = (options.stream if options else False) and (not "stream" in completion_data or completion_data.get("stream") != False)


not for this diff but this stream logic should be cleaned up

I have an bootcamp issue to track this in #861

jonathanlastmileai · 2024-01-10T21:06:36Z

...s/HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/text_translation.py

- self.translators[model_name] = pipeline("translation", model_name)
- translator = self.translators[model_name]
+ model_name = get_hf_model(aiconfig, prompt, self)
+ key = model_name if model_name is not None else "__default__"


jonathanlastmileai · 2024-01-10T21:07:30Z

extensions/HuggingFace/python/src/aiconfig_extension_hugging_face/local_inference/util.py

+if TYPE_CHECKING:
+ from aiconfig import AIConfigRuntime
+
+
+def get_hf_model(aiconfig: "AIConfigRuntime", prompt: Prompt, model_parser: ParameterizedModelParser) -> str | None:


If possible it's much better to untangle the circle. Do we even have a circle in this case?

rossdanlm

Edit: I realize that unlike Facebook commandeer, if I take it it'll create a new PR and that will need to get approved again. In that case makes sense just for you to resolve. Just make sure all imports are included, doesn't need to be alphabetical or whatever, I can always fix forward

~~Sorry just FYI there will be some merge conflicts due to #862. I can commandeer and resolve if you want, just let me know~~

The HF prompt schema has been updated in #850 to support a "model" property, which can specify the model to use for a given HF task. This change updates the model parsers to use this setting. Test Plan: Tested with the following prompt: ``` { "name": "image_caption", "input": { "attachments": [ { "data": "https://s3.amazonaws.com/lastmileai.aiconfig.public/uploads/2024_1_10_18_41_31/1742/bear_with_honey.png", "mime_type": "image/png" }, { "data": "https://s3.amazonaws.com/lastmileai.aiconfig.public/uploads/2024_1_10_18_41_31/7275/fox_in_forest.png", "mime_type": "image/png" } ] }, "metadata": { "model": { "name": "HuggingFaceImage2TextTransformer", "settings": { "max_new_tokens": 10, "model": "Salesforce/blip-image-captioning-base" } }, "parameters": {} }, "outputs": [ { "output_type": "execute_result", "execution_count": 0, "data": "a bear sitting on a rock eating honey", "metadata": {} }, { "output_type": "execute_result", "execution_count": 1, "data": "a red fox in the woods", "metadata": {} } ] }, ``` Validated the the model setting was respected and worked.

Fix parameterization issues Now that we have non-text input model parsers, we can see some issues in our parameterization. 1) ASR and Image-to-Text modelparsers should NOT be `ParameterizedModelParser` instances, since their inputs cannot be parameterized. 2) Parameter resolution logic shouldn't throw errors in the case where it's parsing prompts belonging to regular model parsers. Test Plan: Ran the translation prompt for this aiconfig: ``` prompts: [ { "name": "image_caption", "input": { "attachments": [ { "data": "https://s3.amazonaws.com/lastmileai.aiconfig.public/uploads/2024_1_10_18_41_31/1742/bear_with_honey.png", "mime_type": "image/png" }, { "data": "https://s3.amazonaws.com/lastmileai.aiconfig.public/uploads/2024_1_10_18_41_31/7275/fox_in_forest.png", "mime_type": "image/png" } ] }, "metadata": { "model": { "name": "HuggingFaceImage2TextTransformer", "settings": { "max_new_tokens": 10, "model": "Salesforce/blip-image-captioning-base" } }, "parameters": {} }, "outputs": [ { "output_type": "execute_result", "execution_count": 0, "data": "a bear sitting on a rock eating honey", "metadata": {} }, { "output_type": "execute_result", "execution_count": 1, "data": "a red fox in the woods", "metadata": {} } ] }, { "name": "translation", "input": "Once upon a time, in a lush and vibrant forest, there lived a magnificent creature known as the Quick Brown Fox. This fox was unlike any other, possessing incredible speed and agility that awed all the animals in the forest. With its fur as golden as the sun and its eyes as sharp as emeralds, the Quick Brown Fox was admired by everyone, from the tiniest hummingbird to the mightiest bear. The fox had a kind heart and would often lend a helping paw to those in need. The Quick Brown Fox had a particular fondness for games and challenges. It loved to test its skills against others, always seeking new adventures to satisfy its boundless curiosity. Its favorite game was called \"The Great Word Hunt,\" where it would embark on a quest to find hidden words scattered across the forest. \n\nOne day it got very old and died", "metadata": { "model": { "name": "HuggingFaceTextTranslationTransformer", "settings": { "model": "Helsinki-NLP/opus-mt-en-fr", "max_length": 100, "min_length": 50, "num_beams": 1 } }, "parameters": {} } } ] ``` Before: ``` File "/opt/homebrew/lib/python3.11/site-packages/aiconfig/util/params.py", line 235, in get_prompt_template raise Exception(f"Cannot get prompt template string from prompt: {prompt.input}") Exception: Cannot get prompt template string from prompt: attachments=[Attachment(data='https://s3.amazonaws.com/lastmileai.aiconfig.public/uploads/2024_1_10_18_41_31/1742/bear_with_honey.png', mime_type='image/png', metadata=None), Attachment(data='https://s3.amazonaws.com/lastmileai.aiconfig.public/uploads/2024_1_10_18_41_31/7275/fox_in_forest.png', mime_type='image/png', metadata=None)] data=None ``` After: * Beautiful translation --- Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/lastmile-ai/aiconfig/pull/866). * #826 * __->__ #866 * #863

Gradio Local Editor cookbook --- Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/lastmile-ai/aiconfig/pull/826). * __->__ #826 * #866 * #863

saqadri requested review from rholinshead, suyoglastmileai, Ankush-lastmile, jonathanlastmileai and rossdanlm as code owners January 10, 2024 20:15

saqadri mentioned this pull request Jan 10, 2024

Gradio Local Editor cookbook #826

Merged

saqadri force-pushed the pr863 branch from 1321a3b to 75d3a2d Compare January 10, 2024 20:16

Ankush-lastmile reviewed Jan 10, 2024

View reviewed changes

Ankush-lastmile approved these changes Jan 10, 2024

View reviewed changes

jonathanlastmileai reviewed Jan 10, 2024

View reviewed changes

This was referenced Jan 10, 2024

Fix parameterization issues #866

Merged

Default to stream for HF model parsers #867

Closed

rossdanlm approved these changes Jan 10, 2024

View reviewed changes

saqadri force-pushed the pr863 branch from 75d3a2d to 5e533b0 Compare January 10, 2024 22:46

saqadri merged commit 5e533b0 into main Jan 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support custom specified models for HF model parsers #863

Support custom specified models for HF model parsers #863

saqadri commented Jan 10, 2024 •

edited

Loading

Ankush-lastmile Jan 10, 2024

jonathanlastmileai Jan 10, 2024

Ankush-lastmile commented Jan 10, 2024 •

edited

Loading

Ankush-lastmile left a comment •

edited

Loading

jonathanlastmileai left a comment

jonathanlastmileai Jan 10, 2024

jonathanlastmileai Jan 10, 2024

jonathanlastmileai Jan 10, 2024

saqadri Jan 10, 2024

rossdanlm Jan 10, 2024

jonathanlastmileai Jan 10, 2024

jonathanlastmileai Jan 10, 2024

rossdanlm left a comment •

edited

Loading

Support custom specified models for HF model parsers #863

Support custom specified models for HF model parsers #863

Conversation

saqadri commented Jan 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ankush-lastmile commented Jan 10, 2024 • edited Loading

Ankush-lastmile left a comment • edited Loading

Choose a reason for hiding this comment

jonathanlastmileai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rossdanlm left a comment • edited Loading

Choose a reason for hiding this comment

saqadri commented Jan 10, 2024 •

edited

Loading

Ankush-lastmile commented Jan 10, 2024 •

edited

Loading

Ankush-lastmile left a comment •

edited

Loading

rossdanlm left a comment •

edited

Loading