Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question to huggingface model using triton #7226

Open
geraldstanje opened this issue May 15, 2024 · 0 comments
Open

Question to huggingface model using triton #7226

geraldstanje opened this issue May 15, 2024 · 0 comments

Comments

@geraldstanje
Copy link

geraldstanje commented May 15, 2024

Description
Hi,
I have a question to https://github.com/triton-inference-server/tutorials/tree/main/HuggingFace for a SetFit model im currently looking at.

Info to the SetFitModel (using sentence-transformers/all-MiniLM-L6-v2):

cat model_to_deploy/config.json 
{
  "_name_or_path": "checkpoints/step_270",
  "architectures": [
    "BertModel"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 384,
  "initializer_range": 0.02,
  "intermediate_size": 1536,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 6,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "torch_dtype": "float32",
  "transformers_version": "4.40.2",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

I exported the SetFitModel to onnx:

model = SetFitModel.from_pretrained("model_to_deploy")
from setfit.exporters.onnx import export_onnx

# Export the sklearn based model
output_path = "setfit_dummy_model.onnx"
export_onnx(model.model_body,
            model.model_head,
            opset=12,
            output_path=output_path)

Here the exported onnx model visualized (not fully shown - check model properties):
image001

How can i configure the config.pbtxt regarding input and output dims for batch size = 1 for and use python backend similar to: https://github.com/triton-inference-server/tutorials/tree/main/HuggingFace/python_model_repository/python_vit? the input and output tensors in netron.app show some '?'

Triton Information
What version of Triton are you using?
24.04

Are you using the Triton container or did you build it yourself?

To Reproduce
Steps to reproduce the behavior.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

Expected behavior
A clear and concise description of what you expected to happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant