Question to huggingface model using triton #7226

geraldstanje · 2024-05-15T23:33:36Z

Description
Hi,
I have a question to https://github.com/triton-inference-server/tutorials/tree/main/HuggingFace for a SetFit model im currently looking at.

Info to the SetFitModel (using sentence-transformers/all-MiniLM-L6-v2):

cat model_to_deploy/config.json 
{
  "_name_or_path": "checkpoints/step_270",
  "architectures": [
    "BertModel"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 384,
  "initializer_range": 0.02,
  "intermediate_size": 1536,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 6,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "torch_dtype": "float32",
  "transformers_version": "4.40.2",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

I exported the SetFitModel to onnx:

model = SetFitModel.from_pretrained("model_to_deploy")
from setfit.exporters.onnx import export_onnx

# Export the sklearn based model
output_path = "setfit_dummy_model.onnx"
export_onnx(model.model_body,
            model.model_head,
            opset=12,
            output_path=output_path)

Here the exported onnx model visualized (not fully shown - check model properties):

How can i configure the config.pbtxt regarding input and output dims for batch size = 1 for and use python backend similar to: https://github.com/triton-inference-server/tutorials/tree/main/HuggingFace/python_model_repository/python_vit? the input and output tensors in netron.app show some '?'

Triton Information
What version of Triton are you using?
24.04

Are you using the Triton container or did you build it yourself?

To Reproduce
Steps to reproduce the behavior.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

Expected behavior
A clear and concise description of what you expected to happen.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question to huggingface model using triton #7226

Question to huggingface model using triton #7226

geraldstanje commented May 15, 2024 •

edited

Question to huggingface model using triton #7226

Question to huggingface model using triton #7226

Comments

geraldstanje commented May 15, 2024 • edited

geraldstanje commented May 15, 2024 •

edited