Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Conversation

dsikka
Copy link
Contributor

@dsikka dsikka commented Aug 29, 2023

For this ticket: https://app.asana.com/0/1201735099598270/1205276886236966/f

Summary:

  • Updates the TransformersPipeline constructor to add in two arguments: config and tokenizer
  • For both of these, the user can provide a string, path or transformers object which will be used as opposed to relying on a deployment directory with the expected json files. By default, these will both be None and in that case, the normal deployment directory workflow will be used.
  • Additionally, the config argument may also be a dictionary
  • To support this functionality, the get_onnx_path_and_configs is refactored/separated into two separate functions, get_hugging_face_configs and get_onnx_path

Testing:

  • Tested locally using a variety of combinations for config and tokenizer

Example:

from deepsparse import Pipeline
from transformers import LlamaConfig, LlamaTokenizerFast

tokenizer = LlamaTokenizerFast.from_pretrained("hf-internal-testing/llama-tokenizer")
config = {
   "_name_or_path": None,
   "architectures": [
      "LlamaForCausalLM"
   ],
   "bos_token_id": 1,
   "eos_token_id": 2,
   "hidden_act": "silu",
   "hidden_size": 5120,
   "initializer_range": 0.02,
   "intermediate_size": 13824,
   "max_position_embeddings": 4096,
   "model_type": "llama",
   "num_attention_heads": 40,
   "num_hidden_layers": 40,
   "num_key_value_heads": 40,
   "pretraining_tp": 1,
   "rms_norm_eps": 1e-05,
   "rope_scaling": None,
   "tie_word_embeddings": "false",
   "torch_dtype": "float16",
   "transformers_version": "4.31.0.dev0",
   "use_cache": "true",
   "vocab_size": 32000
}

llama = Pipeline.create(
   task="text-generation",
   model_path="/home/dsikka/models_llama/deployment_13",
   engine_type="onnxruntime",
   deterministic=False,
   config=config,
   tokenizer=tokenizer
)

inference = llama(sequences=["Hello?"])
for s in inference.sequences:
   print(s)

@dsikka dsikka marked this pull request as ready for review August 29, 2023 22:56
@bfineran bfineran merged commit 0f0029a into main Sep 6, 2023
@bfineran bfineran deleted the new_args branch September 6, 2023 19:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants