Skip to content
Discussion options

You must be logged in to vote

Ensure it using the same tokenizers

  • Use OpenAI’s tokenizers that correspond to the deployed model:
    • GPT-3.5/4 “turbo” era → cl100k_base
    • GPT-4o family → the newer “o*” encodings (e.g., o200k_base).
      OpenAI documents the mapping and provides an official tokenizer and cookbook examples. (https://platform.openai.com/tokenizer)
  • Count exactly what you send (the serialized chat messages, system+user+assistant+tool calls). Function/tool call arguments and schemas are tokens too.
  • For streaming, Azure bills prompt + generated tokens up to the moment you stop. (Billing/metrics define this explicitly.)

How to be certain?

  • On the client: use a tokenizer that matches your model and count the exact s…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@Magnuti
Comment options

@kiril-buga
Comment options

Answer selected by Magnuti
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants