You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current behavior:
If max_tokens is provided in the request - number of tokens in response will be always equal to max_tokens
Expected behavior:
Create shorter outputs for part of requests.
Find relevant statistics, in which part of requests response should be shorter than max_tokens, and how to calculate response's length