Bugfix: Illogical "Avoid computing higher temperatures on no_speech" (ggml-org#1903)

Purfview · jongwook · web-flow · commit 90db0de1896c · 2024-11-30T21:47:01.000-08:00
* Bugfix: Illogical "Avoid computing higher temperatures on no_speech" Bugfix for openai/whisper#1279 It's "silence" when decoding has failed due to `compression_ratio_threshold` too, when further down the code it's not "silence" anymore. "Silence" should be only when decoding has failed due to `logprob_threshold`. Like described there: https://github.com/openai/whisper/blob/8bc8860694949db53c42ba47ddc23786c2e02a8b/whisper/transcribe.py#L421 And in code there: https://github.com/openai/whisper/blob/8bc8860694949db53c42ba47ddc23786c2e02a8b/whisper/transcribe.py#L243-L251 * Fix if "logprob_threshold=None" --------- Co-authored-by: Jong Wook Kim <jongwook@openai.com>
diff --git a/whisper/transcribe.py b/whisper/transcribe.py
@@ -214,6 +214,8 @@ def decode_with_fallback(segment: torch.Tensor) -> DecodingResult:
             if (
                 no_speech_threshold is not None
                 and decode_result.no_speech_prob > no_speech_threshold
+                and logprob_threshold is not None
+                and decode_result.avg_logprob < logprob_threshold
             ):
                 needs_fallback = False  # silence
             if not needs_fallback: