Feat/temperature scaling confidence calibration #1434
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add temperature scaling for confidence calibration and simplify confidence metric
Description
This PR introduces a
temperatureparameter to the recognition pipeline, allowing users to calibrate model confidence, softening overconfident models or boosting underconfident ones. It applies temperature scaling to logits before the softmax layer inrecognizer_predict, and replaces the geometric-root-basedcustom_meanwith a standard average of token-level max probabilities, making confidence scores more interpretable.Why this matters
Confidence calibration makes EasyOCR more trustworthy in edge cases, especially for challenging scripts like Arabic. This tweak gives fine-grained control over confidence behavior across model variants or data types, without changing the core model architecture. The code remains purely additive, optional, and fully backward-compatible.
Changes
temperatureparameter (default = 1.0) in:recognizer_predictget_textcustom_meanconfidence calculation with standard average of max probabilities.temperatureparameter.Example: Reducing overconfidence with temperature scaling
We tested the new
temperatureparameter on Arabic OCR.The baseline model was highly overconfident, often assigning >0.95 confidence to incorrect predictions.
Applying temperature scaling with
temperature=1.5reduced these inflated scores, producing more realistic confidence estimates.Before (temperature = 1.0, default):
After (temperature = 1.5):
This demonstrates how temperature scaling can make EasyOCR confidence scores more trustworthy in practice.
Testing
get_textwithtemperature=1.5produces expected reduction in average confidence scores.temperature=1.0.Backward compatibility
temperaturedefaults to1.0, so existing users experience no behavior change unless they explicitly set it.Next steps
temperatureparameter.Maintainers:
This PR is self-contained and backward compatible. The included example shows one practical case (reducing overconfidence). Further use cases like boosting confidence can be demonstrated in follow-up tests if needed.