generated from roboflow/template-python
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Closed
Description
This issue aggregates the discussion and near-future plans to introduce metrics to supervision.
The first steps shall be enacted by the core Roboflow team, and then we'll open submissions for specific metrics for the community.
I propose the following:
- Aim for ease of usage, compact API, sacrificing completeness if required.
- Provide public classes with aggregation by default (metrics.py), keep implementation in impl.py or equivalent, to be used internally.
- Expose not in global scope, but in supervision.metrics.
- I don't think we need to split into metrics.detection, metrics.segmentation, metrics.classification, but I'm on the fence.
- Focus only on what we can apply to Detections object.
- This means, only implement metrics if they use some of: class_id, confidence, xyxy, mask, xyxyxyxy (in Detections.data).
⚠️ I don't know:
- How metrics are computed when targets and predictions have different numbers of detections or they are mismatched.
- I don't think metrics should fail in that case, but perhaps there's a standard way of addressing this.
I believe we could start with:
- Importing current metrics into the new system:
- IoU
- mAP
- Confusion Matrix
- Detections
- Accuracy
- Precision
- Recall
- General
- Mean confidence
- Median confidence
- Min confidence
- Max confidence
- (not typical, but I'd find useful) - number of unique classes detected & aggregate count of how many objects of which class were detected - N defects / hour).
I believe the param Metrics needs to provide during construction is queue_size.
- 1 - don't keep history, only ever give metrics of current batch
- N - keep up to N metric results in history for computation.
Other thoughts:
- I don't think metrics should know about datasets. Instead of benchmark as it is in current API, let's have def benchmark_dataset(dataset, metric) in metrics/utils.py.
API:
class Accuracy(Metric):
def __init__(self, queue_size=1) -> None
@override
def update(predictions: Detections, targets: Detections) -> None
@override
def compute() -> NotSureYet
# Metric also provides `def detect_and_compute(args*, kwargs**)`.
accuracy_metric = Accuracy()
accuracy_metric.add(detections, detections_ground_truth)
accuracy = accuracy_metric.compute()
Related features:
- Object Detection Confusion Matrix - Evaluation API #140
- Add ConfusionMatrix to EvaluationAPI #177
- Object detection mAP score #232
- Feature MeanAveragePrecision 🔥 #236
- Calculate
precision
,recall
andf1 score
for object detection models. #292 - Instance Segmentation Mean Average Precision #480
- Add a thresholding API. #632