-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Component(s)
exporter/prometheus
What happened?
Describe the bug
The prometheus exporter's internal telemetry is misleading when it encounters data it cannot process. When a non-monotonic OTLP Sum metric with DELTA aggregation temporality is sent to the collector, the exporter correctly drops it as per the specification: Prometheus and OpenMetrics Compatibility - Sums
However, the internal metric otelcol_exporter_sent_metric_points_total
is incremented for these dropped data points, while otelcol_exporter_send_failed_metric_points_total
remains at 0. This is quite misleading, suggesting a successful export and making it very difficult to diagnose why data is missing in Prometheus.
Steps to reproduce
- Use the following configuration files
--- docker-compose.yml ---
services:
otel-collector:
image: otel/opentelemetry-collector@sha256:1a266d7de716f80416c0d80b06014b4c3fbf26c4721c66d5a73e6d08c5011bb5
command: ["--config=/etc/otel-collector-config.yml"]
volumes:
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
ports:
- "8888" # Prometheus metrics exposed by the collector
- "9089" # Prometheus exporter metrics
- "4317:4317" # gRPC receiver
- "4318:4318" # http receiver
networks:
- otel-net
prometheus:
container_name: prometheus
image: prom/prometheus:v3.5.0@sha256:63805ebb8d2b3920190daf1cb14a60871b16fd38bed42b857a3182bc621f4996
command: ["--config.file=/etc/prometheus/prometheus.yml", "--log.level=debug"]
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
networks:
- otel-net
networks:
otel-net:
--- otel-collector-config.yml ---
receivers:
otlp:
protocols:
grpc:
endpoint: otel-collector:4317
http:
endpoint: otel-collector:4318
exporters:
prometheus:
endpoint: "otel-collector:9089"
namespace: promexample
service:
telemetry:
metrics:
readers:
- pull:
exporter:
prometheus:
host: '0.0.0.0'
port: 8888
pipelines:
metrics:
receivers: [otlp]
exporters: [prometheus]
--- prometheus.yml ---
scrape_configs:
- job_name: 'otel-collector'
scrape_interval: 1s
static_configs:
- targets: ['otel-collector:9089']
- job_name: 'otel-collector-internal'
scrape_interval: 1s
static_configs:
- targets: ['otel-collector:8888']
- Start the services:
docker-compose up -d
- Run the following script to send two DELTA, non-monotonic Sum data points (values 5 and 10)
now_ns="$(date +%s)000000000"
now_plus1_ns="$((now_ns + 1000000000))"
# Command 1: Send the first data point (t=0->0, Value: 0)
curl -X POST -H "Content-Type: application/json" http://localhost:4318/v1/metrics -d "{
\"resource_metrics\": [
{
\"resource\": {
\"attributes\": [
{ \"key\": \"environment\", \"value\": { \"string_value\": \"my-env\" } }
]
},
\"scope_metrics\": [
{
\"scope\": {},
\"metrics\": [
{
\"name\": \"experiment2\",
\"description\": \"testing\",
\"unit\": \"1\",
\"sum\": {
\"data_points\": [
{
\"start_time_unix_nano\": \"$now_ns\",
\"time_unix_nano\": \"$now_ns\",
\"as_int\": \"5\"
}
],
\"aggregation_temporality\": \"AGGREGATION_TEMPORALITY_DELTA\",
\"is_monotonic\": false
}
}
]
}
]
}
]
}"
sleep 1
# Command 2: Send the second data point (t=0->1, Value: 10)
curl -X POST -H "Content-Type: application/json" http://localhost:4318/v1/metrics -d "{
\"resource_metrics\": [
{
\"resource\": {
\"attributes\": [
{ \"key\": \"environment\", \"value\": { \"string_value\": \"my-env\" } }
]
},
\"scope_metrics\": [
{
\"scope\": {},
\"metrics\": [
{
\"name\": \"experiment2\",
\"description\": \"testing\",
\"unit\": \"1\",
\"sum\": {
\"data_points\": [
{
\"start_time_unix_nano\": \"$now_ns\",
\"time_unix_nano\": \"$now_plus1_ns\",
\"as_int\": \"10\"
}
],
\"aggregation_temporality\": \"AGGREGATION_TEMPORALITY_DELTA\",
\"is_monotonic\": false
}
}
]
}
]
}
]
}"
Query Prometheus UI (http://localhost:9090) for the collector's internal metrics:
{__name__=~".+experiment.+"} or {__name__=~"otelcol_exporter.+"} or {__name__=~"otelcol_receiver.+"}
What did you expect to see?
Internal metrics should accurately reflect the outcome of an operation. When the exporter drops data due to incompatibility, its internal metrics should report a failure or a drop, not a success.
The metric otelcol_exporter_sent_metric_points_total
should remain at 0, as no data points were successfully exposed for scraping.
The metric otelcol_exporter_send_failed_metric_points_total
(or a more specific otelcol_exporter_dropped_metric_points_total
) should increment to 2, accurately reflecting that two points were received but could not be exported.
What did you see instead?
result of {__name__=~".+experiment.+"} or {__name__=~"otelcol_exporter.+"} or {__name__=~"otelcol_receiver.+"}
:
metric | value |
---|---|
otelcol_exporter_send_failed_metric_points_total{exporter="prometheus", (...)} | 0 |
otelcol_exporter_sent_metric_points_total{exporter="prometheus", (...)"} | 2 |
otelcol_receiver_accepted_metric_points_total{receiver="otlp", transport="http", (...)} | 2 |
otelcol_receiver_refused_metric_points_total{receiver="otlp", transport="http", (...)} | 0 |
- The metric
promexample_experiment2
is correctly absent from Prometheus because it was dropped. - The collector's internal metric
otelcol_exporter_sent_metric_points_total
incorrectly increments to 2. - The collector's internal metric
otelcol_exporter_send_failed_metric_points_total
remains at 0.
This gives the operator a misleading signal that two metric points were successfully exported when they were, in fact, dropped.
For comparison, if I change the configuration by adding the deltatocumulative
processor (using the collector-contrib):
--- docker-compose.yml ---
services:
otel-collector:
image: otel/opentelemetry-collector-contrib@sha256:sha256:8d5c6595ac5d6fd8ee0ca91868bead6426353b077722b85f5ae98e583caa259b
(...)
--- otel-collector-config.yml ---
receivers: (...)
processors:
deltatocumulative:
exporters: (...)
service:
telemetry: (...)
pipelines:
metrics:
receivers: [otlp]
processors: [deltatocumulative]
exporters: [prometheus]
Then the promexample_experiment2
metric is exported, but we see the very same internal metrics (which are now as expected):
Query: {__name__=~".+experiment.+"} or {__name__=~"otelcol_exporter.+"} or {__name__=~"otelcol_receiver.+"}
metric | value |
---|---|
promexample_experiment2{instance="otel-collector:9089", job="otel-collector"} | 15 |
otelcol_exporter_send_failed_metric_points_total{exporter="prometheus", (...)} | 0 |
otelcol_exporter_sent_metric_points_total{exporter="prometheus", (...)} | 2 |
otelcol_receiver_accepted_metric_points_total{receiver="otlp", transport="http", (...)} | 2 |
otelcol_receiver_refused_metric_points_total{receiver="otlp", transport="http", (...)} | 0 |
Collector version
sha256:1a266d7de716f80416c0d80b06014b4c3fbf26c4721c66d5a73e6d08c5011bb5
Environment information
Environment
OS: Ubuntu 24.04.2 LTS under Win11/WSL2
Docker: Docker version 28.3.3, build 980b856
OpenTelemetry Collector configuration
receivers:
otlp:
protocols:
grpc:
endpoint: otel-collector:4317
http:
endpoint: otel-collector:4318
exporters:
prometheus:
endpoint: "otel-collector:9089"
namespace: promexample
service:
telemetry:
metrics:
readers:
- pull:
exporter:
prometheus:
host: '0.0.0.0'
port: 8888
pipelines:
metrics:
receivers: [otlp]
exporters: [prometheus]
Log output
2025-08-15T06:44:51.604Z info [email protected]/service.go:187 Setting up own telemetry... {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}}
2025-08-15T06:44:51.608Z info [email protected]/service.go:249 Starting otelcol... {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}, "Version": "0.132.0", "NumCPU": 12}
2025-08-15T06:44:51.608Z info extensions/extensions.go:41 Starting extensions... {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}}
2025-08-15T06:44:51.614Z info [email protected]/otlp.go:117 Starting GRPC server {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "otel-collector:4317"}
2025-08-15T06:44:51.616Z info [email protected]/otlp.go:175 Starting HTTP server {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "otel-collector:4318"}
2025-08-15T06:44:51.618Z info [email protected]/service.go:272 Everything is ready. Begin running and processing data. {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}}
Additional context
No response
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1
or me too
, to help us triage it. Learn more here.