Skip to content

Internal telemetry of Prometheus exporter incorrectly reports dropped metrics as "Sent" #13643

@halasz-csaba

Description

@halasz-csaba

Component(s)

exporter/prometheus

What happened?

Describe the bug
The prometheus exporter's internal telemetry is misleading when it encounters data it cannot process. When a non-monotonic OTLP Sum metric with DELTA aggregation temporality is sent to the collector, the exporter correctly drops it as per the specification: Prometheus and OpenMetrics Compatibility - Sums

However, the internal metric otelcol_exporter_sent_metric_points_total is incremented for these dropped data points, while otelcol_exporter_send_failed_metric_points_total remains at 0. This is quite misleading, suggesting a successful export and making it very difficult to diagnose why data is missing in Prometheus.

Steps to reproduce

  1. Use the following configuration files
--- docker-compose.yml ---

services:
  otel-collector:
    image: otel/opentelemetry-collector@sha256:1a266d7de716f80416c0d80b06014b4c3fbf26c4721c66d5a73e6d08c5011bb5
    command: ["--config=/etc/otel-collector-config.yml"]
    volumes:
      - ./otel-collector-config.yml:/etc/otel-collector-config.yml
    ports:
      - "8888"   # Prometheus metrics exposed by the collector
      - "9089"   # Prometheus exporter metrics
      - "4317:4317"   # gRPC receiver
      - "4318:4318"   # http receiver
    networks:
      - otel-net

  prometheus:
    container_name: prometheus
    image: prom/prometheus:v3.5.0@sha256:63805ebb8d2b3920190daf1cb14a60871b16fd38bed42b857a3182bc621f4996
    command: ["--config.file=/etc/prometheus/prometheus.yml", "--log.level=debug"]
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
    networks:
      - otel-net
      
networks:
  otel-net:

--- otel-collector-config.yml ---

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: otel-collector:4317
      http:
        endpoint: otel-collector:4318

exporters:
  prometheus:
    endpoint: "otel-collector:9089"
    namespace: promexample

service:
  telemetry:
    metrics:
      readers:
        - pull:
            exporter:
              prometheus:
                host: '0.0.0.0'
                port: 8888
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus]

--- prometheus.yml ---

scrape_configs:
  - job_name: 'otel-collector'
    scrape_interval: 1s
    static_configs:
      - targets: ['otel-collector:9089']
  - job_name: 'otel-collector-internal'
    scrape_interval: 1s
    static_configs:
      - targets: ['otel-collector:8888']
  1. Start the services: docker-compose up -d
  2. Run the following script to send two DELTA, non-monotonic Sum data points (values 5 and 10)
now_ns="$(date +%s)000000000"
now_plus1_ns="$((now_ns + 1000000000))"

# Command 1: Send the first data point (t=0->0, Value: 0)
curl -X POST -H "Content-Type: application/json" http://localhost:4318/v1/metrics -d "{
  \"resource_metrics\": [
    {
      \"resource\": {
        \"attributes\": [
          { \"key\": \"environment\", \"value\": { \"string_value\": \"my-env\" } }
        ]
      },
      \"scope_metrics\": [
        {
          \"scope\": {},
          \"metrics\": [
            {
              \"name\": \"experiment2\",
              \"description\": \"testing\",
              \"unit\": \"1\",
              \"sum\": {
                \"data_points\": [
                  {
                    \"start_time_unix_nano\": \"$now_ns\",
                    \"time_unix_nano\": \"$now_ns\",
                    \"as_int\": \"5\"
                  }
                ],
                \"aggregation_temporality\": \"AGGREGATION_TEMPORALITY_DELTA\",
                \"is_monotonic\": false
              }
            }
          ]
        }
      ]
    }
  ]
}"

sleep 1

# Command 2: Send the second data point (t=0->1, Value: 10)
curl -X POST -H "Content-Type: application/json" http://localhost:4318/v1/metrics -d "{
  \"resource_metrics\": [
    {
      \"resource\": {
        \"attributes\": [
          { \"key\": \"environment\", \"value\": { \"string_value\": \"my-env\" } }
        ]
      },
      \"scope_metrics\": [
        {
          \"scope\": {},
          \"metrics\": [
            {
              \"name\": \"experiment2\",
              \"description\": \"testing\",
              \"unit\": \"1\",
              \"sum\": {
                \"data_points\": [
                  {
                    \"start_time_unix_nano\": \"$now_ns\",
                    \"time_unix_nano\": \"$now_plus1_ns\",
                    \"as_int\": \"10\"
                  }
                ],
                \"aggregation_temporality\": \"AGGREGATION_TEMPORALITY_DELTA\",
                \"is_monotonic\": false
              }
            }
          ]
        }
      ]
    }
  ]
}"

Query Prometheus UI (http://localhost:9090) for the collector's internal metrics:
{__name__=~".+experiment.+"} or {__name__=~"otelcol_exporter.+"} or {__name__=~"otelcol_receiver.+"}

What did you expect to see?
Internal metrics should accurately reflect the outcome of an operation. When the exporter drops data due to incompatibility, its internal metrics should report a failure or a drop, not a success.

The metric otelcol_exporter_sent_metric_points_total should remain at 0, as no data points were successfully exposed for scraping.

The metric otelcol_exporter_send_failed_metric_points_total (or a more specific otelcol_exporter_dropped_metric_points_total) should increment to 2, accurately reflecting that two points were received but could not be exported.

What did you see instead?
result of {__name__=~".+experiment.+"} or {__name__=~"otelcol_exporter.+"} or {__name__=~"otelcol_receiver.+"}:

metric value
otelcol_exporter_send_failed_metric_points_total{exporter="prometheus", (...)} 0
otelcol_exporter_sent_metric_points_total{exporter="prometheus", (...)"} 2
otelcol_receiver_accepted_metric_points_total{receiver="otlp", transport="http", (...)} 2
otelcol_receiver_refused_metric_points_total{receiver="otlp", transport="http", (...)} 0
  • The metric promexample_experiment2 is correctly absent from Prometheus because it was dropped.
  • The collector's internal metric otelcol_exporter_sent_metric_points_total incorrectly increments to 2.
  • The collector's internal metric otelcol_exporter_send_failed_metric_points_total remains at 0.

This gives the operator a misleading signal that two metric points were successfully exported when they were, in fact, dropped.

For comparison, if I change the configuration by adding the deltatocumulative processor (using the collector-contrib):

--- docker-compose.yml ---

services:
  otel-collector:
    image: otel/opentelemetry-collector-contrib@sha256:sha256:8d5c6595ac5d6fd8ee0ca91868bead6426353b077722b85f5ae98e583caa259b
	(...)
	
--- otel-collector-config.yml ---

receivers: (...)
processors:
  deltatocumulative:
exporters: (...)
service:
  telemetry: (...)
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [deltatocumulative]
      exporters: [prometheus]

Then the promexample_experiment2 metric is exported, but we see the very same internal metrics (which are now as expected):
Query: {__name__=~".+experiment.+"} or {__name__=~"otelcol_exporter.+"} or {__name__=~"otelcol_receiver.+"}

metric value
promexample_experiment2{instance="otel-collector:9089", job="otel-collector"} 15
otelcol_exporter_send_failed_metric_points_total{exporter="prometheus", (...)} 0
otelcol_exporter_sent_metric_points_total{exporter="prometheus", (...)} 2
otelcol_receiver_accepted_metric_points_total{receiver="otlp", transport="http", (...)} 2
otelcol_receiver_refused_metric_points_total{receiver="otlp", transport="http", (...)} 0

Collector version

sha256:1a266d7de716f80416c0d80b06014b4c3fbf26c4721c66d5a73e6d08c5011bb5

Environment information

Environment

OS: Ubuntu 24.04.2 LTS under Win11/WSL2
Docker: Docker version 28.3.3, build 980b856

OpenTelemetry Collector configuration

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: otel-collector:4317
      http:
        endpoint: otel-collector:4318

exporters:
  prometheus:
    endpoint: "otel-collector:9089"
    namespace: promexample

service:
  telemetry:
    metrics:
      readers:
        - pull:
            exporter:
              prometheus:
                host: '0.0.0.0'
                port: 8888
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus]

Log output

2025-08-15T06:44:51.604Z        info    [email protected]/service.go:187 Setting up own telemetry...     {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}}
2025-08-15T06:44:51.608Z        info    [email protected]/service.go:249 Starting otelcol...     {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}, "Version": "0.132.0", "NumCPU": 12}
2025-08-15T06:44:51.608Z        info    extensions/extensions.go:41     Starting extensions...  {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}}
2025-08-15T06:44:51.614Z        info    [email protected]/otlp.go:117       Starting GRPC server    {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "otel-collector:4317"}
2025-08-15T06:44:51.616Z        info    [email protected]/otlp.go:175       Starting HTTP server    {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "otel-collector:4318"}
2025-08-15T06:44:51.618Z        info    [email protected]/service.go:272 Everything is ready. Begin running and processing data. {"resource": {"service.instance.id": "5e958754-2569-4627-9d68-b3babbc98f93", "service.name": "otelcol", "service.version": "0.132.0"}}

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcollector-telemetryhealthchecker and other telemetry collection issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions