Incorrect Behavior in OpenTelemetry Collector Spanmetrics

### Component(s)

connector/spanmetrics, exporter/prometheusremotewrite

### What happened?

**Subject: Issue Report: Incorrect Behavior in OpenTelemetry Collector Spanmetrics**

**Issue Description:**

We're facing a peculiar issue with the OpenTelemetry Collector's Spanmetrics connector and could use some help sorting it out. 

Here's a quick rundown:

**Problem:**
- We've set up an architecture using Grafana Stack LGTM, with Grafana Loki, Tempo, and Mimir for logs, tracing, and metrics, respectively.
- The goal is to sample traces efficiently but capture 100% of spanmetrics for a comprehensive APM dashboard.
- Our setup involves the otel/opentelemetry-collector-contrib as a load balancer, handling trace metrics with the 'spanmetrics' connector and routing traces/metrics based on an attribute_source to apply our internal's tenant distribuition inside Grafana's services.
- Traces are correctly routed and stored in Grafana Tempo, but the spanmetrics exhibit strange behavior on Grafana Mimir.

**Spanmetrics Configuration:**

```yaml
connectors:
  spanmetrics:
    histogram:
      explicit:
        buckets: [1ms, 2ms, ... , 10000s]
    namespace: traces.spanmetrics
    dimensions:
      - name: http.status_code
      - name: http.method
      - name: rpc.grpc.status_code
      - name: db.system
      - name: external.service
      - name: k8s.cluster.name
```

**Issue Details:**

- Executing code that generates a specific span 10 times accumulates the counter timeseries correctly.
- However, querying the metric using PromQL functions like increase or rate yields inaccurate results.
- For example, `increase(traces_spanmetrics_calls_total{service_name="my-service"}[5m])` shows a continuously increasing line, reaching 600 executions, and never returning to 0, even after a trace-free period.

**Observations:**

- The discrepancy is causing inflated values in application metrics, with rate showing over 100,000,000 spans/minute for an app generating 40,000 spans/minute.

- We sought help on the Grafana Mimir Slack channel ([link](https://grafana.slack.com/archives/C039863E8P7/p1696443123721789)) without success, but since we haven't found issues with metrics generated by our own applications, it suggests the problem lies within the OpenTelemetry Collector.

Screenshots:

![image](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/10624972/a40cb9ce-7681-4e46-b2c0-d31d4c85ecb9)

![image](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/10624972/cf4a4e9f-d0a3-4857-ac71-b4607b70f3c7)

> In this last example, the metric only stopped because we restarted the opentelemetry-collector that was serving these spanmetrics

Another example of the metric being incorrect after the application no longer generates new spans:

![image](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/10624972/c018334a-bac9-4cef-9f72-4874a6bac3d9)

If you need more details or logs, just let us know!

### Collector version

0.83.0

### Environment information

## Environment

Kubernetes using official helm-chart:

```
image:
  # If you want to use the core image `otel/opentelemetry-collector`, you also need to change `command.name` value to `otelcol`.
  repository: otel/opentelemetry-collector-contrib
  pullPolicy: IfNotPresent
  # Overrides the image tag whose default is the chart appVersion.
  tag: "0.83.0"
  # When digest is set to a non-empty value, images will be pulled by digest (regardless of tag value).
  digest: ""
```

### OpenTelemetry Collector configuration

> There are 2 yaml helm configurations in this section.

**The loadbalancer:**

```yaml
# Default values for opentelemetry-collector.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

nameOverride: ""
fullnameOverride: ""

# Valid values are "daemonset", "deployment", and "statefulset".
mode: "deployment"

configMap:
  # Specifies whether a configMap should be created (true by default)
  create: true

# Base collector configuration.
# Supports templating. To escape existing instances of {{ }}, use {{` <original content> `}}.
# For example, {{ REDACTED_EMAIL }} becomes {{` {{ REDACTED_EMAIL }} `}}.
config:
  receivers:
    jaeger: null
    zipkin: null
    prometheus: null
    otlp:
      protocols:
        grpc:
          endpoint: ${env:MY_POD_IP}:4317
          max_recv_msg_size_mib: 500
        http:
          endpoint: ${env:MY_POD_IP}:4318
  processors:
    batch:
      send_batch_max_size: 8192
    routing:
      from_attribute: k8s.cluster.name
      attribute_source: resource
      table:
      - value: a
        exporters:
          - prometheusremotewrite/mimir-a
      - value: b
        exporters:
          - prometheusremotewrite/mimir-b
      - value: c
        exporters:
          - prometheusremotewrite/mimir-c
      - value: d
        exporters:
          - prometheusremotewrite/mimir-d
      - value: e
        exporters:
          - prometheusremotewrite/mimir-e
      - value: e
        exporters:
          - prometheusremotewrite/mimir-f
      - value: f
        exporters:
          - prometheusremotewrite/mimir-g
      - value: g
        exporters:
          - prometheusremotewrite/mimir-h
      - value: h
        exporters:
          - prometheusremotewrite/mimir-i
      - value: J
        exporters:
          - prometheusremotewrite/mimir-j
    # If set to null, will be overridden with values based on k8s resource limits
    memory_limiter: null
  connectors:
    spanmetrics:
      histogram:
        explicit:
          buckets: [1ms, 2ms, 4ms, 6ms, 8ms, 10ms, 50ms, 100ms, 200ms, 400ms, 800ms, 1s, 1400ms, 2s, 5s, 10s, 15s, 20s, 40s, 100s, 500s, 1000s, 10000s]
      namespace: traces.spanmetrics
      dimensions:
        - name: http.status_code
        - name: http.method
        - name: rpc.grpc.status_code
        - name: db.system
        - name: external.service
        - name: k8s.cluster.name
  exporters:
    logging: null
    prometheusremotewrite/mimir-a:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanaaMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    prometheusremotewrite/mimir-b:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanabMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    prometheusremotewrite/mimir-c:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanacMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    prometheusremotewrite/mimir-d:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanadMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    prometheusremotewrite/mimir-e:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanaFirehoseMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    prometheusremotewrite/mimir-f:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanaeMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    prometheusremotewrite/mimir-g:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanafMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    prometheusremotewrite/mimir-h:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanagMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    prometheusremotewrite/mimir-i:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanahMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    prometheusremotewrite/mimir-j:
      endpoint: http://mimir-distributor.mimir-system.svc.cluster.local:8080/api/v1/push
      resource_to_telemetry_conversion:
        enabled: true
      tls:
        insecure: true
      headers:
        X-Scope-OrgID: grafanajMimir
      remote_write_queue:
        enabled: true
        queue_size: 10000
        num_consumers: 5
    loadbalancing:
      protocol:
        otlp:
          tls:
            insecure: true
      resolver:
        dns:
          hostname: opentelemetry-collector-tail.tempo-system.svc.cluster.local
          port: 4317

  extensions:
    # The health_check extension is mandatory for this chart.
    # Without the health_check extension the collector will fail the readiness and liveliness probes.
    # The health_check extension can be modified, but should never be removed.
    health_check: {}
    memory_ballast:
      size_in_percentage: 33
  service:
    telemetry:
      metrics:
        address: 0.0.0.0:8888
      logs:
        encoding: json
    extensions:
      - health_check
      - memory_ballast
    pipelines:
      logs: null
      metrics:
        receivers:
          - spanmetrics
        processors:
          - memory_limiter
          - batch
          - routing
        exporters:
          - prometheusremotewrite/mimir-a
          - prometheusremotewrite/mimir-b
          - prometheusremotewrite/mimir-c
          - prometheusremotewrite/mimir-d
          - prometheusremotewrite/mimir-e
          - prometheusremotewrite/mimir-f
          - prometheusremotewrite/mimir-g
          - prometheusremotewrite/mimir-h
          - prometheusremotewrite/mimir-i
          - prometheusremotewrite/mimir-j
      traces:
        receivers:
          - otlp
        processors:
          - memory_limiter
          - batch
        exporters:
          - loadbalancing
          - spanmetrics

image:
  # If you want to use the core image `otel/opentelemetry-collector`, you also need to change `command.name` value to `otelcol`.
  repository: otel/opentelemetry-collector-contrib
  pullPolicy: IfNotPresent
  # Overrides the image tag whose default is the chart appVersion.
  tag: "0.83.0"
  # When digest is set to a non-empty value, images will be pulled by digest (regardless of tag value).
  digest: ""
imagePullSecrets: []

# OpenTelemetry Collector executable
command:
  name: otelcol-contrib
  extraArgs:
    - --feature-gates=pkg.translator.prometheus.NormalizeName

nodeSelector:
  role: lgtm
tolerations:
- effect: NoSchedule
  key: grafana-stack
  operator: Exists

# Configuration for ports
# nodePort is also allowed
ports:
  otlp:
    enabled: true
    containerPort: 4317
    servicePort: 4317
    hostPort: 4317
    protocol: TCP
    # nodePort: 30317
    appProtocol: grpc
  otlp-http:
    enabled: true
    containerPort: 4318
    servicePort: 4318
    hostPort: 4318
    protocol: TCP
  jaeger-compact:
    enabled: false
    containerPort: 6831
    servicePort: 6831
    hostPort: 6831
    protocol: UDP
  jaeger-thrift:
    enabled: false
    containerPort: 14268
    servicePort: 14268
    hostPort: 14268
    protocol: TCP
  jaeger-grpc:
    enabled: false
    containerPort: 14250
    servicePort: 14250
    hostPort: 14250
    protocol: TCP
  zipkin:
    enabled: false
    containerPort: 9411
    servicePort: 9411
    hostPort: 9411
    protocol: TCP
  metrics:
    # The metrics port is disabled by default. However you need to enable the port
    # in order to use the ServiceMonitor (serviceMonitor.enabled) or PodMonitor (podMonitor.enabled).
    enabled: true
    containerPort: 8888
    servicePort: 8888
    protocol: TCP

# Resource limits & requests. Update according to your own use case as these values might be too low for a typical deployment.
resources:
  limits:
    cpu: 1
    memory: 1Gi
  requests:
    cpu: 100m
    memory: 100Mi

podAnnotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "8888"

# only used with deployment mode
replicaCount: 4

# only used with deployment mode
revisionHistoryLimit: 10

service:
  type: ClusterIP
  # type: LoadBalancer
  # loadBalancerIP: 1.2.3.4
  # loadBalancerSourceRanges: []
  annotations: {}

# PodDisruptionBudget is used only if deployment enabled
podDisruptionBudget:
  enabled: true
#   minAvailable: 2
  maxUnavailable: 1

rollout:
  rollingUpdate: {}
  # When 'mode: daemonset', maxSurge cannot be used when hostPort is set for any of the ports
  # maxSurge: 25%
  # maxUnavailable: 0
  strategy: RollingUpdate

clusterRole:
  # Specifies whether a clusterRole should be created
  # Some presets also trigger the creation of a cluster role and cluster role binding.
  # If using one of those presets, this field is no-op.
  create: false
  # Annotations to add to the clusterRole
  # Can be used in combination with presets that create a cluster role.
  annotations: {}
  # The name of the clusterRole to use.
  # If not set a name is generated using the fullname template
  # Can be used in combination with presets that create a cluster role.
  name: ""
  # A set of rules as documented here : https://kubernetes.io/docs/reference/access-authn-authz/rbac/
  # Can be used in combination with presets that create a cluster role to add additional rules.
  rules:
  - apiGroups:
    - ''
    resources:
    - 'endpoints'
    verbs:
    - 'get'
    - 'list'
    - 'watch'

  clusterRoleBinding:
    # Annotations to add to the clusterRoleBinding
    # Can be used in combination with presets that create a cluster role binding.
    annotations: {}
    # The name of the clusterRoleBinding to use.
    # If not set a name is generated using the fullname template
    # Can be used in combination with presets that create a cluster role binding.
    name: ""
```

**The tail sampler:**

```yaml
# Default values for opentelemetry-collector.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

nameOverride: ""
fullnameOverride: ""

# Valid values are "daemonset", "deployment", and "statefulset".
mode: "deployment"

configMap:
  # Specifies whether a configMap should be created (true by default)
  create: true

# Base collector configuration.
# Supports templating. To escape existing instances of {{ }}, use {{` <original content> `}}.
# For example, {{ REDACTED_EMAIL }} becomes {{` {{ REDACTED_EMAIL }} `}}.
config:
  receivers:
    jaeger: null
    zipkin: null
    prometheus: null
    otlp:
      protocols:
        grpc:
          endpoint: ${env:MY_POD_IP}:4317
          max_recv_msg_size_mib: 500
        http: null
  processors:
    batch:
      send_batch_max_size: 8192
    # If set to null, will be overridden with values based on k8s resource limits
    memory_limiter: null
    tail_sampling:
      decision_wait: 60s
      policies:
      - name: probabilistic
        type: probabilistic
        probabilistic:
          sampling_percentage: 10
    routing:
      from_attribute: k8s.cluster.name
      attribute_source: resource
      # default_exporters:
      # - otlp/default
      table:
      - value: a
        exporters:
          - otlp/tempo-a
      - value: b
        exporters:
          - otlp/tempo-b
      - value: c
        exporters:
          - otlp/tempo-c
      - value: d
        exporters:
          - otlp/tempo-d
      - value: e
        exporters:
          - otlp/tempo-e
      - value: f
        exporters:
          - otlp/tempo-f
      - value: g
        exporters:
          - otlp/tempo-g
      - value: h
        exporters:
          - otlp/tempo-h
      - value: i
        exporters:
          - otlp/tempo-i
      - value: j
        exporters:
          - otlp/tempo-j
  exporters:
    logging: null
    # otlp/default:
    #   endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
    #   tls:
    #     insecure: true
    #   headers:
    #     x-scope-orgid: aMimir
    otlp/tempo-a:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanaaTempo
    otlp/tempo-b:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanabTempo
    otlp/tempo-c:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanacTempo
    otlp/tempo-d:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanadTempo
    otlp/tempo-e:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanaeTempo
    otlp/tempo-f:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanafTempo
    otlp/tempo-g:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanagTempo
    otlp/tempo-h:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanahTempo
    otlp/tempo-i:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanaiTempo
    otlp/tempo-j:
      endpoint: tempo-distributor.tempo-system.svc.cluster.local:4317
      tls:
        insecure: true
      headers:
        x-scope-orgid: grafanajTempo
  extensions:
    # The health_check extension is mandatory for this chart.
    # Without the health_check extension the collector will fail the readiness and liveliness probes.
    # The health_check extension can be modified, but should never be removed.
    health_check: {}
    memory_ballast:
      size_in_percentage: 33
  service:
    telemetry:
      metrics:
        address: 0.0.0.0:8888
      logs:
        encoding: json
    extensions:
      - health_check
      - memory_ballast
    pipelines:
      logs: null
      metrics: null
      traces:
        receivers:
          - otlp
        processors:
          - memory_limiter
          - tail_sampling
          - batch
          - routing
        exporters:
          - otlp/tempo-a
          - otlp/tempo-b
          - otlp/tempo-c
          - otlp/tempo-d
          - otlp/tempo-e
          - otlp/tempo-f
          - otlp/tempo-g
          - otlp/tempo-h
          - otlp/tempo-i
          - otlp/tempo-j

image:
  # If you want to use the core image `otel/opentelemetry-collector`, you also need to change `command.name` value to `otelcol`.
  repository: otel/opentelemetry-collector-contrib
  pullPolicy: IfNotPresent
  # Overrides the image tag whose default is the chart appVersion.
  tag: "0.83.0"
  # When digest is set to a non-empty value, images will be pulled by digest (regardless of tag value).
  digest: ""
imagePullSecrets: []

# OpenTelemetry Collector executable
command:
  name: otelcol-contrib
  extraArgs: []

nodeSelector:
  role: lgtm
tolerations:
- effect: NoSchedule
  key: grafana-stack
  operator: Exists

# Configuration for ports
# nodePort is also allowed
ports:
  otlp:
    enabled: true
    containerPort: 4317
    servicePort: 4317
    hostPort: 4317
    protocol: TCP
    # nodePort: 30317
    appProtocol: grpc
  otlp-http:
    enabled: false
    containerPort: 4318
    servicePort: 4318
    hostPort: 4318
    protocol: TCP
  jaeger-compact:
    enabled: false
    containerPort: 6831
    servicePort: 6831
    hostPort: 6831
    protocol: UDP
  jaeger-thrift:
    enabled: false
    containerPort: 14268
    servicePort: 14268
    hostPort: 14268
    protocol: TCP
  jaeger-grpc:
    enabled: false
    containerPort: 14250
    servicePort: 14250
    hostPort: 14250
    protocol: TCP
  zipkin:
    enabled: false
    containerPort: 9411
    servicePort: 9411
    hostPort: 9411
    protocol: TCP
  metrics:
    # The metrics port is disabled by default. However you need to enable the port
    # in order to use the ServiceMonitor (serviceMonitor.enabled) or PodMonitor (podMonitor.enabled).
    enabled: true
    containerPort: 8888
    servicePort: 8888
    protocol: TCP

# Resource limits & requests. Update according to your own use case as these values might be too low for a typical deployment.
resources:
  limits:
    cpu: 1
    memory: 2Gi
  requests:
    cpu: 100m
    memory: 500Mi

podAnnotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "8888"

# only used with deployment mode
replicaCount: 4

# only used with deployment mode
revisionHistoryLimit: 10

service:
  type: ClusterIP
  # type: LoadBalancer
  # loadBalancerIP: 1.2.3.4
  # loadBalancerSourceRanges: []
  clusterIP: None
  annotations: {}

# PodDisruptionBudget is used only if deployment enabled
podDisruptionBudget:
  enabled: true
#   minAvailable: 2
  maxUnavailable: 1

rollout:
  rollingUpdate: {}
  # When 'mode: daemonset', maxSurge cannot be used when hostPort is set for any of the ports
  # maxSurge: 25%
  # maxUnavailable: 0
  strategy: RollingUpdate
```

> ignore the exporters' names, I replaced them off
```


### Log output

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect Behavior in OpenTelemetry Collector Spanmetrics #27472

Component(s)

What happened?

Collector version

Environment information

Environment

OpenTelemetry Collector configuration

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect Behavior in OpenTelemetry Collector Spanmetrics #27472

Description

Component(s)

What happened?

Collector version

Environment information

Environment

OpenTelemetry Collector configuration

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions