[Geneva Exporter]   Optimize EncodedBatch memory usage for retry scenarios using Arc-based sharing

**Problem**

  Currently, `EncodedBatch.upload_batch()` requires cloning the entire compressed data vector on each upload attempt to preserve data for retry scenarios. This creates
  unnecessary memory allocations and copies, especially for large compressed batches.

```rust
  // Current implementation requires expensive clone for retry capability
  pub async fn upload_batch(&self, batch: &EncodedBatch) -> Result<(), String> {
      self.uploader
          .upload(batch.data.clone(), &batch.event_name, &batch.metadata) // ← Full vector copy
          .await
  }
```

  **Impact**

  - Memory overhead: Each retry attempt copies entire compressed batch data
  - Performance: O(n) memory allocation where n = compressed batch size
  - Scalability: Higher memory pressure under retry scenarios or high throughput

  **Proposed Solution**

  Replace `Vec<u8>` with `Arc<Vec<u8>>` in EncodedBatch to enable cheap reference sharing:

```rust
  #[derive(Debug, Clone)]
  pub struct EncodedBatch {
      pub event_name: String,
      pub data: Arc<Vec<u8>>,  // ← Shared ownership
      pub metadata: BatchMetadata,
  }
```

  **Benefits**

  - Cheap clones: Arc clone is O(1) reference counting vs O(n) memory copy
  - Retry-friendly: Multiple references to same data without duplication
  - Memory efficient: Single allocation shared across retry attempts
  - API compatibility: Maintains existing retry semantics

  **Implementation Tasks**

  1. Update `EncodedBatch` struct to use `Arc<Vec<u8>>`
  2. Modify `OtlpEncoder::encode_log_batch()` to wrap result in `Arc`
  3. Update `uploader.upload()` to handle `Arc<Vec<u8>>` input (with Arc::try_unwrap optimization)
  4. Update tests and benchmarks
  5. Consider backward compatibility if this is a breaking change

  **Alternative Considered**

  - Consume ownership: Change API to `upload_batch(batch: EncodedBatch)` - but this breaks retry capability
  - Keep current clone: Simple but inefficient for large batches

  **Priority**
  Medium - Performance optimization that becomes more important with:
  - Large compressed batch sizes (>100KB)
  - High retry rates
  - High-throughput scenarios


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Geneva Exporter] Optimize EncodedBatch memory usage for retry scenarios using Arc-based sharing #432

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Geneva Exporter] Optimize EncodedBatch memory usage for retry scenarios using Arc-based sharing #432

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions