Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a76b542
Making builder static
LikeTheSalad Aug 20, 2025
e3e38be
Merge branch 'main' into disk-buffering-api-implementation
LikeTheSalad Aug 21, 2025
54cdbbf
Writing spans using FileSpanStorage
LikeTheSalad Aug 21, 2025
9b233ee
Creating StorageIterator
LikeTheSalad Aug 21, 2025
a67f72e
Clean up from disk exporters
LikeTheSalad Aug 21, 2025
01993c3
Specifying storage type
LikeTheSalad Aug 21, 2025
87b1162
Updating storage tests
LikeTheSalad Aug 22, 2025
cf51489
Updating tests
LikeTheSalad Aug 22, 2025
a8e543d
Removing unused types
LikeTheSalad Aug 22, 2025
4999104
Creating FileSignalStorage
LikeTheSalad Aug 24, 2025
75b41aa
Creating file signal storage implementations
LikeTheSalad Aug 24, 2025
9af7917
Fixing lint warnings
LikeTheSalad Aug 24, 2025
1ca3cb2
Creating export to disk implementations
LikeTheSalad Aug 24, 2025
b5fcf76
Updating integration tests
LikeTheSalad Aug 24, 2025
d027e2e
Merge branch 'main' into disk-buffering-api-implementation
LikeTheSalad Aug 25, 2025
6e35a73
Updating error message
LikeTheSalad Aug 25, 2025
8153313
Updating DESIGN.md
LikeTheSalad Aug 25, 2025
1d64de8
Making callbacks aware of the signal type
LikeTheSalad Aug 25, 2025
b8c0a59
Updating README.md
LikeTheSalad Aug 25, 2025
cfb818e
Updating README.md
LikeTheSalad Aug 25, 2025
ac23620
Spotless
LikeTheSalad Aug 25, 2025
ab424be
./gradlew spotlessApply
otelbot[bot] Aug 25, 2025
b336d9b
Merge remote-tracking branch 'origin/disk-buffering-api-implementatio…
LikeTheSalad Aug 25, 2025
1605402
Merge branch 'main' into disk-buffering-api-implementation
LikeTheSalad Aug 26, 2025
0430c40
Updating internal log levels
LikeTheSalad Aug 26, 2025
a5fe36a
Updating tests
LikeTheSalad Aug 26, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 33 additions & 36 deletions disk-buffering/DESIGN.md
Original file line number Diff line number Diff line change
@@ -1,59 +1,62 @@
# Design Overview

There are three main disk-writing exporters provided by this module:
The core of disk buffering
is [SignalStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/SignalStorage.java).
SignalStorage is an abstraction that defines the bare minimum functionalities needed for
implementations to allow writing and reading signals.

* [LogRecordToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/LogRecordToDiskExporter.java)
* [MetricToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/MetricToDiskExporter.java)
* [SpanToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/SpanToDiskExporter.java))
There is a default implementation per signal that writes serialized signal items to protobuf
delimited messages into files, where each file's name represents a timestamp of when it was created,
which will help later to know when it's ready to read, as well as when it's expired. These
implementations are the following:

Each is responsible for writing a specific type of telemetry to disk storage for later
harvest/ingest.
* [FileSpanStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/impl/FileSpanStorage.java)
* [FileLogRecordStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/impl/FileLogRecordStorage.java)
* [FileMetricStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/impl/FileMetricStorage.java)

For later reading, there are:
Each one has a `create()` method that takes a destination directory (to store data into) and an
optional [FileStorageConfiguration](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/impl/FileStorageConfiguration.java)
to have a finer control of the storing behavior.

* [LogRecordFromToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/LogRecordFromDiskExporter.java)
* [MetricFromDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/MetricFromDiskExporter.java)
* [SpanFromDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/SpanFromDiskExporter.java))
Even
though [SignalStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/SignalStorage.java)
can receive signal items directly to be stored in disk, there are convenience exporter
implementations for each signal that handle the storing process on your behalf. Those are the
following:

Each one of those has a `create()` method that takes a delegate exporter (to send data
to ingest) and the `StorageConfiguration` that tells them where to find buffered data.
* [SpanToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/exporters/SpanToDiskExporter.java)
* [LogRecordToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/exporters/LogRecordToDiskExporter.java)
* [MetricToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/exporters/MetricToDiskExporter.java)

As explained in the [README](README.md), this has to be triggered manually by the consumer of
this library and does not happen automatically.
Each receive their
respective [SignalStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/SignalStorage.java)
object to delegate signals to as well as an optional callback object to notify its operations.

## Writing overview

![Writing flow](assets/writing-flow.png)

* The writing process happens automatically within its `export(Collection<SignalData> signals)`
method, which is called by the configured signal processor.
* When a set of signals is received, these are delegated over to
a type-specific wrapper of [ToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/internal/exporter/ToDiskExporter.java)
class which then serializes them using an implementation
of [SignalSerializer](src/main/java/io/opentelemetry/contrib/disk/buffering/internal/serialization/serializers/SignalSerializer.java)
and then the serialized data is appended into a File using an instance of
the [Storage](src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/Storage.java)
class.
* Via the convenience toDisk exporters, the writing process happens automatically within their
`export(Collection<SignalData> signals)` method, which is called by the configured signal
processor.
* When a set of signals is received, these are delegated over to a type-specific serializer
and then the serialized data is appended into a file.
* The data is written into a file directly, without the use of a buffer, to make sure no data gets
lost in case the application ends unexpectedly.
* Each disk exporter stores its signals in its own folder, which is expected to contain files
* Each signal storage stores its signals in its own folder, which is expected to contain files
that belong to that type of signal only.
* Each file may contain more than a batch of signals if the configuration parameters allow enough
limit size for it.
* If the configured folder size for the signals has been reached and a new file is needed to be
created to keep storing new data, the oldest available file will be removed to make space for the
new one.
* The [Storage](src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/Storage.java),
[FolderManager](src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManager.java)
and [WritableFile](src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/files/WritableFile.java)
files contain more information on the details of the writing process into a file.

## Reading overview

![Reading flow](assets/reading-flow.png)

* The reading process has to be triggered manually by the library consumer as explained in
the [README](README.md).
* The reading process has to be triggered manually by the library consumer via the signal storage
iterator.
* A single file is read at a time and updated to remove the data gathered from it after it is
successfully exported, until it's emptied. Each file previously created during the
writing process has a timestamp in milliseconds, which is used to determine what file to start
Expand All @@ -62,9 +65,3 @@ this library and does not happen automatically.
the time of creating the disk exporter, then it will be ignored, and the next oldest (and
unexpired) one will be used instead.
* All the stale and empty files will be removed as a new file is created.
* The [Storage](src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/Storage.java),
[FolderManager](src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManager.java)
and [ReadableFile](src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/files/ReadableFile.java)
files contain more information on the details of the file reading process.
* Note that the reader delegates the data to the exporter exactly in the way it has received the
data - it does not try to batch data (but this could be an optimization in the future).
173 changes: 98 additions & 75 deletions disk-buffering/README.md
Original file line number Diff line number Diff line change
@@ -1,109 +1,132 @@
# Disk buffering

This module provides exporters that store telemetry data in files which can be
sent later on demand. A high level description of how it works is that there are two separate
processes in place, one for writing data in disk, and one for reading/exporting the previously
stored data.
This module provides an abstraction
named [SignalStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/SignalStorage.java),
as well as default implementations for each signal type that allow writing signals to disk and
reading them later.

* Each exporter stores the received data automatically in disk right after it's received from its
processor.
* The reading of the data back from disk and exporting process has to be done manually. At
the moment there's no automatic mechanism to do so. There's more information on how it can be
achieved, under [Reading data](#reading-data).
For a more detailed information on how the whole process works, take a look at
the [DESIGN.md](DESIGN.md) file.

> For a more detailed information on how the whole process works, take a look at
> the [DESIGN.md](DESIGN.md) file.
## Default implementation usage

## Configuration
The default implementations are the following:

The configurable parameters are provided **per exporter**, the available ones are:
* [FileSpanStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/impl/FileSpanStorage.java)
* [FileLogRecordStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/impl/FileLogRecordStorage.java)
* [FileMetricStorage](src/main/java/io/opentelemetry/contrib/disk/buffering/storage/impl/FileMetricStorage.java)

* Max file size, defaults to 1MB.
* Max folder size, defaults to 10MB. All files are stored in a single folder per-signal, therefore
if all 3 types of signals are stored, the total amount of space from disk to be taken by default
would be of 30MB.
* Max age for file writing, defaults to 30 seconds.
* Min age for file reading, defaults to 33 seconds. It must be greater that the max age for file
writing.
* Max age for file reading, defaults to 18 hours. After that time passes, the file will be
considered stale and will be removed when new files are created. No more data will be read from a
file past this time.

## Usage
### Set up

### Storing data
We need to create a signal storage object per signal type to start writing signals to disk. Each
`File*Storage` implementation has a `create()` function that receives:

* A File directory to store the signal files. Note that each signal storage object must have a
dedicated directory to work properly.
* (Optional) a configuration object.

In order to use it, you need to wrap your own exporter with a new instance of
the ones provided in here:
The available configuration parameters are the following:

* For a LogRecordExporter, it must be wrapped within
a [LogRecordToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/LogRecordToDiskExporter.java).
* For a MetricExporter, it must be wrapped within
a [MetricToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/MetricToDiskExporter.java).
* For a SpanExporter, it must be wrapped within
a [SpanToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/SpanToDiskExporter.java).
* Max file size, defaults to 1MB.
* Max folder size, defaults to 10MB.
* Max age for file writing. It sets the time window where a file can get signals appended to it.
Defaults to 30 seconds.
* Min age for file reading. It sets the time to wait before starting to read from a file after
its creation. Defaults to 33 seconds. It must be greater that the max age for file writing.
* Max age for file reading. After that time passes, the file will be considered stale and will be
removed when new files are created. No more data will be read from a file past this time. Defaults
to 18 hours.

Each wrapper will need the following when instantiating them:
```java
// Root dir
File rootDir = new File("/some/root");

* The exporter to be wrapped.
* A File instance of the root directory where all the data is going to be written. The same root dir
can be used for all the wrappers, since each will create their own folder inside it.
* An instance
of [StorageConfiguration](src/main/java/io/opentelemetry/contrib/disk/buffering/config/StorageConfiguration.java)
with the desired parameters. You can create one with default values by
calling `StorageConfiguration.getDefault()`.
// Setting up span storage
SignalStorage.Span spanStorage = FileSpanStorage.create(new File(rootDir, "spans"));

After wrapping your exporters, you must register the wrapper as the exporter you'll use. It will
take care of always storing the data it receives.
// Setting up metric storage
SignalStorage.Metric metricStorage = FileMetricStorage.create(new File(rootDir, "metrics"));

#### Set up example for spans
// Setting up log storage
SignalStorage.LogRecord logStorage = FileLogRecordStorage.create(new File(rootDir, "logs"));
```

### Writing data
### Storing data

The data is written in the disk by "ToDisk" exporters, these are exporters that serialize and store the data as received by their processors. If for some reason
the "ToDisk" cannot store data in the disk, they'll delegate the data to their wrapped exporter.
While you could manually call your `SignalStorage.write(items)` function, disk buffering
provides convenience exporters that you can use in your OpenTelemetry's instance, so
that all signals are automatically stored as they are created.

```java
// Creating the SpanExporter of our choice.
SpanExporter mySpanExporter = OtlpGrpcSpanExporter.getDefault();
* For a span storage, use
a [SpanToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/exporters/SpanToDiskExporter.java).
* For a log storage, use
a [LogRecordToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/exporters/LogRecordToDiskExporter.java).
* For a metric storage, use
a [MetricToDiskExporter](src/main/java/io/opentelemetry/contrib/disk/buffering/exporters/MetricToDiskExporter.java).

// Wrapping our exporter with its "ToDisk" exporter.
SpanToDiskExporter toDiskExporter = SpanToDiskExporter.create(mySpanExporter, StorageConfiguration.getDefault(new File("/my/signals/cache/dir")));
Each will wrap a signal storage for its respective signal type, as well as an optional callback
to notify when it succeeds, fails, and gets shutdown.

// Registering the disk exporter within our OpenTelemetry instance.
SdkTracerProvider myTraceProvider = SdkTracerProvider.builder()
.addSpanProcessor(SimpleSpanProcessor.create(toDiskExporter))
```java
// Setting up span to disk exporter
SpanToDiskExporter spanToDiskExporter =
SpanToDiskExporter.builder(spanStorage).setExporterCallback(spanCallback).build();
// Setting up metric to disk
MetricToDiskExporter metricToDiskExporter =
MetricToDiskExporter.builder(metricStorage).setExporterCallback(metricCallback).build();
// Setting up log to disk exporter
LogRecordToDiskExporter logToDiskExporter =
LogRecordToDiskExporter.builder(logStorage).setExporterCallback(logCallback).build();

// Using exporters in your OpenTelemetry instance.
OpenTelemetry openTelemetry =
OpenTelemetrySdk.builder()
// Using span to disk exporter
.setTracerProvider(
SdkTracerProvider.builder()
.addSpanProcessor(BatchSpanProcessor.builder(spanToDiskExporter).build())
.build())
// Using log to disk exporter
.setLoggerProvider(
SdkLoggerProvider.builder()
.addLogRecordProcessor(
BatchLogRecordProcessor.builder(logToDiskExporter).build())
.build())
// Using metric to disk exporter
.setMeterProvider(
SdkMeterProvider.builder()
.registerMetricReader(PeriodicMetricReader.create(metricToDiskExporter))
.build())
.build();
OpenTelemetrySdk.builder()
.setTracerProvider(myTraceProvider)
.buildAndRegisterGlobal();

```

Now when creating signals using your `OpenTelemetry` instance, those will get stored in disk.

### Reading data

In order to read data, we need to create "FromDisk" exporters, which read data from the disk, parse it and delegate it
to their wrapped exporters.
In order to read data, we can iterate through our signal storage objects and then forward them to
a network exporter, as shown in the example for spans below.

```java
try {
SpanFromDiskExporter fromDiskExporter = SpanFromDiskExporter.create(memorySpanExporter, storageConfig);
if(fromDiskExporter.exportStoredBatch(1, TimeUnit.SECONDS)) {
// A batch was successfully exported and removed from disk. You can call this method for as long as it keeps returning true.
} else {
// Either there was no data in the disk or the wrapped exporter returned CompletableResultCode.ofFailure().
}
} catch (IOException e) {
// Something unexpected happened.
// Example of reading an exporting spans from disk
OtlpHttpSpanExporter networkExporter;
Iterator<Collection<SpanData>> spanCollections = spanStorage.iterator();
while(spanCollections.hasNext()){
networkExporter.export(spanCollections.next());
}
```

The `File*Storage` iterators delete the previously returned collection when `next()` is called,
assuming that if the next collection is requested is because the previous one was successfully
consumed.

Both the writing and reading processes can run in parallel and they don't overlap
because each is supposed to happen in different files. We ensure that reader and writer don't
accidentally meet in the same file by using the configurable parameters. These parameters set non-overlapping time frames for each action to be done on a single file at a time. On top of that, there's a mechanism in
place to avoid overlapping on edge cases where the time frames ended but the resources haven't been
released. For that mechanism to work properly, this tool assumes that both the reading and the
writing actions are executed within the same application process.
accidentally meet in the same file by using the configurable parameters. These parameters set
non-overlapping time frames for each action to be done on a single file at a time. On top of that,
there's a mechanism in place to avoid overlapping on edge cases where the time frames ended but the
resources haven't been released. For that mechanism to work properly, this tool assumes that both
the reading and the writing actions are executed within the same application process.

## Component owners

Expand Down
Binary file modified disk-buffering/assets/reading-flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified disk-buffering/assets/writing-flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

This file was deleted.

Loading
Loading