Releases: streamingfast/firehose-core
v1.9.11
[!WARNING] This version contains a thread leak, leading to eventual higher memory usage, bump to v1.9.12
Substreams improvements v1.15.8
Rework the execout File read/write to improve memory efficiency:
-
This reduces the RAM usage necessary to read and stream data to the user on tier1,
as well as to read the existing execouts on tier2 jobs (in multi-stage scenario) -
The cached execouts need to be rewritten to take advantage of this, since their data is currently not ordered:
the system will automatically load and rewrite existing execout when they are used. -
Code changes include:
- new FileReader / FileWriter that "read as you go" or "write as you go"
- No more 'KV' map attached to the File
- Split the IndexWriter away from its dependencies on execoutMappers.
- Clock distributor now also reads "as you go", using a small "one-block-cache"
-
Removed
SUBSTREAMS_OUTPUT_SIZE_LIMIT_PER_SEGMENTenv var (since this is not a RAM issue anymore) -
Add
uncompressed_egress_bytesfield tosubstreams request statslog message
Various
- (dstore) Add storageClass query parameter for s3:// urls on stores (@fschoell)
- Update the firehose-beacon proto to include the new Electra spec in the 'well-known' protobuf definitions (@fschoell)
- Use The Graph's Network Registry to recognize chains by genesis blocks and fill the 'advertise' server on substreams/firehose
v1.9.10
Substreams improvements v1.15.7
- Tier2 jobs now write mapper outputs "as they progress", preventing memory usage spikes when saving them to disk.
- Tier2 jobs now limit writing and loading mapper output files to a maximum size of 8GiB by default.
- Tier2 jobs now release existingExecOuts memory as blocks progress
- Speed up DeleteByPrefix operations on all tiers (5x perf improvement on some heavy substreams)
- Added
SUBSTREAMS_OUTPUT_SIZE_LIMIT_PER_SEGMENT(int) environment variable to control this new limit. - Added
SUBSTREAMS_STORE_SIZE_LIMIT(uint64) env var to allow overwriting the default 1GiB value - Added
SUBSTREAMS_PRINT_STACK(bool) env var to enable printing full stack traces when caught panic occurs - Added
SUBSTREAMS_DEBUG_API_ADDR(string) environment variable to expose a "debug API" HTTP interface that allows blocking connections, running GC, listing or canceling active requests. - Prevent a deterministic failure on a module definition (mode, valueType, updatePolicy) from persisting when the issue is fixed in the substreams.yaml streamingfast/substreams#621
- Metering events on tier2 now bundled at the end of the job (prevents sending metering events for failing jobs)
- Added metering for: "processed_blocks" (block * number of stages where execution happened) and "egress_bytes"
v1.9.9
Substreams performance improvements v1.15.4
- (RAM+CPU) dedupe execution of modules with same hash but different name when computing dependency graph. (#619)
- (RAM) prevent memory usage burst on tier2 when writing mapper by streaming protobuf items to writer
- Tier1 requests will no longer error out with "service currently overloaded" because tier2 servers are ramping up
New 'firehose' reader
- Add
reader-node-firehosewhich creates one-blocks by consuming blocks from an already existing Firehose endpoint. This can be used to set up an indexer stack without having to run an instrumented blockchain node, or getting redundancy from another firehose provider.
Other
- Bumped grpc-go lib to 1.72.0
- Now building
amd64andarm64Docker images on push & release.
v1.9.8
v1.9.7
- Bump substreams to v1.15.2
- fix the 'quicksave' feature on substreams (incorrect block hash on quicksave)
v1.9.6
Substreams (v1.15.1)
- Save deterministic failures in WASM in the module cache (under a file named
errors.0123456789.zstat the failed block number), so further requests depending on this module at the same block can return the error immediately without re-executing the module.
v1.9.5
Substreams
- Fix a panic when a module times out on tier2 while being executed from cached outputs
- Add environment variables to control retry behavior, "SUBSTREAMS_WORKER_MAX_RETRIES" (default 10) and "SUBSTREAMS_WORKER_MAX_TIMEOUT_RETRIES" (default 2), changing from previous defaults (720 and 3)
The worker_max_timeout_retries is the number of retries specifically applied to block execution timing out (ex: because of external calls) - The mechanism to slow down processing segments "ahead of blocks being sent to user" has been disabled on "noop-mode" requests, since these requests are used to pre-cache data and should not be slowed down.
- The "number of segments ahead" in this mechanism has been increased from
>number of parallel workers>to<number of parallel workers> * 1.5 - Tier2 now returns GRPC error codes for
DeadlineExceededwhen it times out, andResourceExhaustedwhen a request is rejected due to overload - Tier1 now correctly reports tier2 job outcomes in the
substreams request stats - Added jitter in "retry" logic to prevent all workers from retrying at the same time when tier2 are overloaded
v1.9.4
Substreams (v1.14.5)
- Bugfix for panics on some requests
v1.9.3
🚫 DO NOT USE - Panics on some requests
Substreams (v1.14.4)
- Bugfix: Properly reject requests with a stop-block below the "resolved" StartBlock (caused by module initialBlocks or a chain's firstStreamableBlock)
- Bugfix: Added the
resolved-start-blockto thesubstreams request statslog
v1.9.2
Substreams (v1.14.3)
Bugfixes
-
Fixed
runtime error: slice bounds out of rangeerror on heavy memory usage with wasmtime engin -
Added a validation on a module for the existence of 'triggering' inputs: the server will now fail with a clear error message
when the only available inputs are stores used with mode 'get' (not 'deltas'), instead of silenlty skipping the module on every block.
Performance
- Added a mechanism for 'production-mode' requests where the tier1 will not schedule tier2 jobs over { max_parallel_subrequests } segments above the current block being streamed to the user.
This will ensure that a user slowly reading blocks 1, 2, 3... will not trigger a flood of tier2 jobs for higher blocks, let's say 300_000_000, that might never get read.
Service lifecycle
- Improved connection draining on shutdown: Now waits for the end of the 'shutdown-delay' before draining and refusing new connections, then waits for 'quicksaves' and successful signaling of clients, up to a max of 30 sec.
Logging / errors
- Added information about the number of blocks that need to be processed for a given request in the
sf.substreams.rpc.v2.SessionInitmessage - Added an optional field
limit_processed_blocksto thesf.substreams.rpc.v2.Request. When set to a non-zero value, the server will reject a request that would process more blocks than the given value with theFailedPreconditionGRPC error code. - Improved error messages when a module execution is timing out on a block (ex: due to a slow external call) and now return a
DeadlineExceededConnect/GRPC error code instead of a Internal. Removed 'panic' from wording. - In 'substreams request stats' log, add fields:
remote_jobs_completed,remote_blocks_processedandtotal_uncompressed_read_bytes