need two-phase commit of shard block

We currently impose the following assumption:
- If the master has a validated shard block header, then the shard block is persisted in slave's db

This assumption holds when the cluster is running properly.  However, it can be broken if
- the slave's machine/container is crashed so that even the block is put, it may not be persisted as rocksdb's fsync is disabled by default #486 
- the slave's may persist the block, but the cluster is shut down before the block header is added to master.  Therefore, after restarting the cluster, the slave assumes the block is validated in the cluster, but actually is not

The second issue may be more common.  A fix is to introduce two-phase commitment.  In general, there are two fixes
- write phase 1 information that a block is not commit
- after the shard block is broadcasted to all slaves and master, the block is committed

Additionally, after restarting the cluster, the block that is not committed will be scanned and recovered before the cluster is active


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

need two-phase commit of shard block #487

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

need two-phase commit of shard block #487

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions