Skip to content

Database corruption during restarts since switching to upstream v1.14.7 #360

@sebastianst

Description

@sebastianst

We've recently merged in upstream v1.14.7 and during a sequencer restart, the following errors popped up:

t=2024-08-13T15:18:31+0000 lvl=info msg="Loaded most recent local block" number=8539128 hash=0x3cef05fd474e7e7dc7eaff71e135604e47985a394a59b8dccd4fe408f3c2888b td=0 age=7s
t=2024-08-13T15:18:31+0000 lvl=info msg="Loaded most recent local finalized block" number=8538465 hash=0x810a9d113721b03303e0daffd8a6e6edb62e7baf6bb1481fee6c586959d4112a td=0 age=22m13s
t=2024-08-13T15:18:31+0000 lvl=warn msg="Head state missing, repairing" number=8539128 hash=0x3cef05fd474e7e7dc7eaff71e135604e47985a394a59b8dccd4fe408f3c2888b snaproot=0xc8380132dab17c437b74e60250d0e854e15af86a97bd3158d79caf6eee78540e
t=2024-08-13T15:18:36+0000 lvl=info msg="Rewound to block with state" number=8460000 hash=0xe2d7f61e7b3f63a8e57662e28a22a0092b03fd6b5ca135ca7e9322fa9b3ca589
t=2024-08-13T15:19:29+0000 lvl=error msg="Error in block freeze operation" err="canonical hash missing, can't freeze block 8534620"
t=2024-08-13T15:20:29+0000 lvl=error msg="Error in block freeze operation" err="canonical hash missing, can't freeze block 8530106"
t=2024-08-13T15:21:29+0000 lvl=error msg="Error in block freeze operation" err="canonical hash missing, can't freeze block 8525615"
t=2024-08-13T15:22:29+0000 lvl=error msg="Error in block freeze operation" err="canonical hash missing, can't freeze block 8521225"
t=2024-08-13T15:23:29+0000 lvl=error msg="Error in block freeze operation" err="canonical hash missing, can't freeze block 8516828"
t=2024-08-13T15:24:29+0000 lvl=error msg="Error in block freeze operation" err="canonical hash missing, can't freeze block 8512200"
t=2024-08-13T15:25:29+0000 lvl=error msg="Error in block freeze operation" err="canonical hash missing, can't freeze block 8507555"
t=2024-08-13T15:26:29+0000 lvl=error msg="Error in block freeze operation" err="canonical hash missing, can't freeze block 8503264"
... (repeating every minute)

So at startup, safe and finalized are reset to genesis. The op-node will then currently walk back to genesis. This was fixed locally by shutting down op-node, and using op-wheel engine rewind --set-head --to 8460000, but the source for this db corruption isn't clear yet.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions