Skip to content

Have a way to repair broken remote stores #13833

@arianvp

Description

@arianvp

Describe the bug

A lot of CI systems offer local cache APIs over simple REST API. Using these with nix is problematic however.

The problem is that they will remove files that are old. If a nar file gets deleted but the corresponding narinfo doesn't; we end up in a broken state. It'd be great if nix can somehow "Detect" this?

Similarly many people would like to set some lifecycle rule on S3 bucket and delete old files. But this again doesn't work due to Nix only checking .narinfo files for "does this file exist"

Steps To Reproduce

Lets copy to a store; but remove the NAR files. simulating a dumb cache invalidator.

#
$ nix copy --to file://$PWD/cache1 /nix/store/8gndzbdrkc4a66s14yhk5m0jlfypji0p-nginx-1.28.0 -vvv  --narinfo-cache-positive-ttl 0 --narinfo-cache-negative-ttl 0
rm  ./cache1/nar/*

Now lets remove the path from my local disk and try to re-susbtitute. this will fail

$ nix-collect-garbage
$ nix copy --from file://$PWD/cache1 /nix/store/8gndzbdrkc4a66s14yhk5m0jlfypji0p-nginx-1.28.0   --narinfo-cache-positive-ttl 0 --narinfo-cache-negative-ttl 0    
error: file 'nar/0xncsxrn44zan3935czdg6zr4jdagk3dzq9akc7isr0k4h6jl6d2.nar.xz' does not exist in binary cache

Now rerun nix copy --to to try to fix it

$ nix copy --to file://$PWD/cache1 /nix/store/8gndzbdrkc4a66s14yhk5m0jlfypji0p-nginx-1.28.0 -vvv  --narinfo-cache-positive-ttl 0 --narinfo-cache-negative-ttl 0
querying info about '/nix/store/133d066jnk2bqx8k61r44x9kg8yr6qn9-libxslt-1.1.43' on 'file:///Users/arian/scratch/cache1'...
querying info about '/nix/store/3g3sgwpnig8sd0w9wbs7d5gy512r43cq-bash-5.3p3' on 'file:///Users/arian/scratch/cache1'...
querying info about '/nix/store/8gbj2w4yr605h1az4lhw0jfvdcmq6f8m-zlib-ng-2.2.4' on 'file:///Users/arian/scratch/cache1'...
querying info about '/nix/store/8gndzbdrkc4a66s14yhk5m0jlfypji0p-nginx-1.28.0' on 'file:///Users/arian/scratch/cache1'...
querying info about '/nix/store/8kzqz379ffbx14kd8ipr9s5s14ldih1r-gettext-0.25.1' on 'file:///Users/arian/scratch/cache1'...
querying info about '/nix/store/a1h94bqvw9213n7ax3cp6a5c7ly3gy2f-pcre2-10.44' on 'file:///Users/arian/scratch/cache1'...
querying info about '/nix/store/l619460n5l5jgg2y3b3y6k11q4l462d5-libcxx-19.1.7' on 'file:///Users/arian/scratch/cache1'...
querying info about '/nix/store/pff8c9a4l487r0jaipibq21mkww5f1yw-libiconv-109' on 'file:///Users/arian/scratch/cache1'...
querying info about '/nix/store/qz8sy946bfh14jb90wagc43q5clnzxp4-libxml2-2.14.5' on 'file:///Users/arian/scratch/cache1'...
querying info about '/nix/store/vhpaiaa94mqzivqkzjrad6af72irp89j-openssl-3.5.1' on 'file:///Users/arian/scratch/cache1'...
copying 0 paths...

Expected behavior

The second nix copy should "fix" the binary cache. But it doesn't. To decide whether a store path exists in a binary cache it only looks at the .narinfo file but I think it should also check if the nar files that the .narinfo points to exists. We could do a GET on the .narinfo and then HEAD requests on all the NARs referenced in the narinfo and reupload any missing NARs

Metadata

Additional context

Checklist


Add 👍 to issues you find important.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions