Skip to content

Conversation

@technoweenie
Copy link
Contributor

Fixes #2639 by ensuring files are added to the git index with a path relative to the current working directory.

@technoweenie technoweenie merged commit 12a5c5b into master Oct 3, 2017
@technoweenie technoweenie deleted the add-index-rel-path branch October 3, 2017 17:33
chrisd8088 added a commit that referenced this pull request Oct 16, 2025
In PR #527 we introduced the "git lfs checkout" and "git lfs pull"
commands and added some initial tests of those commands to our test
suite, which we have subsequently expanded over time.  Before we adjust
how these commands check and creates files, we want to further revise
and expand their test suites to validate a broader range of conditions.

In the initial "checkout" test of the t/t-checkout.sh test script we
expand two sets of tests, both those we perform within a subdirectory
of the current working tree, and those we perform with various "glob"
file pattern arguments.

After changing the working directory to a subdirectory of the working
tree, at present the test suite only checks the behaviour of the
"git lfs checkout" command when passed either a single relative file
name and or the "." current directory name.  We now add similar checks
using first a bare ".." parent directory name and then using a relative
path pattern to a sibling subdirectory combined with a trailing "/**"
component, following the gitignore(5) rules we support for pattern
matching.

We also run the "git lfs checkout" command in the root of the working
tree and check that it behaves as expected when passed the name of
a subdirectory which already exists, and when that subdirectory's
name is followed by a trailing "/**" pattern component.

When the "git lfs checkout" command internally converts its file path
pattern arguments into file path patterns that are relative to the root
of the repository, rather being relative to the current working directory,
it appends a trailing "/" character if the pattern is actually just the
path to a directory (or a symbolic link to a directory).  This change is
implemented in the Convert() method of the currentToRepoPatternConverter
structure in our "lfs" package, and was added in commit
56abb71 of PR #4556 so as to ensure
that arguments which resolve to specific directories are properly
converted into patterns that match both the directories and all of
their contents.

Note, though, when we run the "git lfs checkout" command with a "."
directory name argument while in a subdirectory of the working tree,
the internal addition of a trailing "/" character does not actually
affect the success of the command.  This is because the Convert()
of the currentToRepoPatternConverter structure first creates a path
relative to the root of the repository, which necessarily includes
the subdirectory's name as its final component.  When used as a
path pattern according to the gitignore(5) rules, then even without
a trailing "/" character the pattern will still match the subdirectory
and therefore its contents.

On the other hand, the internal additional of a trailing "/" character
does affect the success of the "git lfs checkout" command when we
run the command in a subdirectory and provide a ".." argument.  In this
case, the Convert() method first converts the path into one relative to
the root of the repository, which is simply the "." directory name, the
same as if the user had supplied a "." argument to a "git lfs checkout"
command run in the root of the working tree.

Although "." is a valid file path, when treated as a pattern according
to the gitignore(5) rules, it does not work to match a directory and
all of its contents.  To allow users to provide simple a "." argument
and have it work as expected, the Convert() method first appends the
"/" character, and then strips any leading "./" path component.  If the
result is an empty path, it is replaced with the "**" pattern component
so the final pattern matches everything in the root of the repository.

Thus our new check in the initial "checkout" test of the t/t-checkout.sh
script where we run the "git lfs checkout" command in a subdirectory and
pass a ".." argument, we validate the behaviour described above.  The
check would fail unless the command converted the ".." argument into
a "." path relative to the root of the repository, and into into a
"./" path, and then into the "**" pattern component.

In parallel with our changes to the t/t-checkout.sh test script, we
make a number of changes to the initial "pull" test of the t/t-pull.sh
test script to ensure that test performs all its checks under a common
set of conditions.  We will further enlarge this set of conditions in
subsequent commits.

First, we correct a typo in the refute_server_object() assertion call
that was added to the "pull" test in commit
7158e3b of PR #2641.  This commit
revised the "git lfs pull" and "git lfs checkout" commands so that the
paths they passed to the "git update-index" command were relative to
the current working directory.  To confirm this new behaviour worked
as expected, the "pull" test was updated so it creates a third test
file named "dir/dir.dat", and some of the checks in the test were then
revised to reference this new file.  However, the refute_server_object()
assertion which is intended to prove that the corresponding Git LFS
object does not exist on the test remote server passes an incorrect
variable name, so we fix that now.

Because the "dir/dir.dat" file and corresponding local Git LFS object
file are not comprehensively referenced in all the checks of the
"git lfs pull" command, we add the appropriate assertions for it into
those checks.  (As well, we revise several invocations of the rm(1)
command to align with others in the same test and elsewhere.)

The same commit 7158e3b also extended
the "pull" test to run the "git lfs pull" command with its -I option
under several conditions, but without removing the local Git LFS object
files and working tree files, or checking that they have been re-created
after the "git lfs pull" command is run.  As a consequence, these checks
effectively only confirm that the command does not fail.

To rectify this limitation we delete all the relevant local files
before each of these checks, and then verify that they have been
restored by the "git lfs pull" command.  We also run the "git lfs pull"
command with a more restricted set of file path patterns to confirm that
files which do not match those patterns are not fetched from the remote
server; we then run the command again to fetch those files.  Previously
our checks did not actually test the file path filtering capability of
the command.

As well, we use a file path pattern which is relative to the root of
the repository, so that when we run the "git lfs pull" command in a
subdirectory of the current working tree and pass this pattern, our test
now ascertains that patterns specified with the -I option are correctly
matched with file paths starting from the root of the repository.
(For the moment we do not check that the "git lfs pull" command accepts
"glob" pattern components such as "**", but in a subsequent commit we
will add patterns of this type to the arguments we pass to the -I option.)

Making these changes, however, exposes another concern which would cause
the "pull: with missing object" test to fail unless we adjust it as well.
This test appears later in the same t/t-pull.sh test script, and depends
on the repository clone created by the "pull" test in the local "clone"
test directory.  (Ideally, our tests should not depend on each other in
this way, so that the failure of one test doesn't cause a series of
otherwise unrelated tests to also fail.  For the moment, though, we
defer this issue to a future PR.)

The "pull: with missing object" test was introduced in commit
68ac0f5 of PR #2237 and checks that if
a Git LFS object is missing both locally and on the remote server, the
"git lfs pull" command should report the missing object and return a
non-zero exit code, but should retrieve all other requested Git LFS
objects.  To simulate this condition, the test removes the Git LFS object
corresponding to the "a.dat" test file in the local storage directories
and from the remote test server, and then runs the "git lfs pull"
command.

However, the test does not remove the "a.dat" file from the working
tree, and as noted above, is dependent on the state of the cloned
repository left by the initial "pull" test.  These two factors combine
to make the test vulnerable to transient failures, particularly now
that we have revised the "pull" test such that the final "git lfs pull"
commands in that test re-create all the files in the working tree,
except that we have also changed the "pull" test to drop the commit
which adds an empty file, as we explain below.

In commit d97c785 of PR #4654 the
"pull" test was revised to commit an empty file which matches the
Git LFS file patterns, and then confirm that the "git lfs pull" command
handles that case without trying to fetch the object.  This check
was added before the final checks of the "git lfs pull" command, which
occur within a subdirectory of the working tree.  As a result, those
checks used to run with the empty file as one of the Git LFS objects,
and so did the "pull: with missing object" test.

In these instances, when "git lfs pull" runs, it finds the empty file
matches the defined Git LFS file patterns, and is considered to exist
locally by the LFSObjectExists() method of the Configuration structure
in our "config" package.  Therefore the command passes the object's
pointer data to the Run() method of the singleCheckout structure,
which invokes the DecodePointerFromFile() function in our "lfs" package.
That function returns a successful result because it detects that the
pointer refers to a zero-byte object, and so the RunToPath() method
of the singleCheckout structure is executed, and then the file's path
is passed to the Add() method of the gitIndexer structure.  That
method starts a "git update-index" command and writes the file's path
to the Git command on its standard input file descriptor.

We start the "git update-index" command with the --refresh option, so
it reviews all of its cached index entries, including the one for the
"a.dat" file.  If that file in the working tree has the same modification
timestamp as the Git index file (at least to within the same second,
assuming Git was not compiled with the USE_NSEC macro definition)
then Git considers the file to potentially be "racily clean", because
it's file size and modification timestamp are insufficient to determine
whether the file has changed since the index entry was cached.

The "git update-index" command therefore reads the contents of the
"a.dat" file in the working tree. Because the file's path matches a Git
LFS file pattern, the data is streamed to the "git-lfs-filter-process"
command, which regenerates a new object file in the local storage
directories under ".git/lfs/objects".

Unfortunately, the "pull: with missing object" test starts by removing
the local and remote copies of the "a.dat" file's Git LFS object file,
and expects that running the "git lfs pull" command will not result
in a new local copy being generated, as none can be fetched from the
remote test server.  But if the "pull" test leaves the "a.dat" file
in the working tree with an identical timestamp as the Git index file,
and has also committed the "empty.dat" file to the Git history,
then as described above, the "git lfs pull" invokes "git update-index"
with the --refresh option, which finds the "a.dat" file and re-creates
the local Git LFS object file, and so the test will fail.

Note that without the presence of an empty file in the commit history,
the "git lfs pull" command does not find any local Git LFS object files,
and so it calls the Run() method of the singleCheckout structure after
fetching the available objects from the remote server.  The Run() method
then executes the DecodePointerFromFile() function, which returns an
error because the file in the working tree is not a raw Git LFS pointer,
and so the Run() method skips calling the RunToPath() method and
also does not pass the file's path to the "git update-index" command.
In fact, the "git update-index" command will not be started at all,
as there are no files in the working tree which could be considered to
contain raw Git LFS pointer data, like the empty file is.  Since the
"git update-index" command does not run, the "git-lfs-filter-process"
command is also never invoked, and so a local Git LFS object file is
never re-created for the "a.dat" file, and the test succeeds.

For this reason, we change the "pull" test so that it removes the
"empty.dat" file from the Git commit history after performing the
appropriate check of the "git lfs pull" command's behaviour with
regard to empty files.

To further guard against a potential regression of our tests and
ensure the "pull: with missing object" test passes, we update that
test to remove the "a.dat" file from the working tree as well
as removing its Git LFS object files from both local and remote
storage.  Without the "a.dat" file in the working tree, it now becomes
impossible for the local object file to be re-created under any
circumstances.

Finally, we strengthen our check of the log message output by the
"git lfs pull" command so that we require a specific "does not exist"
message rather than just the raw OID of the missing Git LFS object.
chrisd8088 added a commit that referenced this pull request Oct 16, 2025
After our "git lfs checkout" and "git lfs pull" commands write the
contents of a Git LFS object into a file in the current working tree,
they pass the file's relative path to the "git update-index" command,
so the entries in Git's index for these files reflect their current
state.

If we did not run the "git update-index" command, Git commands like
"git diff-index" would report the files we updated as having been
modified in the working tree.

Notably, though, the "git status" command would normally not report the
files as modified, because it refreshes the Git index in the same way as
the "git update-index" command we run.  We invoke the "git update-index"
command with the --refresh option, so it runs the refresh_index()
function, which is also executed by the "git status" command.

The refresh_index() function checks the state of all the cached entries
in Git's index, and so it will detect that the files in the working
tree have been modified, read their content and pass it to the
"git lfs filter-process" command, which returns the same Git LFS
pointer data as should already be in the Git index.  The end result
is that the index entries remain unchanged, but the cached modification
timestamps are updated.

As a consequence, even if we never invoked the "git update-index"
command while running our "git lfs checkout" and "git lfs pull" commands,
a subsequent "git status" command would typically not show the working
tree files we update as modified.

However, in our tests of the "git lfs pull" command, we rely exclusively
on the "git status" command to try to verify that the "git update-index"
command has been executed by the "git lfs pull" command.  Moreover, the
assert_clean_status() test helper function we invoke to run the
"git status" command has a bug which means it always succeeds, even
if the "git status" command were to report that the current working
tree was not clean.

We added the assert_clean_status() assertion function in commit
7158e3b of PR #2641, along with
calls to that function in the initial "pull" test of what is now
our t/t-pull.sh test script.  In the same commit we also revised the
"git lfs checkout" and "git lfs pull" commands so that they pass paths
relative to the current working directory to the "git update-index"
command, rather than paths relative to the root of the repository.

This latter change resolved the problem reported in #2639, because
by passing file paths relative to the repository root instead of the
current working directory to the "git update-index" command, the
"git lfs checkout" and "git lfs pull" commands were leaving the Git
index and working tree in a confusing state.

At the time when #2639 was reported, running "git lfs pull" in a
subdirectory might result in a subsequent "git status" command finding
modifications to both the Git index and the working tree.  To check
that the problem was resolved, in PR #2641 we introduced the
assert_clean_status() test helper function, and expanded the "pull"
test to call this function after performing several "git lfs pull"
commands in subdirectories of the working tree.

Unfortunately, the assert_clean_status() function does not behave as
an assertion, because it always returns a successful (i.e., zero) exit
code.  The function tests whether the output from a "git status" command
includes the message "working tree clean", and if not, reports the
command's output along with that from a "git lfs status" command.
However, the function does not cause the current test's sub-shell to
stop execution and return a non-zero exit code to signal a test
failure condition.

We address this problem now by adding a call to the "exit" shell built-in
command with a non-zero exit code at the end of the brace-delimited
group command which executes if the assert_clean_status() function
does not find the "working tree clean" message in the output from the
"git status" command.  This change ensures that if Git determines that
the working tree is not clean, our assert_clean_status() function will
cause the test which called it to fail.

As well, we update the assert_clean_status() function so that when
checking for the "working tree clean" message in the output of the
"git status" command, it will also accept the older "working directory
clean" message output by the "git status" command prior to Git v2.10.0.
This allows our assertion function to succeed and the "pull" test to pass
in our GitHub Actions CI jobs that run our test suite against Git v2.0.0,
the oldest Git version we currently support.

As a further problem, though, checking that a "git status" command
finds a clean working tree is no longer sufficient to guarantee that
the "git lfs pull" command has actually called the "git update-index"
command with the appropriate file paths.  To be certain that the Git LFS
client has invoked the "git update-index" command and passed valid file
paths, we also need to check the output of the "git diff-index" command,
and do so before we run the "git status" command, which will cause Git
to reset the file modification timestamps it has cached.

We therefore define a new assert_clean_worktree() test helper function
which runs a "git diff-index HEAD" command and confirms no unexpected
changes are detected in the working tree, and we call this function at
the start of the assert_clean_status() function before the "git status"
command is executed.  This ensures that we check the state of the working
tree before the "git status" command refreshes the file modification
timestamps Git has previously cached.  As noted above, if we reversed
the order of these checks, the "git diff-index" command would always
return an empty list and so our check of the working tree's state would
be defeated.

We also define a new assert_clean_worktree_with_exceptions() test helper
function, which acts like the assert_clean_worktree() function but filters
the output of the "git diff-index HEAD" command with an extended regular
expression pattern provided to the function as its only argument.  This
allows us to use the assert_clean_worktree_with_exceptions() function in
instances where we expect certain files in the working tree to be absent
or modified and so we want to ignore them when they appear in the
"git diff-index" command's output.

We then update the initial "pull" test in the t/t-pull.sh test script
so that it calls either the assert_clean_status() function or the
assert_clean_worktree_with_exceptions() function after each invocation
of the "git lfs pull" command, thereby making the checks in this test
as thorough and consistent with each other as possible.

Finally, we update the initial "checkout" test in the t/t-checkout.sh
test script so that it makes use of the assert_clean_status() and
assert_clean_worktree_with_exceptions() functions in the same manner
as the "pull" test in the t/t-pull.sh script now does.  Both tests
perform similar operations, and like the "git lfs pull" command, the
"git lfs checkout" command also invokes the "git update-index" command.
However, the "checkout" test was not updated in PR #2641 when the
assert_clean_status() was introduced and the "pull" test was updated
to call it.

Since we want our "checkout" test to verify that the "git lfs checkout"
command successfully runs the "git update-index" command, we update the
test so that it mirrors the "pull" test and consistently uses the
assert_clean_worktree() and assert_clean_worktree_with_exceptions()
functions to check the state of the working tree.

To ensure that the "checkout" test still passes, though, we also need to
adjust the test so that it removes any log files it creates because the
assert_clean_status() function now behaves as was originally intended and
will cause the calling test to fail if the working tree is not clean.
chrisd8088 added a commit that referenced this pull request Oct 16, 2025
In PR #527 we introduced the "git lfs checkout" and "git lfs pull"
commands and added some initial tests of those commands to our test
suite, which we have subsequently expanded over time.  Before we adjust
how these commands check and creates files, we want to further revise
and expand their test suites to validate a broader range of conditions.

In the initial "checkout" test of the t/t-checkout.sh test script we
expand two sets of tests, both those we perform within a subdirectory
of the current working tree, and those we perform with various "glob"
file pattern arguments.

After changing the working directory to a subdirectory of the working
tree, at present the test suite only checks the behaviour of the
"git lfs checkout" command when passed either a single relative file
name and or the "." current directory name.  We now add similar checks
using first a bare ".." parent directory name and then using a relative
path pattern to a sibling subdirectory combined with a trailing "/**"
component, following the gitignore(5) rules we support for pattern
matching.

We also run the "git lfs checkout" command in the root of the working
tree and check that it behaves as expected when passed the name of
a subdirectory which already exists, and when that subdirectory's
name is followed by a trailing "/**" pattern component.

When the "git lfs checkout" command internally converts its file path
pattern arguments into file path patterns that are relative to the root
of the repository, rather being relative to the current working directory,
it appends a trailing "/" character if the pattern is actually just the
path to a directory (or a symbolic link to a directory).  This change is
implemented in the Convert() method of the currentToRepoPatternConverter
structure in our "lfs" package, and was added in commit
56abb71 of PR #4556 so as to ensure
that arguments which resolve to specific directories are properly
converted into patterns that match both the directories and all of
their contents.

Note, though, when we run the "git lfs checkout" command with a "."
directory name argument while in a subdirectory of the working tree,
the internal addition of a trailing "/" character does not actually
affect the success of the command.  This is because the Convert()
of the currentToRepoPatternConverter structure first creates a path
relative to the root of the repository, which necessarily includes
the subdirectory's name as its final component.  When used as a
path pattern according to the gitignore(5) rules, then even without
a trailing "/" character the pattern will still match the subdirectory
and therefore its contents.

On the other hand, the internal additional of a trailing "/" character
does affect the success of the "git lfs checkout" command when we
run the command in a subdirectory and provide a ".." argument.  In this
case, the Convert() method first converts the path into one relative to
the root of the repository, which is simply the "." directory name, the
same as if the user had supplied a "." argument to a "git lfs checkout"
command run in the root of the working tree.

Although "." is a valid file path, when treated as a pattern according
to the gitignore(5) rules, it does not work to match a directory and
all of its contents.  To allow users to provide simple a "." argument
and have it work as expected, the Convert() method first appends the
"/" character, and then strips any leading "./" path component.  If the
result is an empty path, it is replaced with the "**" pattern component
so the final pattern matches everything in the root of the repository.

Thus our new check in the initial "checkout" test of the t/t-checkout.sh
script where we run the "git lfs checkout" command in a subdirectory and
pass a ".." argument, we validate the behaviour described above.  The
check would fail unless the command converted the ".." argument into
a "." path relative to the root of the repository, and into into a
"./" path, and then into the "**" pattern component.

In parallel with our changes to the t/t-checkout.sh test script, we
make a number of changes to the initial "pull" test of the t/t-pull.sh
test script to ensure that test performs all its checks under a common
set of conditions.  We will further enlarge this set of conditions in
subsequent commits.

First, we correct a typo in the refute_server_object() assertion call
that was added to the "pull" test in commit
7158e3b of PR #2641.  This commit
revised the "git lfs pull" and "git lfs checkout" commands so that the
paths they passed to the "git update-index" command were relative to
the current working directory.  To confirm this new behaviour worked
as expected, the "pull" test was updated so it creates a third test
file named "dir/dir.dat", and some of the checks in the test were then
revised to reference this new file.  However, the refute_server_object()
assertion which is intended to prove that the corresponding Git LFS
object does not exist on the test remote server passes an incorrect
variable name, so we fix that now.

Because the "dir/dir.dat" file and corresponding local Git LFS object
file are not comprehensively referenced in all the checks of the
"git lfs pull" command, we add the appropriate assertions for it into
those checks.  (As well, we revise several invocations of the rm(1)
command to align with others in the same test and elsewhere.)

The same commit 7158e3b also extended
the "pull" test to run the "git lfs pull" command with its -I option
under several conditions, but without removing the local Git LFS object
files and working tree files, or checking that they have been re-created
after the "git lfs pull" command is run.  As a consequence, these checks
effectively only confirm that the command does not fail.

To rectify this limitation we delete all the relevant local files
before each of these checks, and then verify that they have been
restored by the "git lfs pull" command.  We also run the "git lfs pull"
command with a more restricted set of file path patterns to confirm that
files which do not match those patterns are not fetched from the remote
server; we then run the command again to fetch those files.  Previously
our checks did not actually test the file path filtering capability of
the command.

As well, we use a file path pattern which is relative to the root of
the repository, so that when we run the "git lfs pull" command in a
subdirectory of the current working tree and pass this pattern, our test
now ascertains that patterns specified with the -I option are correctly
matched with file paths starting from the root of the repository.
(For the moment we do not check that the "git lfs pull" command accepts
"glob" pattern components such as "**", but in a subsequent commit we
will add patterns of this type to the arguments we pass to the -I option.)

Making these changes, however, exposes another concern which would cause
the "pull: with missing object" test to fail unless we adjust it as well.
This test appears later in the same t/t-pull.sh test script, and depends
on the repository clone created by the "pull" test in the local "clone"
test directory.  (Ideally, our tests should not depend on each other in
this way, so that the failure of one test doesn't cause a series of
otherwise unrelated tests to also fail.  For the moment, though, we
defer this issue to a future PR.)

The "pull: with missing object" test was introduced in commit
68ac0f5 of PR #2237 and checks that if
a Git LFS object is missing both locally and on the remote server, the
"git lfs pull" command should report the missing object and return a
non-zero exit code, but should retrieve all other requested Git LFS
objects.  To simulate this condition, the test removes the Git LFS object
corresponding to the "a.dat" test file in the local storage directories
and from the remote test server, and then runs the "git lfs pull"
command.

However, the test does not remove the "a.dat" file from the working
tree, and as noted above, is dependent on the state of the cloned
repository left by the initial "pull" test.  These two factors combine
to make the test vulnerable to transient failures, particularly now
that we have revised the "pull" test such that the final "git lfs pull"
commands in that test re-create all the files in the working tree,
except that we have also changed the "pull" test to drop the commit
which adds an empty file, as we explain below.

In commit d97c785 of PR #4654 the
"pull" test was revised to commit an empty file which matches the
Git LFS file patterns, and then confirm that the "git lfs pull" command
handles that case without trying to fetch the object.  This check
was added before the final checks of the "git lfs pull" command, which
occur within a subdirectory of the working tree.  As a result, those
checks used to run with the empty file as one of the Git LFS objects,
and so did the "pull: with missing object" test.

In these instances, when "git lfs pull" runs, it finds the empty file
matches the defined Git LFS file patterns, and is considered to exist
locally by the LFSObjectExists() method of the Configuration structure
in our "config" package.  Therefore the command passes the object's
pointer data to the Run() method of the singleCheckout structure,
which invokes the DecodePointerFromFile() function in our "lfs" package.
That function returns a successful result because it detects that the
pointer refers to a zero-byte object, and so the RunToPath() method
of the singleCheckout structure is executed, and then the file's path
is passed to the Add() method of the gitIndexer structure.  That
method starts a "git update-index" command and writes the file's path
to the Git command on its standard input file descriptor.

We start the "git update-index" command with the --refresh option, so
it reviews all of its cached index entries, including the one for the
"a.dat" file.  If that file in the working tree has the same modification
timestamp as the Git index file (at least to within the same second,
assuming Git was not compiled with the USE_NSEC macro definition)
then Git considers the file to potentially be "racily clean", because
it's file size and modification timestamp are insufficient to determine
whether the file has changed since the index entry was cached.

The "git update-index" command therefore reads the contents of the
"a.dat" file in the working tree. Because the file's path matches a Git
LFS file pattern, the data is streamed to the "git-lfs-filter-process"
command, which regenerates a new object file in the local storage
directories under ".git/lfs/objects".

Unfortunately, the "pull: with missing object" test starts by removing
the local and remote copies of the "a.dat" file's Git LFS object file,
and expects that running the "git lfs pull" command will not result
in a new local copy being generated, as none can be fetched from the
remote test server.  But if the "pull" test leaves the "a.dat" file
in the working tree with an identical timestamp as the Git index file,
and has also committed the "empty.dat" file to the Git history,
then as described above, the "git lfs pull" invokes "git update-index"
with the --refresh option, which finds the "a.dat" file and re-creates
the local Git LFS object file, and so the test will fail.

Note that without the presence of an empty file in the commit history,
the "git lfs pull" command does not find any local Git LFS object files,
and so it calls the Run() method of the singleCheckout structure after
fetching the available objects from the remote server.  The Run() method
then executes the DecodePointerFromFile() function, which returns an
error because the file in the working tree is not a raw Git LFS pointer,
and so the Run() method skips calling the RunToPath() method and
also does not pass the file's path to the "git update-index" command.
In fact, the "git update-index" command will not be started at all,
as there are no files in the working tree which could be considered to
contain raw Git LFS pointer data, like the empty file is.  Since the
"git update-index" command does not run, the "git-lfs-filter-process"
command is also never invoked, and so a local Git LFS object file is
never re-created for the "a.dat" file, and the test succeeds.

For this reason, we change the "pull" test so that it removes the
"empty.dat" file from the Git commit history after performing the
appropriate check of the "git lfs pull" command's behaviour with
regard to empty files.

To further guard against a potential regression of our tests and
ensure the "pull: with missing object" test passes, we update that
test to remove the "a.dat" file from the working tree as well
as removing its Git LFS object files from both local and remote
storage.  Without the "a.dat" file in the working tree, it now becomes
impossible for the local object file to be re-created under any
circumstances.

Finally, we strengthen our check of the log message output by the
"git lfs pull" command so that we require a specific "does not exist"
message rather than just the raw OID of the missing Git LFS object.
chrisd8088 added a commit that referenced this pull request Oct 16, 2025
In PR #527 we introduced the "git lfs checkout" and "git lfs pull"
commands and added some initial tests of those commands to our
test suite, starting with the "checkout" test in what is now our
t/t-checkout.sh test script.  Since that test first appeared in commit
1d05552, it has validated the
behaviour of the "git lfs checkout" command with a number of
example files tracked as Git LFS objects, two of which are located
within subdirectories.

We later added a similar test of the "git lfs pull" command in commit
096b6da of the same PR, but without any
files tracked as Git LFS objects that were not at the top level of the
test repository.  We eventually expanded this test to include one such
file in commit 7158e3b of PR #2641.

However, none of the tests in our t/t-checkout.sh or t/t-pull.sh test
scripts exercise the relevant Git LFS commands with files that are
contained within multiple levels of subdirectories.

Before we adjust how our "git lfs checkout" and "git lfs pull" commands
create subdirectories, we first revise our "checkout" and "pull" tests
to create files within multiple levels of subdirectories, and to confirm
that commands recreate the full set of subdirectories when they do not
exist.

Note that in our "pull" test in particular we now check that
the -I option accepts arguments containing "**" pattern components,
matching the checks we added to our "checkout" test in a previous commit.
chrisd8088 added a commit that referenced this pull request Oct 16, 2025
After our "git lfs checkout" and "git lfs pull" commands write the
contents of a Git LFS object into a file in the current working tree,
they pass the file's relative path to the "git update-index" command,
so the entries in Git's index for these files reflect their current
state.

If we did not run the "git update-index" command, Git commands like
"git diff-index" would report the files we updated as having been
modified in the working tree.

Notably, though, the "git status" command would normally not report the
files as modified, because it refreshes the Git index in the same way as
the "git update-index" command we run.  We invoke the "git update-index"
command with the --refresh option, so it runs the refresh_index()
function, which is also executed by the "git status" command.

The refresh_index() function checks the state of all the cached entries
in Git's index, and so it will detect that the files in the working
tree have been modified, read their content and pass it to the
"git lfs filter-process" command, which returns the same Git LFS
pointer data as should already be in the Git index.  The end result
is that the index entries remain unchanged, but the cached modification
timestamps are updated.

As a consequence, even if we never invoked the "git update-index"
command while running our "git lfs checkout" and "git lfs pull" commands,
a subsequent "git status" command would typically not show the working
tree files we update as modified.

However, in our tests of the "git lfs pull" command, we rely exclusively
on the "git status" command to try to verify that the "git update-index"
command has been executed by the "git lfs pull" command.  Moreover, the
assert_clean_status() test helper function we invoke to run the
"git status" command has a bug which means it always succeeds, even
if the "git status" command were to report that the current working
tree was not clean.

We added the assert_clean_status() assertion function in commit
7158e3b of PR #2641, along with
calls to that function in the initial "pull" test of what is now
our t/t-pull.sh test script.  In the same commit we also revised the
"git lfs checkout" and "git lfs pull" commands so that they pass paths
relative to the current working directory to the "git update-index"
command, rather than paths relative to the root of the repository.

This latter change resolved the problem reported in #2639, because
by passing file paths relative to the repository root instead of the
current working directory to the "git update-index" command, the
"git lfs checkout" and "git lfs pull" commands were leaving the Git
index and working tree in a confusing state.

At the time when #2639 was reported, running "git lfs pull" in a
subdirectory might result in a subsequent "git status" command finding
modifications to both the Git index and the working tree.  To check
that the problem was resolved, in PR #2641 we introduced the
assert_clean_status() test helper function, and expanded the "pull"
test to call this function after performing several "git lfs pull"
commands in subdirectories of the working tree.

Unfortunately, the assert_clean_status() function does not behave as
an assertion, because it always returns a successful (i.e., zero) exit
code.  The function tests whether the output from a "git status" command
includes the message "working tree clean", and if not, reports the
command's output along with that from a "git lfs status" command.
However, the function does not cause the current test's sub-shell to
stop execution and return a non-zero exit code to signal a test
failure condition.

We address this problem now by adding a call to the "exit" shell built-in
command with a non-zero exit code at the end of the brace-delimited
group command which executes if the assert_clean_status() function
does not find the "working tree clean" message in the output from the
"git status" command.  This change ensures that if Git determines that
the working tree is not clean, our assert_clean_status() function will
cause the test which called it to fail.

As well, we update the assert_clean_status() function so that when
checking for the "working tree clean" message in the output of the
"git status" command, it will also accept the older "working directory
clean" message output by the "git status" command prior to Git v2.10.0.
This allows our assertion function to succeed and the "pull" test to pass
in our GitHub Actions CI jobs that run our test suite against Git v2.0.0,
the oldest Git version we currently support.

As a further problem, though, checking that a "git status" command
finds a clean working tree is no longer sufficient to guarantee that
the "git lfs pull" command has actually called the "git update-index"
command with the appropriate file paths.  To be certain that the Git LFS
client has invoked the "git update-index" command and passed valid file
paths, we also need to check the output of the "git diff-index" command,
and do so before we run the "git status" command, which will cause Git
to reset the file modification timestamps it has cached.

We therefore define a new assert_clean_worktree() test helper function
which runs a "git diff-index HEAD" command and confirms no unexpected
changes are detected in the working tree, and we call this function at
the start of the assert_clean_status() function before the "git status"
command is executed.  This ensures that we check the state of the working
tree before the "git status" command refreshes the file modification
timestamps Git has previously cached.  As noted above, if we reversed
the order of these checks, the "git diff-index" command would always
return an empty list and so our check of the working tree's state would
be defeated.

We also define a new assert_clean_worktree_with_exceptions() test helper
function, which acts like the assert_clean_worktree() function but filters
the output of the "git diff-index HEAD" command with an extended regular
expression pattern provided to the function as its only argument.  This
allows us to use the assert_clean_worktree_with_exceptions() function in
instances where we expect certain files in the working tree to be absent
or modified and so we want to ignore them when they appear in the
"git diff-index" command's output.

We then update the initial "pull" test in the t/t-pull.sh test script
so that it calls either the assert_clean_status() function or the
assert_clean_worktree_with_exceptions() function after each invocation
of the "git lfs pull" command, thereby making the checks in this test
as thorough and consistent with each other as possible.

Finally, we update the initial "checkout" test in the t/t-checkout.sh
test script so that it makes use of the assert_clean_status() and
assert_clean_worktree_with_exceptions() functions in the same manner
as the "pull" test in the t/t-pull.sh script now does.  Both tests
perform similar operations, and like the "git lfs pull" command, the
"git lfs checkout" command also invokes the "git update-index" command.
However, the "checkout" test was not updated in PR #2641 when the
assert_clean_status() was introduced and the "pull" test was updated
to call it.

Since we want our "checkout" test to verify that the "git lfs checkout"
command successfully runs the "git update-index" command, we update the
test so that it mirrors the "pull" test and consistently uses the
assert_clean_worktree() and assert_clean_worktree_with_exceptions()
functions to check the state of the working tree.

To ensure that the "checkout" test still passes, though, we also need to
adjust the test so that it removes any log files it creates because the
assert_clean_status() function now behaves as was originally intended and
will cause the calling test to fail if the working tree is not clean.
chrisd8088 added a commit that referenced this pull request Oct 16, 2025
Our "git lfs checkout" and "git lfs pull" commands, at present,
follow any extant symbolic links when they populate the current working
tree with files containing the content of Git LFS objects, even if
the symbolic links point to locations outside of the working tree.
This vulnerability has been assigned the identifier CVE-2025-26625.

To partially address this vulnerability, we adjust the
DecodePointerFromBlob() function in our "lfs" package to use the Lstat()
function from the "os" package in the Go standard library instead of
the Stat() function.  This ensures that the DecodePointerFromBlob()
function checks whether an irregular file or other directory entry
already exists at the location where the "git lfs checkout" and "git lfs
pull" commands intend to create or update a file.

We then update a number of the tests that we added to the t/t-checkout.sh
and t/t-pull.sh test scripts in previous commits, and now also add
another pair of new tests to those scripts.

First, we revise the "checkout: skip directory file conflicts",
"pull: skip directory file conflicts", "checkout: skip directory symlink
conflicts", and "pull: skip directory symlink conflicts" tests so that
when they run on Unix systems, they now expect the name of the lstat(2)
system call to appear in the log messages output by the "git lfs checkout"
and "git lfs pull" commands.  Previously, these tests expected the name
of the stat(2) system call to appear in the commands' log messages.

Next, we expand and revise the "checkout: skip file symlink conflicts"
and "pull: skip file symlink conflicts" tests so they confirm that the
respective commands try to avoid writing through symbolic links which
exist in the working tree at the locations where the commands intend to
create or update files, regardless of the nature of the links' targets.
In their initial form, these tests could only check the case where the
targets of the symbolic links were directories, but now they can also
check the commands' behaviour both when the links' targets do not exist
and when the targets are files which contain Git LFS pointers identical
to those of the corresponding paths in the Git repository.  Previously,
in such cases the commands would create or update files at the locations
of the targets of the symbolic links.

We then add two new tests, named "checkout: skip case-based symlink
conflicts" and "pull: skip case-based symlink conflicts", which confirm
that the respective commands do not write through symbolic links which
exist in the working tree at the locations where the commands intend to
create or update files, after those links are created by Git due to
filename conflicts on case-insensitive filesystems.  Like the other
tests with symbolic links, we only run these new tests on Windows if
the current system supports the creation of true symbolic links.

In both our new and revised tests we run the "git lfs checkout" and
"git lfs pull" commands at several directory levels in the working tree,
in order to exercise the ability for these commands to be run in any
subdirectory, a behaviour we have supported since PR #2641.  We also
confirm that the commands do not add the paths of the symbolic links to
the Git index as they previously did because the commands assumed they
had updated regular files at those locations.

Note that while our new check in the DecodePointerFromFile() function
avoids cases where a symbolic link already exists the working tree
before we try to create or update a file at the same location, this
check does not entirely prevent TOCTOU (time-of-check/time-of-use)
races where a symbolic link might be created immediately after we check
for its existence and before we attempt to create or open a file.

In a subsequent commit we will address these concerns, at least in
part, by changing the SmudgeToFile() method of the GitFilter structure
in our "lfs" package to remove any existing file or link and always
create a new file with the O_EXCL flag.  This should help ensure we only
ever create a new file and never write through a symlink that was added
immediately after the DecodePointerFromBlob() function ran.

Finally, note that other than the "git lfs checkout" and "git lfs pull"
commands, the only other caller of the DecodePointerFromBlob() function
is the "git lfs merge-driver" command, which is guaranteed by the context
in which it runs to always open regular, temporary files created by Git.
For this reason, we do not need to expand the test suite for the
"git lfs merge-driver" command to check how it handles pre-existing
symbolic links.
chrisd8088 added a commit that referenced this pull request Oct 16, 2025
In PR #527 we introduced the "git lfs checkout" and "git lfs pull"
commands and added some initial tests of those commands to our
test suite, starting with the "checkout" test in what is now our
t/t-checkout.sh test script.  Since that test first appeared in commit
1d05552, it has validated the
behaviour of the "git lfs checkout" command with a number of
example files tracked as Git LFS objects, two of which are located
within subdirectories.

We later added a similar test of the "git lfs pull" command in commit
096b6da of the same PR, but without any
files tracked as Git LFS objects that were not at the top level of the
test repository.  We eventually expanded this test to include one such
file in commit 7158e3b of PR #2641.

However, none of the tests in our t/t-checkout.sh or t/t-pull.sh test
scripts exercise the relevant Git LFS commands with files that are
contained within multiple levels of subdirectories.

Before we adjust how our "git lfs checkout" and "git lfs pull" commands
create subdirectories, we first revise our "checkout" and "pull" tests
to create files within multiple levels of subdirectories, and to confirm
that commands recreate the full set of subdirectories when they do not
exist.

Note that in our "pull" test in particular we now check that
the -I option accepts arguments containing "**" pattern components,
matching the checks we added to our "checkout" test in a previous commit.
chrisd8088 added a commit that referenced this pull request Oct 16, 2025
Our "git lfs checkout" and "git lfs pull" commands, at present,
follow any extant symbolic links when they populate the current working
tree with files containing the content of Git LFS objects, even if
the symbolic links point to locations outside of the working tree.
This vulnerability has been assigned the identifier CVE-2025-26625.

To partially address this vulnerability, we adjust the
DecodePointerFromBlob() function in our "lfs" package to use the Lstat()
function from the "os" package in the Go standard library instead of
the Stat() function.  This ensures that the DecodePointerFromBlob()
function checks whether an irregular file or other directory entry
already exists at the location where the "git lfs checkout" and "git lfs
pull" commands intend to create or update a file.

We then update a number of the tests that we added to the t/t-checkout.sh
and t/t-pull.sh test scripts in previous commits, and now also add
another pair of new tests to those scripts.

First, we revise the "checkout: skip directory file conflicts",
"pull: skip directory file conflicts", "checkout: skip directory symlink
conflicts", and "pull: skip directory symlink conflicts" tests so that
when they run on Unix systems, they now expect the name of the lstat(2)
system call to appear in the log messages output by the "git lfs checkout"
and "git lfs pull" commands.  Previously, these tests expected the name
of the stat(2) system call to appear in the commands' log messages.

Next, we expand and revise the "checkout: skip file symlink conflicts"
and "pull: skip file symlink conflicts" tests so they confirm that the
respective commands try to avoid writing through symbolic links which
exist in the working tree at the locations where the commands intend to
create or update files, regardless of the nature of the links' targets.
In their initial form, these tests could only check the case where the
targets of the symbolic links were directories, but now they can also
check the commands' behaviour both when the links' targets do not exist
and when the targets are files which contain Git LFS pointers identical
to those of the corresponding paths in the Git repository.  Previously,
in such cases the commands would create or update files at the locations
of the targets of the symbolic links.

We then add two new tests, named "checkout: skip case-based symlink
conflicts" and "pull: skip case-based symlink conflicts", which confirm
that the respective commands do not write through symbolic links which
exist in the working tree at the locations where the commands intend to
create or update files, after those links are created by Git due to
filename conflicts on case-insensitive filesystems.  Like the other
tests with symbolic links, we only run these new tests on Windows if
the current system supports the creation of true symbolic links.

In both our new and revised tests we run the "git lfs checkout" and
"git lfs pull" commands at several directory levels in the working tree,
in order to exercise the ability for these commands to be run in any
subdirectory, a behaviour we have supported since PR #2641.  We also
confirm that the commands do not add the paths of the symbolic links to
the Git index as they previously did because the commands assumed they
had updated regular files at those locations.

Note that while our new check in the DecodePointerFromFile() function
avoids cases where a symbolic link already exists the working tree
before we try to create or update a file at the same location, this
check does not entirely prevent TOCTOU (time-of-check/time-of-use)
races where a symbolic link might be created immediately after we check
for its existence and before we attempt to create or open a file.

In a subsequent commit we will address these concerns, at least in
part, by changing the SmudgeToFile() method of the GitFilter structure
in our "lfs" package to remove any existing file or link and always
create a new file with the O_EXCL flag.  This should help ensure we only
ever create a new file and never write through a symlink that was added
immediately after the DecodePointerFromBlob() function ran.

Finally, note that other than the "git lfs checkout" and "git lfs pull"
commands, the only other caller of the DecodePointerFromBlob() function
is the "git lfs merge-driver" command, which is guaranteed by the context
in which it runs to always open regular, temporary files created by Git.
For this reason, we do not need to expand the test suite for the
"git lfs merge-driver" command to check how it handles pre-existing
symbolic links.
hswong3i pushed a commit to alvistack/git-lfs-git-lfs that referenced this pull request Oct 18, 2025
In PR git-lfs#527 we introduced the "git lfs checkout" and "git lfs pull"
commands and added some initial tests of those commands to our test
suite, which we have subsequently expanded over time.  Before we adjust
how these commands check and creates files, we want to further revise
and expand their test suites to validate a broader range of conditions.

In the initial "checkout" test of the t/t-checkout.sh test script we
expand two sets of tests, both those we perform within a subdirectory
of the current working tree, and those we perform with various "glob"
file pattern arguments.

After changing the working directory to a subdirectory of the working
tree, at present the test suite only checks the behaviour of the
"git lfs checkout" command when passed either a single relative file
name and or the "." current directory name.  We now add similar checks
using first a bare ".." parent directory name and then using a relative
path pattern to a sibling subdirectory combined with a trailing "/**"
component, following the gitignore(5) rules we support for pattern
matching.

We also run the "git lfs checkout" command in the root of the working
tree and check that it behaves as expected when passed the name of
a subdirectory which already exists, and when that subdirectory's
name is followed by a trailing "/**" pattern component.

When the "git lfs checkout" command internally converts its file path
pattern arguments into file path patterns that are relative to the root
of the repository, rather being relative to the current working directory,
it appends a trailing "/" character if the pattern is actually just the
path to a directory (or a symbolic link to a directory).  This change is
implemented in the Convert() method of the currentToRepoPatternConverter
structure in our "lfs" package, and was added in commit
56abb71 of PR git-lfs#4556 so as to ensure
that arguments which resolve to specific directories are properly
converted into patterns that match both the directories and all of
their contents.

Note, though, when we run the "git lfs checkout" command with a "."
directory name argument while in a subdirectory of the working tree,
the internal addition of a trailing "/" character does not actually
affect the success of the command.  This is because the Convert()
of the currentToRepoPatternConverter structure first creates a path
relative to the root of the repository, which necessarily includes
the subdirectory's name as its final component.  When used as a
path pattern according to the gitignore(5) rules, then even without
a trailing "/" character the pattern will still match the subdirectory
and therefore its contents.

On the other hand, the internal additional of a trailing "/" character
does affect the success of the "git lfs checkout" command when we
run the command in a subdirectory and provide a ".." argument.  In this
case, the Convert() method first converts the path into one relative to
the root of the repository, which is simply the "." directory name, the
same as if the user had supplied a "." argument to a "git lfs checkout"
command run in the root of the working tree.

Although "." is a valid file path, when treated as a pattern according
to the gitignore(5) rules, it does not work to match a directory and
all of its contents.  To allow users to provide simple a "." argument
and have it work as expected, the Convert() method first appends the
"/" character, and then strips any leading "./" path component.  If the
result is an empty path, it is replaced with the "**" pattern component
so the final pattern matches everything in the root of the repository.

Thus our new check in the initial "checkout" test of the t/t-checkout.sh
script where we run the "git lfs checkout" command in a subdirectory and
pass a ".." argument, we validate the behaviour described above.  The
check would fail unless the command converted the ".." argument into
a "." path relative to the root of the repository, and into into a
"./" path, and then into the "**" pattern component.

In parallel with our changes to the t/t-checkout.sh test script, we
make a number of changes to the initial "pull" test of the t/t-pull.sh
test script to ensure that test performs all its checks under a common
set of conditions.  We will further enlarge this set of conditions in
subsequent commits.

First, we correct a typo in the refute_server_object() assertion call
that was added to the "pull" test in commit
7158e3b of PR git-lfs#2641.  This commit
revised the "git lfs pull" and "git lfs checkout" commands so that the
paths they passed to the "git update-index" command were relative to
the current working directory.  To confirm this new behaviour worked
as expected, the "pull" test was updated so it creates a third test
file named "dir/dir.dat", and some of the checks in the test were then
revised to reference this new file.  However, the refute_server_object()
assertion which is intended to prove that the corresponding Git LFS
object does not exist on the test remote server passes an incorrect
variable name, so we fix that now.

Because the "dir/dir.dat" file and corresponding local Git LFS object
file are not comprehensively referenced in all the checks of the
"git lfs pull" command, we add the appropriate assertions for it into
those checks.  (As well, we revise several invocations of the rm(1)
command to align with others in the same test and elsewhere.)

The same commit 7158e3b also extended
the "pull" test to run the "git lfs pull" command with its -I option
under several conditions, but without removing the local Git LFS object
files and working tree files, or checking that they have been re-created
after the "git lfs pull" command is run.  As a consequence, these checks
effectively only confirm that the command does not fail.

To rectify this limitation we delete all the relevant local files
before each of these checks, and then verify that they have been
restored by the "git lfs pull" command.  We also run the "git lfs pull"
command with a more restricted set of file path patterns to confirm that
files which do not match those patterns are not fetched from the remote
server; we then run the command again to fetch those files.  Previously
our checks did not actually test the file path filtering capability of
the command.

As well, we use a file path pattern which is relative to the root of
the repository, so that when we run the "git lfs pull" command in a
subdirectory of the current working tree and pass this pattern, our test
now ascertains that patterns specified with the -I option are correctly
matched with file paths starting from the root of the repository.
(For the moment we do not check that the "git lfs pull" command accepts
"glob" pattern components such as "**", but in a subsequent commit we
will add patterns of this type to the arguments we pass to the -I option.)

Making these changes, however, exposes another concern which would cause
the "pull: with missing object" test to fail unless we adjust it as well.
This test appears later in the same t/t-pull.sh test script, and depends
on the repository clone created by the "pull" test in the local "clone"
test directory.  (Ideally, our tests should not depend on each other in
this way, so that the failure of one test doesn't cause a series of
otherwise unrelated tests to also fail.  For the moment, though, we
defer this issue to a future PR.)

The "pull: with missing object" test was introduced in commit
68ac0f5 of PR git-lfs#2237 and checks that if
a Git LFS object is missing both locally and on the remote server, the
"git lfs pull" command should report the missing object and return a
non-zero exit code, but should retrieve all other requested Git LFS
objects.  To simulate this condition, the test removes the Git LFS object
corresponding to the "a.dat" test file in the local storage directories
and from the remote test server, and then runs the "git lfs pull"
command.

However, the test does not remove the "a.dat" file from the working
tree, and as noted above, is dependent on the state of the cloned
repository left by the initial "pull" test.  These two factors combine
to make the test vulnerable to transient failures, particularly now
that we have revised the "pull" test such that the final "git lfs pull"
commands in that test re-create all the files in the working tree,
except that we have also changed the "pull" test to drop the commit
which adds an empty file, as we explain below.

In commit d97c785 of PR git-lfs#4654 the
"pull" test was revised to commit an empty file which matches the
Git LFS file patterns, and then confirm that the "git lfs pull" command
handles that case without trying to fetch the object.  This check
was added before the final checks of the "git lfs pull" command, which
occur within a subdirectory of the working tree.  As a result, those
checks used to run with the empty file as one of the Git LFS objects,
and so did the "pull: with missing object" test.

In these instances, when "git lfs pull" runs, it finds the empty file
matches the defined Git LFS file patterns, and is considered to exist
locally by the LFSObjectExists() method of the Configuration structure
in our "config" package.  Therefore the command passes the object's
pointer data to the Run() method of the singleCheckout structure,
which invokes the DecodePointerFromFile() function in our "lfs" package.
That function returns a successful result because it detects that the
pointer refers to a zero-byte object, and so the RunToPath() method
of the singleCheckout structure is executed, and then the file's path
is passed to the Add() method of the gitIndexer structure.  That
method starts a "git update-index" command and writes the file's path
to the Git command on its standard input file descriptor.

We start the "git update-index" command with the --refresh option, so
it reviews all of its cached index entries, including the one for the
"a.dat" file.  If that file in the working tree has the same modification
timestamp as the Git index file (at least to within the same second,
assuming Git was not compiled with the USE_NSEC macro definition)
then Git considers the file to potentially be "racily clean", because
it's file size and modification timestamp are insufficient to determine
whether the file has changed since the index entry was cached.

The "git update-index" command therefore reads the contents of the
"a.dat" file in the working tree. Because the file's path matches a Git
LFS file pattern, the data is streamed to the "git-lfs-filter-process"
command, which regenerates a new object file in the local storage
directories under ".git/lfs/objects".

Unfortunately, the "pull: with missing object" test starts by removing
the local and remote copies of the "a.dat" file's Git LFS object file,
and expects that running the "git lfs pull" command will not result
in a new local copy being generated, as none can be fetched from the
remote test server.  But if the "pull" test leaves the "a.dat" file
in the working tree with an identical timestamp as the Git index file,
and has also committed the "empty.dat" file to the Git history,
then as described above, the "git lfs pull" invokes "git update-index"
with the --refresh option, which finds the "a.dat" file and re-creates
the local Git LFS object file, and so the test will fail.

Note that without the presence of an empty file in the commit history,
the "git lfs pull" command does not find any local Git LFS object files,
and so it calls the Run() method of the singleCheckout structure after
fetching the available objects from the remote server.  The Run() method
then executes the DecodePointerFromFile() function, which returns an
error because the file in the working tree is not a raw Git LFS pointer,
and so the Run() method skips calling the RunToPath() method and
also does not pass the file's path to the "git update-index" command.
In fact, the "git update-index" command will not be started at all,
as there are no files in the working tree which could be considered to
contain raw Git LFS pointer data, like the empty file is.  Since the
"git update-index" command does not run, the "git-lfs-filter-process"
command is also never invoked, and so a local Git LFS object file is
never re-created for the "a.dat" file, and the test succeeds.

For this reason, we change the "pull" test so that it removes the
"empty.dat" file from the Git commit history after performing the
appropriate check of the "git lfs pull" command's behaviour with
regard to empty files.

To further guard against a potential regression of our tests and
ensure the "pull: with missing object" test passes, we update that
test to remove the "a.dat" file from the working tree as well
as removing its Git LFS object files from both local and remote
storage.  Without the "a.dat" file in the working tree, it now becomes
impossible for the local object file to be re-created under any
circumstances.

Finally, we strengthen our check of the log message output by the
"git lfs pull" command so that we require a specific "does not exist"
message rather than just the raw OID of the missing Git LFS object.
hswong3i pushed a commit to alvistack/git-lfs-git-lfs that referenced this pull request Oct 18, 2025
After our "git lfs checkout" and "git lfs pull" commands write the
contents of a Git LFS object into a file in the current working tree,
they pass the file's relative path to the "git update-index" command,
so the entries in Git's index for these files reflect their current
state.

If we did not run the "git update-index" command, Git commands like
"git diff-index" would report the files we updated as having been
modified in the working tree.

Notably, though, the "git status" command would normally not report the
files as modified, because it refreshes the Git index in the same way as
the "git update-index" command we run.  We invoke the "git update-index"
command with the --refresh option, so it runs the refresh_index()
function, which is also executed by the "git status" command.

The refresh_index() function checks the state of all the cached entries
in Git's index, and so it will detect that the files in the working
tree have been modified, read their content and pass it to the
"git lfs filter-process" command, which returns the same Git LFS
pointer data as should already be in the Git index.  The end result
is that the index entries remain unchanged, but the cached modification
timestamps are updated.

As a consequence, even if we never invoked the "git update-index"
command while running our "git lfs checkout" and "git lfs pull" commands,
a subsequent "git status" command would typically not show the working
tree files we update as modified.

However, in our tests of the "git lfs pull" command, we rely exclusively
on the "git status" command to try to verify that the "git update-index"
command has been executed by the "git lfs pull" command.  Moreover, the
assert_clean_status() test helper function we invoke to run the
"git status" command has a bug which means it always succeeds, even
if the "git status" command were to report that the current working
tree was not clean.

We added the assert_clean_status() assertion function in commit
7158e3b of PR git-lfs#2641, along with
calls to that function in the initial "pull" test of what is now
our t/t-pull.sh test script.  In the same commit we also revised the
"git lfs checkout" and "git lfs pull" commands so that they pass paths
relative to the current working directory to the "git update-index"
command, rather than paths relative to the root of the repository.

This latter change resolved the problem reported in git-lfs#2639, because
by passing file paths relative to the repository root instead of the
current working directory to the "git update-index" command, the
"git lfs checkout" and "git lfs pull" commands were leaving the Git
index and working tree in a confusing state.

At the time when git-lfs#2639 was reported, running "git lfs pull" in a
subdirectory might result in a subsequent "git status" command finding
modifications to both the Git index and the working tree.  To check
that the problem was resolved, in PR git-lfs#2641 we introduced the
assert_clean_status() test helper function, and expanded the "pull"
test to call this function after performing several "git lfs pull"
commands in subdirectories of the working tree.

Unfortunately, the assert_clean_status() function does not behave as
an assertion, because it always returns a successful (i.e., zero) exit
code.  The function tests whether the output from a "git status" command
includes the message "working tree clean", and if not, reports the
command's output along with that from a "git lfs status" command.
However, the function does not cause the current test's sub-shell to
stop execution and return a non-zero exit code to signal a test
failure condition.

We address this problem now by adding a call to the "exit" shell built-in
command with a non-zero exit code at the end of the brace-delimited
group command which executes if the assert_clean_status() function
does not find the "working tree clean" message in the output from the
"git status" command.  This change ensures that if Git determines that
the working tree is not clean, our assert_clean_status() function will
cause the test which called it to fail.

As well, we update the assert_clean_status() function so that when
checking for the "working tree clean" message in the output of the
"git status" command, it will also accept the older "working directory
clean" message output by the "git status" command prior to Git v2.10.0.
This allows our assertion function to succeed and the "pull" test to pass
in our GitHub Actions CI jobs that run our test suite against Git v2.0.0,
the oldest Git version we currently support.

As a further problem, though, checking that a "git status" command
finds a clean working tree is no longer sufficient to guarantee that
the "git lfs pull" command has actually called the "git update-index"
command with the appropriate file paths.  To be certain that the Git LFS
client has invoked the "git update-index" command and passed valid file
paths, we also need to check the output of the "git diff-index" command,
and do so before we run the "git status" command, which will cause Git
to reset the file modification timestamps it has cached.

We therefore define a new assert_clean_worktree() test helper function
which runs a "git diff-index HEAD" command and confirms no unexpected
changes are detected in the working tree, and we call this function at
the start of the assert_clean_status() function before the "git status"
command is executed.  This ensures that we check the state of the working
tree before the "git status" command refreshes the file modification
timestamps Git has previously cached.  As noted above, if we reversed
the order of these checks, the "git diff-index" command would always
return an empty list and so our check of the working tree's state would
be defeated.

We also define a new assert_clean_worktree_with_exceptions() test helper
function, which acts like the assert_clean_worktree() function but filters
the output of the "git diff-index HEAD" command with an extended regular
expression pattern provided to the function as its only argument.  This
allows us to use the assert_clean_worktree_with_exceptions() function in
instances where we expect certain files in the working tree to be absent
or modified and so we want to ignore them when they appear in the
"git diff-index" command's output.

We then update the initial "pull" test in the t/t-pull.sh test script
so that it calls either the assert_clean_status() function or the
assert_clean_worktree_with_exceptions() function after each invocation
of the "git lfs pull" command, thereby making the checks in this test
as thorough and consistent with each other as possible.

Finally, we update the initial "checkout" test in the t/t-checkout.sh
test script so that it makes use of the assert_clean_status() and
assert_clean_worktree_with_exceptions() functions in the same manner
as the "pull" test in the t/t-pull.sh script now does.  Both tests
perform similar operations, and like the "git lfs pull" command, the
"git lfs checkout" command also invokes the "git update-index" command.
However, the "checkout" test was not updated in PR git-lfs#2641 when the
assert_clean_status() was introduced and the "pull" test was updated
to call it.

Since we want our "checkout" test to verify that the "git lfs checkout"
command successfully runs the "git update-index" command, we update the
test so that it mirrors the "pull" test and consistently uses the
assert_clean_worktree() and assert_clean_worktree_with_exceptions()
functions to check the state of the working tree.

To ensure that the "checkout" test still passes, though, we also need to
adjust the test so that it removes any log files it creates because the
assert_clean_status() function now behaves as was originally intended and
will cause the calling test to fail if the working tree is not clean.
hswong3i pushed a commit to alvistack/git-lfs-git-lfs that referenced this pull request Oct 18, 2025
In PR git-lfs#527 we introduced the "git lfs checkout" and "git lfs pull"
commands and added some initial tests of those commands to our
test suite, starting with the "checkout" test in what is now our
t/t-checkout.sh test script.  Since that test first appeared in commit
1d05552, it has validated the
behaviour of the "git lfs checkout" command with a number of
example files tracked as Git LFS objects, two of which are located
within subdirectories.

We later added a similar test of the "git lfs pull" command in commit
096b6da of the same PR, but without any
files tracked as Git LFS objects that were not at the top level of the
test repository.  We eventually expanded this test to include one such
file in commit 7158e3b of PR git-lfs#2641.

However, none of the tests in our t/t-checkout.sh or t/t-pull.sh test
scripts exercise the relevant Git LFS commands with files that are
contained within multiple levels of subdirectories.

Before we adjust how our "git lfs checkout" and "git lfs pull" commands
create subdirectories, we first revise our "checkout" and "pull" tests
to create files within multiple levels of subdirectories, and to confirm
that commands recreate the full set of subdirectories when they do not
exist.

Note that in our "pull" test in particular we now check that
the -I option accepts arguments containing "**" pattern components,
matching the checks we added to our "checkout" test in a previous commit.
hswong3i pushed a commit to alvistack/git-lfs-git-lfs that referenced this pull request Oct 18, 2025
Our "git lfs checkout" and "git lfs pull" commands, at present,
follow any extant symbolic links when they populate the current working
tree with files containing the content of Git LFS objects, even if
the symbolic links point to locations outside of the working tree.
This vulnerability has been assigned the identifier CVE-2025-26625.

To partially address this vulnerability, we adjust the
DecodePointerFromBlob() function in our "lfs" package to use the Lstat()
function from the "os" package in the Go standard library instead of
the Stat() function.  This ensures that the DecodePointerFromBlob()
function checks whether an irregular file or other directory entry
already exists at the location where the "git lfs checkout" and "git lfs
pull" commands intend to create or update a file.

We then update a number of the tests that we added to the t/t-checkout.sh
and t/t-pull.sh test scripts in previous commits, and now also add
another pair of new tests to those scripts.

First, we revise the "checkout: skip directory file conflicts",
"pull: skip directory file conflicts", "checkout: skip directory symlink
conflicts", and "pull: skip directory symlink conflicts" tests so that
when they run on Unix systems, they now expect the name of the lstat(2)
system call to appear in the log messages output by the "git lfs checkout"
and "git lfs pull" commands.  Previously, these tests expected the name
of the stat(2) system call to appear in the commands' log messages.

Next, we expand and revise the "checkout: skip file symlink conflicts"
and "pull: skip file symlink conflicts" tests so they confirm that the
respective commands try to avoid writing through symbolic links which
exist in the working tree at the locations where the commands intend to
create or update files, regardless of the nature of the links' targets.
In their initial form, these tests could only check the case where the
targets of the symbolic links were directories, but now they can also
check the commands' behaviour both when the links' targets do not exist
and when the targets are files which contain Git LFS pointers identical
to those of the corresponding paths in the Git repository.  Previously,
in such cases the commands would create or update files at the locations
of the targets of the symbolic links.

We then add two new tests, named "checkout: skip case-based symlink
conflicts" and "pull: skip case-based symlink conflicts", which confirm
that the respective commands do not write through symbolic links which
exist in the working tree at the locations where the commands intend to
create or update files, after those links are created by Git due to
filename conflicts on case-insensitive filesystems.  Like the other
tests with symbolic links, we only run these new tests on Windows if
the current system supports the creation of true symbolic links.

In both our new and revised tests we run the "git lfs checkout" and
"git lfs pull" commands at several directory levels in the working tree,
in order to exercise the ability for these commands to be run in any
subdirectory, a behaviour we have supported since PR git-lfs#2641.  We also
confirm that the commands do not add the paths of the symbolic links to
the Git index as they previously did because the commands assumed they
had updated regular files at those locations.

Note that while our new check in the DecodePointerFromFile() function
avoids cases where a symbolic link already exists the working tree
before we try to create or update a file at the same location, this
check does not entirely prevent TOCTOU (time-of-check/time-of-use)
races where a symbolic link might be created immediately after we check
for its existence and before we attempt to create or open a file.

In a subsequent commit we will address these concerns, at least in
part, by changing the SmudgeToFile() method of the GitFilter structure
in our "lfs" package to remove any existing file or link and always
create a new file with the O_EXCL flag.  This should help ensure we only
ever create a new file and never write through a symlink that was added
immediately after the DecodePointerFromBlob() function ran.

Finally, note that other than the "git lfs checkout" and "git lfs pull"
commands, the only other caller of the DecodePointerFromBlob() function
is the "git lfs merge-driver" command, which is guaranteed by the context
in which it runs to always open regular, temporary files created by Git.
For this reason, we do not need to expand the test suite for the
"git lfs merge-driver" command to check how it handles pre-existing
symbolic links.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants