-
Notifications
You must be signed in to change notification settings - Fork 2.2k
add files to index with path relative to current dir #2641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rubyist
approved these changes
Oct 3, 2017
chrisd8088
added a commit
that referenced
this pull request
Oct 16, 2025
In PR #527 we introduced the "git lfs checkout" and "git lfs pull" commands and added some initial tests of those commands to our test suite, which we have subsequently expanded over time. Before we adjust how these commands check and creates files, we want to further revise and expand their test suites to validate a broader range of conditions. In the initial "checkout" test of the t/t-checkout.sh test script we expand two sets of tests, both those we perform within a subdirectory of the current working tree, and those we perform with various "glob" file pattern arguments. After changing the working directory to a subdirectory of the working tree, at present the test suite only checks the behaviour of the "git lfs checkout" command when passed either a single relative file name and or the "." current directory name. We now add similar checks using first a bare ".." parent directory name and then using a relative path pattern to a sibling subdirectory combined with a trailing "/**" component, following the gitignore(5) rules we support for pattern matching. We also run the "git lfs checkout" command in the root of the working tree and check that it behaves as expected when passed the name of a subdirectory which already exists, and when that subdirectory's name is followed by a trailing "/**" pattern component. When the "git lfs checkout" command internally converts its file path pattern arguments into file path patterns that are relative to the root of the repository, rather being relative to the current working directory, it appends a trailing "/" character if the pattern is actually just the path to a directory (or a symbolic link to a directory). This change is implemented in the Convert() method of the currentToRepoPatternConverter structure in our "lfs" package, and was added in commit 56abb71 of PR #4556 so as to ensure that arguments which resolve to specific directories are properly converted into patterns that match both the directories and all of their contents. Note, though, when we run the "git lfs checkout" command with a "." directory name argument while in a subdirectory of the working tree, the internal addition of a trailing "/" character does not actually affect the success of the command. This is because the Convert() of the currentToRepoPatternConverter structure first creates a path relative to the root of the repository, which necessarily includes the subdirectory's name as its final component. When used as a path pattern according to the gitignore(5) rules, then even without a trailing "/" character the pattern will still match the subdirectory and therefore its contents. On the other hand, the internal additional of a trailing "/" character does affect the success of the "git lfs checkout" command when we run the command in a subdirectory and provide a ".." argument. In this case, the Convert() method first converts the path into one relative to the root of the repository, which is simply the "." directory name, the same as if the user had supplied a "." argument to a "git lfs checkout" command run in the root of the working tree. Although "." is a valid file path, when treated as a pattern according to the gitignore(5) rules, it does not work to match a directory and all of its contents. To allow users to provide simple a "." argument and have it work as expected, the Convert() method first appends the "/" character, and then strips any leading "./" path component. If the result is an empty path, it is replaced with the "**" pattern component so the final pattern matches everything in the root of the repository. Thus our new check in the initial "checkout" test of the t/t-checkout.sh script where we run the "git lfs checkout" command in a subdirectory and pass a ".." argument, we validate the behaviour described above. The check would fail unless the command converted the ".." argument into a "." path relative to the root of the repository, and into into a "./" path, and then into the "**" pattern component. In parallel with our changes to the t/t-checkout.sh test script, we make a number of changes to the initial "pull" test of the t/t-pull.sh test script to ensure that test performs all its checks under a common set of conditions. We will further enlarge this set of conditions in subsequent commits. First, we correct a typo in the refute_server_object() assertion call that was added to the "pull" test in commit 7158e3b of PR #2641. This commit revised the "git lfs pull" and "git lfs checkout" commands so that the paths they passed to the "git update-index" command were relative to the current working directory. To confirm this new behaviour worked as expected, the "pull" test was updated so it creates a third test file named "dir/dir.dat", and some of the checks in the test were then revised to reference this new file. However, the refute_server_object() assertion which is intended to prove that the corresponding Git LFS object does not exist on the test remote server passes an incorrect variable name, so we fix that now. Because the "dir/dir.dat" file and corresponding local Git LFS object file are not comprehensively referenced in all the checks of the "git lfs pull" command, we add the appropriate assertions for it into those checks. (As well, we revise several invocations of the rm(1) command to align with others in the same test and elsewhere.) The same commit 7158e3b also extended the "pull" test to run the "git lfs pull" command with its -I option under several conditions, but without removing the local Git LFS object files and working tree files, or checking that they have been re-created after the "git lfs pull" command is run. As a consequence, these checks effectively only confirm that the command does not fail. To rectify this limitation we delete all the relevant local files before each of these checks, and then verify that they have been restored by the "git lfs pull" command. We also run the "git lfs pull" command with a more restricted set of file path patterns to confirm that files which do not match those patterns are not fetched from the remote server; we then run the command again to fetch those files. Previously our checks did not actually test the file path filtering capability of the command. As well, we use a file path pattern which is relative to the root of the repository, so that when we run the "git lfs pull" command in a subdirectory of the current working tree and pass this pattern, our test now ascertains that patterns specified with the -I option are correctly matched with file paths starting from the root of the repository. (For the moment we do not check that the "git lfs pull" command accepts "glob" pattern components such as "**", but in a subsequent commit we will add patterns of this type to the arguments we pass to the -I option.) Making these changes, however, exposes another concern which would cause the "pull: with missing object" test to fail unless we adjust it as well. This test appears later in the same t/t-pull.sh test script, and depends on the repository clone created by the "pull" test in the local "clone" test directory. (Ideally, our tests should not depend on each other in this way, so that the failure of one test doesn't cause a series of otherwise unrelated tests to also fail. For the moment, though, we defer this issue to a future PR.) The "pull: with missing object" test was introduced in commit 68ac0f5 of PR #2237 and checks that if a Git LFS object is missing both locally and on the remote server, the "git lfs pull" command should report the missing object and return a non-zero exit code, but should retrieve all other requested Git LFS objects. To simulate this condition, the test removes the Git LFS object corresponding to the "a.dat" test file in the local storage directories and from the remote test server, and then runs the "git lfs pull" command. However, the test does not remove the "a.dat" file from the working tree, and as noted above, is dependent on the state of the cloned repository left by the initial "pull" test. These two factors combine to make the test vulnerable to transient failures, particularly now that we have revised the "pull" test such that the final "git lfs pull" commands in that test re-create all the files in the working tree, except that we have also changed the "pull" test to drop the commit which adds an empty file, as we explain below. In commit d97c785 of PR #4654 the "pull" test was revised to commit an empty file which matches the Git LFS file patterns, and then confirm that the "git lfs pull" command handles that case without trying to fetch the object. This check was added before the final checks of the "git lfs pull" command, which occur within a subdirectory of the working tree. As a result, those checks used to run with the empty file as one of the Git LFS objects, and so did the "pull: with missing object" test. In these instances, when "git lfs pull" runs, it finds the empty file matches the defined Git LFS file patterns, and is considered to exist locally by the LFSObjectExists() method of the Configuration structure in our "config" package. Therefore the command passes the object's pointer data to the Run() method of the singleCheckout structure, which invokes the DecodePointerFromFile() function in our "lfs" package. That function returns a successful result because it detects that the pointer refers to a zero-byte object, and so the RunToPath() method of the singleCheckout structure is executed, and then the file's path is passed to the Add() method of the gitIndexer structure. That method starts a "git update-index" command and writes the file's path to the Git command on its standard input file descriptor. We start the "git update-index" command with the --refresh option, so it reviews all of its cached index entries, including the one for the "a.dat" file. If that file in the working tree has the same modification timestamp as the Git index file (at least to within the same second, assuming Git was not compiled with the USE_NSEC macro definition) then Git considers the file to potentially be "racily clean", because it's file size and modification timestamp are insufficient to determine whether the file has changed since the index entry was cached. The "git update-index" command therefore reads the contents of the "a.dat" file in the working tree. Because the file's path matches a Git LFS file pattern, the data is streamed to the "git-lfs-filter-process" command, which regenerates a new object file in the local storage directories under ".git/lfs/objects". Unfortunately, the "pull: with missing object" test starts by removing the local and remote copies of the "a.dat" file's Git LFS object file, and expects that running the "git lfs pull" command will not result in a new local copy being generated, as none can be fetched from the remote test server. But if the "pull" test leaves the "a.dat" file in the working tree with an identical timestamp as the Git index file, and has also committed the "empty.dat" file to the Git history, then as described above, the "git lfs pull" invokes "git update-index" with the --refresh option, which finds the "a.dat" file and re-creates the local Git LFS object file, and so the test will fail. Note that without the presence of an empty file in the commit history, the "git lfs pull" command does not find any local Git LFS object files, and so it calls the Run() method of the singleCheckout structure after fetching the available objects from the remote server. The Run() method then executes the DecodePointerFromFile() function, which returns an error because the file in the working tree is not a raw Git LFS pointer, and so the Run() method skips calling the RunToPath() method and also does not pass the file's path to the "git update-index" command. In fact, the "git update-index" command will not be started at all, as there are no files in the working tree which could be considered to contain raw Git LFS pointer data, like the empty file is. Since the "git update-index" command does not run, the "git-lfs-filter-process" command is also never invoked, and so a local Git LFS object file is never re-created for the "a.dat" file, and the test succeeds. For this reason, we change the "pull" test so that it removes the "empty.dat" file from the Git commit history after performing the appropriate check of the "git lfs pull" command's behaviour with regard to empty files. To further guard against a potential regression of our tests and ensure the "pull: with missing object" test passes, we update that test to remove the "a.dat" file from the working tree as well as removing its Git LFS object files from both local and remote storage. Without the "a.dat" file in the working tree, it now becomes impossible for the local object file to be re-created under any circumstances. Finally, we strengthen our check of the log message output by the "git lfs pull" command so that we require a specific "does not exist" message rather than just the raw OID of the missing Git LFS object.
chrisd8088
added a commit
that referenced
this pull request
Oct 16, 2025
After our "git lfs checkout" and "git lfs pull" commands write the contents of a Git LFS object into a file in the current working tree, they pass the file's relative path to the "git update-index" command, so the entries in Git's index for these files reflect their current state. If we did not run the "git update-index" command, Git commands like "git diff-index" would report the files we updated as having been modified in the working tree. Notably, though, the "git status" command would normally not report the files as modified, because it refreshes the Git index in the same way as the "git update-index" command we run. We invoke the "git update-index" command with the --refresh option, so it runs the refresh_index() function, which is also executed by the "git status" command. The refresh_index() function checks the state of all the cached entries in Git's index, and so it will detect that the files in the working tree have been modified, read their content and pass it to the "git lfs filter-process" command, which returns the same Git LFS pointer data as should already be in the Git index. The end result is that the index entries remain unchanged, but the cached modification timestamps are updated. As a consequence, even if we never invoked the "git update-index" command while running our "git lfs checkout" and "git lfs pull" commands, a subsequent "git status" command would typically not show the working tree files we update as modified. However, in our tests of the "git lfs pull" command, we rely exclusively on the "git status" command to try to verify that the "git update-index" command has been executed by the "git lfs pull" command. Moreover, the assert_clean_status() test helper function we invoke to run the "git status" command has a bug which means it always succeeds, even if the "git status" command were to report that the current working tree was not clean. We added the assert_clean_status() assertion function in commit 7158e3b of PR #2641, along with calls to that function in the initial "pull" test of what is now our t/t-pull.sh test script. In the same commit we also revised the "git lfs checkout" and "git lfs pull" commands so that they pass paths relative to the current working directory to the "git update-index" command, rather than paths relative to the root of the repository. This latter change resolved the problem reported in #2639, because by passing file paths relative to the repository root instead of the current working directory to the "git update-index" command, the "git lfs checkout" and "git lfs pull" commands were leaving the Git index and working tree in a confusing state. At the time when #2639 was reported, running "git lfs pull" in a subdirectory might result in a subsequent "git status" command finding modifications to both the Git index and the working tree. To check that the problem was resolved, in PR #2641 we introduced the assert_clean_status() test helper function, and expanded the "pull" test to call this function after performing several "git lfs pull" commands in subdirectories of the working tree. Unfortunately, the assert_clean_status() function does not behave as an assertion, because it always returns a successful (i.e., zero) exit code. The function tests whether the output from a "git status" command includes the message "working tree clean", and if not, reports the command's output along with that from a "git lfs status" command. However, the function does not cause the current test's sub-shell to stop execution and return a non-zero exit code to signal a test failure condition. We address this problem now by adding a call to the "exit" shell built-in command with a non-zero exit code at the end of the brace-delimited group command which executes if the assert_clean_status() function does not find the "working tree clean" message in the output from the "git status" command. This change ensures that if Git determines that the working tree is not clean, our assert_clean_status() function will cause the test which called it to fail. As well, we update the assert_clean_status() function so that when checking for the "working tree clean" message in the output of the "git status" command, it will also accept the older "working directory clean" message output by the "git status" command prior to Git v2.10.0. This allows our assertion function to succeed and the "pull" test to pass in our GitHub Actions CI jobs that run our test suite against Git v2.0.0, the oldest Git version we currently support. As a further problem, though, checking that a "git status" command finds a clean working tree is no longer sufficient to guarantee that the "git lfs pull" command has actually called the "git update-index" command with the appropriate file paths. To be certain that the Git LFS client has invoked the "git update-index" command and passed valid file paths, we also need to check the output of the "git diff-index" command, and do so before we run the "git status" command, which will cause Git to reset the file modification timestamps it has cached. We therefore define a new assert_clean_worktree() test helper function which runs a "git diff-index HEAD" command and confirms no unexpected changes are detected in the working tree, and we call this function at the start of the assert_clean_status() function before the "git status" command is executed. This ensures that we check the state of the working tree before the "git status" command refreshes the file modification timestamps Git has previously cached. As noted above, if we reversed the order of these checks, the "git diff-index" command would always return an empty list and so our check of the working tree's state would be defeated. We also define a new assert_clean_worktree_with_exceptions() test helper function, which acts like the assert_clean_worktree() function but filters the output of the "git diff-index HEAD" command with an extended regular expression pattern provided to the function as its only argument. This allows us to use the assert_clean_worktree_with_exceptions() function in instances where we expect certain files in the working tree to be absent or modified and so we want to ignore them when they appear in the "git diff-index" command's output. We then update the initial "pull" test in the t/t-pull.sh test script so that it calls either the assert_clean_status() function or the assert_clean_worktree_with_exceptions() function after each invocation of the "git lfs pull" command, thereby making the checks in this test as thorough and consistent with each other as possible. Finally, we update the initial "checkout" test in the t/t-checkout.sh test script so that it makes use of the assert_clean_status() and assert_clean_worktree_with_exceptions() functions in the same manner as the "pull" test in the t/t-pull.sh script now does. Both tests perform similar operations, and like the "git lfs pull" command, the "git lfs checkout" command also invokes the "git update-index" command. However, the "checkout" test was not updated in PR #2641 when the assert_clean_status() was introduced and the "pull" test was updated to call it. Since we want our "checkout" test to verify that the "git lfs checkout" command successfully runs the "git update-index" command, we update the test so that it mirrors the "pull" test and consistently uses the assert_clean_worktree() and assert_clean_worktree_with_exceptions() functions to check the state of the working tree. To ensure that the "checkout" test still passes, though, we also need to adjust the test so that it removes any log files it creates because the assert_clean_status() function now behaves as was originally intended and will cause the calling test to fail if the working tree is not clean.
chrisd8088
added a commit
that referenced
this pull request
Oct 16, 2025
In PR #527 we introduced the "git lfs checkout" and "git lfs pull" commands and added some initial tests of those commands to our test suite, which we have subsequently expanded over time. Before we adjust how these commands check and creates files, we want to further revise and expand their test suites to validate a broader range of conditions. In the initial "checkout" test of the t/t-checkout.sh test script we expand two sets of tests, both those we perform within a subdirectory of the current working tree, and those we perform with various "glob" file pattern arguments. After changing the working directory to a subdirectory of the working tree, at present the test suite only checks the behaviour of the "git lfs checkout" command when passed either a single relative file name and or the "." current directory name. We now add similar checks using first a bare ".." parent directory name and then using a relative path pattern to a sibling subdirectory combined with a trailing "/**" component, following the gitignore(5) rules we support for pattern matching. We also run the "git lfs checkout" command in the root of the working tree and check that it behaves as expected when passed the name of a subdirectory which already exists, and when that subdirectory's name is followed by a trailing "/**" pattern component. When the "git lfs checkout" command internally converts its file path pattern arguments into file path patterns that are relative to the root of the repository, rather being relative to the current working directory, it appends a trailing "/" character if the pattern is actually just the path to a directory (or a symbolic link to a directory). This change is implemented in the Convert() method of the currentToRepoPatternConverter structure in our "lfs" package, and was added in commit 56abb71 of PR #4556 so as to ensure that arguments which resolve to specific directories are properly converted into patterns that match both the directories and all of their contents. Note, though, when we run the "git lfs checkout" command with a "." directory name argument while in a subdirectory of the working tree, the internal addition of a trailing "/" character does not actually affect the success of the command. This is because the Convert() of the currentToRepoPatternConverter structure first creates a path relative to the root of the repository, which necessarily includes the subdirectory's name as its final component. When used as a path pattern according to the gitignore(5) rules, then even without a trailing "/" character the pattern will still match the subdirectory and therefore its contents. On the other hand, the internal additional of a trailing "/" character does affect the success of the "git lfs checkout" command when we run the command in a subdirectory and provide a ".." argument. In this case, the Convert() method first converts the path into one relative to the root of the repository, which is simply the "." directory name, the same as if the user had supplied a "." argument to a "git lfs checkout" command run in the root of the working tree. Although "." is a valid file path, when treated as a pattern according to the gitignore(5) rules, it does not work to match a directory and all of its contents. To allow users to provide simple a "." argument and have it work as expected, the Convert() method first appends the "/" character, and then strips any leading "./" path component. If the result is an empty path, it is replaced with the "**" pattern component so the final pattern matches everything in the root of the repository. Thus our new check in the initial "checkout" test of the t/t-checkout.sh script where we run the "git lfs checkout" command in a subdirectory and pass a ".." argument, we validate the behaviour described above. The check would fail unless the command converted the ".." argument into a "." path relative to the root of the repository, and into into a "./" path, and then into the "**" pattern component. In parallel with our changes to the t/t-checkout.sh test script, we make a number of changes to the initial "pull" test of the t/t-pull.sh test script to ensure that test performs all its checks under a common set of conditions. We will further enlarge this set of conditions in subsequent commits. First, we correct a typo in the refute_server_object() assertion call that was added to the "pull" test in commit 7158e3b of PR #2641. This commit revised the "git lfs pull" and "git lfs checkout" commands so that the paths they passed to the "git update-index" command were relative to the current working directory. To confirm this new behaviour worked as expected, the "pull" test was updated so it creates a third test file named "dir/dir.dat", and some of the checks in the test were then revised to reference this new file. However, the refute_server_object() assertion which is intended to prove that the corresponding Git LFS object does not exist on the test remote server passes an incorrect variable name, so we fix that now. Because the "dir/dir.dat" file and corresponding local Git LFS object file are not comprehensively referenced in all the checks of the "git lfs pull" command, we add the appropriate assertions for it into those checks. (As well, we revise several invocations of the rm(1) command to align with others in the same test and elsewhere.) The same commit 7158e3b also extended the "pull" test to run the "git lfs pull" command with its -I option under several conditions, but without removing the local Git LFS object files and working tree files, or checking that they have been re-created after the "git lfs pull" command is run. As a consequence, these checks effectively only confirm that the command does not fail. To rectify this limitation we delete all the relevant local files before each of these checks, and then verify that they have been restored by the "git lfs pull" command. We also run the "git lfs pull" command with a more restricted set of file path patterns to confirm that files which do not match those patterns are not fetched from the remote server; we then run the command again to fetch those files. Previously our checks did not actually test the file path filtering capability of the command. As well, we use a file path pattern which is relative to the root of the repository, so that when we run the "git lfs pull" command in a subdirectory of the current working tree and pass this pattern, our test now ascertains that patterns specified with the -I option are correctly matched with file paths starting from the root of the repository. (For the moment we do not check that the "git lfs pull" command accepts "glob" pattern components such as "**", but in a subsequent commit we will add patterns of this type to the arguments we pass to the -I option.) Making these changes, however, exposes another concern which would cause the "pull: with missing object" test to fail unless we adjust it as well. This test appears later in the same t/t-pull.sh test script, and depends on the repository clone created by the "pull" test in the local "clone" test directory. (Ideally, our tests should not depend on each other in this way, so that the failure of one test doesn't cause a series of otherwise unrelated tests to also fail. For the moment, though, we defer this issue to a future PR.) The "pull: with missing object" test was introduced in commit 68ac0f5 of PR #2237 and checks that if a Git LFS object is missing both locally and on the remote server, the "git lfs pull" command should report the missing object and return a non-zero exit code, but should retrieve all other requested Git LFS objects. To simulate this condition, the test removes the Git LFS object corresponding to the "a.dat" test file in the local storage directories and from the remote test server, and then runs the "git lfs pull" command. However, the test does not remove the "a.dat" file from the working tree, and as noted above, is dependent on the state of the cloned repository left by the initial "pull" test. These two factors combine to make the test vulnerable to transient failures, particularly now that we have revised the "pull" test such that the final "git lfs pull" commands in that test re-create all the files in the working tree, except that we have also changed the "pull" test to drop the commit which adds an empty file, as we explain below. In commit d97c785 of PR #4654 the "pull" test was revised to commit an empty file which matches the Git LFS file patterns, and then confirm that the "git lfs pull" command handles that case without trying to fetch the object. This check was added before the final checks of the "git lfs pull" command, which occur within a subdirectory of the working tree. As a result, those checks used to run with the empty file as one of the Git LFS objects, and so did the "pull: with missing object" test. In these instances, when "git lfs pull" runs, it finds the empty file matches the defined Git LFS file patterns, and is considered to exist locally by the LFSObjectExists() method of the Configuration structure in our "config" package. Therefore the command passes the object's pointer data to the Run() method of the singleCheckout structure, which invokes the DecodePointerFromFile() function in our "lfs" package. That function returns a successful result because it detects that the pointer refers to a zero-byte object, and so the RunToPath() method of the singleCheckout structure is executed, and then the file's path is passed to the Add() method of the gitIndexer structure. That method starts a "git update-index" command and writes the file's path to the Git command on its standard input file descriptor. We start the "git update-index" command with the --refresh option, so it reviews all of its cached index entries, including the one for the "a.dat" file. If that file in the working tree has the same modification timestamp as the Git index file (at least to within the same second, assuming Git was not compiled with the USE_NSEC macro definition) then Git considers the file to potentially be "racily clean", because it's file size and modification timestamp are insufficient to determine whether the file has changed since the index entry was cached. The "git update-index" command therefore reads the contents of the "a.dat" file in the working tree. Because the file's path matches a Git LFS file pattern, the data is streamed to the "git-lfs-filter-process" command, which regenerates a new object file in the local storage directories under ".git/lfs/objects". Unfortunately, the "pull: with missing object" test starts by removing the local and remote copies of the "a.dat" file's Git LFS object file, and expects that running the "git lfs pull" command will not result in a new local copy being generated, as none can be fetched from the remote test server. But if the "pull" test leaves the "a.dat" file in the working tree with an identical timestamp as the Git index file, and has also committed the "empty.dat" file to the Git history, then as described above, the "git lfs pull" invokes "git update-index" with the --refresh option, which finds the "a.dat" file and re-creates the local Git LFS object file, and so the test will fail. Note that without the presence of an empty file in the commit history, the "git lfs pull" command does not find any local Git LFS object files, and so it calls the Run() method of the singleCheckout structure after fetching the available objects from the remote server. The Run() method then executes the DecodePointerFromFile() function, which returns an error because the file in the working tree is not a raw Git LFS pointer, and so the Run() method skips calling the RunToPath() method and also does not pass the file's path to the "git update-index" command. In fact, the "git update-index" command will not be started at all, as there are no files in the working tree which could be considered to contain raw Git LFS pointer data, like the empty file is. Since the "git update-index" command does not run, the "git-lfs-filter-process" command is also never invoked, and so a local Git LFS object file is never re-created for the "a.dat" file, and the test succeeds. For this reason, we change the "pull" test so that it removes the "empty.dat" file from the Git commit history after performing the appropriate check of the "git lfs pull" command's behaviour with regard to empty files. To further guard against a potential regression of our tests and ensure the "pull: with missing object" test passes, we update that test to remove the "a.dat" file from the working tree as well as removing its Git LFS object files from both local and remote storage. Without the "a.dat" file in the working tree, it now becomes impossible for the local object file to be re-created under any circumstances. Finally, we strengthen our check of the log message output by the "git lfs pull" command so that we require a specific "does not exist" message rather than just the raw OID of the missing Git LFS object.
chrisd8088
added a commit
that referenced
this pull request
Oct 16, 2025
In PR #527 we introduced the "git lfs checkout" and "git lfs pull" commands and added some initial tests of those commands to our test suite, starting with the "checkout" test in what is now our t/t-checkout.sh test script. Since that test first appeared in commit 1d05552, it has validated the behaviour of the "git lfs checkout" command with a number of example files tracked as Git LFS objects, two of which are located within subdirectories. We later added a similar test of the "git lfs pull" command in commit 096b6da of the same PR, but without any files tracked as Git LFS objects that were not at the top level of the test repository. We eventually expanded this test to include one such file in commit 7158e3b of PR #2641. However, none of the tests in our t/t-checkout.sh or t/t-pull.sh test scripts exercise the relevant Git LFS commands with files that are contained within multiple levels of subdirectories. Before we adjust how our "git lfs checkout" and "git lfs pull" commands create subdirectories, we first revise our "checkout" and "pull" tests to create files within multiple levels of subdirectories, and to confirm that commands recreate the full set of subdirectories when they do not exist. Note that in our "pull" test in particular we now check that the -I option accepts arguments containing "**" pattern components, matching the checks we added to our "checkout" test in a previous commit.
chrisd8088
added a commit
that referenced
this pull request
Oct 16, 2025
After our "git lfs checkout" and "git lfs pull" commands write the contents of a Git LFS object into a file in the current working tree, they pass the file's relative path to the "git update-index" command, so the entries in Git's index for these files reflect their current state. If we did not run the "git update-index" command, Git commands like "git diff-index" would report the files we updated as having been modified in the working tree. Notably, though, the "git status" command would normally not report the files as modified, because it refreshes the Git index in the same way as the "git update-index" command we run. We invoke the "git update-index" command with the --refresh option, so it runs the refresh_index() function, which is also executed by the "git status" command. The refresh_index() function checks the state of all the cached entries in Git's index, and so it will detect that the files in the working tree have been modified, read their content and pass it to the "git lfs filter-process" command, which returns the same Git LFS pointer data as should already be in the Git index. The end result is that the index entries remain unchanged, but the cached modification timestamps are updated. As a consequence, even if we never invoked the "git update-index" command while running our "git lfs checkout" and "git lfs pull" commands, a subsequent "git status" command would typically not show the working tree files we update as modified. However, in our tests of the "git lfs pull" command, we rely exclusively on the "git status" command to try to verify that the "git update-index" command has been executed by the "git lfs pull" command. Moreover, the assert_clean_status() test helper function we invoke to run the "git status" command has a bug which means it always succeeds, even if the "git status" command were to report that the current working tree was not clean. We added the assert_clean_status() assertion function in commit 7158e3b of PR #2641, along with calls to that function in the initial "pull" test of what is now our t/t-pull.sh test script. In the same commit we also revised the "git lfs checkout" and "git lfs pull" commands so that they pass paths relative to the current working directory to the "git update-index" command, rather than paths relative to the root of the repository. This latter change resolved the problem reported in #2639, because by passing file paths relative to the repository root instead of the current working directory to the "git update-index" command, the "git lfs checkout" and "git lfs pull" commands were leaving the Git index and working tree in a confusing state. At the time when #2639 was reported, running "git lfs pull" in a subdirectory might result in a subsequent "git status" command finding modifications to both the Git index and the working tree. To check that the problem was resolved, in PR #2641 we introduced the assert_clean_status() test helper function, and expanded the "pull" test to call this function after performing several "git lfs pull" commands in subdirectories of the working tree. Unfortunately, the assert_clean_status() function does not behave as an assertion, because it always returns a successful (i.e., zero) exit code. The function tests whether the output from a "git status" command includes the message "working tree clean", and if not, reports the command's output along with that from a "git lfs status" command. However, the function does not cause the current test's sub-shell to stop execution and return a non-zero exit code to signal a test failure condition. We address this problem now by adding a call to the "exit" shell built-in command with a non-zero exit code at the end of the brace-delimited group command which executes if the assert_clean_status() function does not find the "working tree clean" message in the output from the "git status" command. This change ensures that if Git determines that the working tree is not clean, our assert_clean_status() function will cause the test which called it to fail. As well, we update the assert_clean_status() function so that when checking for the "working tree clean" message in the output of the "git status" command, it will also accept the older "working directory clean" message output by the "git status" command prior to Git v2.10.0. This allows our assertion function to succeed and the "pull" test to pass in our GitHub Actions CI jobs that run our test suite against Git v2.0.0, the oldest Git version we currently support. As a further problem, though, checking that a "git status" command finds a clean working tree is no longer sufficient to guarantee that the "git lfs pull" command has actually called the "git update-index" command with the appropriate file paths. To be certain that the Git LFS client has invoked the "git update-index" command and passed valid file paths, we also need to check the output of the "git diff-index" command, and do so before we run the "git status" command, which will cause Git to reset the file modification timestamps it has cached. We therefore define a new assert_clean_worktree() test helper function which runs a "git diff-index HEAD" command and confirms no unexpected changes are detected in the working tree, and we call this function at the start of the assert_clean_status() function before the "git status" command is executed. This ensures that we check the state of the working tree before the "git status" command refreshes the file modification timestamps Git has previously cached. As noted above, if we reversed the order of these checks, the "git diff-index" command would always return an empty list and so our check of the working tree's state would be defeated. We also define a new assert_clean_worktree_with_exceptions() test helper function, which acts like the assert_clean_worktree() function but filters the output of the "git diff-index HEAD" command with an extended regular expression pattern provided to the function as its only argument. This allows us to use the assert_clean_worktree_with_exceptions() function in instances where we expect certain files in the working tree to be absent or modified and so we want to ignore them when they appear in the "git diff-index" command's output. We then update the initial "pull" test in the t/t-pull.sh test script so that it calls either the assert_clean_status() function or the assert_clean_worktree_with_exceptions() function after each invocation of the "git lfs pull" command, thereby making the checks in this test as thorough and consistent with each other as possible. Finally, we update the initial "checkout" test in the t/t-checkout.sh test script so that it makes use of the assert_clean_status() and assert_clean_worktree_with_exceptions() functions in the same manner as the "pull" test in the t/t-pull.sh script now does. Both tests perform similar operations, and like the "git lfs pull" command, the "git lfs checkout" command also invokes the "git update-index" command. However, the "checkout" test was not updated in PR #2641 when the assert_clean_status() was introduced and the "pull" test was updated to call it. Since we want our "checkout" test to verify that the "git lfs checkout" command successfully runs the "git update-index" command, we update the test so that it mirrors the "pull" test and consistently uses the assert_clean_worktree() and assert_clean_worktree_with_exceptions() functions to check the state of the working tree. To ensure that the "checkout" test still passes, though, we also need to adjust the test so that it removes any log files it creates because the assert_clean_status() function now behaves as was originally intended and will cause the calling test to fail if the working tree is not clean.
chrisd8088
added a commit
that referenced
this pull request
Oct 16, 2025
Our "git lfs checkout" and "git lfs pull" commands, at present, follow any extant symbolic links when they populate the current working tree with files containing the content of Git LFS objects, even if the symbolic links point to locations outside of the working tree. This vulnerability has been assigned the identifier CVE-2025-26625. To partially address this vulnerability, we adjust the DecodePointerFromBlob() function in our "lfs" package to use the Lstat() function from the "os" package in the Go standard library instead of the Stat() function. This ensures that the DecodePointerFromBlob() function checks whether an irregular file or other directory entry already exists at the location where the "git lfs checkout" and "git lfs pull" commands intend to create or update a file. We then update a number of the tests that we added to the t/t-checkout.sh and t/t-pull.sh test scripts in previous commits, and now also add another pair of new tests to those scripts. First, we revise the "checkout: skip directory file conflicts", "pull: skip directory file conflicts", "checkout: skip directory symlink conflicts", and "pull: skip directory symlink conflicts" tests so that when they run on Unix systems, they now expect the name of the lstat(2) system call to appear in the log messages output by the "git lfs checkout" and "git lfs pull" commands. Previously, these tests expected the name of the stat(2) system call to appear in the commands' log messages. Next, we expand and revise the "checkout: skip file symlink conflicts" and "pull: skip file symlink conflicts" tests so they confirm that the respective commands try to avoid writing through symbolic links which exist in the working tree at the locations where the commands intend to create or update files, regardless of the nature of the links' targets. In their initial form, these tests could only check the case where the targets of the symbolic links were directories, but now they can also check the commands' behaviour both when the links' targets do not exist and when the targets are files which contain Git LFS pointers identical to those of the corresponding paths in the Git repository. Previously, in such cases the commands would create or update files at the locations of the targets of the symbolic links. We then add two new tests, named "checkout: skip case-based symlink conflicts" and "pull: skip case-based symlink conflicts", which confirm that the respective commands do not write through symbolic links which exist in the working tree at the locations where the commands intend to create or update files, after those links are created by Git due to filename conflicts on case-insensitive filesystems. Like the other tests with symbolic links, we only run these new tests on Windows if the current system supports the creation of true symbolic links. In both our new and revised tests we run the "git lfs checkout" and "git lfs pull" commands at several directory levels in the working tree, in order to exercise the ability for these commands to be run in any subdirectory, a behaviour we have supported since PR #2641. We also confirm that the commands do not add the paths of the symbolic links to the Git index as they previously did because the commands assumed they had updated regular files at those locations. Note that while our new check in the DecodePointerFromFile() function avoids cases where a symbolic link already exists the working tree before we try to create or update a file at the same location, this check does not entirely prevent TOCTOU (time-of-check/time-of-use) races where a symbolic link might be created immediately after we check for its existence and before we attempt to create or open a file. In a subsequent commit we will address these concerns, at least in part, by changing the SmudgeToFile() method of the GitFilter structure in our "lfs" package to remove any existing file or link and always create a new file with the O_EXCL flag. This should help ensure we only ever create a new file and never write through a symlink that was added immediately after the DecodePointerFromBlob() function ran. Finally, note that other than the "git lfs checkout" and "git lfs pull" commands, the only other caller of the DecodePointerFromBlob() function is the "git lfs merge-driver" command, which is guaranteed by the context in which it runs to always open regular, temporary files created by Git. For this reason, we do not need to expand the test suite for the "git lfs merge-driver" command to check how it handles pre-existing symbolic links.
chrisd8088
added a commit
that referenced
this pull request
Oct 16, 2025
In PR #527 we introduced the "git lfs checkout" and "git lfs pull" commands and added some initial tests of those commands to our test suite, starting with the "checkout" test in what is now our t/t-checkout.sh test script. Since that test first appeared in commit 1d05552, it has validated the behaviour of the "git lfs checkout" command with a number of example files tracked as Git LFS objects, two of which are located within subdirectories. We later added a similar test of the "git lfs pull" command in commit 096b6da of the same PR, but without any files tracked as Git LFS objects that were not at the top level of the test repository. We eventually expanded this test to include one such file in commit 7158e3b of PR #2641. However, none of the tests in our t/t-checkout.sh or t/t-pull.sh test scripts exercise the relevant Git LFS commands with files that are contained within multiple levels of subdirectories. Before we adjust how our "git lfs checkout" and "git lfs pull" commands create subdirectories, we first revise our "checkout" and "pull" tests to create files within multiple levels of subdirectories, and to confirm that commands recreate the full set of subdirectories when they do not exist. Note that in our "pull" test in particular we now check that the -I option accepts arguments containing "**" pattern components, matching the checks we added to our "checkout" test in a previous commit.
chrisd8088
added a commit
that referenced
this pull request
Oct 16, 2025
Our "git lfs checkout" and "git lfs pull" commands, at present, follow any extant symbolic links when they populate the current working tree with files containing the content of Git LFS objects, even if the symbolic links point to locations outside of the working tree. This vulnerability has been assigned the identifier CVE-2025-26625. To partially address this vulnerability, we adjust the DecodePointerFromBlob() function in our "lfs" package to use the Lstat() function from the "os" package in the Go standard library instead of the Stat() function. This ensures that the DecodePointerFromBlob() function checks whether an irregular file or other directory entry already exists at the location where the "git lfs checkout" and "git lfs pull" commands intend to create or update a file. We then update a number of the tests that we added to the t/t-checkout.sh and t/t-pull.sh test scripts in previous commits, and now also add another pair of new tests to those scripts. First, we revise the "checkout: skip directory file conflicts", "pull: skip directory file conflicts", "checkout: skip directory symlink conflicts", and "pull: skip directory symlink conflicts" tests so that when they run on Unix systems, they now expect the name of the lstat(2) system call to appear in the log messages output by the "git lfs checkout" and "git lfs pull" commands. Previously, these tests expected the name of the stat(2) system call to appear in the commands' log messages. Next, we expand and revise the "checkout: skip file symlink conflicts" and "pull: skip file symlink conflicts" tests so they confirm that the respective commands try to avoid writing through symbolic links which exist in the working tree at the locations where the commands intend to create or update files, regardless of the nature of the links' targets. In their initial form, these tests could only check the case where the targets of the symbolic links were directories, but now they can also check the commands' behaviour both when the links' targets do not exist and when the targets are files which contain Git LFS pointers identical to those of the corresponding paths in the Git repository. Previously, in such cases the commands would create or update files at the locations of the targets of the symbolic links. We then add two new tests, named "checkout: skip case-based symlink conflicts" and "pull: skip case-based symlink conflicts", which confirm that the respective commands do not write through symbolic links which exist in the working tree at the locations where the commands intend to create or update files, after those links are created by Git due to filename conflicts on case-insensitive filesystems. Like the other tests with symbolic links, we only run these new tests on Windows if the current system supports the creation of true symbolic links. In both our new and revised tests we run the "git lfs checkout" and "git lfs pull" commands at several directory levels in the working tree, in order to exercise the ability for these commands to be run in any subdirectory, a behaviour we have supported since PR #2641. We also confirm that the commands do not add the paths of the symbolic links to the Git index as they previously did because the commands assumed they had updated regular files at those locations. Note that while our new check in the DecodePointerFromFile() function avoids cases where a symbolic link already exists the working tree before we try to create or update a file at the same location, this check does not entirely prevent TOCTOU (time-of-check/time-of-use) races where a symbolic link might be created immediately after we check for its existence and before we attempt to create or open a file. In a subsequent commit we will address these concerns, at least in part, by changing the SmudgeToFile() method of the GitFilter structure in our "lfs" package to remove any existing file or link and always create a new file with the O_EXCL flag. This should help ensure we only ever create a new file and never write through a symlink that was added immediately after the DecodePointerFromBlob() function ran. Finally, note that other than the "git lfs checkout" and "git lfs pull" commands, the only other caller of the DecodePointerFromBlob() function is the "git lfs merge-driver" command, which is guaranteed by the context in which it runs to always open regular, temporary files created by Git. For this reason, we do not need to expand the test suite for the "git lfs merge-driver" command to check how it handles pre-existing symbolic links.
hswong3i
pushed a commit
to alvistack/git-lfs-git-lfs
that referenced
this pull request
Oct 18, 2025
In PR git-lfs#527 we introduced the "git lfs checkout" and "git lfs pull" commands and added some initial tests of those commands to our test suite, which we have subsequently expanded over time. Before we adjust how these commands check and creates files, we want to further revise and expand their test suites to validate a broader range of conditions. In the initial "checkout" test of the t/t-checkout.sh test script we expand two sets of tests, both those we perform within a subdirectory of the current working tree, and those we perform with various "glob" file pattern arguments. After changing the working directory to a subdirectory of the working tree, at present the test suite only checks the behaviour of the "git lfs checkout" command when passed either a single relative file name and or the "." current directory name. We now add similar checks using first a bare ".." parent directory name and then using a relative path pattern to a sibling subdirectory combined with a trailing "/**" component, following the gitignore(5) rules we support for pattern matching. We also run the "git lfs checkout" command in the root of the working tree and check that it behaves as expected when passed the name of a subdirectory which already exists, and when that subdirectory's name is followed by a trailing "/**" pattern component. When the "git lfs checkout" command internally converts its file path pattern arguments into file path patterns that are relative to the root of the repository, rather being relative to the current working directory, it appends a trailing "/" character if the pattern is actually just the path to a directory (or a symbolic link to a directory). This change is implemented in the Convert() method of the currentToRepoPatternConverter structure in our "lfs" package, and was added in commit 56abb71 of PR git-lfs#4556 so as to ensure that arguments which resolve to specific directories are properly converted into patterns that match both the directories and all of their contents. Note, though, when we run the "git lfs checkout" command with a "." directory name argument while in a subdirectory of the working tree, the internal addition of a trailing "/" character does not actually affect the success of the command. This is because the Convert() of the currentToRepoPatternConverter structure first creates a path relative to the root of the repository, which necessarily includes the subdirectory's name as its final component. When used as a path pattern according to the gitignore(5) rules, then even without a trailing "/" character the pattern will still match the subdirectory and therefore its contents. On the other hand, the internal additional of a trailing "/" character does affect the success of the "git lfs checkout" command when we run the command in a subdirectory and provide a ".." argument. In this case, the Convert() method first converts the path into one relative to the root of the repository, which is simply the "." directory name, the same as if the user had supplied a "." argument to a "git lfs checkout" command run in the root of the working tree. Although "." is a valid file path, when treated as a pattern according to the gitignore(5) rules, it does not work to match a directory and all of its contents. To allow users to provide simple a "." argument and have it work as expected, the Convert() method first appends the "/" character, and then strips any leading "./" path component. If the result is an empty path, it is replaced with the "**" pattern component so the final pattern matches everything in the root of the repository. Thus our new check in the initial "checkout" test of the t/t-checkout.sh script where we run the "git lfs checkout" command in a subdirectory and pass a ".." argument, we validate the behaviour described above. The check would fail unless the command converted the ".." argument into a "." path relative to the root of the repository, and into into a "./" path, and then into the "**" pattern component. In parallel with our changes to the t/t-checkout.sh test script, we make a number of changes to the initial "pull" test of the t/t-pull.sh test script to ensure that test performs all its checks under a common set of conditions. We will further enlarge this set of conditions in subsequent commits. First, we correct a typo in the refute_server_object() assertion call that was added to the "pull" test in commit 7158e3b of PR git-lfs#2641. This commit revised the "git lfs pull" and "git lfs checkout" commands so that the paths they passed to the "git update-index" command were relative to the current working directory. To confirm this new behaviour worked as expected, the "pull" test was updated so it creates a third test file named "dir/dir.dat", and some of the checks in the test were then revised to reference this new file. However, the refute_server_object() assertion which is intended to prove that the corresponding Git LFS object does not exist on the test remote server passes an incorrect variable name, so we fix that now. Because the "dir/dir.dat" file and corresponding local Git LFS object file are not comprehensively referenced in all the checks of the "git lfs pull" command, we add the appropriate assertions for it into those checks. (As well, we revise several invocations of the rm(1) command to align with others in the same test and elsewhere.) The same commit 7158e3b also extended the "pull" test to run the "git lfs pull" command with its -I option under several conditions, but without removing the local Git LFS object files and working tree files, or checking that they have been re-created after the "git lfs pull" command is run. As a consequence, these checks effectively only confirm that the command does not fail. To rectify this limitation we delete all the relevant local files before each of these checks, and then verify that they have been restored by the "git lfs pull" command. We also run the "git lfs pull" command with a more restricted set of file path patterns to confirm that files which do not match those patterns are not fetched from the remote server; we then run the command again to fetch those files. Previously our checks did not actually test the file path filtering capability of the command. As well, we use a file path pattern which is relative to the root of the repository, so that when we run the "git lfs pull" command in a subdirectory of the current working tree and pass this pattern, our test now ascertains that patterns specified with the -I option are correctly matched with file paths starting from the root of the repository. (For the moment we do not check that the "git lfs pull" command accepts "glob" pattern components such as "**", but in a subsequent commit we will add patterns of this type to the arguments we pass to the -I option.) Making these changes, however, exposes another concern which would cause the "pull: with missing object" test to fail unless we adjust it as well. This test appears later in the same t/t-pull.sh test script, and depends on the repository clone created by the "pull" test in the local "clone" test directory. (Ideally, our tests should not depend on each other in this way, so that the failure of one test doesn't cause a series of otherwise unrelated tests to also fail. For the moment, though, we defer this issue to a future PR.) The "pull: with missing object" test was introduced in commit 68ac0f5 of PR git-lfs#2237 and checks that if a Git LFS object is missing both locally and on the remote server, the "git lfs pull" command should report the missing object and return a non-zero exit code, but should retrieve all other requested Git LFS objects. To simulate this condition, the test removes the Git LFS object corresponding to the "a.dat" test file in the local storage directories and from the remote test server, and then runs the "git lfs pull" command. However, the test does not remove the "a.dat" file from the working tree, and as noted above, is dependent on the state of the cloned repository left by the initial "pull" test. These two factors combine to make the test vulnerable to transient failures, particularly now that we have revised the "pull" test such that the final "git lfs pull" commands in that test re-create all the files in the working tree, except that we have also changed the "pull" test to drop the commit which adds an empty file, as we explain below. In commit d97c785 of PR git-lfs#4654 the "pull" test was revised to commit an empty file which matches the Git LFS file patterns, and then confirm that the "git lfs pull" command handles that case without trying to fetch the object. This check was added before the final checks of the "git lfs pull" command, which occur within a subdirectory of the working tree. As a result, those checks used to run with the empty file as one of the Git LFS objects, and so did the "pull: with missing object" test. In these instances, when "git lfs pull" runs, it finds the empty file matches the defined Git LFS file patterns, and is considered to exist locally by the LFSObjectExists() method of the Configuration structure in our "config" package. Therefore the command passes the object's pointer data to the Run() method of the singleCheckout structure, which invokes the DecodePointerFromFile() function in our "lfs" package. That function returns a successful result because it detects that the pointer refers to a zero-byte object, and so the RunToPath() method of the singleCheckout structure is executed, and then the file's path is passed to the Add() method of the gitIndexer structure. That method starts a "git update-index" command and writes the file's path to the Git command on its standard input file descriptor. We start the "git update-index" command with the --refresh option, so it reviews all of its cached index entries, including the one for the "a.dat" file. If that file in the working tree has the same modification timestamp as the Git index file (at least to within the same second, assuming Git was not compiled with the USE_NSEC macro definition) then Git considers the file to potentially be "racily clean", because it's file size and modification timestamp are insufficient to determine whether the file has changed since the index entry was cached. The "git update-index" command therefore reads the contents of the "a.dat" file in the working tree. Because the file's path matches a Git LFS file pattern, the data is streamed to the "git-lfs-filter-process" command, which regenerates a new object file in the local storage directories under ".git/lfs/objects". Unfortunately, the "pull: with missing object" test starts by removing the local and remote copies of the "a.dat" file's Git LFS object file, and expects that running the "git lfs pull" command will not result in a new local copy being generated, as none can be fetched from the remote test server. But if the "pull" test leaves the "a.dat" file in the working tree with an identical timestamp as the Git index file, and has also committed the "empty.dat" file to the Git history, then as described above, the "git lfs pull" invokes "git update-index" with the --refresh option, which finds the "a.dat" file and re-creates the local Git LFS object file, and so the test will fail. Note that without the presence of an empty file in the commit history, the "git lfs pull" command does not find any local Git LFS object files, and so it calls the Run() method of the singleCheckout structure after fetching the available objects from the remote server. The Run() method then executes the DecodePointerFromFile() function, which returns an error because the file in the working tree is not a raw Git LFS pointer, and so the Run() method skips calling the RunToPath() method and also does not pass the file's path to the "git update-index" command. In fact, the "git update-index" command will not be started at all, as there are no files in the working tree which could be considered to contain raw Git LFS pointer data, like the empty file is. Since the "git update-index" command does not run, the "git-lfs-filter-process" command is also never invoked, and so a local Git LFS object file is never re-created for the "a.dat" file, and the test succeeds. For this reason, we change the "pull" test so that it removes the "empty.dat" file from the Git commit history after performing the appropriate check of the "git lfs pull" command's behaviour with regard to empty files. To further guard against a potential regression of our tests and ensure the "pull: with missing object" test passes, we update that test to remove the "a.dat" file from the working tree as well as removing its Git LFS object files from both local and remote storage. Without the "a.dat" file in the working tree, it now becomes impossible for the local object file to be re-created under any circumstances. Finally, we strengthen our check of the log message output by the "git lfs pull" command so that we require a specific "does not exist" message rather than just the raw OID of the missing Git LFS object.
hswong3i
pushed a commit
to alvistack/git-lfs-git-lfs
that referenced
this pull request
Oct 18, 2025
After our "git lfs checkout" and "git lfs pull" commands write the contents of a Git LFS object into a file in the current working tree, they pass the file's relative path to the "git update-index" command, so the entries in Git's index for these files reflect their current state. If we did not run the "git update-index" command, Git commands like "git diff-index" would report the files we updated as having been modified in the working tree. Notably, though, the "git status" command would normally not report the files as modified, because it refreshes the Git index in the same way as the "git update-index" command we run. We invoke the "git update-index" command with the --refresh option, so it runs the refresh_index() function, which is also executed by the "git status" command. The refresh_index() function checks the state of all the cached entries in Git's index, and so it will detect that the files in the working tree have been modified, read their content and pass it to the "git lfs filter-process" command, which returns the same Git LFS pointer data as should already be in the Git index. The end result is that the index entries remain unchanged, but the cached modification timestamps are updated. As a consequence, even if we never invoked the "git update-index" command while running our "git lfs checkout" and "git lfs pull" commands, a subsequent "git status" command would typically not show the working tree files we update as modified. However, in our tests of the "git lfs pull" command, we rely exclusively on the "git status" command to try to verify that the "git update-index" command has been executed by the "git lfs pull" command. Moreover, the assert_clean_status() test helper function we invoke to run the "git status" command has a bug which means it always succeeds, even if the "git status" command were to report that the current working tree was not clean. We added the assert_clean_status() assertion function in commit 7158e3b of PR git-lfs#2641, along with calls to that function in the initial "pull" test of what is now our t/t-pull.sh test script. In the same commit we also revised the "git lfs checkout" and "git lfs pull" commands so that they pass paths relative to the current working directory to the "git update-index" command, rather than paths relative to the root of the repository. This latter change resolved the problem reported in git-lfs#2639, because by passing file paths relative to the repository root instead of the current working directory to the "git update-index" command, the "git lfs checkout" and "git lfs pull" commands were leaving the Git index and working tree in a confusing state. At the time when git-lfs#2639 was reported, running "git lfs pull" in a subdirectory might result in a subsequent "git status" command finding modifications to both the Git index and the working tree. To check that the problem was resolved, in PR git-lfs#2641 we introduced the assert_clean_status() test helper function, and expanded the "pull" test to call this function after performing several "git lfs pull" commands in subdirectories of the working tree. Unfortunately, the assert_clean_status() function does not behave as an assertion, because it always returns a successful (i.e., zero) exit code. The function tests whether the output from a "git status" command includes the message "working tree clean", and if not, reports the command's output along with that from a "git lfs status" command. However, the function does not cause the current test's sub-shell to stop execution and return a non-zero exit code to signal a test failure condition. We address this problem now by adding a call to the "exit" shell built-in command with a non-zero exit code at the end of the brace-delimited group command which executes if the assert_clean_status() function does not find the "working tree clean" message in the output from the "git status" command. This change ensures that if Git determines that the working tree is not clean, our assert_clean_status() function will cause the test which called it to fail. As well, we update the assert_clean_status() function so that when checking for the "working tree clean" message in the output of the "git status" command, it will also accept the older "working directory clean" message output by the "git status" command prior to Git v2.10.0. This allows our assertion function to succeed and the "pull" test to pass in our GitHub Actions CI jobs that run our test suite against Git v2.0.0, the oldest Git version we currently support. As a further problem, though, checking that a "git status" command finds a clean working tree is no longer sufficient to guarantee that the "git lfs pull" command has actually called the "git update-index" command with the appropriate file paths. To be certain that the Git LFS client has invoked the "git update-index" command and passed valid file paths, we also need to check the output of the "git diff-index" command, and do so before we run the "git status" command, which will cause Git to reset the file modification timestamps it has cached. We therefore define a new assert_clean_worktree() test helper function which runs a "git diff-index HEAD" command and confirms no unexpected changes are detected in the working tree, and we call this function at the start of the assert_clean_status() function before the "git status" command is executed. This ensures that we check the state of the working tree before the "git status" command refreshes the file modification timestamps Git has previously cached. As noted above, if we reversed the order of these checks, the "git diff-index" command would always return an empty list and so our check of the working tree's state would be defeated. We also define a new assert_clean_worktree_with_exceptions() test helper function, which acts like the assert_clean_worktree() function but filters the output of the "git diff-index HEAD" command with an extended regular expression pattern provided to the function as its only argument. This allows us to use the assert_clean_worktree_with_exceptions() function in instances where we expect certain files in the working tree to be absent or modified and so we want to ignore them when they appear in the "git diff-index" command's output. We then update the initial "pull" test in the t/t-pull.sh test script so that it calls either the assert_clean_status() function or the assert_clean_worktree_with_exceptions() function after each invocation of the "git lfs pull" command, thereby making the checks in this test as thorough and consistent with each other as possible. Finally, we update the initial "checkout" test in the t/t-checkout.sh test script so that it makes use of the assert_clean_status() and assert_clean_worktree_with_exceptions() functions in the same manner as the "pull" test in the t/t-pull.sh script now does. Both tests perform similar operations, and like the "git lfs pull" command, the "git lfs checkout" command also invokes the "git update-index" command. However, the "checkout" test was not updated in PR git-lfs#2641 when the assert_clean_status() was introduced and the "pull" test was updated to call it. Since we want our "checkout" test to verify that the "git lfs checkout" command successfully runs the "git update-index" command, we update the test so that it mirrors the "pull" test and consistently uses the assert_clean_worktree() and assert_clean_worktree_with_exceptions() functions to check the state of the working tree. To ensure that the "checkout" test still passes, though, we also need to adjust the test so that it removes any log files it creates because the assert_clean_status() function now behaves as was originally intended and will cause the calling test to fail if the working tree is not clean.
hswong3i
pushed a commit
to alvistack/git-lfs-git-lfs
that referenced
this pull request
Oct 18, 2025
In PR git-lfs#527 we introduced the "git lfs checkout" and "git lfs pull" commands and added some initial tests of those commands to our test suite, starting with the "checkout" test in what is now our t/t-checkout.sh test script. Since that test first appeared in commit 1d05552, it has validated the behaviour of the "git lfs checkout" command with a number of example files tracked as Git LFS objects, two of which are located within subdirectories. We later added a similar test of the "git lfs pull" command in commit 096b6da of the same PR, but without any files tracked as Git LFS objects that were not at the top level of the test repository. We eventually expanded this test to include one such file in commit 7158e3b of PR git-lfs#2641. However, none of the tests in our t/t-checkout.sh or t/t-pull.sh test scripts exercise the relevant Git LFS commands with files that are contained within multiple levels of subdirectories. Before we adjust how our "git lfs checkout" and "git lfs pull" commands create subdirectories, we first revise our "checkout" and "pull" tests to create files within multiple levels of subdirectories, and to confirm that commands recreate the full set of subdirectories when they do not exist. Note that in our "pull" test in particular we now check that the -I option accepts arguments containing "**" pattern components, matching the checks we added to our "checkout" test in a previous commit.
hswong3i
pushed a commit
to alvistack/git-lfs-git-lfs
that referenced
this pull request
Oct 18, 2025
Our "git lfs checkout" and "git lfs pull" commands, at present, follow any extant symbolic links when they populate the current working tree with files containing the content of Git LFS objects, even if the symbolic links point to locations outside of the working tree. This vulnerability has been assigned the identifier CVE-2025-26625. To partially address this vulnerability, we adjust the DecodePointerFromBlob() function in our "lfs" package to use the Lstat() function from the "os" package in the Go standard library instead of the Stat() function. This ensures that the DecodePointerFromBlob() function checks whether an irregular file or other directory entry already exists at the location where the "git lfs checkout" and "git lfs pull" commands intend to create or update a file. We then update a number of the tests that we added to the t/t-checkout.sh and t/t-pull.sh test scripts in previous commits, and now also add another pair of new tests to those scripts. First, we revise the "checkout: skip directory file conflicts", "pull: skip directory file conflicts", "checkout: skip directory symlink conflicts", and "pull: skip directory symlink conflicts" tests so that when they run on Unix systems, they now expect the name of the lstat(2) system call to appear in the log messages output by the "git lfs checkout" and "git lfs pull" commands. Previously, these tests expected the name of the stat(2) system call to appear in the commands' log messages. Next, we expand and revise the "checkout: skip file symlink conflicts" and "pull: skip file symlink conflicts" tests so they confirm that the respective commands try to avoid writing through symbolic links which exist in the working tree at the locations where the commands intend to create or update files, regardless of the nature of the links' targets. In their initial form, these tests could only check the case where the targets of the symbolic links were directories, but now they can also check the commands' behaviour both when the links' targets do not exist and when the targets are files which contain Git LFS pointers identical to those of the corresponding paths in the Git repository. Previously, in such cases the commands would create or update files at the locations of the targets of the symbolic links. We then add two new tests, named "checkout: skip case-based symlink conflicts" and "pull: skip case-based symlink conflicts", which confirm that the respective commands do not write through symbolic links which exist in the working tree at the locations where the commands intend to create or update files, after those links are created by Git due to filename conflicts on case-insensitive filesystems. Like the other tests with symbolic links, we only run these new tests on Windows if the current system supports the creation of true symbolic links. In both our new and revised tests we run the "git lfs checkout" and "git lfs pull" commands at several directory levels in the working tree, in order to exercise the ability for these commands to be run in any subdirectory, a behaviour we have supported since PR git-lfs#2641. We also confirm that the commands do not add the paths of the symbolic links to the Git index as they previously did because the commands assumed they had updated regular files at those locations. Note that while our new check in the DecodePointerFromFile() function avoids cases where a symbolic link already exists the working tree before we try to create or update a file at the same location, this check does not entirely prevent TOCTOU (time-of-check/time-of-use) races where a symbolic link might be created immediately after we check for its existence and before we attempt to create or open a file. In a subsequent commit we will address these concerns, at least in part, by changing the SmudgeToFile() method of the GitFilter structure in our "lfs" package to remove any existing file or link and always create a new file with the O_EXCL flag. This should help ensure we only ever create a new file and never write through a symlink that was added immediately after the DecodePointerFromBlob() function ran. Finally, note that other than the "git lfs checkout" and "git lfs pull" commands, the only other caller of the DecodePointerFromBlob() function is the "git lfs merge-driver" command, which is guaranteed by the context in which it runs to always open regular, temporary files created by Git. For this reason, we do not need to expand the test suite for the "git lfs merge-driver" command to check how it handles pre-existing symbolic links.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #2639 by ensuring files are added to the git index with a path relative to the current working directory.