Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Git LFS in private repositories #4623

Open
FPtje opened this issue Mar 9, 2021 · 18 comments · May be fixed by #8186
Open

Support for Git LFS in private repositories #4623

FPtje opened this issue Mar 9, 2021 · 18 comments · May be fixed by #8186
Labels
fetching Networking with the outside (non-Nix) world, input locking

Comments

@FPtje
Copy link
Contributor

FPtje commented Mar 9, 2021

In Nixpkgs PRs NixOS/nixpkgs#105998 and NixOS/nixpkgs#113580, support for git LFS is added to the Nixpkgs fetchgit function. The problem with fetchgit, however, is that it does not properly support private repositories. Nix' builtins.fetchGit does support private repositories, but it does not seem to support git LFS.

Currently, when trying to builtins.fetchGit a repository with LFS, the following happens:

nix-repl> builtins.fetchGit {url = "[email protected]:my_company/private-lfs-repo.git"; rev = "some_rev";}
Downloading some/lfs/file (123 KB)
Error downloading object: some/lfs/file (a123456): Smudge error: Error downloading some/lfs/file (some_rev): batch request: missing protocol: ""

Errors logged to /home/my-user/nix/gitv2/xxx/lfs/logs/20210309T095658.11111111.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
error: program 'git' failed with exit code 128

Ideally, it should be possible to builtins.fetchGit the repo either with or without downloading the LFS files. In one use case, the LFS files are used for non-vital things, like tests or documentation. The nix derivations do not depend on those files. Not downloading the LFS files would save space. In another use case, the LFS files are needed to build the derivations, and should therefore be downloaded.

It is possible to export GIT_LFS_SKIP_SMUDGE=1 to accomplish the first use case (i.e. fetch private LFS repository without actually downloading the LFS files), but it would be be much nicer to have it as an option of the builtins.fetchGit function.

@roberth
Copy link
Member

roberth commented Mar 13, 2021

#4635 has the potential to fix the first use case by default

the LFS files are used for non-vital things, like tests or documentation.

Did you configure LFS globally in your git user config?
I now realize git global user config may affect more places than what I've found with my testing.

@FPtje
Copy link
Contributor Author

FPtje commented Mar 15, 2021

Did you configure LFS globally in your git user config?

Yes, the following section is present in ~/.gitconfig:

[filter "lfs"]
        clean = git-lfs clean -- %f
        smudge = git-lfs smudge -- %f
        process = git-lfs filter-process
        required = true

@stale
Copy link

stale bot commented Sep 14, 2021

I marked this as stale due to inactivity. → More info

@arximboldi
Copy link

For some projects I am working on LFS is crucial. I hope this gets solved soon.

@stale stale bot removed the stale label Oct 9, 2021
@stale
Copy link

stale bot commented Apr 16, 2022

I marked this as stale due to inactivity. → More info

@stale stale bot added the stale label Apr 16, 2022
@arximboldi
Copy link

Still relevant.

@reivilibre
Copy link

reivilibre commented Oct 6, 2022

This error also pops up when using a git repository that uses LFS as a flake input, or seemingly even just by having a flake in a repository with LFS (c.f. NixOS/nixpkgs#137998).
I didn't expect it, but export GIT_LFS_SKIP_SMUDGE=1 seems to also workaround the problem with flakes, as long as you don't care about the LFS files.

@chadac chadac linked a pull request Apr 7, 2023 that will close this issue
8 tasks
wsx-udscbt added a commit to wsx-udscbt/cosmic-comp that referenced this issue Jun 2, 2023
@roberth roberth added the fetching Networking with the outside (non-Nix) world, input locking label Jun 2, 2023
@silky
Copy link
Member

silky commented Jul 3, 2023

but .... what if you do care about the LFS files .... 🥲 🥲 🥲 🥲 🥲 🥲 🥲

@roberth
Copy link
Member

roberth commented Jul 3, 2023

I think the plan for this would be

  • Merge the libfetchers changes from Lazy trees #6530
  • Implement LFS support in the fetcher. Could be smudge filter-based + whitelist of smudge filters, or something more hardcoded. (We don't want to allow general smudge support because that's impure, but could be a convenient implementation strategy - or not)
  • Add a parameter to the git fetcher. I think we'll eventually want three modes
    • Lazy LFS: fetch any LFS file when it is needed. This will tend to be sequential. Most versatile mode, and a sensible default.
    • Eager LFS: fetch all LFS files simultaneously. This will be faster when you know you need all LFS files.
    • No LFS: quick, even if we're copying the whole flake, which we may have to do until the libexpr part of Lazy trees #6530 is figured out. Alternatively, this mode could be a filter of which files to ignore / fetch eagerly / fetch lazily.
  • Implement the double fetching protocol where we fetch and load flake.nix once to figure out the fetch parameters, and then fetch and load again if needed

@janvogt
Copy link

janvogt commented Nov 1, 2023

Gitlab forces free users now to use LFS in many cases, so I guess this will become a lot more relevant.

@SomeoneSerge
Copy link

AFAIU, this is unspecific to private repos:

builtins.fetchGit {                  
  url = "https://huggingface.co/openlm-research/open_llama_3b";                                                                                                                     
  rev = "141067009124b9c0aea62c76b3eb952174864057";            
};                                                             

...fails in the same way:

...
Downloading pytorch_model.bin (6.9 GB)
Error downloading object: pytorch_model.bin (9ffd42d): Smudge error: Error downloading pytorch_model.bin (9ffd42dc58c4f49154e98bc7796306fde40febef278e99636a240a731d626a4a): batch request: missing protocol: ""

Errors logged to '/home/.../.cache/nix/gitv3/14avjqj1kcsaj6025lqgbr5r4yz680zmj1xzppc13cgxx12i8dj3/lfs/logs/20231227T021723.995860432.log'.
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: pytorch_model.bin: smudge filter lfs failed
error:
       … while calling the 'fetchGit' builtin
...

@newAM
Copy link
Member

newAM commented Jan 2, 2024

@SomeoneSerge for huggingface this worked for me:

fetchgit {  # from `pkgs`, not `builtins`, may not matter?
  url = "https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2";
  rev = "b70aa86578567ba3301b21c8a27bea4e8f6d6d61";
  hash = "sha256-IAe/tHFB7yqFRF5aRojkNCD8TbKj8XQMt6eEyPmr4HU=";
  fetchLFS = true;
}

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/flake-lfs-input/40184/2

@bratorange
Copy link

Is there currently a workaround for fetching nix flakes input with lfs?

@tengkuizdihar
Copy link

@bratorange I think the only way is to create a tar.gz that includes all of the LFS files

@lriesebos
Copy link

@bratorange I think the only way is to create a tar.gz that includes all of the LFS files

@tengkuizdihar do you know if there is a way to directly do that through a github/gitlab link? also, I assume that does not allow things like ssh authentication in case of private repos.

@tengkuizdihar
Copy link

nope no idea

@roberth
Copy link
Member

roberth commented Oct 1, 2024

If you are fetching another repo, you could use fetchgit from Nixpkgs, which produces fixed output derivations. (This means adding a hash attribute, and accepting Import From Derivation if you need expressions from there or builtins.readFile etc)

For local LFS files in flakes, the only option is for libfetchers to support it. @b-camacho has a WIP PR; maybe he could use some help:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetching Networking with the outside (non-Nix) world, input locking
Projects
None yet
Development

Successfully merging a pull request may close this issue.