-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storing big dependencies in one single global location #2125
Comments
So if people are interested in this, there are some constraints - tools like NVM will switch the NPM global package location. So what I am thinking is that Yarn or NPM or whoever should choose another truly global location of modules to store things in one place. In other words, |
@ORESoftware Your proposal is awesome, and in fact, there is a global location where Yarn stores the modules, it is What we can do with this?
Now, the ultimate problem with all this is that symlinks are broken in Node right now. However, this is being fixed right now! See #2133 and nodejs/node#10107. /cc @phestermcs 1 — |
Apart from that, just the first step shrinked my node_modules on Songbee/desktop from 188,4 MB down to 137,6 MB. Here's a simple Python script I used. |
I am going to try to understand what you said as best I can. What I was thinking would require a structure similar to that in ~/.npm, where the structure is like this: I am not sure why
although instead of .tgz where we'd have limitations because the code is compressed, so it would be better to just have this instead:
where so it would be something like
I am not sure I follow the reasoning you have about when it would be ok to symlink and when it wouldn't, I just want to make sure it's clear that I am talking about having all relevant versions of all relevant NPM packages in the cache directory. |
Updated the script, now my node_modules is only 83,5 kB (what?!). @ORESoftware, that's because Yarn doesn't store (or symlink) packages' dependencies in
and replace
This can be solved by adding symlinks in |
Yeah, I see no reason to have anything but symlinks in the |
That's because node isn't working with symlinks correctly right now. Again, see nodejs/node#10107 for more info on that. |
cool thanks |
The symlink discussion is a little over my head ATM but I will follow it |
Well, regarding the symlink discussion - the pnpm package manager seems to be using the symlink methodology we are discussing here.. I am sure you have heard of pnpm - so maybe if y'all haven't, open up talks with pnpm. TBH, I think it's in pnpm's best interest to merge with this lib somehow, at least merge features. |
@ORESoftware FWIW when installing with pnpm, it's still creating copies of modules for that install, but only one per module. It does use symlinks, but only from within the local install to the local copy, and not to a machine level copy. Being able to symlink to a machine level store would mean after the first install, the second would take literally about 1-2 seconds (i.e. once a module is on your machine, it can then be symlinked to whenever used again, in however many projects used). Using a machine level store is not possible with Further, because of an inefficiency in how Lastly, because of how
|
@phestermcs interesting |
@phestermcs actually pnpm uses a global store, so it saves a package only once per machine. More details about it here ied currently uses a local store, so it does make a single copy of a dependency per package. However, pnpm and ied collaborators are working on some shared specs and we all agreed that a single store per machine is the way to go. Here's the store spec (still in draft) Here we discuss the collaboration and specs P.S. @phestermcs what you are doing is awesome! ❤️ |
A few ideas:
|
honestly, this might be sound crazy, but just have the user change NODE_PATH in the bashrc or zshrc. what that allows for, is if the user wants to install something local, so be it, and that overrides everything. but otherwise it would fall back to the modules found in the NODE_PATH I don't know how you could hot patch require reliably, for every possible entry point of an application NODE_PATH is reliable, and is no longer going to be deprecated however, I don't know how this would work for the root user |
The problem would then be that it's impossible to store several versions of a package in the global store.
There aren't that many entry points in applications nowadays, and that's why require hooks work (e. g. see babel-register). Patching require isn't really a new thing, too. |
well you might have to add 1000 entries to NODE_PATH
so NODE_PATH would look like:
maybe you would only add items to NODE_PATH if they were in your package.json file |
@iamale Well, is patching require fundamentally different than adding to NODE_PATH? Not as far as I can tell, all it does is change where node looks for deps, right? Unless, of course, you change the precedence rules of the require function, then I guess it would be different. But I don't see the need to change the precedence rules, if you have local modules, those should probably be picked up first. |
@ORESoftware Interesting. But crazy. But interesting. |
I frankly just don't know how performant it would or woudln't be to add a lot of entries to NODE_PATH, I'd have to look at the source |
I think it would be really bad in terms of performance. Although a simple test is what we need here. |
Well, as long as there aren't that many folders in each NODE_PATH entry, it should be OK, not great, but doable |
You don't ever want to set |
No, not at all @ljharb if you do this in your bashrc / zshrc / bash_profile
you overwrite the existing NODE_PATH what you just said is false :) |
@ORESoftware Also, it wouldn't solve a situation with multiple dependency versions, would it? If our app depends on |
AFAICT it would be an abusable system to depend on users to properly set NODE_PATH. But if the package manager told users exactly what to do, it would reasonable to depend on the users to play along. I think it would all work with NODE_PATH, but I just don't know how well it would work |
@ORESoftware there is no existing NODE_PATH by default in node - if you have one, your user profile is setting it. If you do that, then you'll be able to |
ok, but you were just saying that using NODE_PATH would point to global modules, I don't know about that. And of course, x, is just an example. I was talking about using NODE_PATH to point to globally installed cache of modules, for example, ~/.global_store, not the modules installed by npm install -g... |
Right. I'm saying that all global modules should be in |
well, this whole conversation is about moving all local modules into a single store to prevent duplication, I think we just are arguing about semantics right now. I was suggesting, merely as an idea, to use NODE_PATH to solve the problem. |
I don't think NODE_PATH would make anything much more complicated than a bunch of symlinks, which sounds no less complicated :) |
@iamale I think I know what you are saying. As a POC, I would just add every single path in .global_store to NODE_PATH. So, ATM, npm install will put all the dependencies in ./node_modules, let's assume that was a flat structure, using yarn makes it flat apparently, so let's assume it can be made flat easily. instead of writing those flat directories to ./node_modules, you'd write those flat dirs to .global_store, in a flat way, then you'd have to add all the new dirs to NODE_PATH. the problem, of course, is now multiple babel versions are now on the NODE_PATH, so which will your program require from? LOL IDK. Now I am seeing a little bit more why NPM works the way it does :) |
actually, from my experience, NODE_PATH is set by default, try it on your system. It appears to be set to $(npm root -g) This could be overidden like I said above |
I am not sure if mine is set because I use NVM or not echo $NODE_PATH
/usr/lib/nodejs:/usr/lib/node_modules:/usr/share/javascript wonder wth |
@ljharb I see this issue do you happen to know if NVM reversed its course and does set NODE_PATH ? |
@ORESoftware no, |
first of all, I really doubt they will ever fully deprecate NODE_PATH. Such a thing is a mainstay in pretty much every programming environment that I am familiar with. I doubt there is some technical reason that it won't work with ES6 modules. e.g. Java classpath => https://en.wikipedia.org/wiki/Classpath_(Java) but yep, you appear to be correct that $NODE_PATH is empty upon a fresh install of node, just tested it. so for some reason some foreign agent is changing my NODE_PATH on my system as is, weird |
Another concern that should be considered is that "referencing" deps in I for one have a mac and windows PC. Project lives on windows PC and folder is shared over network, I have this setup to test electron apps on both platforms. |
Closing; we won't support symlink by default as we've designed Plug'n'Play which fits our bill better by entirely removing the node_modules from the equation. More info: https://yarnpkg.github.io/berry/features/pnp |
Feature request, on the latest versions of most everything.
Creating libraries for NPM, often times we see locally installed modules taking up a lot of space on disk - tapjs/tapjs#333
IMO there is no reason NPM cannot act more like Maven and put all modules and every different version of each module in the same global location on disk. That way we don't store the same version of the same package twice on the same machine.
Just curious if Yarn considers this an actual problem to be solved and if anyone is working on solving it. Thanks!
The text was updated successfully, but these errors were encountered: