Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cached database on running "Install" command #1645

Closed
eugenesvk opened this issue Jul 19, 2023 · 10 comments
Closed

Use cached database on running "Install" command #1645

eugenesvk opened this issue Jul 19, 2023 · 10 comments
Labels

Comments

@eugenesvk
Copy link

eugenesvk commented Jul 19, 2023

Update: there seems to be 2 things that can be implemented to reduce the needless delay:

  • (solves delay on repeats) fix some bug with "resolving commit timestamps"
  • (solves delay on first start) allow the pre-downloaded cache to be used even after ST restart to avoid slow GitHub API calls (the update should happen during regular update windows as defined by auto_upgrade_frequency or a similar var)

Every time I run an Install Package command, the Package Control takes some time (I guess to update the database)
I'd like to avoid that step and always use the cached version to avoid any annoying delays
Then the update should happen periodically in the background (I don't need to get the latest on every single run)
Or at least it should be delayed until you actually install something

@deathaxe
Copy link
Collaborator

Package Control uses 2 levels of caching channel/repository information.

A fully evaluated list of packages is hold in RAM for the time specified by "cache_length" setting (by default 5mins).

After that, channel information are re-fetched from packagecontrol.io preferring locally cached data stored on disk, if 304 is returned by the server. The delay of displaying quick panel is mainly defined by the server response time then - no downloads are taking place.

You can increase cache_length, but be aware nothing will be fetched during this period - no channel data and no upstream infos from git tracked packages.

Then the update should happen periodically in the background

It is not planned to perform periodic tasks in the background beyound initial auto-update and package maintanance upon startup. Installing/Updating packages is not a prio 1 task, which happens every 10 minutes during normal work, thus periodically fetching package information is not justified.

Or at least it should be delayed until you actually install something

Fetching channel data (list of packages) is a crucial part of displaying a list of installable packages and can't be delayed as described. This request just doesn't make sense.

@eugenesvk
Copy link
Author

You can increase cache_length, but be aware nothing will be fetched during this period - no channel data and no upstream infos from git tracked packages.

I'd like to set this to ∞, but then I don't need to waste that memory forever for the rare use of install command, I guess what I'm asking for is to be able to skip the server query stage completely and always use the locally cached data - then I'll get the list without any unneeded delay

not planned to perform periodic tasks in the background beyound initial auto-update and package maintanance upon startup

that's fine, I don't need it more frequently, updating the local database along with the auto_upgrade_frequency is ok. I'll just set it to 0 in those rare cases I'd need to make sure to get the latest package

@deathaxe
Copy link
Collaborator

Actually RAM cache isn't actively cleared as doing so doesn't have any effect on plugin_hosts RAM usage (checked on Windows). Thus packag list resides in RAM until next package operation (instantiating of PackageManager) anyway.

To check that yourself just call:

>>> from package_control import cache
>>> cache.clear_cache()

The http cache on the other hand is an implementation detail of the downloader. It has an option to prefer locally cached files without sending a request, but it would

a) cause updated channel/repo to be fetched not before http_cache_length (1 week) is expired, not receiving any update in the meanwhile.
b) not make any difference compared to increasing cache_length.

After 5 mins, PC would just delete RAM cache to immediatelly rebuild it from the same http download artefact without checking whether it has been updated upstream. That's more or less pointless.

@eugenesvk
Copy link
Author

Just did a quick test:

  • disabled network connection, got an error that a custom repositories url couldn't get fetched, now Install command is instant
  • enabled the network, Install is instant (guess it remembers the failure), restarted Sublime, now Install is back taking its sweet time doing some network requests

Don't understand how it squares with your description (at least as how I understand it) that at least within the first 5mins everything should be instant. Maybe it's only caching the default package database?

Using the latest beta version https://github.com/wbond/package_control/releases/tag/4.0.0-beta4

@deathaxe
Copy link
Collaborator

Available packages are fetched and cached by PackageManager.fetch_available(). It doesn't matter whether they are fetched via a channel or a repository. The method merges all sources into one dictionary of "repo_url": data pairs and adds them to RAM cache.

Faild sources are indeed cached in RAM as well to avoid requerying dead ends.

Custom repositories may increase delay of displaying quick panels if cache is cold, especially if they contain unresolved package/release information. Those require at least 2 API calls per package to Github/Bitbutcket/Gitlab to fetch details and tag info, which may take a significant amount of time (see: #1638). That's however nothing PC can improve/solve at this point.

What default channel (packagecontrol.io) basically does is to periodically crawl all repositories, do the expensive API calls and stores resolved package information in the channel_v3.json, which PC can download and use directly.

@eugenesvk
Copy link
Author

That's however nothing PC can improve/solve at this point

PC could just skip those calls? Is there a combination of settings that would just not make those calls untill the regular auto_upgrade_frequency upgrade/update maintenance task is run?

Custom repositories may increase delay of displaying quick panels if cache is cold

But it happens even on immediate repeats as mentioned above, tried it again with commenting out custom repositories, and the speed has improved

@deathaxe
Copy link
Collaborator

PC could just skip those calls?

If an empty "Install Package" List is good for you, then yes.

But it happens even on immediate repeats as mentioned above,

I can't reproduce that with default settings, neither with default channel, a custom repository.json or a direct Github repo url in repsositories setting.

Turning on "debug": true in settings displays

Package Control: Fetching list of available packages and libraries
  Platform: windows-x64
  Sublime Text Version: 4150
  Package Control Version: 4.0.0-beta4

There's no indication of any connection being established anywhere, once repositories have been fetched, until the time of cache_length is ellapsed.

Note: Setting cache_length: 0 however completely disables caching information, which even causes the cached package information downloaded from default channel not to be used. In that case, PC starts crawling each repository, at each call of Install Package.

That's likely to fail due to rate limits.

@eugenesvk
Copy link
Author

Meanwhile did another test:

  • run in safe mode,
  • install package control,
  • run install package command
  • (expected delay to generate cache)
  • run Install again
  • instant panel (just like you describe re. cache)
  • added custom repo
  • run Install again
  • (expected longer delay to do slow github API calls)
  • run Install again
  • now the panel is instant!!!

so maybe the issue is with beta 4? Though I can't test it since in Normal mode v3 fails with the crypto isssue for which you already have a few issues opened, and I can't manually copy v4 to the safe mode as it gets auto-deleted
Or maybe safe mode does something special (tried with "fresh" Installed Packages Packages folders, but still safe mode works, but regular doesn't)

If an empty "Install Package" List is good for you, then yes.

:) It won't be empty, I mean repeated calls, not the first one

Thanks to your debug command found out that this link "https://raw.githubusercontent.com/eugenesvk/sublime-dic_RuEn_bi/main/repository.json", gets requested every single time
Strange! Will remove it for now. Other repositories seem to get cached

(I have "cache_length": 300)

  • Another "interesting" observation is that in the Safe mode the panel is truly instant - the true Sublime way - without network calls, while in the regular mode it takes noticeablly slower

  • Also, as far as I understand, the RAM cache is removed on ST restart, correct? So on every start I'd get the slow API delays. Is there a way to skip that? Basically, to make all network calls for existing lists of repositories only happen during auto_upgrade_frequency unless I install something, then I could get my sweet instant panels!!!

@eugenesvk
Copy link
Author

And thanks for helping out!

@deathaxe
Copy link
Collaborator

It appears the reason is an issue with resolving commit timestamps causing a hidden exception while downloading infos from a code hoster. Needs some investigation.

deathaxe added a commit that referenced this issue Jul 21, 2023
This commit resolves a type conflicts, which caused exception when parsing
release information of packages, hosted on GitHub.

As a result packages from Github repositories, using tag based releases were
not parsed and thus

1. not published to "Package Control: Install Packages" quick panel.
2. not added to cache, causing them to be re-queried each time the quick panel
   was triggered (see #1645).

Note:

Only Package Control client is effected, which tries to reduce amount of API
calls, by not downloading commit info of each release tag.
@deathaxe deathaxe added the bug label Jul 23, 2023
@deathaxe deathaxe closed this as completed Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants