Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MavenSupport: Add cache for remote artifacts #122

Merged
merged 1 commit into from
Dec 11, 2017
Merged

Conversation

mnonnenmacher
Copy link
Member

@mnonnenmacher mnonnenmacher commented Dec 11, 2017

Store remote artifact information in a disk cache backed by
DiskLruCache [1]. This brings a big performance boost for Gradle, Maven
and SBT dependency resolution.

The cached metadata contains the URLs and checksums of Maven artifacts which
is retrieved without actually downloading the remote artifacts. This is
information which is not available from the local Maven repository,
especially without downloading the remote artifacts.

Currently the max cache size is hardcoded to 1GB and expiry to six hours,
this will be made configurable in a later commit.

[1] https://github.com/JakeWharton/DiskLruCache


This change is Reviewable

@sschuberth
Copy link
Member

Could you elaborate in the commit message why this makes sense? Naively thinking I would have assumed Maven artifacts get already cached by Maven itself in the standard local Maven directory. And even if they would not get cached implicitly, I would have assumed "turning on" the feature to cache in Maven-Local is less work than implementing an own cache with DiskLruCache.

@@ -66,7 +69,12 @@ fun Artifact.identifier() = "$groupId:$artifactId:$version"

class MavenSupport(localRepositoryManagerConverter: (LocalRepositoryManager) -> LocalRepositoryManager) {
companion object {
private const val GIGABYTE = 1073741824L
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Write as 1024L * 1024L?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -125,6 +133,16 @@ class MavenSupport(localRepositoryManagerConverter: (LocalRepositoryManager) ->
}

fun requestRemoteArtifact(artifact: Artifact, repositories: List<RemoteRepository>): RemoteArtifact {
val cacheKey = artifact.toString()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use String.fileSystemEncode() instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DiskLruCache only allows keys matching "[a-z0-9_-]{1,64}" which is stricter than fileSystemEncode().


package com.here.ort.util

import ch.frankel.slf4k.error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Always use wildcard import for this package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

val timeToLive: Int
) {
companion object {
const val INDEX_FULL_KEY = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making this an INDEX enum and using INDEX.values().size instead of VALUE_COUNT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a try, but my impression was it adds more overhead than benefit.


/**
* Shorten the string to be usable as a key for [DiskLruCache] which has a maximum length of 64. If the string is
* shorter than 57 chars take it as is, otherwise shorten it and append a serial number. The un-shortened string is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why exactly 57?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved the documentation.

* stored in the cache to be able to compare if a cache entry matches a key.
*/
private fun String.asKey(): String {
return if (length > 57) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract 57 to a local variable / constant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"Cannot generate key for '$this' because all possible keys starting with '$key' are taken.")
} else {
// String is short enough to be unique, use it as is.
this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make the easy case go first and return early, and only use the tryKey logic as a fallback lateron?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

private val diskLruCache = DiskLruCache.open(directory, 0, VALUE_COUNT, maxSize)

/**
* Shorten the string to be usable as a key for [DiskLruCache] which has a maximum length of 64. If the string is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, there's no official release with that change. We could use JitPack to dynamically create one, or fork the project an publish our own release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would still need to implement the same logic, as keys can also be longer than 120 chars, so I think we wouldn't win much.

There is also JakeWharton/DiskLruCache#98, but it seems Jake is not active on this project anymore.

this
} else {
// String is too long to be unique, append it with a serial number.
val key = substring(0..minOf(57, length - 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

57 still hard-coded :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

} else {
// String is too long to be unique, append it with a serial number.
val key = substring(0..minOf(57, length - 1))
for (index in 0..99999) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length of the upper bound number should be aligned to KEY_SUFFIX_LENGTH.

Copy link
Member Author

@mnonnenmacher mnonnenmacher Dec 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Changed the constant to 6.

Copy link
Member

@sschuberth sschuberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

sschuberth
sschuberth previously approved these changes Dec 11, 2017
Store remote artifact information in a disk cache backed by
DiskLruCache [1]. This brings a big performance boost for Gradle, Maven
and SBT dependency resolution.

The cached metadata contains the URLs and checksums of Maven artifacts which
is retrieved without actually downloading the remote artifacts. This is
information which is not available from the local Maven repository,
especially without downloading the remote artifacts.

Currently the max cache size is hardcoded to 1GB and expiry to six hours,
this will be made configurable in a later commit.

[1] https://github.com/JakeWharton/DiskLruCache
@mnonnenmacher mnonnenmacher merged commit 847c06b into master Dec 11, 2017
@mnonnenmacher mnonnenmacher deleted the maven-caching branch December 11, 2017 16:46
sschuberth pushed a commit that referenced this pull request Jun 28, 2021
Fix: Allow custom config also for scanner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants