-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[native_assets_builder] Use file content hashing #1750
Conversation
PR HealthBreaking changes ✔️
Changelog Entry ✔️
Changes to files need to be accounted for in their respective changelogs. API leaks ✔️The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
License Headers ✔️
All source files should start with a license header. Unrelated files missing license headers
|
@jtmcdole @jonahwilliams I can't seem to add you as reviewers on this PR (possibly because it's not the flutter org). But PTAL as Martin is OOO currently. N.B. This will add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to pull in xxh3 as a flutter_tools dependency? do we know/trust the maintainer?
b60c830
to
58b6279
Compare
@jonahwilliams - I trust it in as much as any other package that we've pulled in. Here's the performance numbers: |
I guess this is a @jtmcdole question. Based on #1593 (comment). @mraleph and I both looked through the source code while playing around with it. But of course we don't know about future versions of the package. Do we have a policy for how we vet third_party packages? (If we are not happy with adding this third party dependency, I can swap the PR to use a slower hashing algorithm.) |
If thats OK with you its OK with me. |
I don't think we want the slower hashing algorithm when it comes to building native assets. If the package changes in the future to pull in new deps; we can revisit - but its a dep-less package today: https://github.com/SamJakob/xxh3/blob/master/pubspec.yaml |
I would LGTM; but I don't have access? |
import '../utils/file.dart'; | ||
import '../utils/uri.dart'; | ||
|
||
class FileSystemCache { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider fusing API calls which are always used together, this leads to more robust and simpler APIs:
cache.readCacheFile
andfindOutdatedFileSystemEntity
reset
,hashFiles
andpersist
.
Also the name cache implies something else, I think.
This yields something like this as an API:
final class DependenciesHashFile {
DependenciesHashFile({required File storage});
Future<bool> isUpToDate();
// Records hashes of files, returns`true` if any file has changed after `buildStart`.
Future<bool> updateAfterBuild({required DateTime buildStart});
}
Swapped the hashing algorithm to |
How did you test the perf? Will native assets be a large set of small files, smaller set of larger files, or something else? In my linked comment, I had xxh3 at 100ms vs >600ms for md5. MD5 should be fine as we're not using it for crypto reasons. |
This only mattered for "large" data like 165MB. Maybe something media heavy (game assets?) - but if all our libraries are small, then this is a wash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few pending comments
mustCompile = sourceLastChange == kernelLastChange || | ||
sourceLastChange.isAfter(kernelLastChange); | ||
mustCompile = | ||
(await dependenciesHashes.findOutdatedFileSystemEntity()) != null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider logging the file name that caused to re-compile it.
@@ -696,8 +709,28 @@ ${e.message} | |||
kernelFile, | |||
depFile, | |||
); | |||
|
|||
if (success) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Consider using early returns and less nesting (on the lines above) - it makes it easier to read this code:
if (!mustCompile) {
return (true, kernelFile, dependenciesHashFile);
}
final success = await _compileHookForPackage(...):
if (!success) {
return (false, ...);
}
...
return (true, ...);
|
||
if (success) { | ||
// Format: `path/to/my.dill: path/to/my.dart, path/to/more.dart` | ||
final depFileContents = await depFile.readAsString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Consider moving this code (that knows about the format of .d
files - into it's own function - e.g. parseDepFile(<file-content>)
)
for (final uri in fileSystemEntities) { | ||
int hash; | ||
if (validBeforeLastModified != null && | ||
(await uri.fileSystemEntity.lastModified()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any indication that we benefit from using async/await
here? (it's more complicated and may as well be slower)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In flutter_tools
the native assets builder is run in a Target
concurrently with other Target
s. Sync I/O prevents flutter_tools making progress on the other targets.
I had to change some sync I/O to async I/O in flutter_tools in other Target
s a while back to speed up flutter builds because of this.
/// If [validBeforeLastModified] is provided, any entities that were modified | ||
/// after [validBeforeLastModified] will get a dummy hash so that they will | ||
/// show up as outdated. If any such entity exists, its uri will be returned. | ||
Future<Uri?> hashFiles( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code seems to handle directories as well, so should this be hashFileAndDirectoryContents
?
/// [Directory] hashes are a hash of the names of the direct children. | ||
class FileSystemHashes { | ||
FileSystemHashes({ | ||
this.version = 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a version?
In the current PR if this field every changed to a backwards incompatible version, it seems nothing would change because the code doesn't use this field at all (apart from serializing and deserializing it)
this.hash, | ||
); | ||
|
||
factory FilesystemEntityHash._fromJson(Map<String, dynamic> json) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It's best to type json maps as Map<String, Object>
because then one gets static type errors if a lookup like json['foo']
is used in wrong way (using dynamic
makes json['foo'].doesNotExist()
just work at compile-time)
graphs: ^2.3.1 | ||
logging: ^1.2.0 | ||
# native_assets_cli: ^0.9.0 | ||
native_assets_cli: | ||
path: ../native_assets_cli/ | ||
package_config: ^2.1.0 | ||
xxh3: ^1.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this dependency on a 3rd party package?
@@ -659,29 +684,17 @@ ${e.message} | |||
final depFile = File.fromUri( | |||
outputDirectory.resolve('../hook.dill.d'), | |||
); | |||
final dependenciesHashFile = File.fromUri( | |||
outputDirectory.resolve('../hook.dependencies_hash_file.json'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we re-build the kernel file if the Dart SDK has changed? (otherwise invoking the old kernel on newer SDK may just crash)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why you review the code! 🤓 🙏
Addresses the remaining comments on #1750. I'm unsure how to test the recompiling of the kernel file if Dart changes without writing a test that downloads a different Dart SDK.
This PR changes the caching behavior for hooks to be file content hashing instead of last modified timestamps.
Closes: #1593.
In addition to using file hashes, a timestamp is passed in to detect if files were modified during a build. The moment of hashing contents is after a build is finished, but we should consider files changed after the build started to invalidate the build. If this happens, the build succeeds, but the cache is invalidated and a warning is printed to the logger.
The implementation was modeled after the
FileStore
in flutter_tools. However, it was adapted to support caching directories and to match the code style in this repository.Directory caching is defined as follows: the hash of the names of the children. This excludes recursive descendants, and excludes the contents of children. For recursive caching (and glob patterns), the populator of the cache should do the glob/recursion to add all directories and files.
Testing
pkgs/native_assets_builder/test/build_runner/build_runner_caching_test.dart
) cover caching behavior.Performance
Adding a stopwatch to pkgs/native_assets_builder/test/build_runner/build_runner_caching_test.dart for the second invocation of the hooks so that it is cached.
lastModified
timestamps: 0.028 seconds (pre this PR)package:crypto
md5
: 0.047 seconds (current PR)package:xxh3
xxh3
: 0.042 secondsThe implementation does not use parallel system IO for loading files (no
Future.wait
), but does use async I/O to allow flutter_tools to run otherTarget
s in parallel.The (pre and post this PR) implementation is fast enough for a handful of packages with native assets in a
flutter run
. The implementation (pre and post this PR) is not fast enough for hot restart / hot reload with 10+ packages with native assets. So, when exploring support for that, we'll need to revisit the implementation.Related issues not addressed in this PR
hook/{build,link}.dart
once. #1578