Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add selection conditions for packages with modified source trees #25

Open
ruffsl opened this issue Sep 2, 2019 · 9 comments
Open

Add selection conditions for packages with modified source trees #25

ruffsl opened this issue Sep 2, 2019 · 9 comments
Labels
enhancement New feature or request

Comments

@ruffsl
Copy link
Member

ruffsl commented Sep 2, 2019

As discussed in colcon/colcon-core#213 , I think it would be useful to add a selection argument to filter packages by those were it's source code has changed with respect to what was used in the previous workspace build. This could be achieved by augmenting the verbs to audit the source folder of a package when specified, and later comparing the source tree with the previous audit file when determining selecting conditions. It could also be version control aware to ignore specific files/folders.

An example workflow could be as follows:

# setup workspace
mkdir -p ~/ws/src && cd ~/ws
vcs import src < ros2.repos

# build workspace with audit
colcon build --audit

# make source changes 
touch src/foo/foo.txt

# gather list of changed packages
PKGS_CHANGED=$(colcon list --packages-skip-source-unchanged | xarg)

# rebuild only changed packages and above
colcon clean --build --install --packages-above $PKGS_CHANGED
colcon build --audit --packages-above $PKGS_CHANGED --cmake-clean-cache
colcon test  --packages-above $PKGS_CHANGED

# rinse and repeat
touch src/bar/bar.txt
PKGS_CHANGED=$(colcon list --packages-skip-source-unchanged | xarg)
colcon clean --build --install --packages-above $PKGS_CHANGED
colcon build --audit --packages-above $PKGS_CHANGED --cmake-clean-cache
colcon test  --packages-above $PKGS_CHANGED
...

So far, I'm thinking of adding two arguments, one that can be used with existing verbs to audit the source and store the audit file to the package build folder for later use. The second would be used to filter packages by investigating deltas between the audit file and current package's source tree.

Before starting a PR, I'd like to iterate on a few things:

Terminology

I've used the term audit as similar defined in https://github.com/jessek/hashdeep CLI, but perhaps this argument should be made more specific? e.g. --audit-packages?

For the filter argument, given the AND logic of all conditions, I see there is a pattern for having an inclusive and exclusive conditions for setting selected = False assignment. E.g. packages_select_test_failures vs. packages_skip_test_passed. In a similar style, perhaps we could use the names:

  • packages_select_audit_changed
  • packages_skip_audit_changed

or

  • packages_select_audit_changed
  • packages_skip_audit_unchanged

I guess it was intentional by design not to make select/skip_test purly symmetric, as packages_skip_test_passed includes both failed and untested packages, while packages_select_test_failures only includes failed packages. What would be the appropriate dualism here?

Extension

Taking a look at the colcon-package-selection extensions, I think adding the feature here might be most appropriate for code reuse. However, if we wanted to add dependencies like gitpython to be version control aware for ignoring files, should we then seek to break this out into its own extension?

Persisting Audits

When the build verb is specified to audit, should the audit file only be over-written to the build folder upon a successful build? I was thinking if the audit file was always updated, regardless of the success of the build, then what would be the cleanest order of commands to rebuild packages matching (audit-changed OR build-not-passed)?

What does it mean to the filter aurments when the audit file does not exist? Should unchanged only be interpreted by the existence of a colcon_audit that matches the audit at runtime, similar to how a package build is only considered as passed if the colcon_build.rc exists and only contains 0? This might be a consideration for the asymmetric duality in select/skip arguments.

@dirk-thomas dirk-thomas added the enhancement New feature or request label Oct 10, 2019
@dirk-thomas
Copy link
Member

colcon build --audit --packages-above ...

In this usage example I am wondering why --audit needs to be specified for the build. Isn't it generally desired to update the information in case of a successful build? Or do you have a case in mind where the user wouldn't want to pass --audit?

I've used the term audit ...

I am not convinced "audit" is the right term for this functionality. Nothing is being reviewed here. It is more like watching for changes in the file system. I don't have a good proposal atm though.

I see there is a pattern for having an inclusive and exclusive conditions

Do you have a use case for packages_skip_audit_changed / packages_select_audit_unchanged? If not I would rather not add those for symmetry only.

What would be the appropriate dualism here?

I would solely base that on concrete use cases - what do users want to achieve and what options would be necessary to achieve that?

if we wanted to add dependencies like gitpython to be version control aware for ignoring files, should we then seek to break this out into its own extension?

I would think it would be sufficient to to the most general approach: skip files/directories starting with a dot, and maybe ignore files listed in a global gitignore configuration. That shouldn't need a dependency. Let's figure that out on demand though - moving a package into a separate repo is straight forward anytime.

Should unchanged only be interpreted by the existence of a colcon_audit that matches the audit at runtime, similar to how a package build is only considered as passed if the colcon_build.rc exists and only contains 0?

That sounds reasonable to me. The use case in mind would be that you can repeatedly call the build and it only processes the packages necessary - on the first build that must be all packages - on repeated invocations only the once which changes as well as their downstream packages.

@dirk-thomas
Copy link
Member

One more though: the following scenario should work imo:

If since a previous audit pkgA has changed the next build would process pkgA as well as its downsteam package pkgB. Assuming pkgA passes and updates its audit information but pkgB gets aborted / fails / whatever any future invocation would still need to rebuild pkgB (even though pkgA doesn't appear in the changed-packages list anymore).

@dirk-thomas
Copy link
Member

@ruffsl Have you had a chance to make any progress towards this feature idea?

@ruffsl
Copy link
Member Author

ruffsl commented Feb 19, 2020

Afraid not yet, had get through my thesis proposal before winter break, and am now tasked with winter conference deadlines. Are there other interested parties in having this work space level caching? Perhaps we could pitch this feature idea to the rest of the tooling working group?

@ruffsl
Copy link
Member Author

ruffsl commented Apr 11, 2021

@ruffsl Have you had a chance to make any progress towards this feature idea?

Hey @dirk-thomas , @tylerjw , @mikepurvis , @mm318 ! I started a working proof a concept: #44

Could use some help to get it over the finish line and clean up.

@dirk-thomas
Copy link
Member

Could use some help to get it over the finish line and clean up.

Since I don't use this tool anymore I won't be able to invest time.

@tylerjw
Copy link

tylerjw commented Apr 14, 2021

I know this is probably not super helpful but I have two solutions for this. Locally I created this and use it:

https://github.com/tylerjw/moveit_ci_tools

And then for CI I have been using this:

ros-industrial/industrial_ci#655

@ruffsl
Copy link
Member Author

ruffsl commented Apr 14, 2021

I know this is probably not super helpful but I have two solutions for this. Locally I created this and use it:
https://github.com/tylerjw/moveit_ci_tools

@tylerjw woah, clever. Looks like this relies on git specifically to check for file diffs. My draft PRs are more generic, but I like the git method too for its native support for .gitignore, and my hashdir/hashdeep method could instead be a good fallback for when the source dir is checked out using a revision control system other than git.

@ruffsl
Copy link
Member Author

ruffsl commented Apr 26, 2021

@dirk-thomas and @tylerjw , I've roughly prototyped a new working colcon extension for caching invocations of verbs.
Feel free to try it out and comment on the current code here: ruffsl/colcon-cache#12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

3 participants