Add verb for tracking and staging source changes in workspace #44

ruffsl · 2021-04-11T06:03:41Z

Fixs #25

This seeks to add a new verb for tracking source file changes in a workspace using dirhash by staging checksums of each package directory. This ignores all dot files and dot folders and also includes new package select arguments.

Example

# setup workspace
mkdir -p ~/ws/src && cd ~/ws
wget https://raw.githubusercontent.com/ros2/ros2/master/ros2.repos
vcs import src < ros2.repos

# stage and tare current workspace
colcon stage --tare-changes

# build and test workspace
colcon build
colcon test

# make a source change
touch src/ament/ament_cmake/ament_cmake_core/foo.txt

# stage current workspace with changes
colcon stage

# gather list of changed packages
PKGS_CHANGED=$(colcon list --packages-select-stage-changed | xarg)
PKGS_UNCHANGED=$(colcon list --packages-select-stage-unchanged | xarg)

# rebuild and retest only changed packages and above
colcon build --packages-above $PKGS_CHANGED --cmake-clean-cache
colcon test  --packages-above $PKGS_CHANGED

# rinse and repeat
colcon stage --tare-changes
touch src/ament/ament_cmake/ament_cmake_core/bar.txt
colcon stage
PKGS_CHANGED=$(colcon list --packages-select-stage-changed | xarg)
colcon build --packages-above $PKGS_CHANGED --cmake-clean-cache
colcon test  --packages-above $PKGS_CHANGED

TODO:

optional args for passing custom include and exclude path regexs to dirhash
leverag revision control of source folder to auto pickup on ignored paths, e.g. .gitignore
- Revision control awareness for determining ignore paths andhus/dirhash-python#6
cleanup workarounds for entrypoint and extension mapping for package types
sort out return code logic so staging unchanged packages don't report as "failed"
provide more stdout/file logs for debugging cache breaks, e.g. which file(s) changed in a package
- Obtaining aggregate information of hashed directory andhus/dirhash-python#7

Add verb for tracking and staging source changes in workspace colcon-core#428

Signed-off-by: ruffsl <[email protected]>

ruffsl · 2021-04-11T06:20:29Z

There may be room for optimization, but it's not too slow currently. Using ros2.repos, it stages 329 packages in 14.1 seconds.

dirk-thomas · 2021-04-14T05:30:03Z

My previous comment from #25 still applied to this proof of concept: I don't think it is a good / convenient interface for user to manually having to invoke stage.

ruffsl · 2021-04-14T06:37:16Z

I'm guessing you're referring to this:

The use case in mind would be that you can repeatedly call the build and it only processes the packages necessary - on the first build that must be all packages - on repeated invocations only the once which changes as well as their downstream packages.

I was operating under the use case where the user wants to carefully control and mark which package with recent changes should be tared. Using a top level verb also allows users to use the same package selection features for other criteria, e.g. tare changes for all packages now with passing builds and tests, and that are above some package I'm done working on, e.g:

colcon stage --tare-changes \
    --packages-select-build-passed \
    --packages-select-test-passed \
    --packages-above my_wip_package

Edit: those select build and test passed arguments don't exist, so expressing this kind of selection set could be cumbersome.

With the single return code value file, I'm not sure how to denote if a build for the latest changes was successful vs. tests with the latest changes was successful. By having it as a verb + package select queries, the user can decide when to tare package changes for subsequent commands to filter on.

Do you have a recommendation on how the build or test verbs could be extended to tare package changes upon success automatically? Is that something a task or an event_handler endpoint handler could achieve? There are so many entry_points, I'm still not sure how everything relates or the order of operations at CLI runtime.

E.g. a successful build for a changed package would mark the package as no-need-to-rebuild, but wouldn't yet mark it as also no-need-to-retest, as only a subsequent successful test verb would do. Thinking out loud, would it then make sense for a secondary rebuild to unset a no-need-to-retest marker as well then?

dirk-thomas · 2021-04-14T07:10:11Z

I was operating under the use case where the user wants to carefully control and mark which package with recent changes should be tared.
By having it as a verb + package select queries, the user can decide when to tare package changes for subsequent commands to filter on.

It just feels to me if the user has to do all this manually the gain isn't very high over just explicitly specifying the packages directly. Maybe I have unreasonable expectations.

Do you have a recommendation on how the build or test verbs could be extended to tare package changes upon success automatically? Is that something a task or an event_handler endpoint handler could achieve?

Yes, an event handler could be used to trigger behavior based on return codes of executed jobs. E.g. all console output showing successful / failed packages is generated by event handlers using those return codes.

There are so many entry_points,

Maybe this page helps: https://colcon.readthedocs.io/en/released/developer/extension-point.html

I'm still not sure how everything relates or the order of operations at CLI runtime.

https://colcon.readthedocs.io/en/released/developer/program-flow.html unfortunately doesn't go into enough detail which extension points are ultimately used for a specific job 😒

ruffsl · 2021-04-14T08:20:01Z

Yes, an event handler could be used to trigger behavior based on return codes of executed jobs. E.g. all console output showing successful / failed packages is generated by event handlers using those return codes.

Hmm, I will look into how to use event handlers to stage a package's source tree changes after successful builds and tests, but I'm still not sure how then to name/structure the package selection arguments to reflect such previous events.

How about:

--packages-select-build-unstaged
--packages-select-test-unstaged
--packages-skip-build-staged
--packages-skip-test-staged

An issue of this use pattern is that computing the hashdir for packages would have to be done when computing the set of selected packages at runtime, rather than just reading the return code a dedicated stage verb/task could pre-compute.

ruffsl · 2021-04-14T08:22:51Z

Oh! Ok, I think I have a complete use pattern

colcon stage verb is used to only update the colcon_stage.rc for selected packages
- Edit: users could select which measurement method to use: hashdir, git, etc
colcon package selection simply compares colcon_stage.rc to colcon_stage_<verb>.rc
colcon build/test use a event handler to copy colcon_stage.rc to colcon_stage_<verb>.rc upon success
- Edit: perhaps build should copy colcon_stage.rc to colcon_stage_build.rc
- and test should copy colcon_stage_build.rc to colcon_stage_test.rc

This way, the user can still control when the staging/hashdir for a package is ratcheted forward, but the build/test verb's also automatically update what is needed for package selection upon verb success.

ruffsl · 2021-04-14T18:09:37Z

@tylerjw , with #44 (comment) I think I have a better use pattern for this colcon staging idea, but not sure if it yet covers the use case you had with list changed since . Current idea relies on comparing previous stage return codes to determine when a package selection is triggered for a verb. But your case is when no previous return code yet exists, let alone a built workspace. What would be a good means to both checkpoint the state of the package source, and denote if it's changed (with respect to a reference point when using revision control)? I suppose the return code could contain a dictionary to encode recent package hashdir/commit as of colcon stage, as well as recent change/diff?

ruffsl · 2021-05-24T22:26:07Z

Closing in favor of ruffsl/colcon-cache#12 .

ruffsl added 2 commits April 10, 2021 21:50

Add .gitignore for python

492c729

Signed-off-by: ruffsl <[email protected]>

Add select arguments for staged packages

9a510fa

This was referenced Apr 11, 2021

Add verb for tracking and staging source changes in workspace colcon/colcon-core#428

Closed

Add selection conditions for packages with modified source trees #25

Open

ruffsl added 3 commits April 10, 2021 23:45

Use return code 0 to denote unchaged

ab2df10

Fix typos

857d485

Fix whitespace

4540440

This was referenced Apr 11, 2021

Revision control awareness for determining ignore paths andhus/dirhash-python#6

Open

Obtaining aggregate information of hashed directory andhus/dirhash-python#7

Open

dirk-thomas added the enhancement New feature or request label Apr 15, 2021

ruffsl closed this May 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add verb for tracking and staging source changes in workspace #44

Add verb for tracking and staging source changes in workspace #44

ruffsl commented Apr 11, 2021 •

edited

Loading

ruffsl commented Apr 11, 2021 •

edited

Loading

dirk-thomas commented Apr 14, 2021

ruffsl commented Apr 14, 2021 •

edited

Loading

dirk-thomas commented Apr 14, 2021

ruffsl commented Apr 14, 2021

ruffsl commented Apr 14, 2021 •

edited

Loading

ruffsl commented Apr 14, 2021

ruffsl commented May 24, 2021

Add verb for tracking and staging source changes in workspace #44

Add verb for tracking and staging source changes in workspace #44

Conversation

ruffsl commented Apr 11, 2021 • edited Loading

Example

TODO:

ruffsl commented Apr 11, 2021 • edited Loading

dirk-thomas commented Apr 14, 2021

ruffsl commented Apr 14, 2021 • edited Loading

dirk-thomas commented Apr 14, 2021

ruffsl commented Apr 14, 2021

ruffsl commented Apr 14, 2021 • edited Loading

ruffsl commented Apr 14, 2021

ruffsl commented May 24, 2021

ruffsl commented Apr 11, 2021 •

edited

Loading

ruffsl commented Apr 11, 2021 •

edited

Loading

ruffsl commented Apr 14, 2021 •

edited

Loading

ruffsl commented Apr 14, 2021 •

edited

Loading