Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DONOTMERGE] [repo_update] Attempt at making the script re-entrant without syncing inbetween #128

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

MarijnS95
Copy link
Contributor

repo_update has never been re-entrant without syncing prior to running, making SKIP_SYNC=TRUE largely useless. This is annoying while developing and shoving new patch applicators in here.
With more improvements to come, let's get this cleanup in first.

Tested with and without the rev-parse check, both apply cleanly. The former properly takes all commits as they are on my machine while the latter also fetches everything from the server appropriately.

repo_update.sh Outdated
@@ -22,6 +22,9 @@ popd () {
enter_aosp_dir() {
LINK="$HTTP://android.googlesource.com/platform/${2:-$1}"
pushd "$ANDROOT/$1"
# Make the script re-entrant when SKIP_SYNC=TRUE by ensuring
# the current tree is clean without any of our picks/reverts:
repo sync -l .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I absolutely hate this. We should check whether the commit has already been applied instead instead of force-resetting repos.
Someone could cherry-pick other changes before or after running this script.

Copy link

@stefanhh0 stefanhh0 May 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When someone cherry-picks / reverts after this script has run, everything should be fine, or not?

To make the reentrant "smarter" introduce a dedicated function that is doing the git revert, if skipsync then only do the revert if the commit has not yet been reverted. Also apply_gerrit_cl_commit has to be changed, in case of skipsync and changeid is already available don't do anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kindof agree. The intent is to have this behind a flag but optarg(s) is too much magic for me.

With many ideas in mind to prevent re-picking/reverting stuff, hence averting lots of local trouble, see the messages below I was writing while hitting roadblocks, which have all been resolved as of this last force-push.

🙈 I assumed reverts are not an issue but with reverts that touch the same lines (in audio) these simply generate conflicts. In any case, see [my attempt](https://github.com/sonyxperiadev/repo_update/commit/f04649f092ee8a4ca245faf97e2891675b586747) to ignore errors when the resulting diff is empty.

Cherry-picks on the other hand are an entirely different story. There's currently no way to tell it to proceed if it is already applied, leaving the tree in an unclean state. Doing a git reset HEAD seems like the best solution here, after making sure the issue is indeed The previous cherry-pick is now empty, possibly due to conflict resolution.. This doesn't mess with local changes as you can't cherry-pick with staged changes.

However, it gets worse: In the audio HAL for example we apply multiple commits on top, each changing similar lines to add platforms. Git doesn't understand what you want anymore and gives a nasty conflict if trying to pick the first commit right on top of a previous picking session. The best solution seems to be git cherry which is able to compare two trees for equivalent commits, but that syntax does not allow us to scan the range android-<current revision>..HEAD for a single, unrelated commit (at least not as far as I understand it).
However, we can always resort to using the tool that makes this thing tick: git patch-id.

[...]

I could explain a lot, but read the commit message and comments in the commits blow 😉

repo_update.sh Outdated
Comment on lines 90 to 91
# Variant on is_patch_in_range
# Notable differences:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this one. We can safely use git -c diff.noprefix=true show in is_patch_in_range too, then the only difference between these functions is the -R. For that I have previously written an is_stable_hash_in_range() with the (identical) for-loops moved in.

Copy link

@stefanhh0 stefanhh0 Jun 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I would just have done something simple like git log --grep=git log --grep="This reverts commit <commit_id_to_be_reverted>\."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you might have noticed I totally nerded out over using the tools git uses internally (cherry and patch-id) to see if something is applied already, when the safest might even be to just grep for that revert string and the commit title in case of picks. That immediately saves us from keeping patches up to date when context starts to diverge (and thus the result of patch-id).

Another option to alleviate the issue brought up above is to simply stash local changes, keep track of rev-parse HEAD, checkout on the manifest branch (eg repo sync -l ., apply patches, then rebase the local changes on top. Perhaps also filtered by code similar to what is written here now, just in case GH isn't able to apply it. Doesn't happen often, but it'll likely happen

In the end these changes are a lot of time spent for no good reason besides learning purposes. The only proper way is to sync and reset to the selected android tag, apply/revert everything, then leave it to the user whether they want to rebase their stuff and/or unstash. Adding that flag to the repo sync -l . might have just solved it all.

repo_update.sh Outdated
Comment on lines 25 to 27
# TODO: repo info is slow, even if only fetching locally. We should
# perhaps just take the manifest branch once for a common repo like `.repo/manifests`?
MANIFEST_BRANCH=$(repo info -l "$path" | sed -n 's/^Manifest branch: //p')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better ideas to retrieve the desired tag? repo doens't clean up .git/refs/remotes/m/ so those are moot.

repo_update.sh Outdated Show resolved Hide resolved
repo_update.sh Outdated Show resolved Hide resolved
@MarijnS95
Copy link
Contributor Author

With the new mechanism we are not totally re-entrant per-se as things may not match up after all, but this is rather uncommon. Only ipacf-mgr and fwb build fingerprint have mismatched context, making patch-id generate different patches.

Keep in mind too that dropped patches will not be detected and removed. That point is largely moot though:
The work I did here (unlike the previous revision) is mainly intended to be lenient towards developers maintaining their own patches on top. Those developers will have to repo sync themselves as well, and wouldn't have gotten rid of those patches anywayif they rebase their stuff on top (or worse, merging things back and forth). Anyone else syncing normally will get patches applied freshly on top of the synced tag.

repo_update.sh Outdated Show resolved Hide resolved
@stefanhh0
Copy link

You have done a more sophisticated approach to identify if a commit was already applied (same for revert), I would have done it simply with the following: git log --grep="Change-Id: I8296a8fb551097fabf72115d2cec0849671b91ea". That would also require adding the change id as 3rd mandatory parameter. Not sure if that would have any bigger drawbacks though?!

repo_update.sh Outdated Show resolved Hide resolved
repo_update.sh Outdated Show resolved Hide resolved
repo_update.sh Outdated Show resolved Hide resolved
repo_update.sh Outdated
false
}

revert_if_not_reverted() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of SKIP_SYNC=false the revert should just be done as before shouldn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: No need to add extra ifs for cases that are implicitly covered by empty collections.

@MarijnS95
Copy link
Contributor Author

@stefanhh0 Thanks for the in-depth review! I learned quite a bit about writing bash scripts, as well as realizing (well, technically I already knew this during the implementation) that this method is over-engineered while not nearly watertight. Something as simple as grep on commit titles, hashes in reverts, and/or searches on Change-Id: should be able to do that at much lower cost.

On that note, Change-Id is already in the file and only needs to be uncommented+moved; since it uniquely identifies a gerrit commit we might as well see if we can resolve that to a patch url too. Otherwise it can be extracted from a commit as well.

In that case we'd look for (still constrained to the range manifest_tag..HEAD):

  • Revert hashes, they'll always be there;
  • Change-Id: they are likely in most if not all commits;
  • Commit title.

It's definitely worth to evaluate this method because the last cherry-pick for fwb (SystemUI: Implement burn-in protection for status-bar/nav-bar items) is dated and context doesn't match: patch-id doesn't match with what's applied, then the entire thing collapses on itself.

@MarijnS95 MarijnS95 force-pushed the cleanup branch 2 times, most recently from fac3fcc to 55fa259 Compare June 3, 2020 22:31
repo_update.sh Outdated
}

enter_aosp_dir() {
if [ -z "$1" ]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to see that you added this error, it would be even better if the other functions would have the same guard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm indifferent on this one. Some functions call other code that errors out by itself, other places and in particular git show will take HEAD as default instead.

You anyway asked me to add set -u which catches most cases where an argument is not even passed: such a check is only helpful when "" is passed (eg. the expanded variables) but they themselves are either:

  • Not set globally, thus an error;
  • Perhaps empty, but a result from a git command that we expect to return an errorcode instead.

repo_update.sh Outdated
local _commit=$1
local _stable_hash
_stable_hash=$(git -c diff.noprefix=true show --no-commit-id "$_commit" | git patch-id --stable)
is_stable_patchid_in_range "${_stable_hash%% *}" "$STABLE_SHAS"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_stable_hash vs. STABLE_SHAS

With SHAS you refer to sha256? Anyway it would be good for consistency reasons to stick to _stable_sha and STABLE_SHAS or _stable_hash and STABLE_HASHES, I'd prefer the latter since it is more general and it is not important to now whether the hash is a SHA256 or something else.

@stefanhh0
Copy link

I am happy to see that you applied some of my suggestions. I haven't dived too deep into it - meaning what is needed what not and how to achieve the required goal in the best manner. If git log --grep is not working or also carries some uncertainties with it, then the more sophisticated approach makes sense. I could dive into it more profoundly especially all the git commands to give a better review, however I have time limitations + I think you are doing a good job :-)

@MarijnS95
Copy link
Contributor Author

MarijnS95 commented Jun 4, 2020

I haven't dived too deep into it

That sounds dangerous, you have already pointed out far too many issues and this is only scratching the surface?! 🙈

I do consider abandoning the "reentrancy" commits as they add quite a bit of code without working nearly perfect: and probably replace it with some greps instead, see how that works out for us.

Copy link

@stefanhh0 stefanhh0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks nice and sound to me from reading the code. However haven't done any functional testing myself on the changes.

@MarijnS95 MarijnS95 marked this pull request as draft June 5, 2020 07:35
@MarijnS95 MarijnS95 changed the title [repo_update] Address shellcheck issues, clean up LINK mess, make re-entrant [DONOTMERGE] [repo_update] Try to not revert/apply again when not on a fresh sync Jun 5, 2020
@MarijnS95 MarijnS95 changed the title [DONOTMERGE] [repo_update] Try to not revert/apply again when not on a fresh sync [DONOTMERGE] [repo_update] Attempt at making the script re-entrant without syncing inbetween Jun 5, 2020
repo_update is currently not reentrant: If a commit is picked once it
cannot be picked a second time as that either results in no changes or a
bunch of merge conflicts.

Generate stable hashes with patch-id that are invariant to line number
changes, white space and file ordering, and use those to compare with
existing patches on top of the selected manifest tag. Skip the pick on
matching hashes.

Note that this intentionally does not check if the patch is found
anywhere else in the tree: It might have been picked and reverted
multiple times in AOSP. Furthermore, if the patch eventually ends up on
a tag we want to be alerted too and remove it.
This obviously implies that we should not be applying and reverting our
own patches either.

Signed-off-by: Marijn Suijten <[email protected]>
This variant operates on reverts instead of cherry picks. See the
previous commit for an overview of the method and design choices.

Reverts are compared by creating a patch that is the revert of the
desired commit, of which the stable patch-id hash is used to compare
against the commit range. Notable is the mnemonic prefix (a/ and b/)
which are swapped in that revert as well, resulting in mismatched
hashes. This is addressed by showing the patch instead with
diff.noprefix=true.

Signed-off-by: Marijn Suijten <[email protected]>
Copy link

@PhaseboyTA PhaseboyTA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hop bet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants