Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate Kernel list email from GitHub CI #1107

Open
nyurik opened this issue Aug 28, 2024 · 14 comments
Open

Automate Kernel list email from GitHub CI #1107

nyurik opened this issue Aug 28, 2024 · 14 comments
Labels
• misc Related to other topics (e.g. CI).

Comments

@nyurik
Copy link

nyurik commented Aug 28, 2024

I was just browsing #1106, and realized once again that Kernel email list is by far the hardest part of joining this amazing project. Being a dev brings up an obvious question: can this be automated?

I would like to propose some (yet to be decided) automated method of emailing to the kernel list using some github actions magic specifically set up for this repo. This way it will keep kernel devs happy, while also attracting new rust talent to this project without requiring what seems as insurmountable (paperwork) barrier of entry.

Ideas welcome.

@workingjubilee
Copy link

Note that GitGitGadget implements effectively-this for the Git project, so there is a model to follow if someone wanted to implement this (and assuming devs were receptive).

@tgross35
Copy link
Collaborator

tgross35 commented Aug 30, 2024

GitGitGadget does look really interesting. Figured it doesn't hurt to ask so I mentioned the idea of enabling it for regular kernel development at gitgitgadget/gitgitgadget#1695.

I think the philosophical reason why the kernel doesn't use better tools is because GH is more "push new commits when you make changes, squash at the end" and LKML is definitely "always keep your commits atomic, resend the whole patchset when you make changes". Which isn't impossible on GitHub, it just does a less nice job keeping track of things across force pushes. I think that tools like Gerrit or Phabricator are better designed for this, but that's probably worse than mail flow.

If you are interested in getting involved in any way, I think it's easiest to just subscribe to the list but automatically filter messages, so you get everything and can reply to it but don't get 20 notifications per day. To subscribe you just send an email to [email protected] with the text subscribe rust-for-linux looks like that's deprecated, use https://subspace.kernel.org/vger.kernel.org.html. Then I have a filter like to:([email protected] [email protected]) that just archives everything unless I'm explicitly mentioned.

Reviews of everything are always welcome. It's easiest to find something on the archives https://lore.kernel.org/rust-for-linux/ then search it in your client and reply from there. Just reply all, enable plain text mode (it's in the triple dots of the compose box if you use gmail), add your responses inline (so type immediately below the relevant bit rather than at the default top of the email), and snip irrelevant sections.

^ I know this is still comparatively a lot of overhead and isn't at all what is being asked, but it's also not that bad to just get started with the default flow if you have an expectation for knowing what to do. Sending your own patches is another thing but that is a bridge that we can help you cross whenever it comes up (basically have to use git send-email rather than anything from this century, but it works reasonably well).

PRs are also absolutely welcome here to get some initial review. They just won't get picked up unless they go through the list - but again, we can help here when needed. Also there is Zulip if you have any kind of questions https://rust-for-linux.zulipchat.com/#narrow/stream/293929-Announcements/topic/LWN.20articles.20and.20posts.

@nyurik
Copy link
Author

nyurik commented Aug 30, 2024

@tgross35 thx for the nice write up! I think sending a PR is the main issue, not the subscribing/monitoring bit. Going from a regular "submit PR from my fork" workflow to a obscure git send-email was a surprisingly high entry barrier when I tried to make a trivial change. Thus, if some-magically-how a PR gets automatically converted to an email, it seems like we can get the best of both worlds:

  • on PR creation, github action automatically sends out the needed email (we could use bors-like allow-lists to limit spam)
  • if PR is updated with a change, github action squashes all changes together and sends a single patch (possibly using the PR's description as the comment)
  • once the list overloards accept the patch and merge it, the PR is simply closed / rebased on top of the change / some other auto-magic.

@bjorn3
Copy link
Member

bjorn3 commented Aug 30, 2024

if PR is updated with a change, github action squashes all changes together and sends a single patch (possibly using the PR's description as the comment)

That will not work if your PR contains changes that should be split into multiple PRs.

I believe gitgitgadget allows you to freely push to yout PR branch and only sends an email when you explicitly ask the bot, giving you the ability to push your changes as new commits and only squash right before you are done with changing things and want to send another revision.

@nyurik
Copy link
Author

nyurik commented Aug 30, 2024

thx @bjorn3 - that could also work - as long as the email sending is automated (e.g. with a trigger comment), I feel the new developers could be on-boarded much faster. Moreover, it looks like PRs are never actually merge-closed here anyway, so the workflow could be:

  • submit PR with an in-depth description
  • R4L community reviews it, gives feedback and suggests changes, which the author fixes
  • once there are no more feedback, either the author or a maintainer adds a magical comment to the PR to auto-push it to the mailing lists
  • [optional] bot could monitor mailing list and post relevant replies directly to PR as individual comments
  • once it gets merged upstream, it gets autoclosed here

@ojeda
Copy link
Member

ojeda commented Aug 30, 2024

In the past, when we were out-of-tree, we did development in GitHub because it was convenient for what we were doing at the time. However, now we are in-tree and, for better or worse, Linux uses an email/patch-based workflow, so it is best to follow that workflow.

A one- or two-way bridge would especially help contributors that only want/need to send a couple small patches here and there. It is not the first time it has been discussed (as well as using forges in general), both inside Rust for Linux and in the kernel community in general, so it may happen eventually. Nowadays, I recommend using B4 (https://b4.docs.kernel.org), maintained by the kernel.org team, which simplifies some of the technicalities and offers an option for those without SMTP access.

For other contributors, i.e. active kernel developers, they would need to learn the actual workflow to get involved with other subsystems, maintainers, lists, trees, their rules, etc. For Rust in particular, there are some patches that only pertain to the Rust subsystem, but Rust is a kernel-wide effort, and thus in many cases one needs to interact with other subsystems anyway. We have some more details at https://rust-for-linux.com/contributing#the-kernel-development-process and https://rust-for-linux.com/contributing#the-rust-subsystem.

In order to get accustomed to the patch-based workflow, from time to time we add "good first issues" here.

@fbq
Copy link
Member

fbq commented Sep 1, 2024

  • once there are no more feedback, either the author or a maintainer adds a magical comment to the PR to auto-push it to the mailing lists

This is something b4 can do it for an individual, why we need some infrastructure to do is a bit questionable.

  • [optional] bot could monitor mailing list and post relevant replies directly to PR as individual comments

How should contributors respond those feedbacks from the list? Another GitHub comment? That means another round of syncing, which may complicate the magical system proposed here.

Overall, I'm not 100% sure, PR + merge workflow is better than email workflow in every aspect. If the main workflow in Linux kernel is still email-based, then it makes more sense to spend time on helping newcomers get familiar with that workflow because if they are looking into a long-term contribution, that will be a necessary skill.

@workingjubilee
Copy link

I'm gonna be honest, looking at b4 briefly, I don't see the advantage of "this arcane CLI tool" over "another arcane CLI tool", so what's the actual advantage?

@fbq
Copy link
Member

fbq commented Sep 2, 2024

I'm gonna be honest, looking at b4 briefly, I don't see the advantage of "this arcane CLI tool" over "another arcane CLI tool", so what's the actual advantage?

b4 has the web endpoint feature that doesn't require you to send a SMTP cli locally.

@fbq
Copy link
Member

fbq commented Sep 2, 2024

Maybe you could list some pain points for you in the email work flow.

@nyurik
Copy link
Author

nyurik commented Sep 2, 2024

I think the disagreement is not about the specific workflow, but about the size of the entry barrier for the gen-Gs - "the GitHub generation", i.e. how many volunteers will be deterred from participating because of unfamiliar workflow.

We tend to evaluate complexity as related to ourselves, but this is not a good metric. I see 895 PRs at torvalds/linux and 784 PRs in this repo. Most PRs are closed without merging. This is an insanely large number of PRs that likely were too small to warrant learning the workflow, but combined would have considerably improved the codebase. Maintainers might have better idea of all these PRs though.

In software, we create compatibility layers to support older systems and APIs - because compatibility layer is cheaper than to re-writing it. With volunteers, it is the same thing - you cannot expect to re-educate insanely large gen-G population to use unfamiliar workflow. At best, you will educate a few, while the vast majority will go elsewhere.

So if the conversion rate is low (as I suspect it is), and the maintainer average age keeps growing, I think there is a problem for the sustainability of the project. Granted that this can be counter-balanced with large cash infusions, i.e. people being paid to work on it, but the cost will continue increasing until it goes into a Cobol maintenance mode... :)

@fbq
Copy link
Member

fbq commented Sep 2, 2024

I think the disagreement is not about the specific workflow, but about the size of the entry barrier for the gen-Gs - "the GitHub generation", i.e. how many volunteers will be deterred from participating because of unfamiliar workflow.

We tend to evaluate complexity as related to ourselves, but this is not a good metric. I see 895 PRs at torvalds/linux and 784 PRs in this repo. Most PRs are closed without merging. This is an insanely large number of PRs that likely were too small to warrant learning the workflow, but combined would have considerably improved the codebase. Maintainers might have better idea of all these PRs though.

So I looked in the 895 PRs in torvalds/linux (in a sampling way), and seems most of them are one commit PR. Let's we lost ~1000 commits since year 2011. However, the commits between Linux v6.9 to v6.10 are 14561 commits, and that's roughly just two months, so although I would feel bad if we lost a single talent, but in term of commit numbers, the impact of these PRs seems unobservable based on this metric.

In software, we create compatibility layers to support older systems and APIs - because compatibility layer is cheaper than to re-writing it. With volunteers, it is the same thing - you cannot expect to re-educate insanely large gen-G population to use unfamiliar workflow. At best, you will educate a few, while the vast majority will go elsewhere.

Not sure I can agree on this.

  • First I won't put an label on people: each individual's situation may vary, there are people I know who used to use GitHub a lot, but are OK with email workflow in kernel, and there are people who heavily use email workflow in the kernel but love GitHub. So I'm not sure about the "gen-G" that you want to represent here, or do they want somebody to represent them ;-)
  • Second, as I mentioned above, if one wants to seek a long-term contribution in a project, one better get familiar with the main workflow, because in this way they will get the feedback from existing members easier, even though the initial days may be tough. This reminds me (who's not a native English speaker) trying to use poor English (it might be still poor now ;-)) to work in the community in my early days, no doubt it was tough, but the result is rewarding, because as long as I passed a few barriers, I'm able to commute with people easily. If I stayed in my comfortable zone, I'm pretty sure I would gain nothing (but maybe it would be a lost to the community, I'm not sure ;-)).
  • Last but not least, b4 already provides an easy way to convert a branch into a patchset and send it out, to your point about rewritting, why do we re-invent another wheel here? I'd argue b4 is better that a PR bot because it's local, and you don't need to use a GitHub account to use it: you mentioned GitHub generation above, but how about "GitLab generation"?

I'm not here to say that "the old way is always better", it is really:

  • Community is all about people, understanding the existing workflow is always helpful for a newcomer to communicate with the community.
  • I trust everyone's capability and potentials.
  • Evolving from the existing tools and practice is much easier than re-invent the wheel.

So if the conversion rate is low (as I suspect it is), and the maintainer average age keeps growing, I think there is a problem for the sustainability of the project. Granted that this can be counter-balanced with large cash infusions, i.e. people being paid to work on it, but the cost will continue increasing until it goes into a Cobol maintenance mode... :)

@ojeda
Copy link
Member

ojeda commented Sep 2, 2024

Most PRs are closed without merging.

This is false, at least for Rust for Linux. You are probably looking at the first few pages. Those PRs were closed because they were applied as patches, via the usual patch workflow, not via GitHub. For the older pages, when we used GitHub, most PRs are in fact merged.

In summary, what one can see in GitHub PRs has little to do with the actual development that is going on nowadays in mainline. Please see https://rust-for-linux.com/contributing.

@vincenzopalazzo
Copy link

This is false, at least for Rust for Linux. You are probably looking at the first few pages. Those PRs were closed because they were applied as patches, via the usual patch workflow, not via GitHub. For the older pages, when we used GitHub, most PRs were merged.

Confirming it, I personally helped to upstream on the ML some PR that are open on Github, and also @ojeda ensured that all Linux guideline was followed.

@ojeda ojeda added the • misc Related to other topics (e.g. CI). label Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
• misc Related to other topics (e.g. CI).
Development

No branches or pull requests

7 participants