Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find your own company #5

Open
abitrolly opened this issue Jan 23, 2020 · 8 comments
Open

Find your own company #5

abitrolly opened this issue Jan 23, 2020 · 8 comments

Comments

@abitrolly
Copy link
Contributor

Currently both the list of companies and emails is hardcoded. At the very least as instruction how to build the stats for your company or employee can be added.

There is a concern that for other companies the requirements to use corporate email for contributions is not so strict, especially if contributors or maintainers are hired or sponsored by corporations to do some jobs.

@patrickstephens1
Copy link
Member

I will respond to this in two parts.

  1. It is a good idea to add instructions for how to modify this file. A little background - to created this list we reviewed email addresses associated with commits to GitHub and then filter out which ones were from companies (commercial organizations), leading to the creation of this list in the SQL file. Since the concept of OSCI was to rank companies (leading on from some earlier such studies which were published), we excluded email addresses from universities, freemail providers, etc. It was a fair bit of effort to research all the email domains and find out which categories to put them in. We also had to combine email domains where a single organization uses multiple. It's possible we missed some companies or subdomains in this exercise, and of course new companies will need to be added sometimes too. So this list needs to be maintained, but yes you are right, it will be necessary to publish the rationale and instructions.

@patrickstephens1
Copy link
Member

  1. The other issue is people using non-corporate emails. We researched many ways to identify what organization contributions were coming from, such as the contributors profile and the org of the repo. All of these have pros and cons, and in the end we concluded that - for now - we would use the email domain - even knowing that this will under count the totals. There is no perfect science to these analyses but we felt this still gives valueable results. We would like to return to the idea of improving this algorithm - this task will need a bit of scoping first which I do plan to work on.

@abitrolly
Copy link
Contributor Author

There is a pain point in open source projects around contributions made during a signed contract with a company. The burden of proof that project doesn't get corporate code as result of third party contribution is placed on open source projects resulting in various CLA and conditional acceptance of contributions. This really hurts.

What could be improved from the corporate side is to make it clear and public which contributions are sponsored or covered by a existing contract. It can also set some standards to get clarity into contracts that say that any code that a person writes belongs to a company. If it will be the responsibility of the company lawyers to track official person involvement into relationships with the company and maintain it online, then an open source developers will feel less pressure over these legal issues, and OSCI could get fine-grained information which emails were involved with certain projects from which company in a certain period.

@gitaroktato
Copy link

@patrickstephens1 - I'm a member of EPAM and have a GitHub account, but not part of the EPAM organization in GitHub. In my profile, I have plenty of MIT licensed stuff that I use in conferences and workshops. So how do I show up in EPAM as a contributor in OSCI based on the current algorithm?

@patrickstephens1
Copy link
Member

@gitaroktato Hey Oresztesz. The OSCI algorithm uses the email domain of the author of the commit. We have a filter which picks out all the company email domains we have found in our analysis. Anyone (EPAM or other) who wants their contributors to be picked up should set their company email address on their public profile, or alternatively it can be set at the repo level (see here https://help.github.com/en/enterprise/2.19/user/github/setting-up-and-managing-your-github-user-account/setting-your-commit-email-address). Does that answer your qn?

@patrickstephens1
Copy link
Member

@abitrolly to an earlier question, the project README is now updated with instructions how to add companies and email domains. Actually we are almost ready to publish an update to this mapping having recently completed another analysis of email domains we see in larger numbers of commits. This will be done in next week or so.

@abitrolly
Copy link
Contributor Author

abitrolly commented Apr 27, 2020

@patrickstephens1 thanks for the heads up! The commit with instructions is 54e8526

Ideally the mappings should be in the repository root in self-describing format. Files with which people interact most often - custom mappings and configuration are better not to be hidden in the depths as to require specialized docs to access them. Anyway, the docs are awesome.

@gitaroktato
Copy link

@patrickstephens1 OK, I've changed my e-mail address in the git history. I hope it makes some impact 😃. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants