Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the repository ID to verify whether a GitHub username was squatted #140

Open
FichteFoll opened this issue May 25, 2021 · 3 comments
Open

Comments

@FichteFoll
Copy link
Contributor

Instead of marking packages as "needing review" if they were unavailable once, it would be more reasonable and robust to instead check the repository id as returned by GitHub's API, store that in the database, and flag those packages as needing review that have a different ID from the latest crawl compared to the database.

@wbond
Copy link
Owner

wbond commented May 27, 2021

Beyond this, the system will need to track and see if a third-party domain changed hands. Also, it will need to do the same sort of thing for GitLab and BitBucket.

I'm not sure if there is an automated way to see if the domain has changed hands. Maybe whois can provide the first registration date and that can be used?

@FichteFoll
Copy link
Contributor Author

Another thing to consider is a repo URL being changed deliberately. I do hope that this can be checked on the database, so that an ID is only checked for the same URL. Outside of custom-hosted packages, for which we'll probably still need the "was missing" check, the implementations for the three git hosters should be very similar.

@kaste
Copy link

kaste commented Jun 12, 2021

I don't think you should reach for a 100% solution here. Reducing false positives is an incremental process. Only handling GitHub reduces the stress as it's probably the 90% hoster nowadays. (And github.com can't change the owner without you reading it in the news.)

Say we just grab a uid from GitHub. In package.modify.store(values), now values will have maybe this uid (iff the provider provides it). Within store() we already cursor.fetchone() to decide if we INSERT or UPDATE. On UPDATE we can now compare the old uid with the new uid and reject some changes. Or allow these changes, but immediately mark needs_review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants