Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codebase git mirroring #718

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

Codebase git mirroring #718

wants to merge 22 commits into from

Conversation

sgfost
Copy link
Contributor

@sgfost sgfost commented May 7, 2024

part 1 (1-way mirror):

adds a button that allows model submitters to create an auto-updating, read-only git repository archive which is hosted on a central organization

additions

  • include CITATION.cff and LICENSE files in archive packages (resolves comses/planning#234)
    • CITATION.cff is translated from codemeta
    • LICENSE file is built from license text templates in the License model
  • library.fs.CodebaseGitRepositoryApi: functionality for building/updating a git repository from a Codebase
  • library.github.GithubApi: provides an interface over PyGithub for interacting with repositories on github
  • library.github.GithubRepoNameValidator: provides validate() to make sure a repo name is valid and unused
  • huey for async task processing which runs on the server container (resolves comses/planning#231)
    • mirror_codebase() and update_mirrored_codebase() huey tasks which call the CodebaseGitRepositoryApi to build the git repo on the file system and then GithubApi to create/push to the remote
  • feature overview page at /github/
  • button + form on the release detail page for mirroring a codebase

configuration steps

  1. create an app on the comses-model-library organization with the following permissions:
    • Administration: read and write
    • Contents: read and write
    • Metadata: read only
  2. generate client secret and private key
  3. install the app on the organization and add the installation id to .env
  4. add the app id, client id, and organization name to .env
  5. add the private key and client secret to secrets/

image

django/library/views.py Fixed Show fixed Hide fixed
@sgfost sgfost force-pushed the git-mirror branch 2 times, most recently from ca4ade6 to 4be9693 Compare September 11, 2024 23:08
django/library/views.py Fixed Show fixed Hide fixed
@sgfost sgfost force-pushed the git-mirror branch 2 times, most recently from 6b4e25f to 5a356fb Compare October 4, 2024 19:34
django/library/views.py Dismissed Show dismissed Hide dismissed
@sgfost sgfost marked this pull request as ready for review October 18, 2024 22:47
sgfost added 11 commits December 5, 2024 11:37
* fix release ordering to sort by semantic version number rather than by
  string
this API is responsible for managing a local git repository mirror for a
comses codebase. PUBLIC release archives are commits/tags in the history

`build()` and `append_releases()` are the two main API methods which
construct (or rebuild) a git repo and add new releases to the repo,
respectively
* indicate that ordered releases method on codebase now returns a list and
  add public_releases() which returns a queryset
currently these are not retroactively inserted into archive packages
since that would require rebuilding everything. Generating git repos,
however, will add them if they are missing

** includes an experimental refactor of metadata transformations which
is used to implement the citation file format generation

resolves comses/planning#234
currently a synchronous process with 0 error handling

* add file size checking to the git repo fs api
this allows for retries regardless of where the failure occurred

the main points are:
- only build the git repo if it doesn't exist
- only create the github repo if it doesn't exist
- pushing changes that exist remotely is already handled gracefully
https://huey.readthedocs.io/

The huey service is essentially a mirror of the server with a connection
to the same db and redis service that runs the huey consumer process

**setup is currently only for development**
* now only attempt to create github releases if they don't already exist
* fixed bug where local_releases were being overwritten with only the
  last updated releases instead of adding to the set
sgfost added 11 commits December 5, 2024 11:37
this is considerably easier to manage than creating a clone service, not
sure if there is any potential downside to not isolating the two
processes
checks that repo names are valid and available

TODO: add a /github page for more in depth information about the feature
and a summary in the action modal
will need to really clean up metadata coversions, especially before
adding syncing but the one-way transformers idea likely won't hold up

ideas for a better approach:
- pydantic for validation/structuring
- codemeta as an intermediate format
adding these manually was an easily forgotten step that wouldn't be
noticed in dev but would fail to build in prod
mirroring strategy is not currently planned to be used with user
repositories

* use absolute url in github description
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant