Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Removed Items? #53

Open
dleve123 opened this issue Jan 4, 2022 · 0 comments
Open

Tracking Removed Items? #53

dleve123 opened this issue Jan 4, 2022 · 0 comments

Comments

@dleve123
Copy link

dleve123 commented Jan 4, 2022

Hi @simonw – thanks so much for this awesome tool!

I am working on an analytics project for a dataset managed as a JSON file in an OSS Github repo (https://github.com/the-commons-project/vci-directory/blob/main/vci-issuers.json) and would love to convert the JSON file into something queryable by a BI tool (Looker, Mode, etc).

git-history has been working great out of the box, but with one exception: it doesn't seem to track the removal of items in a first-class way. To be more concrete, there was a commit in my dataset where many items were added erroneously but subsequently removed. It would be ideal for some record of their removal to be recorded, but I don't believe it is (please correct me if I'm wrong).

To better demonstrate the object removal, below is a screenshot generated from manually counting objects from the JSON file in question at every commit (not using git-history):

Screen Shot 2022-01-04 at 3 35 58 PM

The expression I'm using to generate the database:

$ git-history file vci-issuers.db vci-issuers.json --namespace issuers --id iss --convert 'json.loads(content)["participating_issuers"]' --ignore-duplicate-ids

Any recommendations/tips? Happy to help here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant