Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor for builder changes and Individual Datasets #7

Merged

Conversation

JLoveUOA
Copy link
Contributor

@JLoveUOA JLoveUOA commented Jul 24, 2024

Two major changes in this PR.

Refactored based on builder changes

  • Encrypted graph is now expected to be written within the RO-Crate graph
  • Sensitive data now generates recipients entities based on public keys supplied.
  • Bulk encryption of output archive is now supported via the --bulk-encrypt command
  • Metadata holds schema url information
  • groups are represented as ACLS
  • New Data types are created in the RO-Crate
    • Organizations
    • facilities
    • group (for both ACLS and facility manager groups)
    • ACLS as Digital Document Permissions

Print Lab Extractor now Produces individual dataset Crates

  • Datasets now collect all sub-files and directories ONLY If 'crate children' is specified in the input worksheet
    • The --duplicate-directory parameter will also cause each dataset to fully duplicate the crate source.
  • the --split datasets parameter will now produce an individual dataset RO-Crate for each dataset in the spreadsheet
  • --duplicate-directory and --split-datasets are not recommend to be used together (this duplicates your data N times where N is your number of datasets)
  • Datasets hold an individual copy of all relevant metatada

Added an ICD-11 lookup API for medical condition data

Added tests for all print lab data extraction and generic metadata extraction

pyproject.toml Outdated
ro-crate-py = {git = "https://github.com/UoA-eResearch/ro-crate-py.git", branch = "encrypted-metadata"}
openpyxl = "^3.1.5"
click = "^8.1.7"
faker = "^27.4.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do Faker, FactoryBoy and Hypothesis want to sit in general dependencies or should these be dev dependencies? I.e. do we expect our users to run the tests?

Just a thought

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved them to a test group, I don't think I even ended up needing hypothesis and factory boy so will remove those

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in a separate .env file? or in the conftest.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a value in the ENV file that updates this with a new pubkey this one is just hard-coded in by default.
I should definitely update the email to a dummy value.



def fake_upi(faker: Faker) -> str:
return faker.bothify(text="???###", letters="abcdefghijklmnopqrstuvwxyz")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing a character? Should be ????### IIRC :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this is included by mistake :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes removed :)



@responses.activate
def test_mytardis_rest_agent_get(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this test is aiming to do. The response has been mocked so asserting that the status is 200 is not useful here. It does demonstrate that a get request has been made :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically all that was for, that a request happens at all. Added a few tests for bad requests to make it more complete.


response = mt_rest_agent.no_auth_request("GET", url)
do_not_use_auth.assert_not_called()
assert isinstance(response.json(), dict)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three tests are mocked so asserting them is surplus to requirements I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reduced just down to checking the auth is not used

@JLoveUOA JLoveUOA merged commit 40ca2eb into main Sep 24, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants