Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling metadata production for DataPipe; updated existing codebase to TypeScript and added miscellaneous functionalities. #99

Open
wants to merge 13 commits into
base: test
Choose a base branch
from

Conversation

Bankminer78
Copy link

Metadata production in DataPipe

DataPipe now can generate metadata from incoming data created by JsPsych. Firebase and OSF both have a copy of the parent metadata for an experiment. With every subsequent data upload call made to DataPipe, the incoming data's metadata is generated and compared against existing metadata, and updates are made as necessary. Thus, with every data file upload, DataPipe maintains an up-to-date metadata file for the whole dataset.

The following functionalities have been created to serve said purpose:

  • A copy of the latest parent metadata for an experiment is stored in Firestore, and update/reads are done in a transaction (to prevent race conditions),

  • If inconsistencies arise between the presence of metadata in OSF and Firestore, logic has been developed to sort it out (metadata-block.ts),

  • Auxiliary metadata functions for the below purposes have been developed:

    • Comparing metadata generated for incoming data with existing metadata, and returning updated metadata (metadata-update.ts),
    • Downloading metadata from OSF (metadata-download.ts) [can be changed to a general download function],
    • Updating a file in OSF (update-file-OSF.ts),
    • Check if metadata exists in OSF and return the file ID (metadata-process.ts),
    • Generate metadata from incoming data using JsPsych’s new metadata module [publishing pending - so a local copy is used here] (metadata-production.ts),
  • Tests have been written to unit test metadata functions (test files with names beginning with metadata). A mock server has been created for this purpose (mock-server.ts),

  • A metadata panel on the dashboard now exists, with a button that toggles metadata production,

  • api-messages.ts has been updated to include messages pertaining to metadata and for data not found.

  • The FAQ and the Getting Started page have been updated to include basic user documentation for using metadata production with DataPipe.

Upgrading codebase to TypeScript

All functions in the functions directory of this version of DataPipe, are now written in TypeScript code. As of this PR, all functions have been rewritten to pass TS building in ‘strict’ mode. Interfaces for commonly used variables have been created (interfaces.ts). Built function files can now be found in a generated lib directory (which have now been included in .gitignore). Assertions have been kept to a minimum and are generally preceded by type guards and throw errors.

Other Additions

This PR also augmented DataPipe with features that have been mentioned in open issues.

Creation of Error Logging system with UI (Issue #76)

DataPipe now keeps track of all errors that occur in the domain of a specific experiment. Changes have been made to writeLog (write-log.ts) that now enable it to write errors to an experiment’s log document in Firebase. So writeLog is called every time an error occurs with an identifiable experimentID. Orphan error handling has not been addressed.

A basic UI has been created to show up on the experiment dashboard (ErrorPanel.js) if a nonzero number of errors have been detected. The UI is persistent, and an accordion is used to show the logs to users in an accessible way. For each occurrence of an error, the error code, and the timestamp of occurrence is shown.

validate JSON and validate CSV now work without requiredFields (Issue #95)

validateJSON (validate-json.ts) and validateCSV (validate-csv.ts) now take requiredFields as an optional parameter, and if the appropriate format is successfully parsed and no requiredFields is found, both functions return true. For, validateCSV due to the nature of parsing, almost every string would be a valid CSV file, so extra strictness (adhering to Psych-DS) has been applied, so valid CSV now needs to have the same number of columns for every row [we should maybe change this?].

Support for targeting subfolder during data upload (Issue #75)

Users can now include a backlash in their filename parameter of pipe-plugin (like <subfolder-name>/<filename> to target upload to specific subfolders. If a subfolder is not found a subfolder is created (put-file-OSF.ts). This checking is done by a function created called parsePath (subfolder.ts).

Live Version

As of commit 15b14fb..., a live deployed version of this PR of DataPipe can be found at: https://datapipe-test.web.app

@Bankminer78 Bankminer78 changed the title Enabling metadata production for DataPipe; Updated existing codebase to TypeScript and added miscellaneous functionalities. Enabling metadata production for DataPipe; updated existing codebase to TypeScript and added miscellaneous functionalities. Jul 9, 2024
.firebaserc Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
.vscode/settings.json Outdated Show resolved Hide resolved
firebase.json Outdated Show resolved Hide resolved
functions/metadata/dist/index.cjs.map Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use npm to install this from the github repo instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time working on this Josh, and the issue seems to be with how npm does not support targeting subfolders on Git Hub. So I used gitpkg to target a subfolder, but the issue is our repository does not host build files, so there was no dist folder that the package.json points to. So I tried adding a prepare script to metadata's package.json, and that failed probably because of gitPkg?

"build": "tsc",
"watch": "tsc --watch",
"deploy": "npm run build && firebase deploy --only functions",
"lint": "echo 'Linting is turned off, if you want to turn it on change this script to eslint '**/*.js''"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Linting is off?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a temporary workaround to avoid linting. It has now been setup to lint the ts files in ./src. (It is probably also worth noting that a lint ignore line was added so that it does not complain about using var when initializing JsPsychMetadata.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test that the content of the error log is useful?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to add tests to check that logs are being generated?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mock server seemed like the most elegant solution at the time to test the ability of DataPipe to check if metadata exists in OSF. It is purely for testing purposes, and the metadata-emulator.test.js file relies on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants