feature: Add metadataTypes endpoint - readonly; available for everyone #1428

Ingvord · 2024-09-19T15:15:08Z

Add required endpoint for #613

sbliven · 2024-09-30T14:00:31Z

What is a metadata type? Can you give an example? I feel like this change should be discussed.

Ingvord · 2024-09-30T22:10:57Z

@sbliven Thanks for asking!

Here is an example:

This is required for SciCatProject/frontend#1594

BTW you can try it yourself. As easy as switching a git branch and rebuilding the project. Once rebuild, open your browser and navigate to localhost:3000/explorer you will see all the endpoints provided by this service, including the new /datasets/metadataTypes.

I recorded a short tutorial:

Peek.2024-10-01.00-08.mp4

Of course that implies you have a dev or testing environment i.e. db etc

If you need any guides on how to do this in your env feel free to reach me out - I will be more than happy to help.

sbliven

If I understand this correctly, every call to this endpoint iterates over every dataset, groups all top-level scientificMetadata fields by key, and then returns the javascript type of the values (or 'mixed' if multiple types are present).

I have multiple objections to this.

It doesn't work for hierarchical metadata
The performance is going to be terrible on big instances. Facilities that don't enforce a single scientificMetadata schema will likely return tens of thousands of keys, most of which will be the useless 'mixed'.
I think this could better be achieved using schemas for facilities that do have standardized metadata.
It doesn't support the unitSI/valueSI convention already in use.

sbliven · 2024-10-02T11:47:07Z

I do see that the frontend needs this info for search auto completion. Would one of these work?

Proscriptive. Operators provide this table manually, eg as a config file. This avoids rebuilding it for every request. They could even implement a script which updates the configuration from the database occasionally if that is required and the number of keys stays reasonable.
User input. Don't auto-complete keys or types, but allow users to specify them manually.
Index. Maintain an index specifically for this query. Dataset creation presumably becomes a bit slower, but this query would be efficient. Care would be needed to keep the index from growing too large.

…I-Logic-from-Frontend" This reverts commit 09d2c75, reversing changes made to 38780fd.

nitrosx · 2024-10-04T05:52:10Z

I have not review the PR, but I agree with @sbliven that this feature needs to be discussed with the collaborators.
If the list is compiled when the request is submitted, performance will be poor for sure. Also we need to consider all the different metadata structures that are used across facilities and be clear which one we address.

alternative solution might be to build this list offline at set intervals. This solution will increase performances, but we would need to accept that the list might not need up-to-date

nitrosx · 2024-10-04T05:53:36Z

@Ingvord can you be a little bit more specific why you need this endpoint?
More important do you see any other solution to the issue that you are aiming to solve?

Ingvord · 2024-10-04T08:25:24Z

@sbliven Thanks for your comprehensive review!

In general I agree with your objections. Though a few quantative metrics are required here.

Your suggested solutions seem prominent. I would strongly support the Prospective one. But I agree with @nitrosx this requires an approval from the collaborators.

I suggest we discuss this in more details via a zoom meeting (maybe on a smaller scale at first, say just three of us). And then propose whatever consensus we have at the regular collaborators meeting.

Thanks again for your input!

nitrosx · 2024-10-08T08:20:31Z

After talking to @Junjiequan, we should leverage the datasets/metadataKeys endpoint. It already provide the list of metadata keys. It needs some refactoring but it can extend to also cover the needs that this PR is trying to address

Add metadataTypes endpoint - readonly; available for everyone

cd6c358

Ingvord mentioned this pull request Sep 19, 2024

Resolving remaining issues in the Search UI project SciCatProject/frontend#1594

Draft

4 tasks

Ingvord force-pushed the add-metadata-types-endpoint branch from 734ac73 to cd6c358 Compare October 2, 2024 11:00

sbliven reviewed Oct 2, 2024

View reviewed changes

Revert "Merge pull request #1404 from SciCatProject/Separate-Search-U…

0271aa3

…I-Logic-from-Frontend" This reverts commit 09d2c75, reversing changes made to 38780fd.

Ingvord added 3 commits November 12, 2024 10:41

Progress #1141: cache metadataTypes

31d3cd8

Progress #1141: cache metadataTypes

680d0bc

Progress #1141: fix cache time

e7f587f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: Add metadataTypes endpoint - readonly; available for everyone #1428

feature: Add metadataTypes endpoint - readonly; available for everyone #1428

Ingvord commented Sep 19, 2024

sbliven commented Sep 30, 2024

Ingvord commented Sep 30, 2024 •

edited

Loading

sbliven left a comment

sbliven commented Oct 2, 2024

nitrosx commented Oct 4, 2024

nitrosx commented Oct 4, 2024

Ingvord commented Oct 4, 2024

nitrosx commented Oct 8, 2024

feature: Add metadataTypes endpoint - readonly; available for everyone #1428

Are you sure you want to change the base?

feature: Add metadataTypes endpoint - readonly; available for everyone #1428

Conversation

Ingvord commented Sep 19, 2024

sbliven commented Sep 30, 2024

Ingvord commented Sep 30, 2024 • edited Loading

sbliven left a comment

Choose a reason for hiding this comment

sbliven commented Oct 2, 2024

nitrosx commented Oct 4, 2024

nitrosx commented Oct 4, 2024

Ingvord commented Oct 4, 2024

nitrosx commented Oct 8, 2024

Ingvord commented Sep 30, 2024 •

edited

Loading