Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change how nested objects are indexed #404

Merged
merged 4 commits into from
Sep 11, 2023
Merged

Change how nested objects are indexed #404

merged 4 commits into from
Sep 11, 2023

Conversation

max-zilla
Copy link
Contributor

@max-zilla max-zilla commented Mar 8, 2023

Description

This is a proposed fix for a bug discovered in Clowder process for indexing extractor metadata into Elasticsearch. The previous code would inadvertently cast nested JSON objects as long JSON strings in some cases where arrays were being used, this PR modifies the indexer to retain the JSON structure. Features like ES type inference (double vs. string for example) is maintained.

Affected instances would need to do the following to refresh/correct the search index:
POST /api/deleteindex
POST /api/reindex (this does not delete the index first, must do it manually)

Review Time Estimate

  • Immediately
  • Within one week
  • When possible

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the CHANGELOG.md.
  • I have signed the CLA
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@max-zilla max-zilla marked this pull request as ready for review March 10, 2023 17:08
@robkooper robkooper requested review from lmarini and tcnichol April 14, 2023 16:14
@lmarini
Copy link
Member

lmarini commented Sep 8, 2023

To test upload extracted metadata with array of json objects and search for keys in the objects.

@lmarini lmarini merged commit 3e058a2 into develop Sep 11, 2023
6 checks passed
@lmarini
Copy link
Member

lmarini commented Sep 11, 2023

Tested with

    "@context": [
        "https://clowder.ncsa.illinois.edu/contexts/metadata.jsonld",
        {
            "Predictions": "http://clowder.ncsa.illinois.edu/metadata/ncsa.tensorflow-parallel-dataset-image-classification#Predictions",
            "class_name": "http://clowder.ncsa.illinois.edu/metadata/ncsa.tensorflow-parallel-dataset-image-classification#Predictions.class_name",
            "class_description": "http://clowder.ncsa.illinois.edu/metadata/ncsa.tensorflow-parallel-dataset-image-classification#Predictions.class_description",
            "score": "http://clowder.ncsa.illinois.edu/metadata/ncsa.tensorflow-parallel-dataset-image-classification#Predictions.score"
        }
    ],
    "agent": {
        "@type": "cat:extractor",
        "name": "ncsa.tensorflow-parallel-dataset-image-classification",
        "extractor_id": "https://clowder.ncsa.illinois.edu/clowder/extractors/ncsa.tensorflow-parallel-dataset-image-classification/2.3"
    },
    "content": {
        "Predictions": [
            {
                "class_name": "n01682714",
                "class_prediction": "American_chameleon",
                "score": 0.7607384
            },
            {
                "class_name": "n01693334",
                "class_prediction": "green_lizard",
                "score": 0.21042463
            },
            {
                "class_name": "n01687978",
                "class_prediction": "agama",
                "score": 0.016864877
            }
        ]
    }
}```

@lmarini lmarini deleted the nested-search-fix branch September 11, 2023 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants