Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neo4jv5 #79

Merged
merged 26 commits into from
Apr 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
33859ad
Updated neo4j in requirements.txt
ChuckKollar Jan 19, 2024
f78d318
Changed to Neo4J v5, fixed depreceted cyphers, added test information.
ChuckKollar Jan 26, 2024
982476f
Added src/compare_responses.py which allows the user to compare respo…
ChuckKollar Feb 1, 2024
3b792b9
updated DIFFERENCES_PROCESSED > ENDPOINTS_PROCESSED
ChuckKollar Feb 2, 2024
83686c4
Create python-app.yml
DerekFurstPitt Mar 12, 2024
989ebc2
Merge pull request #83 from x-atlas-consortia/Derek-Furst/setup-pytho…
yuanzhou Mar 13, 2024
e336c24
enhancement: field_* endpoints account for changes to CEDAR and CEDAR…
AlanSimmons Mar 13, 2024
e5b0c5c
renamed application parameter in field-entities to application_context
AlanSimmons Mar 14, 2024
53b7bab
renamed application parameter in field-entities to application_context
AlanSimmons Mar 14, 2024
2394ff4
updates to test script
AlanSimmons Mar 14, 2024
68077f0
Bump version to 1.4.11
yuanzhou Mar 18, 2024
d4d9387
updates to test script
AlanSimmons Mar 18, 2024
50fdac2
Merge pull request #86 from x-atlas-consortia/simmons/13mar
yuanzhou Mar 21, 2024
78016b4
Removed /status endpoint as it is interited from ubkg-api
ChuckKollar Mar 26, 2024
75c4242
Updated documentation to reflect working with local or branched versi…
AlanSimmons Apr 8, 2024
ffbbf2f
gitignore to ignore any files generated by prototype cells_index script
AlanSimmons Apr 9, 2024
b739eb2
Refactored test script to write to output file.
AlanSimmons Apr 9, 2024
3551b30
Bug fixes in genedetail.cypher: 1. HRA changed label for "has_marker_…
AlanSimmons Apr 9, 2024
d65918b
Bug fixes in celltypedetail.cypher: 1. HRA changed label for "has_mar…
AlanSimmons Apr 9, 2024
15820c5
Merge pull request #87 from x-atlas-consortia/simmons/03apr2024/neo4jv5
yuanzhou Apr 12, 2024
edc772f
Bump version to 2.0.0
yuanzhou Apr 12, 2024
4197afb
Update hs-ontology-api-spec.yaml version to 2.0.0
yuanzhou Apr 12, 2024
cc52031
Increase to 4 workers and enable multi-threading
yuanzhou Apr 12, 2024
8501d7b
Sync with latest main
yuanzhou Apr 12, 2024
98357b0
Update requirements.txt to use ubkg-api 2.1.1
yuanzhou Apr 12, 2024
80bd722
Install ubkg-api2.1.1
yuanzhou Apr 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# This workflow will install Python dependencies, run tests and lint with a single version of Python
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python

name: Python application
on:
push:
branches: [ "main", "dev-integrate" ]
pull_request:
branches: [ "main", "dev-integrate" ]
permissions:
contents: read
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
uses: actions/setup-python@v3
with:
python-version: "3.9"
- name: Upgrade Pip
run: python -m pip install --upgrade pip
working-directory: src
- name: Install Dependencies
run: pip install -r requirements.txt
working-directory: src
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,7 @@ docker/ubkg-api/BUILD
BUILD

**/__pycache__

/tests/*/*.out
/src/cells_index/*.csv
/src/cells_index/*.tsv
18 changes: 11 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,17 @@ If you are modifying code only in hs-ontology-api, you will only need
to use the PyPy package version of ubkg-api. The package is included in the requirements.txt file of this repo.

If you need to modify both the hs-ontology-api and ubkg-api in concert, you will
need to work with a local instance of the ubkg-api. This is possible by doing the following:
1. Check out a branch of ubkg-api.
2. Configure the local branch of ubkg-api, similarly to the local instance of hs-ontology-api.
3. Start the local instance of ubkg-api.
4. In the virtual environment for hs-ontology-api, install the local instance of ubkg-api using pip with the **-e** flag. This will override the pointer to the ubkg-api package.

``pip install -e path/to/local/ubkg/repo``
need to work with a local or branch instance of the ubkg-api. This is possible by doing the following:
1. If your working ubkg-api instance has been committed to a branch, you can point to the branch instance in requirements.txt with a command such as ``git+https://github.com/x-atlas-consortia/ubkg-api.git@<YOUR BRANCH>``
2. Check out a branch of ubkg-api.
2. Configure the app.cfg file of the local branch of ubkg-api to connect to the appropriate UBKG instance.
3. In the virtual environment for hs-ontology-api, install an editable local instance of ubkg-api. Two ways to do this:
a. ``pip install -e path/to/local/ubkg-api/repo``
b. If using PyCharm, in the **Python Packages** tab,
1) Click **Add Package**.
2) Navigate to the root of the ubkg-api repo.
3) Indicate that the package is editable.
4. Because ubkg-api has a PyPI TOML file, any of the aforementioned commands will compile a local package and override the pointer to the ubkg-api package.

## Connecting to the local instance of hs-ontology-api
For URLs that execute endpoints in your local instance, use the values indicated in the **main.py** script, in the section prefaced with the comment `For local development/testing`:
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.4.10
2.0.0
42 changes: 30 additions & 12 deletions hs-ontology-api-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ openapi: 3.0.3
info:
title: HubMAP/SenNet Ontology API (hs-ontology-api)
description: The HuBMAP/SenNet Ontology API contains endpoints for querying a [UBKG](https://ubkg.docs.xconsortia.org/) instance with content from the [HuBMAP/SenNet context](https://ubkg.docs.xconsortia.org/contexts/#hubmapsennet-context). The hs-ontology-api imports the [ubkg-api](https://smart-api.info/ui/96e5b5c0b0efeef5b93ea98ac2794837), which encapsulates both basic connectivity to a UBKG instance and generic endpoint code.
version: 1.4.4
version: 2.0.0
contact:
name: GitHub repository
url: https://github.com/x-atlas-consortia/hs-ontology-api
Expand Down Expand Up @@ -735,24 +735,33 @@ paths:
/field-entities:
get:
operationId: field_entities_get
summary: Return associations between ingest metadata fields and provenance entities. Replacement for field-entities.yaml. NOTE - available only for fields from field-entities.yaml.
summary: Return associations between ingest metadata fields and provenance entities. Replacement for field-entities.yaml.
parameters:
- name: source
in: query
required: false
description: case-insensitive name of the ontology source for the provenance entities (HMFIELD = field-types.yaml; HUBMAP = UBKG)
description: case-insensitive name of the ontology source for the provenance entity mappings. (HMFIELD = from legacy field-entities.yaml)
schema:
type: string
enum:
- HMFIELD
- HUBMAP
- CEDAR
- name: entity
in: query
required: false
description: case-sensitive name for entity in either HMFIELD or HUMBMAP ontology
description: case-sensitive name for an entity in either HMFIELD or HUBMAP/SENNET
schema:
type: string
example: Sample
- name: application_context
in: query
required: false
description: case-insensitive name of the application context
schema:
type: string
enum:
- HUBMAP
- SENNET
responses:
'200':
description: Associations between ingest metadata fields and provenance entities.
Expand All @@ -761,15 +770,15 @@ paths:
schema:
$ref: '#/components/schemas/FieldEntitiesResponse'
'400':
description: Invalid value for parameter (e.g., *source* not HMFIELD or HUBMAP); invalid parameter
description: Invalid value for parameter (e.g., *source* not HMFIELD or CEDAR); invalid parameter
'404':
description: No field entities (with list of parameters, if specified)
'5XX':
description: Unexpected error
/field-entities/{name}:
get:
operationId: field_entities_name_get
summary: Return associations between specified ingest metadata field and provenance entities. Replacement for field-entities.yaml. NOTE - available only for fields from field-entities.yaml.
summary: Return associations between specified ingest metadata field and provenance entities. Replacement for field-entities.yaml.
parameters:
- name: name
in: path
Expand All @@ -781,7 +790,7 @@ paths:
- name: source
in: query
required: false
description: case-insensitive name of the ontology source for the provenance entities (HMFIELD = field-types.yaml; HUBMAP = UBKG)
description: case-insensitive name of the ontology source for the provenance entity mappings. (HMFIELD = from legacy field-entities.yaml)
schema:
type: string
enum:
Expand All @@ -790,10 +799,19 @@ paths:
- name: entity
in: query
required: false
description: case-sensitive name for entity in either HMFIELD or HUMBMAP ontology
description: case-sensitive name for entity in either HMFIELD or HUBMAP/SENNET
schema:
type: string
example: Sample
- name: application_context
in: query
required: false
description: case-insensitive name of the application context
schema:
type: string
enum:
- HUBMAP
- SENNET
responses:
'200':
description: Associations between specified ingest metadata field and provenance entities.
Expand All @@ -810,7 +828,7 @@ paths:
/field-assays:
get:
operationId: field_assays_get
summary: Return associations between ingest metadata fields and the "assays" (dataset data types). Replacement for field-assays.yaml. NOTE only those CEDAR fields that are also in legacy field-assays.yaml can be mapped to assays.
summary: Return associations between ingest metadata fields and the "assays" (dataset data types). Replacement for field-assays.yaml. NOTE only those CEDAR fields that are also in legacy field-assays.yaml can be mapped to assays. In addition, the response from this endpoint is reliable only for datasets that existed prior to the deployment in 2024 of the assay classifier (aka Rules Engine, aka "soft assay types").
parameters:
- name: assay_identifier
in: query
Expand Down Expand Up @@ -849,7 +867,7 @@ paths:
/field-assays/{name}:
get:
operationId: field_assays_name_get
summary: Return associations between the specified ingest metadata field and the "assays" (dataset data types). Replacement for field-assays.yaml.NOTE only those CEDAR fields that are also in legacy field-assays.yaml can be mapped to assays.
summary: Return associations between the specified ingest metadata field and the "assays" (dataset data types). Replacement for field-assays.yaml. NOTE only those CEDAR fields that are also in legacy field-assays.yaml can be mapped to assays.
parameters:
- name: name
in: path
Expand Down Expand Up @@ -1876,4 +1894,4 @@ components:
schema:
type: string
description: name of schema
example: imc3d
example: imc3d
16 changes: 10 additions & 6 deletions src/hs_ontology_api/cypher/celltypedetail.cypher
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,9 @@ CALL
// The calling function in neo4j_logic.py will replace $ids.
WITH [$ids] AS ids

OPTIONAL MATCH (pCL:Concept)-[:CODE]->(cCL:Code) WHERE cCL.SAB='CL' AND CASE WHEN ids[0]<>'' THEN ANY(id in ids WHERE cCL.CODE=id) ELSE 1=1 END RETURN DISTINCT pCL.CUI AS CLCUI
}
// APRIL 2024 Bug fix to use CodeID instead of CODE for cases of leading zeroes in strings.
OPTIONAL MATCH (pCL:Concept)-[:CODE]->(cCL:Code)
WHERE CASE WHEN ids[0]<>'' THEN ANY(id in ids WHERE cCL.CodeID='CL:'+id) ELSE 1=1 END RETURN DISTINCT pCL.CUI AS CLCUI}

CALL
{
Expand Down Expand Up @@ -54,13 +55,16 @@ ORDER BY CLID
UNION

//CL-HGNC mappings via HRA
// APRIL 2024 - HRA changed "has_marker_component" to "characterized_by"

//HGNC ID
WITH CLCUI
OPTIONAL MATCH (cCL:Code)<-[:CODE]-(pCL:Concept)-[:has_marker_component]->(pGene:Concept)-[:CODE]->(cGene:Code)-[r]->(tGene:Term)
OPTIONAL MATCH (cCL:Code)<-[:CODE]-(pCL:Concept)-[:characterized_by]->(pGene:Concept)-[:CODE]->(cGene:Code)-[r]->(tGene:Term)
WHERE pCL.CUI=CLCUI AND cGene.SAB='HGNC' AND r.CUI=pGene.CUI AND cCL.SAB='CL' AND type(r) IN ['ACR','PT']
RETURN distinct cCL.CodeID as CLID, 'cell_types_genes' as ret_key, cGene.CodeID + '|' + apoc.text.join(COLLECT(tGene.name),'|') AS ret_value
ORDER BY CLID, cGene.CodeID + '|' + apoc.text.join(COLLECT(tGene.name),'|')
WITH COLLECT(tGene.name) AS tgene_names, cGene.CodeID AS cgene_codeid, cCL.CodeID AS ccl_codeid
WITH distinct ccl_codeid AS CLID, 'cell_types_genes' AS ret_key, cgene_codeid+'|'+apoc.text.join(tgene_names,'|') AS ret_value
RETURN CLID, ret_key, ret_value
ORDER BY CLID, ret_value

UNION

Expand Down Expand Up @@ -110,4 +114,4 @@ map['cell_types_definition'] AS cell_types_definition,
map['cell_types_genes'] AS cell_types_genes,
map['cell_types_organ'] AS cell_types_organs

order by CLID
order by CLID
10 changes: 7 additions & 3 deletions src/hs_ontology_api/cypher/fieldassays.cypher
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
// Obtains associations between ingest metadat fields and assay dataset types, both for legacy (HMFIELD) and CEDAR.
// Obtains associations between ingest metadata fields and assay dataset types, both for legacy (HMFIELD) and CEDAR.
// Used by the field-assays endpoint.

// NOTE: With the deployment of the assay classifier (Rules Engine, or "soft assay types"), the UBKG is no longer the
// source of truth for assay type. This endpoint is primarily for legacy datasets.

// Identify all metadata fields, from both:
// - legacy sources (the field_*.yaml files in ingest-validation-tools, and modeled in HMFIELD), child codes of HMFIELD:1000
// - current sources (CEDAR tempates, modeled in CEDAR), child codes of CEDAR:TemplateField
Expand Down Expand Up @@ -64,13 +67,14 @@ CALL
WITH CUIHMDataset
OPTIONAL MATCH (pAssay:Concept)-[:has_data_type]->(pDataType:Concept)-[:CODE]->(cDataType:Code)-[r:PT]->(tDataType:Term)
WHERE pAssay.CUI=CUIHMDataset
AND cDataType.SAB='HUBMAP'
AND cDataType.SAB ='HUBMAP'
AND r.CUI=pDataType.CUI
RETURN CASE WHEN tDataType.name IS NULL THEN 'none' ELSE tDataType.name END AS data_type
}

// For each HuBMAP Dataset, obtain the "soft assay" dataset type.
// The "soft assay" dataset type is a member of the Soft Assay Dataset Type hierarchy in HUBMAP, with parent code HUBMAP:C003041.
// The "soft assay" dataset type is a member of the Soft Assay Dataset Type hierarchy in HUBMAP, with parent code
// HUBMAP:C003041
CALL
{
WITH CUIHMDataset
Expand Down
57 changes: 42 additions & 15 deletions src/hs_ontology_api/cypher/fieldentities.cypher
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,13 @@
/// Identify all metadata fields, from both:
// - legacy sources (the field_*.yaml files in ingest-validation-tools, and modeled in HMFIELD), child codes of HMFIELD:1000
// - current sources (CEDAR tempates, modeled in CEDAR), child codes of CEDAR:TemplateField
// Fields that are in the intersection of HMFIELD and CEDAR share CUIs.

// Collect the HMFIELD and CEDAR codes for each metadata field to flatten to level of field name.

// The field_entities_get_logic in neo4j_logic will replace the field_filter and source_filter variables.
// The field_entities_get_logic in neo4j_logic will replace variables that start with the dollar sign.

// source_filter allows filtering by mapping source (HMFIELD or CEDAR).
// field_filter allows filtering by field name.

WITH $source_filter AS source_filter
CALL
Expand All @@ -30,24 +32,49 @@ CALL
apoc.text.join(COLLECT(DISTINCT cField.CodeID),'|') AS code_ids,
pField.CUI AS CUIField
}
// For each field, get associated entities from HMFIELD and HUBMAP ontologies.
// Each HMFIELD entity node is cross-referenced to a HUBMAP entity node.
// (CEDAR fields are currently not associated with entities.)
// The field_entities_get_logic in neo4j_logic will replace the entity_filter and source_filter variables.
// For each field, get associated provenance entities.
// entity_filter allows filtering for provenance entity by name--e.g., "dataset", "Dataset".
// application_filter allows filtering on application context--i.e., "HUBMAP" or "SENNET".

CALL
{
WITH CUIField, source_filter
OPTIONAL MATCH (pField:Concept)-[:used_in_entity]->(pEntity:Concept)-[:CODE]->(cHMFIELDEntity:Code)-[rHMFIELD:PT]->(tHMFIELDEntity:Term),
(pEntity:Concept)-[:CODE]->(cHUBMAPEntity:Code)-[rHUBMAP:PT]->(tHUBMAPEntity:Term)
WHERE pField.CUI=CUIField AND cHMFIELDEntity.SAB ='HMFIELD' AND rHMFIELD.CUI=pEntity.CUI
AND cHUBMAPEntity.SAB='HUBMAP' AND rHUBMAP.CUI=pEntity.CUI
$entity_filter
RETURN apoc.text.join([CASE WHEN source_filter in ['HMFIELD',''] THEN REPLACE(cHMFIELDEntity.CodeID,':','|') + '|' + tHMFIELDEntity.name ELSE '' END,
CASE WHEN source_filter in ['HUBMAP',''] THEN REPLACE(cHUBMAPEntity.CodeID,':','|') + '|' + tHUBMAPEntity.name ELSE '' END],';') AS entity
// Each HMFIELD field node is linked to a HMFIELD entity node.
WITH CUIField,source_filter
OPTIONAL MATCH (pField:Concept)-[:used_in_entity]->(pEntity:Concept)-[:CODE]->(cEntity:Code)-[r:PT]->(tEntity:Term)
WHERE pField.CUI=CUIField
AND cEntity.SAB ='HMFIELD'
AND r.CUI=pEntity.CUI
$entity_filter
RETURN DISTINCT CASE WHEN source_filter IN ['HMFIELD',''] THEN REPLACE(cEntity.CodeID,':','|') + '|' + tEntity.name ELSE '' END AS entity

UNION

// Each HMFIELD entity node is cross-referenced to HUBMAP and SENNET provenance entity nodes.
WITH CUIField,source_filter
OPTIONAL MATCH (pField:Concept)-[:used_in_entity]->(pEntity:Concept)-[:CODE]->(cEntity:Code)-[r:PT]->(tEntity:Term)
WHERE pField.CUI=CUIField
$application_filter
AND r.CUI=pEntity.CUI
$entity_filter
RETURN DISTINCT CASE WHEN source_filter IN ['HMFIELD',''] THEN REPLACE(cEntity.CodeID,':','|') + '|' + tEntity.name ELSE '' END AS entity

UNION

//CEDAR template nodes are mapped to provenance entity nodes in both HUBMAP and SENNET.
//CEDAR field nodes relate to CEDAR template nodes.

WITH CUIField,source_filter
OPTIONAL MATCH (pField:Concept)-[:inverse_has_field]->(pTemplate:Concept)-[:used_in_entity]->(pEntity:Concept)-[:CODE]->(cEntity:Code)-[r:PT]->(tEntity:Term)
WHERE pField.CUI = CUIField
AND r.CUI=pEntity.CUI
//AND c.Entity.SAB in ['HUBMAP','SENNET']
$application_filter
$entity_filter
RETURN DISTINCT CASE WHEN source_filter IN ['CEDAR',''] THEN REPLACE(cEntity.CodeID,':','|') + '|' + tEntity.name ELSE '' END AS entity
}

WITH field_name, code_ids, entity
WHERE entity <>""
WITH field_name, code_ids, COLLECT(entity) AS entities
WHERE entities <>['null;null']
RETURN field_name, code_ids, entities
ORDER BY field_name
Loading
Loading