Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neo4jv5 #88

Merged
merged 17 commits into from
Apr 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,7 @@ docker/ubkg-api/BUILD
BUILD

**/__pycache__

/tests/*/*.out
/src/cells_index/*.csv
/src/cells_index/*.tsv
18 changes: 11 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,17 @@ If you are modifying code only in hs-ontology-api, you will only need
to use the PyPy package version of ubkg-api. The package is included in the requirements.txt file of this repo.

If you need to modify both the hs-ontology-api and ubkg-api in concert, you will
need to work with a local instance of the ubkg-api. This is possible by doing the following:
1. Check out a branch of ubkg-api.
2. Configure the local branch of ubkg-api, similarly to the local instance of hs-ontology-api.
3. Start the local instance of ubkg-api.
4. In the virtual environment for hs-ontology-api, install the local instance of ubkg-api using pip with the **-e** flag. This will override the pointer to the ubkg-api package.

``pip install -e path/to/local/ubkg/repo``
need to work with a local or branch instance of the ubkg-api. This is possible by doing the following:
1. If your working ubkg-api instance has been committed to a branch, you can point to the branch instance in requirements.txt with a command such as ``git+https://github.com/x-atlas-consortia/ubkg-api.git@<YOUR BRANCH>``
2. Check out a branch of ubkg-api.
2. Configure the app.cfg file of the local branch of ubkg-api to connect to the appropriate UBKG instance.
3. In the virtual environment for hs-ontology-api, install an editable local instance of ubkg-api. Two ways to do this:
a. ``pip install -e path/to/local/ubkg-api/repo``
b. If using PyCharm, in the **Python Packages** tab,
1) Click **Add Package**.
2) Navigate to the root of the ubkg-api repo.
3) Indicate that the package is editable.
4. Because ubkg-api has a PyPI TOML file, any of the aforementioned commands will compile a local package and override the pointer to the ubkg-api package.

## Connecting to the local instance of hs-ontology-api
For URLs that execute endpoints in your local instance, use the values indicated in the **main.py** script, in the section prefaced with the comment `For local development/testing`:
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.4.11
2.0.0
4 changes: 2 additions & 2 deletions hs-ontology-api-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ openapi: 3.0.3
info:
title: HubMAP/SenNet Ontology API (hs-ontology-api)
description: The HuBMAP/SenNet Ontology API contains endpoints for querying a [UBKG](https://ubkg.docs.xconsortia.org/) instance with content from the [HuBMAP/SenNet context](https://ubkg.docs.xconsortia.org/contexts/#hubmapsennet-context). The hs-ontology-api imports the [ubkg-api](https://smart-api.info/ui/96e5b5c0b0efeef5b93ea98ac2794837), which encapsulates both basic connectivity to a UBKG instance and generic endpoint code.
version: 1.4.11
version: 2.0.0
contact:
name: GitHub repository
url: https://github.com/x-atlas-consortia/hs-ontology-api
Expand Down Expand Up @@ -1894,4 +1894,4 @@ components:
schema:
type: string
description: name of schema
example: imc3d
example: imc3d
16 changes: 10 additions & 6 deletions src/hs_ontology_api/cypher/celltypedetail.cypher
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,9 @@ CALL
// The calling function in neo4j_logic.py will replace $ids.
WITH [$ids] AS ids

OPTIONAL MATCH (pCL:Concept)-[:CODE]->(cCL:Code) WHERE cCL.SAB='CL' AND CASE WHEN ids[0]<>'' THEN ANY(id in ids WHERE cCL.CODE=id) ELSE 1=1 END RETURN DISTINCT pCL.CUI AS CLCUI
}
// APRIL 2024 Bug fix to use CodeID instead of CODE for cases of leading zeroes in strings.
OPTIONAL MATCH (pCL:Concept)-[:CODE]->(cCL:Code)
WHERE CASE WHEN ids[0]<>'' THEN ANY(id in ids WHERE cCL.CodeID='CL:'+id) ELSE 1=1 END RETURN DISTINCT pCL.CUI AS CLCUI}

CALL
{
Expand Down Expand Up @@ -54,13 +55,16 @@ ORDER BY CLID
UNION

//CL-HGNC mappings via HRA
// APRIL 2024 - HRA changed "has_marker_component" to "characterized_by"

//HGNC ID
WITH CLCUI
OPTIONAL MATCH (cCL:Code)<-[:CODE]-(pCL:Concept)-[:has_marker_component]->(pGene:Concept)-[:CODE]->(cGene:Code)-[r]->(tGene:Term)
OPTIONAL MATCH (cCL:Code)<-[:CODE]-(pCL:Concept)-[:characterized_by]->(pGene:Concept)-[:CODE]->(cGene:Code)-[r]->(tGene:Term)
WHERE pCL.CUI=CLCUI AND cGene.SAB='HGNC' AND r.CUI=pGene.CUI AND cCL.SAB='CL' AND type(r) IN ['ACR','PT']
RETURN distinct cCL.CodeID as CLID, 'cell_types_genes' as ret_key, cGene.CodeID + '|' + apoc.text.join(COLLECT(tGene.name),'|') AS ret_value
ORDER BY CLID, cGene.CodeID + '|' + apoc.text.join(COLLECT(tGene.name),'|')
WITH COLLECT(tGene.name) AS tgene_names, cGene.CodeID AS cgene_codeid, cCL.CodeID AS ccl_codeid
WITH distinct ccl_codeid AS CLID, 'cell_types_genes' AS ret_key, cgene_codeid+'|'+apoc.text.join(tgene_names,'|') AS ret_value
RETURN CLID, ret_key, ret_value
ORDER BY CLID, ret_value

UNION

Expand Down Expand Up @@ -110,4 +114,4 @@ map['cell_types_definition'] AS cell_types_definition,
map['cell_types_genes'] AS cell_types_genes,
map['cell_types_organ'] AS cell_types_organs

order by CLID
order by CLID
36 changes: 24 additions & 12 deletions src/hs_ontology_api/cypher/genedetail.cypher
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,9 @@ ORDER BY hgnc_id,ret_key
UNION

//Cell types - CL Codes
// APRIL 2024 - HRA changed "has_marker_component" to "characterized_by"
WITH GeneCUI
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_has_marker_component]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term) WHERE pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' AND rCL.CUI=pCL.CUI RETURN toInteger(cGene.CODE) AS hgnc_id, 'cell_types_code' AS ret_key, cCL.CodeID AS ret_value
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_characterized_by]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term) WHERE pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' AND rCL.CUI=pCL.CUI RETURN toInteger(cGene.CODE) AS hgnc_id, 'cell_types_code' AS ret_key, cCL.CodeID AS ret_value
ORDER BY hgnc_id,ret_key,ret_value

UNION
Expand All @@ -87,27 +88,34 @@ UNION
// The preferred term will be the term of type PT; if there is no PT, then any of the others of type PT_SAB will do.

// First, order the preferred terms by whether they are the PT or a PT_SAB.
// APRIL 2024 - HRA changed the label from "has_marker_component" to "characterized_by"
WITH GeneCUI
CALL{
WITH GeneCUI
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_has_marker_component]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term) WHERE pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' AND rCL.CUI=pCL.CUI AND type(rCL) STARTS WITH 'PT' RETURN toInteger(cGene.CODE) AS hgnc_id, cCL.CodeID AS CLID, MIN(CASE WHEN type(rCL)='PT' THEN 0 ELSE 1 END) AS mintype order by hgnc_id,CLID,mintype
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_characterized_by]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term) WHERE pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' AND rCL.CUI=pCL.CUI AND type(rCL) STARTS WITH 'PT' RETURN toInteger(cGene.CODE) AS hgnc_id, cCL.CodeID AS CLID, MIN(CASE WHEN type(rCL)='PT' THEN 0 ELSE 1 END) AS mintype order by hgnc_id,CLID,mintype
}

// Next, filter to either the PT or one of the PT_SABs.
// MARCH 2024 - WITH used in return to upgrade to v5 Cypher.
WITH hgnc_id, CLID, mintype
OPTIONAL MATCH (cCL:Code)-[rCL]->(tCL:Term)
where cCL.CodeID = CLID AND type(rCL) STARTS WITH 'PT'
AND CASE WHEN type(rCL)='PT' THEN 0 ELSE 1 END=mintype
return hgnc_id, 'cell_types_name' AS ret_key, CLID +'|'+ CASE WHEN tCL.name IS NULL THEN '' ELSE tCL.name END AS ret_value
WITH hgnc_id, 'cell_types_name' AS ret_key, CLID +'|'+ CASE WHEN tCL.name IS NULL THEN '' ELSE tCL.name END AS ret_value
RETURN hgnc_id, ret_key, ret_value

UNION

// Cell types - CL code|definition
// Definitions link to Concepts and multiple CL codes can match to the same concept; however, each CL code has a "preferred" CUI, identified by the CUI property of the relationship of any of the code's linked terms.

// MARCH 2024 - final WITH added to work with v5 Cypher
// APRIL 2024 - HRA changed "has_marker_component" to "characterized_by"
WITH GeneCUI
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_has_marker_component]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term),(pCL:Concept)-[:DEF]->(dCL:Definition) WHERE rCL.CUI=pCL.CUI AND pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' AND dCL.SAB='CL' RETURN DISTINCT toInteger(cGene.CODE) AS hgnc_id,'cell_types_definition' as ret_key, cCL.CodeID + '|'+ dCL.DEF as ret_value
ORDER BY hgnc_id,cCL.CodeID + '|'+ dCL.DEF
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_characterized_by]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term),(pCL:Concept)-[:DEF]->(dCL:Definition) WHERE rCL.CUI=pCL.CUI AND pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' AND dCL.SAB='CL'
WITH toInteger(cGene.CODE) AS hgnc_id,'cell_types_definition' as ret_key, cCL.CodeID + '|'+ dCL.DEF as ret_value
RETURN DISTINCT hgnc_id, ret_key, ret_value
ORDER BY hgnc_id, ret_value

UNION

Expand All @@ -118,32 +126,36 @@ UNION
// 3. Assigns UBERON codes as cross-references to AZ organ codes.
//
// To get organ information, map gene to cell type to organ location.
// APRIL 2024 - HRA changed "has_marker_component" to "characterized_by"
WITH GeneCUI
//First, get Azimuth Codes that are cross-referenced to CL codes. For the case of a CL code being cross-referenced to multiple AZ codes, only one AZ code gets the "preferred" cross-reference to the CL code; however, all AZ codes have a cross-reference to the CL code, so do not check on rAZ.CUI=pCL.CUI.
CALL
{WITH GeneCUI
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_has_marker_component]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term), (pCL:Concept)-[:CODE]->(cAZ:Code)-[rAZ]->(tAZ:Term) WHERE rCL.CUI=pCL.CUI AND pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' AND cAZ.SAB='AZ' RETURN DISTINCT toInteger(cGene.CODE) AS hgnc_id,cCL.CodeID as CLID,cAZ.CodeID AS AZID}
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_characterized_by]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term), (pCL:Concept)-[:CODE]->(cAZ:Code)-[rAZ]->(tAZ:Term) WHERE rCL.CUI=pCL.CUI AND pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' AND cAZ.SAB='AZ' RETURN DISTINCT toInteger(cGene.CODE) AS hgnc_id,cCL.CodeID as CLID,cAZ.CodeID AS AZID}
//Use the AZ codes to map to concepts that have located_in relationships with AZ organ codes. The AZ organ codes are cross-referenced to UBERON codes. Limit the located_in relationships to those from AZ.
CALL
{WITH AZID
OPTIONAL MATCH (cAZ:Code)<-[:CODE]-(pAZ:Concept)-[rAZUB:located_in]->(pUB:Concept)-[:CODE]->(cUB:Code)-[rUB:PT]->(tUB:Term) WHERE rAZUB.SAB='AZ' AND rUB.CUI=pUB.CUI AND cAZ.CodeID=AZID AND cUB.SAB='UBERON' RETURN cUB.CodeID+'*'+ tUB.name + '' as UBERONID
}

WITH hgnc_id, CLID,UBERONID
RETURN DISTINCT hgnc_id,'cell_types_organ' as ret_key, CLID+ '|' + apoc.text.join(COLLECT(DISTINCT UBERONID),",") AS ret_value
ORDER BY hgnc_id, CLID+ '|' + apoc.text.join(COLLECT(DISTINCT UBERONID),",")
WITH hgnc_id, 'cell_types_organ' as ret_key, CLID,UBERONID, CLID+ '|' + apoc.text.join(COLLECT(DISTINCT UBERONID),",") AS ret_value
RETURN DISTINCT hgnc_id, ret_key, ret_value
ORDER BY hgnc_id, ret_value

// Indicate the source of cell type information.
// APRIL 2024 - HRA changed "has_marker_component" to "characterized_by"
UNION
WITH GeneCUI
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_has_marker_component]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term) WHERE rCL.CUI=pCL.CUI AND pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' RETURN DISTINCT toInteger(cGene.CODE) AS hgnc_id,'cell_types_source' as ret_key, cCL.CodeID + '|Human Reference Atlas' as ret_value
OPTIONAL MATCH (cGene:Code)<-[:CODE]-(pGene:Concept)-[:inverse_characterized_by]->(pCL:Concept)-[:CODE]->(cCL:Code)-[rCL]->(tCL:Term) WHERE rCL.CUI=pCL.CUI AND pGene.CUI=GeneCUI AND cGene.SAB='HGNC' AND cCL.SAB='CL' RETURN DISTINCT toInteger(cGene.CODE) AS hgnc_id,'cell_types_source' as ret_key, cCL.CodeID + '|Human Reference Atlas' as ret_value
ORDER BY hgnc_id,cCL.CodeID + '|Human Reference Atlas'

}

// APRIL 2024 bug fix check for null gene before calling fromlists

WITH hgnc_id, ret_key, COLLECT(ret_value) AS values
WITH hgnc_id,apoc.map.fromLists(COLLECT(ret_key),COLLECT(values)) AS map
WHERE hgnc_id IS NOT NULL
WITH hgnc_id,apoc.map.fromLists(COLLECT(ret_key),COLLECT(values)) AS map
RETURN hgnc_id,
map['approved_symbol'] AS approved_symbol,
map['approved_name'] AS approved_name,
Expand All @@ -159,4 +171,4 @@ map['cell_types_definition'] AS cell_types_code_definition,
map['cell_types_organ'] AS cell_types_codes_organ,
map['cell_types_source'] AS cell_types_codes_source

order by hgnc_id
order by hgnc_id
19 changes: 2 additions & 17 deletions src/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ def make_flask_config():
return temp_flask_app.config


app = UbkgAPI(make_flask_config()).app
app = UbkgAPI(make_flask_config(), Path(__file__).absolute().parent.parent).app

app.register_blueprint(assaytype_blueprint)
app.register_blueprint(assayname_blueprint)
app.register_blueprint(datasets_blueprint)
Expand Down Expand Up @@ -68,22 +69,6 @@ def make_flask_config():
app.cells_client = OntologyCellsClient(cellsurl)


# Defining the /status endpoint in the ubkg_api package will cause 500 error
# Because the VERSION and BUILD files are not built into the package
@app.route('/status', methods=['GET'])
def api_status():
status_data = {
# Use strip() to remove leading and trailing spaces, newlines, and tabs
'version': (Path(__file__).absolute().parent.parent / 'VERSION').read_text().strip(),
'build': (Path(__file__).absolute().parent.parent / 'BUILD').read_text().strip(),
'neo4j_connection': False
}
is_connected = current_app.neo4jConnectionHelper.check_connection()
if is_connected:
status_data['neo4j_connection'] = True

return jsonify(status_data)

####################################################################################################
## For local development/testing
####################################################################################################
Expand Down
13 changes: 9 additions & 4 deletions src/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
ubkg-api==1.4.0
Flask == 2.1.3
neo4j == 4.4
ubkg-api==2.1.1
Flask==2.1.3
neo4j==5.15.0

# for analysis of tabular data
pandas==1.5.0
Expand All @@ -12,4 +12,9 @@ numpy==1.23.5
Werkzeug==2.3.7

# Cells API client
hubmap-api-py-client==0.0.9
hubmap-api-py-client==0.0.9

# Test and analysis scripts
argparse==1.4.0
datatest==0.11.1
deepdiff==6.7.1
7 changes: 5 additions & 2 deletions src/uwsgi.ini
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,12 @@ module = wsgi:application
# Send logs to stdout instead of file so docker picks it up and writes to AWS CloudWatch
log-master=true

# Master with 2 worker process (based on CPU number)
# Master with 4 worker process (based on CPU number)
master = true
processes = 2
processes = 4

# Enable multithreading
enable-threads = true

# Use http socket for integration with nginx running on the same machine
socket = localhost:5000
Expand Down
Loading
Loading