WIP: Remove string & object types to clean up the context #35

natachaperez · 2019-06-14T12:01:22Z

I use the same process as for boolean type e.g. I added the line: ctxt = re.sub(r',\s*\n\s*"@type": "http://www.w3.org/2001/XMLSchema#string"', '', ctxt) in create_nidmr_context.py. Most of the issues are cleaned up, we have now the following structure : "attribute description": "@id" : " <> expect for two of them : NIDM_0000157 and NIDM_0000159.

…d 00159

natachaperez · 2019-06-17T07:21:08Z

For the attributes "noiseFWHMInUnits" and "noiseFWHMInVoxels" in the SearchSpaceMaskMap class whose IDs are : NIDM_0000157 and NIDM_0000159, the method is slightly different. In Protegé, they match with spm files which are deprecated and still appear in the file nidm-results.owl. It seems that alter the reading of the nidm files. Hence, I choose to remove those lines corresponding to the spm identifiers : from line 952 to 963 and from line 979 to 991 in nidm-results.owl.
By running the file refresh.py and then opening the example spm-resullts.json, the context is now completely cleaned for string types.

cmaumet · 2019-06-17T07:23:48Z

@natachaperez: thanks! It would be nice to better understand why this is happening though. Can you try and identify which line (or minimal subsets of lines) has to be removed from the owl file to obtain the behavior we want?

natachaperez · 2019-06-17T08:28:29Z

@cmaumet I tried line by line and we have to remove these following lines to obtain the behavior we want :

http://www.w3.org/2004/02/skos/core#prefLabel "spm_noiseFWHMInUnits"
http://www.w3.org/2004/02/skos/core#prefLabel "spm_noiseFWHMInVoxels"
But be careful to keep the '.' at the end.

cmaumet · 2019-06-21T15:18:39Z

@natachaperez: thanks a lot for those edits! For documentation purposes, can you add a comment explaining briefly what changes you introduced in the last set of commits and why?

natachaperez · 2019-06-21T15:28:54Z

So I introduce those following lines (in bold) in create_nidmr_context.py:
for s, o in sorted(owl.graph.subject_objects(SKOS['prefLabel'])):
json_key = str(o)
context['@context'][json_key] = OrderedDict()
if s in owl.ranges:
context['@context'][json_key]['@id'] = str(s)
context['@context'][json_key]['@type'] = next(iter(owl.ranges[s]))
else:
context['@context'][json_key] = str(s)
if owl.is_deprecated(s):
del context['@context'][json_key]

for json_key in context['@context']:
    if '_' in json_key:
       new_key = json_key.split('_')[1] 
       context['@context'][new_key]=context['@context'].pop(json_key)

These changes fix the problem encountered with the attributes "NoiseFWHMInUnits" and "FWHMInVoxels" without changing the owl file. Indeed, the issue was linked with the presence of the same json_key twice whereas a key must be unique in a dictionary. The idea was to change the json_key ( e.g. to delete the '_') after having completed the context. Then, we must ensure that none of deprecated labels remains in the context.

natachaperez · 2019-06-25T08:56:37Z

The changes above in create_nidmr_context enable to gather the lines previously added:
ctxt = re.sub(r',\s*\n\s*"@type": "http://www.w3.org/2001/XMLSchema#boolean"', '', ctxt)
ctxt = re.sub(r',\s*\n\s*"@type": "http://www.w3.org/2001/XMLSchema#string"', '', ctxt)
to clean up the context by keeping only the primitive type ( int, float, positiveInteger, Integer).
Moreover, we also figure out a way to fix the issues with object properties.

natachaperez · 2019-06-25T14:26:37Z

I added a check with the autopep8 command in order to be in line with the PEP8 convention for the python file create_nidmr_context and I remove the useless lines as comments.

cmaumet

Hi @natachaperez!

Thanks for this pull request. I only have a few minor comments regarding remaining PEP8 formatting issue and creating one variable to make the code easier to read. Once this is updated, the pull request will be ready to merge.

cmaumet · 2019-06-27T08:28:57Z