Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rank metric #15 #42

Merged
merged 14 commits into from
Dec 13, 2021
Merged

Add rank metric #15 #42

merged 14 commits into from
Dec 13, 2021

Conversation

jonasscheid
Copy link
Collaborator

@jonasscheid jonasscheid commented Oct 26, 2021

The Rank metric is used by multiple prediction tools, including the NetMHC family. This PR aims to provide a solution for issue #15

  • To implement another score metric fundamental changes to the EpitopePredictionResult class had to be made. For a better overview the new structure is sketched in row 63 in Results.py.
  • For every external tool supporting the rank metric the output parser parse_external_result has been adjusted.
  • For every tool not supporting the rank metric the predict method has been adjusted to maintain the overall hierarchy
  • Overall enums replaced the hardcoded positional accession in prediction output parsers
  • The (parsed) output of all predictors (epytope/epytope/EpitopePrediction/*) is now condensed into a nested dictionary with the following structure:

{'Allele1': {'Score': {'Pep1': Score1, 'Pep2': Score2,..}, 'Rank': {'Pep1': Rank1, 'Pep2': Rank2,..}, 'Allele2':...}

-> If no rank is supported by the respective tool, the hierarchy is maintained, just no rank information.
  • This nested dictionary is then packed into an EpitopePredictionResult using the new from_dict method and returns the desired structure.

  • These changes to the EpitopePredictionResult structure additionally caused to adjust OptiTope.py. OptiTope __init__ got a new boolean parameter rank, in order to optimize differentiate with respect to the given metric. OptiTope aims to minimize the rank or maximize the (prediction) score.

I hope I covered most of the changes.

Looking forward to your feedback!

@jonasscheid
Copy link
Collaborator Author

Hey @b-schubert
I encountered an issue in TestSpacerDesign.py caused after adding the new A-2601-9 matrix. I need to fix the underlying problem, I was just wondering why the tests went through when there was no A-2601-9 matrix for syfpeithi available. The test should have reported this problem right?
Thanks in advance!
Jonas

@b-schubert
Copy link
Collaborator

b-schubert commented Oct 26, 2021

Hey @b-schubert I encountered an issue in TestSpacerDesign.py caused after adding the new A-2601-9 matrix. I need to fix the underlying problem, I was just wondering why the tests went through when there was no A-2601-9 matrix for syfpeithi available. The test should have reported this problem right? Thanks in advance! Jonas

The reason is the overall problem changed. Now it is considering the A-2601 matrix during optimization, before it was only considering the A02 matrix. The functions work fine, it finds now a new solution that is optimal when considering both A02 and A26.

@jonasscheid
Copy link
Collaborator Author

Thanks for the quick response! So because now the allele is there the output differs. I adjusted the test as follows

@christopher-mohr christopher-mohr self-requested a review November 17, 2021 15:31
Copy link
Collaborator

@christopher-mohr christopher-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jonasscheid ! So far only minor things, please check the syfpeithi matrices again, should the "wildcard" be there in all cases? I didn't mark it for all of them in case it was missing.

I will continue with the remaining files tomorrow.

.github/workflows/python-test-conda.yml Show resolved Hide resolved
epytope/Core/Result.py Outdated Show resolved Hide resolved
epytope/Core/Result.py Outdated Show resolved Hide resolved
epytope/Core/Result.py Outdated Show resolved Hide resolved
epytope/Data/pssms/syfpeithi/mat/B_5802_10.py Outdated Show resolved Hide resolved
epytope/EpitopeAssembly/EpitopeAssembly.py Outdated Show resolved Hide resolved
epytope/EpitopeAssembly/EpitopeAssembly.py Show resolved Hide resolved
epytope/EpitopeAssembly/EpitopeAssembly.py Outdated Show resolved Hide resolved
epytope/EpitopeAssembly/EpitopeAssembly.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/ANN.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/ANN.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/ANN.py Show resolved Hide resolved
epytope/EpitopePrediction/ANN.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/ANN.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/External.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/External.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/External.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/External.py Show resolved Hide resolved
epytope/EpitopePrediction/External.py Show resolved Hide resolved
Copy link
Collaborator

@christopher-mohr christopher-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more suggestions and things to change/discuss but overall it looks good to me. Thanks a lot @jonasscheid !

epytope/EpitopePrediction/PSSM.py Show resolved Hide resolved
epytope/EpitopePrediction/PSSM.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/PSSM.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/PSSM.py Outdated Show resolved Hide resolved
epytope/EpitopePrediction/PSSM.py Outdated Show resolved Hide resolved
epytope/test/TestEpitopeAssembly.py Show resolved Hide resolved
epytope/test/TestSpacerDesign.py Show resolved Hide resolved
epytope/test/TestSpacerDesign.py Outdated Show resolved Hide resolved
epytope/tutorials/data/allele_probabilities_europe.csv Outdated Show resolved Hide resolved
@christopher-mohr
Copy link
Collaborator

Hi @jonasscheid, thanks for the update! 👍 There are still a couple of unresolved conversations, could you please marke them as resolved if applicable or create issues for them and mark them afterwards as resolved .

Copy link
Collaborator

@christopher-mohr christopher-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things and the filtering functionality that might have to be adapted as we discussed. 👍

epytope/EpitopeSelection/OptiTope.py Outdated Show resolved Hide resolved
epytope/EpitopeSelection/OptiTope.py Outdated Show resolved Hide resolved
epytope/test/TestSpacerDesign.py Show resolved Hide resolved
epytope/Core/Result.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@christopher-mohr christopher-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on some of the recently changed files.

epytope/tutorials/HLATyping.ipynb Outdated Show resolved Hide resolved
epytope/tutorials/HLATyping.ipynb Outdated Show resolved Hide resolved
epytope/tutorials/HLATyping.ipynb Show resolved Hide resolved
Copy link
Collaborator

@christopher-mohr christopher-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comments on remaining changed files.

epytope/Core/Result.py Outdated Show resolved Hide resolved
epytope/Core/Result.py Outdated Show resolved Hide resolved
epytope/Core/Result.py Outdated Show resolved Hide resolved
"""
Filters a result data frame based on a specified expression consisting of a list of triple with
(method_name, comparator, threshold). The expression is applied to each row. If any of the columns fulfill
the criteria the row remains.
(method_name/scoretype_name, comparator, threshold) and a boolean if the scoretype is specified.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just wondering if it would be cleaner to have the first part of the triple only for method_name and the additional parameter scoretype with a default value Score.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this method according to your suggestion. Please have a look again :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks still the same to me, could you please doublecheck?

epytope/Core/Result.py Outdated Show resolved Hide resolved
epytope/Core/Result.py Show resolved Hide resolved
epytope/EpitopePrediction/External.py Show resolved Hide resolved
Copy link
Collaborator

@christopher-mohr christopher-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a typo has been introduced in one of the HLA class II alleles.

epytope/Data/pssms/syfpeithi/mat/B_4001_9.py Outdated Show resolved Hide resolved
epytope/Data/pssms/syfpeithi/mat/DRB1_1101_9.py Outdated Show resolved Hide resolved
epytope/Data/pssms/syfpeithi/mat/DRB1_1501_9.py Outdated Show resolved Hide resolved
epytope/Data/pssms/syfpeithi/mat/DRB1_0701_9.py Outdated Show resolved Hide resolved
epytope/Data/pssms/syfpeithi/mat/DRB1_0101_9.py Outdated Show resolved Hide resolved
epytope/Data/pssms/syfpeithi/mat/DRB1_0301_9.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@christopher-mohr christopher-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍

@christopher-mohr christopher-mohr merged commit 65fd57a into develop Dec 13, 2021
@jonasscheid jonasscheid deleted the feature-15-add-rank-metric branch December 13, 2021 21:57
christopher-mohr added a commit that referenced this pull request Jan 26, 2022
* Version bump 3.0.0rc2

* Fix master / main branch naming in GH action

* Fix typos in README file

* Add pypi GH action

* Fix PyPI linting errors

* Reduce version to 3.0.0rc1

* Add a changelog

* Install changelog with package

* Change PyPI CD trigger to published release

* Add rank metric #15 (#42)

* Push all changes made on fork

* Set Setuptools version also for external yml

* Fixed erroneous variable names in matrix files

* deleted A_2601_9 matrix for now. Caused troubles

* Add A2601_9 syf matrix for debugging

* Fixed bug in test caused by addition of A*26:01 matrix

* Change solver from cbc to glpk to investigate if macOS dependant env problems in github actions can be solved

* Corrected after review

* Adjust tutorials to new structure

* Change filter_result as discussed

* Adjusted filter method and tutorials according to #12

* Fixed a bug occuring for netMHCfamily tools when peptide input has multiple lengths

* remove logging

* Alter filter_result method as discussed

* Fixed issues #38, #44 and #45 (#46)

* Fixed issues #44 and #45

* Fix #48, include review suggestions

* Improve/update documentation (#50)

* Update CHANGELOG

* Extend README

* Change framework name in code comment

* Remove logging warning

* Change file ending in tutorial

* Add docstrings, minor formatting

* Update CHANGELOG version and setup.py

* Update date

Co-authored-by: Leon Kuchenbecker <[email protected]>
Co-authored-by: Jonas Scheid <[email protected]>
christopher-mohr added a commit that referenced this pull request Jun 15, 2022
* Version bump 3.0.0rc2

* Fix master / main branch naming in GH action

* Fix typos in README file

* Add pypi GH action

* Fix PyPI linting errors

* Reduce version to 3.0.0rc1

* Add a changelog

* Install changelog with package

* Change PyPI CD trigger to published release

* Add rank metric #15 (#42)

* Push all changes made on fork

* Set Setuptools version also for external yml

* Fixed erroneous variable names in matrix files

* deleted A_2601_9 matrix for now. Caused troubles

* Add A2601_9 syf matrix for debugging

* Fixed bug in test caused by addition of A*26:01 matrix

* Change solver from cbc to glpk to investigate if macOS dependant env problems in github actions can be solved

* Corrected after review

* Adjust tutorials to new structure

* Change filter_result as discussed

* Adjusted filter method and tutorials according to #12

* Fixed a bug occuring for netMHCfamily tools when peptide input has multiple lengths

* remove logging

* Alter filter_result method as discussed

* Fixed issues #38, #44 and #45 (#46)

* Fixed issues #44 and #45

* Fix #48, include review suggestions

* Improve/update documentation (#50)

* Update CHANGELOG

* Extend README

* Change framework name in code comment

* Remove logging warning

* Change file ending in tutorial

* Add docstrings, minor formatting

* Update CHANGELOG version and setup.py

* Update date

* Fix #52 (#53)

* add check if transcript sequence available from BioMart, cleanup (#58)

* Add interface for netMHCpan 4.1 (#59)

* add interface for netmhcpan 4.1

* remove duplicate alleles from list

* Update supportedAlleles of syfpeithi (#62)

Co-authored-by: Christopher Mohr <[email protected]>

* Fix protobuf version for tests, prepare docs for 3.1.0 release (#64)

* Prepare docs for new release

* minor changes/additions docs

* check if fixing protobuf version resolves testing errors

* check if changing github actions workflow resolves testing issue

* allow lower versions of protobuf

* Update epytope/doc/conf.py

Co-authored-by: Gisela Gabernet <[email protected]>

Co-authored-by: Christopher Mohr <[email protected]>
Co-authored-by: Jonas Scheid <[email protected]>
Co-authored-by: Gisela Gabernet <[email protected]>

* Add netMHCIIpan 4.1 interface (#66)

* add netmhciipan 4.1 interface

* remove logging

* remove __name method

* update changelog

* Update CHANGELOG.md

Co-authored-by: Christopher Mohr <[email protected]>

Co-authored-by: Christopher Mohr <[email protected]>
Co-authored-by: Christopher Mohr <[email protected]>

* minor doc improvements, cleanup setup.py

* bump version

Co-authored-by: Leon Kuchenbecker <[email protected]>
Co-authored-by: Jonas Scheid <[email protected]>
Co-authored-by: Jonas Scheid <[email protected]>
Co-authored-by: Gisela Gabernet <[email protected]>
christopher-mohr added a commit that referenced this pull request Nov 9, 2022
* Version bump 3.0.0rc2

* Fix master / main branch naming in GH action

* Fix typos in README file

* Add pypi GH action

* Fix PyPI linting errors

* Reduce version to 3.0.0rc1

* Add a changelog

* Install changelog with package

* Change PyPI CD trigger to published release

* Add rank metric #15 (#42)

* Push all changes made on fork

* Set Setuptools version also for external yml

* Fixed erroneous variable names in matrix files

* deleted A_2601_9 matrix for now. Caused troubles

* Add A2601_9 syf matrix for debugging

* Fixed bug in test caused by addition of A*26:01 matrix

* Change solver from cbc to glpk to investigate if macOS dependant env problems in github actions can be solved

* Corrected after review

* Adjust tutorials to new structure

* Change filter_result as discussed

* Adjusted filter method and tutorials according to #12

* Fixed a bug occuring for netMHCfamily tools when peptide input has multiple lengths

* remove logging

* Alter filter_result method as discussed

* Fixed issues #38, #44 and #45 (#46)

* Fixed issues #44 and #45

* Fix #48, include review suggestions

* Improve/update documentation (#50)

* Update CHANGELOG

* Extend README

* Change framework name in code comment

* Remove logging warning

* Change file ending in tutorial

* Add docstrings, minor formatting

* Update CHANGELOG version and setup.py

* Update date

* Fix #52 (#53)

* add check if transcript sequence available from BioMart, cleanup (#58)

* Add interface for netMHCpan 4.1 (#59)

* add interface for netmhcpan 4.1

* remove duplicate alleles from list

* Update supportedAlleles of syfpeithi (#62)

Co-authored-by: Christopher Mohr <[email protected]>

* Fix protobuf version for tests, prepare docs for 3.1.0 release (#64)

* Prepare docs for new release

* minor changes/additions docs

* check if fixing protobuf version resolves testing errors

* check if changing github actions workflow resolves testing issue

* allow lower versions of protobuf

* Update epytope/doc/conf.py

Co-authored-by: Gisela Gabernet <[email protected]>

Co-authored-by: Christopher Mohr <[email protected]>
Co-authored-by: Jonas Scheid <[email protected]>
Co-authored-by: Gisela Gabernet <[email protected]>

* Add netMHCIIpan 4.1 interface (#66)

* add netmhciipan 4.1 interface

* remove logging

* remove __name method

* update changelog

* Update CHANGELOG.md

Co-authored-by: Christopher Mohr <[email protected]>

Co-authored-by: Christopher Mohr <[email protected]>
Co-authored-by: Christopher Mohr <[email protected]>

* minor doc improvements, cleanup setup.py

* Update MartsAdapter (#69)

* Rewrite, extend, cleanup MartsAdapter, adapt tests

* add requests and beautifulsoup4 dependency

* prevent too long requests, avoid server request for each attribute

* add gene to test object

* fix enum ref

* adapt MartsAdapter in other test

* add function for getting gene names, add tests

* change method name, add test

* add lxml as dependency

* workaround for pandas read_xml, remove dependency

* add missing all()

* fix test

* add retry strategy for GET requests

* Update epytope/IO/MartsAdapter.py

Co-authored-by: Gisela Gabernet <[email protected]>

* add default biomart url

Co-authored-by: Gisela Gabernet <[email protected]>

* Outsource supported alleles (#63)

* Draft for outsourcing supported alleles

* Further outsourcing of netmhc alleles

* Finish outsourcing external alleles

* Outsource alleles from pssm and ann predictors

* Correct minor erroneous hla nomenclatures of smmpmbec

* Change allele imports by importing frozensets

* Add __allele_import_name to classes to increase readability

* Refactor: convert_alleles is now classmethod in pssm

* Incorporate feedback

* Update __init__.py

* Update uniprot adapter (#71)

* remove HLAtyping and distance2self tests, update CHANGELOG

* fix reading sequences in uniprot adapter

* add test for uniprot adapter

* remove HLAtyping and distance2self tests, update CHANGELOG (#70)

* Fix netmhcii4.0 parser (#73)

* fix netmhciipan4.0 issue

* update changelog

* Add function for peptides to check if created by variant (#74)

* remove HLAtyping and distance2self tests, update CHANGELOG

* add Peptide functon to determine if peptide originates from a variant

* fix peptide call, update CHANGELOG

* Improve function to check peptide origin (#75)

* remove HLAtyping and distance2self tests, update CHANGELOG

* add Peptide functon to determine if peptide originates from a variant

* fix peptide call, update CHANGELOG

* improve method for variant-peptide check

* minor CHANGELOG change

* change peptide to self

* update setup.py and CHANGELOG

* Fix errorneous supported alleles (#78)

* Draft for outsourcing supported alleles

* Further outsourcing of netmhc alleles

* Finish outsourcing external alleles

* Outsource alleles from pssm and ann predictors

* Correct minor erroneous hla nomenclatures of smmpmbec

* Change allele imports by importing frozensets

* Add __allele_import_name to classes to increase readability

* Refactor: convert_alleles is now classmethod in pssm

* Incorporate feedback

* Fix parsing error and sort allele list

* Adjust variable naming

Co-authored-by: Leon Kuchenbecker <[email protected]>
Co-authored-by: Jonas Scheid <[email protected]>
Co-authored-by: Jonas Scheid <[email protected]>
Co-authored-by: Gisela Gabernet <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants