Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping multimer protein components #46

Closed
ijpulidos opened this issue Aug 23, 2024 · 3 comments · Fixed by #47
Closed

Mapping multimer protein components #46

ijpulidos opened this issue Aug 23, 2024 · 3 comments · Fixed by #47
Assignees

Comments

@ijpulidos
Copy link
Contributor

ijpulidos commented Aug 23, 2024

Describe the bug
When mapping gufe protein components that were built using multimeric PDBs, I'm observing that the map is only done to a part of the multimer, apparently only one of the monomers is mapped. I would expect kartograf to be able to map the components correctly, or complain if it doesn't.

To Reproduce

from kartograf import KartografAtomMapper
from gufe import ProteinComponent

# Create components from PDB Files
protein_comp = ProteinComponent.from_pdb_file("input.pdb")
mutated_comp = ProteinComponent.from_pdb_file("mutated.pdb")

mapper = KartografAtomMapper(atom_map_hydrogens=True)
mapping = next(mapper.suggest_mappings(protein_comp, mutated_comp))
print(len(mapping.componentA_to_componentB))

It seems to map only the chain "B" for some reason.

Expected behavior
I expect the length of the mapping to be the number of atoms of the protein components minus the mutated ones, which should be just a few of them.

Screenshots

image

Additional context
This would enable handling protein mutations in a more streamlined way. As far as I can tell, the way to do it right now would be to separate each monomer (each chain in the PDBs) to its own component and then mapping those independently, but that can be cumbersome for users.

PDB files to test in the following zip archive:
Archive.zip

@IAlibay
Copy link
Member

IAlibay commented Aug 27, 2024

From today's call: a fix here would be a check for a ProteinComponent that checks for chain breaks and how to fix it.

@RiesBen RiesBen linked a pull request Aug 28, 2024 that will close this issue
@RiesBen
Copy link
Contributor

RiesBen commented Aug 28, 2024

@ijpulidos
I marked in the PR the code bits, where I think the new features need to be implemented to :)
let me know what you think? :)
P.s.: I implemented an initial suggestion for splitting the protein chains into components, can you test that one?

@jameseastwood
Copy link

Irfan's comments should be addressed, but this PR is not blocking any of Ivan's current work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants