-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix seq
and tors
length mismatch for int2cart
bgeo strategy
#227
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, trusting you on this. just a small comment 👍
|
||
except ValueError: | ||
# Int2Cart requires length of seq to match length of torsions | ||
# if there is a mismatch match is found, start from scratch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add here just a quick sentence when (in which situations) such a mismatch can happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the curious thing here is that this error is not consistent, e.g. can still make some conformers but on my machine at around conformer # 30-40 and 60-70, a mismatch occurs... I will do my best to find a consistency though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I seem to have found some interesting causes. It occurs on the first residue (G for Sic1 in this case) but has 2 sets of torsion angles for some reason:
[2022-06-21 13:20:46,561] Error, 'seq' and 'tors' lengths don't match
[2022-06-21 13:20:46,561] Current residue: 0
[2022-06-21 13:20:46,561] seq value: G
[2022-06-21 13:20:46,561] seq length: 1
[2022-06-21 13:20:46,562] tors value: [[ 1.2598734 -3.0602584 2.8664412 ]
[ 1.6958017 -0.30540177 -3.1142986 ]]
[2022-06-21 13:20:46,562] tors length: 2
I will write a sentence regarding this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see one place where this might go wrong: when the backbone energy exceeds threshold, IDPCG algorithm does one backtrack step, and correspondingly the torsion angles for this chunk is removed (see https://github.com/julie-forman-kay-lab/IDPConformerGenerator/blob/main/src/idpconfgen/cli_build.py#L1347). Maybe here it should be torsion_records[:current_res_number]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Jerry, thanks for the input. I've just tried changing to torsions_records[:current_res_number]
and that broke the code. If it's backtracking, maybe we can just pop()
the last set of torsion angles? Something like:
if len(seq) != len(tors):
tors.pop()
Or better to account for different fragment size backtracks:
if len(seq) != len(tors):
diff = len(tors) - len(seq)
tors = tors[: -diff or None]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Jerry, sorry for the late reply, I was trying different methods for acquiring previous/current data. Here's what I have so far. Not sure what to make of this immediately though:
last 5 res numbers: [5, 6, 7, 8, 0]
last 5 torsion records: [[(2.9964604, 1.2988691, 0.39022663), (-3.0718434, -2.224424, -2.385617), (2.9989808, -1.0205619, 2.6155), (3.090659, -2.2035253, 2.8770804), (-3.0882726, -1.1287305, -0.18624555), (3.0020282, -1.2860423, -0.13994366), (-3.0922096, -1.9446679, 2.7733097), (3.124611, -1.2435292, 2.8831468), (2.9845228, -1.0668774, 2.5300293), (3.124027, -1.0440558, 2.3476822)], [(2.9964604, 1.2988691, 0.39022663), (-3.0718434, -2.224424, -2.385617), (2.9989808, -1.0205619, 2.6155), (3.090659, -2.2035253, 2.8770804), (-3.0882726, -1.1287305, -0.18624555), (3.0020282, -1.2860423, -0.13994366), (-3.0922096, -1.9446679, 2.7733097), (3.124611, -1.2435292, 2.8831468), (2.9845228, -1.0668774, 2.5300293), (3.124027, -1.0440558, 2.3476822)], [(2.9964604, 1.2988691, 0.39022663), (-3.0718434, -2.224424, -2.385617), (2.9989808, -1.0205619, 2.6155), (3.090659, -2.2035253, 2.8770804), (-3.0882726, -1.1287305, -0.18624555), (3.0020282, -1.2860423, -0.13994366), (-3.0922096, -1.9446679, 2.7733097), (3.124611, -1.2435292, 2.8831468), (2.9845228, -1.0668774, 2.5300293), (3.124027, -1.0440558, 2.3476822)], [(2.9964604, 1.2988691, 0.39022663), (-3.0718434, -2.224424, -2.385617), (2.9989808, -1.0205619, 2.6155), (3.090659, -2.2035253, 2.8770804), (-3.0882726, -1.1287305, -0.18624555), (3.0020282, -1.2860423, -0.13994366), (-3.0922096, -1.9446679, 2.7733097), (3.124611, -1.2435292, 2.8831468), (2.9845228, -1.0668774, 2.5300293), (3.124027, -1.0440558, 2.3476822)], [(2.9964604, 1.2988691, 0.39022663), (3.0501144, 2.154529, 1.6086466)]]
current res number: 0
current torsion records: [(2.9964604, 1.2988691, 0.39022663), (3.0501144, 2.154529, 1.6086466)]
I've attached more examples in a .txt
file to save some space in this thread. Total 6 mismatches making 256 backbones
res_and_tors.txt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the last 5 res numbers values for current_res_num in 5 previous iterations? As I understand it current_res_num+1 should be the same as the length of the torsion records, but it does not seem to be the case for what you provided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it was a bit misleading, the "0" in the "last 5 res numbers" is the current res number. The "last_X" list updates at the beginning of the for-loop before torsion records or current res number changes. If an error occurs, it printed the last 5 items in the "last_X" list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I still cannot understand why it can be 5,6,7,8,0. This means starting from a fragment of 5 residues, in the next several iterations only chunks of a single residues are added until it goes up to 8 residues. Then somehow it goes back to zero residue? That does not seem to be the logic of the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great discussion.
The machinery inside the build should flow without the need for many exceptions. Things around the machinery should adapt to that and provide whatever the machinery needs to continue flowing. Reading from the discussion, I think the bug may come from the get_adjacent_angles
function. For some reason, that function is giving one set of torsions more than requested. It's a tricky function. Does it happen only with int2cart
? Or does it happen with other options as well simply the builder ignores that extra set?
also, dont forget the changelog |
@joaomcteixeira I have also fixed the test/build/pr issues here as well however I need your authorization. It's a fix nonetheless for the issue with Int2Cart integration, maybe there might be a better one in the future? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment. cheers
No description provided.