Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensemble.fit gets stuck #18

Open
pietromarchesi opened this issue Jun 24, 2019 · 7 comments
Open

Ensemble.fit gets stuck #18

pietromarchesi opened this issue Jun 24, 2019 · 7 comments

Comments

@pietromarchesi
Copy link

I am fitting an ensemble of tensor decompositions (ncp_bcp) on my data, however, on some sessions, the call to fit gets stuck after fitting models of all ranks. Model fitting seems to take the same amount of time on all sessions (which is expected as they all have comparable numbers of neurons/trials), but on some, whatever happens after model fitting (component aligment?) takes a huge amount of time. Terminating execution manually:

Traceback (most recent call last):
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-002d1043bf24>", line 207, in <module>
    E.fit(X, ranks=ensemble_ranks, replicates=ensemble_replicates)
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/tensortools/ensemble.py", line 130, in fit
    res.similarity = kruskal_align(U, res.factors, permute_V=True)
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/tensortools/diagnostics.py", line 43, in kruskal_align
    indices = Munkres().compute(cost.copy())
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/munkres.py", line 160, in compute
    step = func()
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/munkres.py", line 331, in __step6
    self.C[i][j] += minval
KeyboardInterrupt

It could be the case that for the sessions on which the gets stuck the data tensor is a little bit bigger, but in practice still really small, e.g. 23 by 96 by 247 (neurons x time points x trials).

This is also related to the rank of the ensemble I am fitting. For the example tensor of size (23, 96, 247), if I fit an ensemble up to rank 14, it runs fine, if I push it to rank 16 it gets stuck.

Do you have any ideas why the time required for the aligment would blow up under certain conditions and on how to avoid it?

@klmcguir
Copy link

klmcguir commented Sep 8, 2019

I think I have run into the same problem previously. I meant to open an issue before now!

I think the issue is caused by a divide by zero creating Nans in line 37 and 38 of diagnostics.kruskal_align. Since it is always 0/0 that is being computed at that stage I fixed it with a kluge resetting Nans to 0.

    # Compute similarity matrices.
    unrm = [f / np.linalg.norm(f, axis=0) for f in U.factors]
    vnrm = [f / np.linalg.norm(f, axis=0) for f in V.factors]

    # if a factor is zero make sure that 0/0 does not cause nans (the new addition)
    for c, f in enumerate(unrm):
        f[np.isnan(f)] = 0
        unrm[c] = f
    for c, f in enumerate(vnrm):
        f[np.isnan(f)] = 0
        vnrm[c] = f

Hope this helps! @ahwillia thanks for everything! Sorry for not posting this sooner!

@ahwillia
Copy link
Collaborator

ahwillia commented Sep 9, 2019

So this happens when a factor becomes fully zero? Perhaps the right way to fix this would be to be to drop that factor out the model. Effectively, the model would have rank R-1 rather than R if one of the factors becomes all zero.

@ahwillia
Copy link
Collaborator

ahwillia commented Sep 9, 2019

Here is my fix -- ahwillia@1014e53. Let me know if this works for you. If the problem persists we can re-open the issue.

@ahwillia ahwillia closed this as completed Sep 9, 2019
@klmcguir
Copy link

klmcguir commented Sep 9, 2019

Thanks! I think this is a good solution.

The only thing I noticed from my experience was that after training you never ended up with a zero component. Is it possible for a factor to go to zero transiently? It has been a long time since I actually had to deal with this.

@ahwillia
Copy link
Collaborator

ahwillia commented Sep 9, 2019 via email

@pietromarchesi
Copy link
Author

For me when fitting ensembles this now breaks in a different way. Seems like removing a zero factor causes a problem with the permutation.

Traceback (most recent call last):
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-1c87a4ec2a15>", line 211, in <module>
    E.fit(X, ranks=ensemble_ranks, replicates=ensemble_replicates)
  File "/home/pietro/pythonprojects/tensortools/tensortools/ensemble.py", line 130, in fit
    res.similarity = kruskal_align(U, res.factors, permute_V=True)
  File "/home/pietro/pythonprojects/tensortools/tensortools/diagnostics.py", line 85, in kruskal_align
    V.permute(prmV)
  File "/home/pietro/pythonprojects/tensortools/tensortools/tensors.py", line 87, in permute
    raise ValueError('Invalid permutation specified.')
ValueError: Invalid permutation specified.

@ahwillia ahwillia reopened this Oct 4, 2019
@ahwillia
Copy link
Collaborator

ahwillia commented Oct 4, 2019

Thanks for helping me find and fix these corner cases... New commit here should hopefully fix the error you were getting: ahwillia@b9c4fef

Can you git pull and try again? Thanks for the patience on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants