Ensemble.fit gets stuck #18

pietromarchesi · 2019-06-24T13:01:20Z

I am fitting an ensemble of tensor decompositions (ncp_bcp) on my data, however, on some sessions, the call to fit gets stuck after fitting models of all ranks. Model fitting seems to take the same amount of time on all sessions (which is expected as they all have comparable numbers of neurons/trials), but on some, whatever happens after model fitting (component aligment?) takes a huge amount of time. Terminating execution manually:

Traceback (most recent call last):
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-002d1043bf24>", line 207, in <module>
    E.fit(X, ranks=ensemble_ranks, replicates=ensemble_replicates)
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/tensortools/ensemble.py", line 130, in fit
    res.similarity = kruskal_align(U, res.factors, permute_V=True)
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/tensortools/diagnostics.py", line 43, in kruskal_align
    indices = Munkres().compute(cost.copy())
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/munkres.py", line 160, in compute
    step = func()
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/munkres.py", line 331, in __step6
    self.C[i][j] += minval
KeyboardInterrupt

It could be the case that for the sessions on which the gets stuck the data tensor is a little bit bigger, but in practice still really small, e.g. 23 by 96 by 247 (neurons x time points x trials).

This is also related to the rank of the ensemble I am fitting. For the example tensor of size (23, 96, 247), if I fit an ensemble up to rank 14, it runs fine, if I push it to rank 16 it gets stuck.

Do you have any ideas why the time required for the aligment would blow up under certain conditions and on how to avoid it?

The text was updated successfully, but these errors were encountered:

klmcguir · 2019-09-08T13:47:51Z

I think I have run into the same problem previously. I meant to open an issue before now!

I think the issue is caused by a divide by zero creating Nans in line 37 and 38 of diagnostics.kruskal_align. Since it is always 0/0 that is being computed at that stage I fixed it with a kluge resetting Nans to 0.

    # Compute similarity matrices.
    unrm = [f / np.linalg.norm(f, axis=0) for f in U.factors]
    vnrm = [f / np.linalg.norm(f, axis=0) for f in V.factors]

    # if a factor is zero make sure that 0/0 does not cause nans (the new addition)
    for c, f in enumerate(unrm):
        f[np.isnan(f)] = 0
        unrm[c] = f
    for c, f in enumerate(vnrm):
        f[np.isnan(f)] = 0
        vnrm[c] = f

Hope this helps! @ahwillia thanks for everything! Sorry for not posting this sooner!

ahwillia · 2019-09-09T17:07:36Z

So this happens when a factor becomes fully zero? Perhaps the right way to fix this would be to be to drop that factor out the model. Effectively, the model would have rank R-1 rather than R if one of the factors becomes all zero.

ahwillia · 2019-09-09T20:28:05Z

Here is my fix -- ahwillia@1014e53. Let me know if this works for you. If the problem persists we can re-open the issue.

klmcguir · 2019-09-09T20:32:07Z

Thanks! I think this is a good solution.

The only thing I noticed from my experience was that after training you never ended up with a zero component. Is it possible for a factor to go to zero transiently? It has been a long time since I actually had to deal with this.

ahwillia · 2019-09-09T21:04:41Z

Ideally the factor should never go to zero. But depending on the optimization algorithm I think it might.

…

On Mon, Sep 9, 2019, 1:32 PM Kelly McGuire ***@***.***> wrote: Thanks! I think this is a good solution. The only thing I noticed from my experience was that after training you never ended up with a zero component. Is it possible for a factor to go to zero transiently? It has been a long time since I actually had to deal with this. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <https://github.com/ahwillia/tensortools/issues/18?email_source=notifications&email_token=AAE3NUKIXLJIVMCLF3FOSATQI2XEPA5CNFSM4H26MSZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6I6AHA#issuecomment-529653788>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAE3NUMCP2ZHNAP2SAQRUODQI2XEPANCNFSM4H26MSZA> .

pietromarchesi · 2019-10-03T10:29:51Z

For me when fitting ensembles this now breaks in a different way. Seems like removing a zero factor causes a problem with the permutation.

Traceback (most recent call last):
  File "/home/pietro/Envs/navig8/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-1c87a4ec2a15>", line 211, in <module>
    E.fit(X, ranks=ensemble_ranks, replicates=ensemble_replicates)
  File "/home/pietro/pythonprojects/tensortools/tensortools/ensemble.py", line 130, in fit
    res.similarity = kruskal_align(U, res.factors, permute_V=True)
  File "/home/pietro/pythonprojects/tensortools/tensortools/diagnostics.py", line 85, in kruskal_align
    V.permute(prmV)
  File "/home/pietro/pythonprojects/tensortools/tensortools/tensors.py", line 87, in permute
    raise ValueError('Invalid permutation specified.')
ValueError: Invalid permutation specified.

ahwillia · 2019-10-04T21:06:07Z

Thanks for helping me find and fix these corner cases... New commit here should hopefully fix the error you were getting: ahwillia@b9c4fef

Can you git pull and try again? Thanks for the patience on this.

ahwillia closed this as completed Sep 9, 2019

ahwillia reopened this Oct 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensemble.fit gets stuck #18

Ensemble.fit gets stuck #18

pietromarchesi commented Jun 24, 2019

klmcguir commented Sep 8, 2019 •

edited

Loading

ahwillia commented Sep 9, 2019

ahwillia commented Sep 9, 2019

klmcguir commented Sep 9, 2019

ahwillia commented Sep 9, 2019 via email

pietromarchesi commented Oct 3, 2019

ahwillia commented Oct 4, 2019

Ensemble.fit gets stuck #18

Ensemble.fit gets stuck #18

Comments

pietromarchesi commented Jun 24, 2019

klmcguir commented Sep 8, 2019 • edited Loading

ahwillia commented Sep 9, 2019

ahwillia commented Sep 9, 2019

klmcguir commented Sep 9, 2019

ahwillia commented Sep 9, 2019 via email

pietromarchesi commented Oct 3, 2019

ahwillia commented Oct 4, 2019

klmcguir commented Sep 8, 2019 •

edited

Loading