Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overflow of timestamps in KS4 #715

Open
seto0425 opened this issue Jun 12, 2024 · 24 comments
Open

Overflow of timestamps in KS4 #715

seto0425 opened this issue Jun 12, 2024 · 24 comments

Comments

@seto0425
Copy link

Describe the issue:

Hi! I am using Kilosort 4.0.10 to perform spike sorting on data recorded with a 64-channel silicon probe for over 40 hours (sampling rate: 20k). During this process, I encountered an issue with the timestamps, as seen in the Amplitude View, Feature View, and Firing Rate View (probably due to overflow). It seems that this issue is caused by settings of the data type in KS4, and I believe it can be improved by changing to a larger data type (int64 or uint32). However, I am not sure where to modify the code to fix this issue. Could you please guide me on which part of the code should be changed?
Thank you in advance!
time_stamp_issue

Reproduce the bug:

No response

Error message:

No response

Version information:

Kilosort 4.0.10

Context for the issue:

No response

Experiment information:

No response

@jacobpennington
Copy link
Collaborator

Hello,

Spike times are already saved as int64 by Kilosort. Can you please try checking the spike times in a script / notebook instead of looking at Phy? It does look like an overflow is happening, but that might be happening in Phy.

Something like this should confirm if the values are stored correctly:

import numpy as np
import matplotlib.pyplot as plt

# Replace path with wherever your results were saved
st = np.load('C:/users/jacob/.kilosort/.test_data/kilosort4/spike_times.npy')
_ = plt.hist(st/30000, bins=100)

@seto0425
Copy link
Author

Thank you. I tried to check the spike times as shown below. The values may indicate overflow, but the data type is int64 in the spike_time. If there are any other possible solutions, please let me know.

time_stamp_issue_replay_2_2

@jacobpennington
Copy link
Collaborator

Thanks, looking into it.

@jacobpennington
Copy link
Collaborator

@seto0425 Sorry for the delay, this has been a difficult one. I fixed a related overflow in an earlier part of the pipeline, but I have still not found the source of this. Every variable assignment and intermediate computation I can find that deals with spike times uses a 64-bit data type so far as I can tell. So, I have a couple follow-up questions:

  1. Are there any other steps you're applying before or after sorting, like using SpikeInterface or doing your own post-processing?

  2. Would it be feasible for you to sort it again with some modifications to the code? I would make a new branch for you with the required changes, which are a bunch of log statements to narrow down which step the overflow is happening at.

@seto0425
Copy link
Author

Thank you for your reply.

  1. I am not using SpikeInterface or doing any of my own post-processing. The data I am using is in the original dat file recorded with the OpenEphys GUI.
  2. Of course, it is possible. Please advise on how to proceed.

@jacobpennington
Copy link
Collaborator

Thank you! These would be the steps to follow, please let me know if you need any clarifications:

Download repository and install in editable mode in a new kilosort environment (modifying the last line if needed for your cuda version):

git clone [email protected]:MouseLand/Kilosort.git
conda create --name kilosort_new python=3.9
conda activate kilosort_new
pip install -e Kilosort[gui]
pip uninstall torch
conda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia

Checkout branch for this issue:

git checkout issue_715

Then run Kilosort4 however you did before. After sorting, there should be a kilosort4.log file in the results directory that you can upload here.

@seto0425
Copy link
Author

I will send you the Kilosort4.log file. Is this okay with you? Also, I recently noticed that there were no issues related to overflow in the drift plot displayed during Kilosort running.

kilosort4.log
drift_amount

@jacobpennington
Copy link
Collaborator

Was that the correct log file? It looks like that one stopped with an error very quickly, during drift correction.

@seto0425
Copy link
Author

seto0425 commented Aug 1, 2024

I apologize for the delayed response due to another issue. After updating from KS4.0.10 to the new version, the program no longer runs correctly and encounters an error, even though I am using the same recording data files as before. I have tried KS4.0.12 and 4.0.14, but both seem to stop with the same error.
I apologize for the delayed response due to another issue. After updating from KS4.0.10 to the new version, the program no longer runs correctly and encounters an error, even though I am using the same recording data files as before. I have tried KS4.0.12 and 4.0.14, but both seem to stop with the same error.

I am attaching the 'kilosort.log' file and the contents of the ‘Message log box’ obtained from the GUI (Kilosort version 4.0.14.dev2) for your reference.

I am currently trying to resolve this issue on my end as well. Although it deviates from the main topic of this thread, if you have any solutions for this error, could you please let me know?

kilosort4_0_14_dev_20240731.log
Message log box_KS4_0_14_dev_20240731.txt

@jacobpennington
Copy link
Collaborator

@seto0425 Sorry for the delay. I just pushed a fix for this, it will be live on pypi later today as v4.0.16.

@seto0425
Copy link
Author

I'm trying the new version 4.0.16, but I've encountered the same problem that was observed with phy before. The data overflowed just like in the initial comments. Are there any other options to resolve this issue?

Please find attached the copies of the Message Log box and the screenshots.

スクリーンショット 2024-09-04 155405
Log_KS4.txt

If you need to check it yourself, I can send you the data.

@jacobpennington
Copy link
Collaborator

Yes, what I meant was I fixed the new error you reported last time, the timestamp overflow problem still needs to be debugged. If sharing the data is feasible, that would probably be easier.

@seto0425
Copy link
Author

I misunderstood! The new error reported last time was mostly corrected. However, while the analysis runs fine with a single shank, it fails when multiple shank probes are assigned.

I want to send you the data using a Dropbox link. Could you please provide me with your email address?

@jacobpennington
Copy link
Collaborator

[email protected]

Fyi, when people have shared with dropbox in the past I've had issues with slow downloads and dropped connections about half the time. If google drive is an option for you, that's been more reliable, but I'll make dropbox work if needed.

@seto0425
Copy link
Author

@jacobpennington I have sent the email.

@jacobpennington
Copy link
Collaborator

Got it, thanks!

@jacobpennington
Copy link
Collaborator

@seto0425 Would you mind sharing the specs for the machine you used to sort the data (CPU, GPU, amount of memory, etc)? I'm having trouble running it on my personal computer, so I just want to get an idea what I'll need to request to run successfully on a cluster.

@seto0425
Copy link
Author

seto0425 commented Oct 3, 2024

Our PC specs are as follows:

PC: Uni-i9X (Workstation)
CPU: Intel(R) Core(TM) i7-7820X CPU @ 3.6GHz
RAM: 128GB
GPU: NVIDIA TITAN RTX (Memory: 24GB)

Please let me know if you have any other questions.

@jacobpennington
Copy link
Collaborator

Okay, good news and bad news:

The good news is I was able to sort the data and did not encounter any problems with timestamp overflow (see Phy screenshot). The bad news is that means I don't know why this is happening for you. It's annoying, but my recommendation would be to try creating a fresh install following the developer instructions at the bottom of the readme (which includes creating a new conda environment using environment.yml), then sort again.

image

@seto0425
Copy link
Author

Thank you for your confirmation. I also downloaded the development version (4.0.21 dev6+g781721a.d20241107) and attempted to run Kilosort again, but unfortunately, the issue was not resolved. I have attached the log from that attempt.

In the figure displayed during computation of the KS, there does not appear to be any data point overflow, so I suspect that the issue might be occurring during the file export process.

I don’t believe the problem is with our PC, but would you be able to provide information about the OS and/or specifications of the PC you are using? I would like to run Kilosort in as similar an environment as possible.

kilosort4.log

@jacobpennington
Copy link
Collaborator

The machine was using an A100 graphics card and an intel processor (exact architecture unclear since it was selected from a queue on a compute cluster with multiple types). The immediate difference I would wonder about is that the machine I ran on was using Linux (Rocky Linux 9.3). I do not currently have a sufficiently beefy machine available that runs Windows to check if this is an OS-specific problem.

@seto0425
Copy link
Author

Thank you. I’d like to consider using Linux via a Virtual Machine or Windows Subsystem for Linux.

@HiroMiyawaki
Copy link

I worked with @seto0425, and we have finally identified the cause of the issue!

In most Windows environments, numpy.int is initialized as int32, while in other environments, it is initialized as int64. As a result, in kilosort/template_matching.py, prog on line 34 is also initialized as int32 in Windows environments. This causes ibatch in the for loop starting at line 38 to also be int32, leading to an overflow at line 68.

To make matters worse, at least in our environment, it seems that overflow warnings are suppressed when using tqdm.

The solution is straightforward: simply specify the dtype in numpy.arange on line 34.

If needed, I can submit a pull request to address this issue, but since it is a minor change, I think it might be easier for you to incorporate it directly.

@jacobpennington
Copy link
Collaborator

Ahh I see that now. Great, thanks so much for helping figure this out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants