Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle missing MSMS data #38

Open
hechth opened this issue Feb 23, 2023 · 2 comments
Open

How to handle missing MSMS data #38

hechth opened this issue Feb 23, 2023 · 2 comments

Comments

@hechth
Copy link
Collaborator

hechth commented Feb 23, 2023

Currently, the MS1 data is copied into the slot for MS2 data if it is not present in the version that reads data from a csv, while it is kept empty when reading it from xcms - should this be made the general case?

@cbroeckl
Copy link
Owner

if i recall, the clustering algorithm is written to expect data in the MS2 slot as well. This is sloppy coding, frankly, as the way it is written was just a shortcut to keep from having to change the similarity scoring. If there is only MS1 data, in theory there is no reason to be calculating MS2 similarity, or MS1vs MS2 correlational similarity.

To move away from this we would need to ensure that the calculate.similarity function behaviour is different when no MS2 data is available - currently there is no condition written to deal with this situation:
max_value <- pmax( cor( data1[, start_row:stop_row], data1[, start_col:stop_col], method = cor.method, use = "everything"), cor( data1[, start_row:stop_row], data2[, start_col:stop_col], method = cor.method, use = "everything"), cor( data2[, start_row:stop_row], data2[, start_col:stop_col], method = cor.method, use = "everything") #, na.rm = TRUE ) ) # correlational similarity corr_sim <- round(exp(-((1 - max_value) ^ 2) / (2 * (sr ^ 2))), digits = 20) }

i think it is better to remedy this situation than leave it as it was written. fewer calculations to do.

@hechth
Copy link
Collaborator Author

hechth commented Feb 24, 2023

@cbroeckl I agree - then let's keep an eye on this. let's make a list of places on the code where this behaviour will need to be adapted and resolve them step by step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants