You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question regarding the optimization of flowSOM clustering.
I set a seed, used an 11x12 grid, and varied the nClus parameter to test a different number of metaclusters. One issue I observed is that 2 clusters that are next to each other on the Minimal spanning tree (MST) AND have very similar marker expression profiles can be clustered in 2 different metaclusters. On the other hand, clusters that are further on MST can be assigned in the same metacluster. For example in the figure I attached below: Metaclusters 1,2,10,14 are split into very distanced parts on MST.
I'm wondering what can be the possible explanation for this issue, and how should I optimize the parameters to get better clustering. (I tried to set maxMeta very high to get more well-defined metaclusters but the issue still remained, and the number of metacluster returned was too high for annotation)
Thank you so much and I'm looking forward to your response!
Best regards,
Ha Le
The text was updated successfully, but these errors were encountered:
Hi Ha Le,
This can indeed happen sometimes.
The first thing to understand is that a minimal spanning tree by definition
cannot have any loops, so even nodes that were quite close in the high
dimensional space, can be far apart in the tree. The only thing that the
tree tells you in that case is that another node was even closer.
Secondly, the discrepancy with the metaclustering is probably caused
because there we use a different algorithm: hierarchical clustering with
average linkage. This means when combining the clusters, it will consider
an average distance between multiple nodes rather than just the closest
node. Single linkage might give you results that correspond better to what
the tree shows.
However, when looking at the marker expression (eg by looking at the
heatmap as suggested by Samuel), we often see that there is good reason for
such a metaclustering, even when it doesn't look intuitive on the tree.
It's just part of the limitations of visualizing high dimensional data in
2D.
So I would recommend checking the heatmaps and scatterplots and deciding
your final metaclusters based on that information. You can also adapt your
metaclustering labels with UpdateMetaclusters manually if need be.
Hope this helps,
Sofie
Hello Sophie!
Thank you for a great package!
I have a question regarding the optimization of flowSOM clustering.
I set a seed, used an 11x12 grid, and varied the nClus parameter to test a different number of metaclusters. One issue I observed is that 2 clusters that are next to each other on the Minimal spanning tree (MST) AND have very similar marker expression profiles can be clustered in 2 different metaclusters. On the other hand, clusters that are further on MST can be assigned in the same metacluster. For example in the figure I attached below: Metaclusters 1,2,10,14 are split into very distanced parts on MST.
I'm wondering what can be the possible explanation for this issue, and how should I optimize the parameters to get better clustering. (I tried to set maxMeta very high to get more well-defined metaclusters but the issue still remained, and the number of metacluster returned was too high for annotation)
Thank you so much and I'm looking forward to your response!
Best regards,
Ha Le
The text was updated successfully, but these errors were encountered: