Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy_number -> copy_call bins inconsistent #268

Open
ymahlich opened this issue Dec 12, 2024 · 0 comments
Open

copy_number -> copy_call bins inconsistent #268

ymahlich opened this issue Dec 12, 2024 · 0 comments
Assignees

Comments

@ymahlich
Copy link
Collaborator

There seem to be differences in how copy_number is converted to copy_call for HCMI in comparison to all other datasets.
Is there a specific reason for this or is this an oversight / bug? If this intentional, that does in turn mean that the copy_number value (not copy_call) won't be comparable between HCMI and other?

Maybe I am deeply misunderstanding something here but it seems odd to me.

I attached the code for every conversion below with references to the code itself below:

Broad Sanger (02-broadSangerOmics.R: lines 119-122):

dplyr::mutate(IMPROVE=ifelse(copy_number<0.5210507,'deep del',
                               ifelse(copy_number<0.7311832,'het loss',
                                      ifelse(copy_number<1.214125,'diploid',
                                             ifelse(copy_number<1.422233,'gain','amp')))))|>

CPTAC (getCptacData.py: lines 213-224):

    for a in arr:
        a = 2**float(a)
        if float(a) < 0.5210507:
            b = 'deep del'
        elif float(a) < 0.7311832:
            b = 'het loss'
        elif float(a) < 1.214125:
            b = 'diploid'
        elif float(a) <1.42233:
            b = 'gain'
        else:
            b = 'amp'

HCMI (02-getHCMIData.py: lines 479-488):

a_val = math.log2(float(a)+0.000001) ###this should not be exponent, should be log!!! 2**float(a)
if a_val < 0.0: #0.5210507:
          return 'deep del'
      elif a_val < 0.7311832:
          return 'het loss'
      elif a_val < 1.214125:
          return 'diploid'
      elif a_val < 1.731183:
          return 'gain'
      else:
          return 'amp'

MPNST (01_mpnst_get_omics.R: lines 173-176):

dplyr::mutate(copy_call=ifelse(copy_number<0.5210507,'deep del',
                               ifelse(copy_number<0.7311832,'het loss',
                                      ifelse(copy_number<1.214125,'diploid',
                                             ifelse(copy_number<1.422233,'gain','amp')))))|>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

3 participants