Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception: Bad effect weights #44

Closed
TravisMizeIGH opened this issue Aug 8, 2024 · 2 comments · Fixed by #46
Closed

Exception: Bad effect weights #44

TravisMizeIGH opened this issue Aug 8, 2024 · 2 comments · Fixed by #46
Labels
bug Something isn't working

Comments

@TravisMizeIGH
Copy link

Description of the bug

Hello,

It appears some of the newer PGS have incorrect headers, causing a traceback (see below). I believe this is due to some scores having "dosage_0_weight" "dosage_1_weight" "dosage_2_weight" instead of "effect_weight" as the column header. This occurs in the following PGS:

PGS004255_hmPOS_GRCh38.txt
PGS004256_hmPOS_GRCh38.txt
PGS004258_hmPOS_GRCh38.txt
PGS004259_hmPOS_GRCh38.txt
PGS004260_hmPOS_GRCh38.txt
PGS004261_hmPOS_GRCh38.txt
PGS004262_hmPOS_GRCh38.txt
PGS004263_hmPOS_GRCh38.txt
PGS004264_hmPOS_GRCh38.txt
PGS004272_hmPOS_GRCh38.txt
PGS004273_hmPOS_GRCh38.txt
PGS004280_hmPOS_GRCh38.txt
PGS004299_hmPOS_GRCh38.txt
PGS004301_hmPOS_GRCh38.txt
PGS004304_hmPOS_GRCh38.txt

Command error:
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:15 DEBUG Other allele column detected, including other_allele in variant identifier
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:15 DEBUG Only single other alleles detected.
pgscatalog_utils.scorefile.effect_weight: 2024-08-07 16:06:15 DEBUG Single effect weight column detected
pgscatalog_utils.scorefile.effect_weight: 2024-08-07 16:06:15 DEBUG Skipping melt
pgscatalog_utils.scorefile.effect_type: 2024-08-07 16:06:15 DEBUG No effect types set, using default (additive)
pgscatalog_utils.scorefile.write: 2024-08-07 16:06:16 DEBUG Output file exists: setting write mode to append
pgscatalog_utils.scorefile.write: 2024-08-07 16:06:16 DEBUG Writing out gzip-compressed combined scorefile
pgscatalog_utils.scorefile.read: 2024-08-07 16:06:23 DEBUG Reading scorefile PGS004227_hmPOS_GRCh38.txt
pgscatalog_utils.scorefile.harmonised: 2024-08-07 16:06:23 DEBUG Harmonised columns detected and used
pgscatalog_utils.scorefile.harmonised: 2024-08-07 16:06:23 DEBUG other_allele column contains information, dropping hm_inferOtherAllele
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:23 DEBUG Quality control: checking for bad variants
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:23 DEBUG Other allele column detected, including other_allele in variant identifier
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:23 DEBUG Only single other alleles detected.
pgscatalog_utils.scorefile.effect_weight: 2024-08-07 16:06:23 DEBUG Single effect weight column detected
pgscatalog_utils.scorefile.effect_weight: 2024-08-07 16:06:23 DEBUG Skipping melt
pgscatalog_utils.scorefile.effect_type: 2024-08-07 16:06:23 DEBUG No effect types set, using default (additive)
pgscatalog_utils.scorefile.write: 2024-08-07 16:06:23 DEBUG Output file exists: setting write mode to append
pgscatalog_utils.scorefile.write: 2024-08-07 16:06:23 DEBUG Writing out gzip-compressed combined scorefile
pgscatalog_utils.scorefile.read: 2024-08-07 16:06:23 DEBUG Reading scorefile PGS004202_hmPOS_GRCh38.txt
pgscatalog_utils.scorefile.harmonised: 2024-08-07 16:06:23 DEBUG Harmonised columns detected and used
pgscatalog_utils.scorefile.harmonised: 2024-08-07 16:06:23 DEBUG other_allele column contains information, dropping hm_inferOtherAllele
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:23 DEBUG Quality control: checking for bad variants
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:23 DEBUG Other allele column detected, including other_allele in variant identifier
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:23 DEBUG Only single other alleles detected.
pgscatalog_utils.scorefile.effect_weight: 2024-08-07 16:06:23 DEBUG Single effect weight column detected
pgscatalog_utils.scorefile.effect_weight: 2024-08-07 16:06:23 DEBUG Skipping melt
pgscatalog_utils.scorefile.effect_type: 2024-08-07 16:06:23 DEBUG No effect types set, using default (additive)
pgscatalog_utils.scorefile.write: 2024-08-07 16:06:23 DEBUG Output file exists: setting write mode to append
pgscatalog_utils.scorefile.write: 2024-08-07 16:06:23 DEBUG Writing out gzip-compressed combined scorefile
pgscatalog_utils.scorefile.read: 2024-08-07 16:06:23 DEBUG Reading scorefile PGS004256_hmPOS_GRCh38.txt
pgscatalog_utils.scorefile.harmonised: 2024-08-07 16:06:24 DEBUG Harmonised columns detected and used
pgscatalog_utils.scorefile.harmonised: 2024-08-07 16:06:24 DEBUG other_allele column contains information, dropping hm_inferOtherAllele
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:24 DEBUG Quality control: checking for bad variants
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:24 DEBUG Other allele column detected, including other_allele in variant identifier
pgscatalog_utils.scorefile.qc: 2024-08-07 16:06:24 DEBUG Only single other alleles detected.
pgscatalog_utils.scorefile.effect_weight: 2024-08-07 16:06:24 ERROR ERROR: Missing valid effect weight columns

Traceback (most recent call last):
File "/venv/bin/combine_scorefiles", line 8, in
sys.exit(combine_scorefiles())
File "/venv/lib/python3.10/site-packages/pgscatalog_utils/scorefile/combine_scorefiles.py", line 84, in combine_scorefiles
.pipe(melt_effect_weights)
File "/venv/lib/python3.10/site-packages/pandas/core/generic.py", line 5839, in pipe
return com.pipe(self, func, *args, **kwargs)
File "/venv/lib/python3.10/site-packages/pandas/core/common.py", line 513, in pipe
return func(obj, *args, **kwargs)
File "/venv/lib/python3.10/site-packages/pgscatalog_utils/scorefile/effect_weight.py", line 11, in melt_effect_weights
elongate = _detect_multiple_weight_columns(df)
File "/venv/lib/python3.10/site-packages/pgscatalog_utils/scorefile/effect_weight.py", line 43, in _detect_multiple_weight_columns
raise Exception("Bad effect weights")
Exception: Bad effect weights

Command used and terminal output

No response

Relevant files

No response

System information

No response

@smlmbrt
Copy link
Member

smlmbrt commented Aug 9, 2024

Thanks for the bug report, this shouldn't cause the pipeline to break but it should warn users. Going to transfer this issue to pygscatalog (utils that are breaking) as it's also somewhat redundant with PGScatalog/pgsc_calc#314

@smlmbrt smlmbrt transferred this issue from PGScatalog/pgsc_calc Aug 9, 2024
@TravisMizeIGH
Copy link
Author

PGS004239_hmPOS_GRCh38.txt is also in an incorrect format/missing information which causes pgscatalog to error out

@nebfield nebfield linked a pull request Sep 9, 2024 that will close this issue
1 task
amstilp added a commit to UW-GAC/primed-pgs-queries that referenced this issue Sep 30, 2024
The pgscatalog-combine command fails to run on ~15 ids from the catalog.
There is currently an issue and PR to fix this, so until that is
done I'm just adding the list that are currently failing. Once
the PR is merged and a new version of pygscatalog-utils is released,
I will update to that version.

Issue: PGScatalog/pygscatalog#44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants