Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some PGS Catalog scores not working with pgsc_calc #370

Closed
Sabramow opened this issue Sep 4, 2024 · 4 comments · Fixed by PGScatalog/pygscatalog#46
Closed

Some PGS Catalog scores not working with pgsc_calc #370

Sabramow opened this issue Sep 4, 2024 · 4 comments · Fixed by PGScatalog/pygscatalog#46
Labels
bug Something isn't working user-query User queries & requests

Comments

@Sabramow
Copy link

Sabramow commented Sep 4, 2024

Description of the bug

I've run into issues calculating some of the scores from the PGS Catalog when I indicate their IDs with the --pgs_id parameter. Specifically:

  1. PGS004255, PGS004256,PGS004258,PGS004259, PGS004260-64, PGS004272, PGS004273, PGS004280, PGS004299, PGS004301, PGS004304, PGS00428 (all from the same publication) cause an error. They are all formatted with dosage weights (see 'relevant files' section for example of formatting).
  2. specifying PGS002759 ran without complication but output the score for PGS000767 (also for depression) rather than PGS002759.

Command used and terminal output

Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)'

Caused by:
  Process `PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)` terminated with an error exit status (1)

Command executed:

  pgscatalog-combine -s PGS004174_hmPOS_GRCh38.txt.gz PGS004175_hmPOS_GRCh38.txt.gz PGS004176_hmPOS_GRCh38.txt.gz PGS004177_hmPOS_GRCh38.txt.gz PGS004178_hmPOS_GRCh38.txt.gz PGS004179_hmPOS_GRCh38.txt.gz PGS004180_hmPOS_GRCh38.txt.gz PGS004181_hmPOS_GRCh38.txt.gz PGS004182_hmPOS_GRCh38.txt.gz PGS004183_hmPOS_GRCh38.txt.gz PGS004184_hmPOS_GRCh38.txt.gz PGS004185_hmPOS_GRCh38.txt.gz PGS004186_hmPOS_GRCh38.txt.gz PGS004187_hmPOS_GRCh38.txt.gz PGS004188_hmPOS_GRCh38.txt.gz PGS004189_hmPOS_GRCh38.txt.gz PGS004190_hmPOS_GRCh38.txt.gz PGS004191_hmPOS_GRCh38.txt.gz PGS004192_hmPOS_GRCh38.txt.gz PGS004193_hmPOS_GRCh38.txt.gz PGS004194_hmPOS_GRCh38.txt.gz PGS004195_hmPOS_GRCh38.txt.gz PGS004196_hmPOS_GRCh38.txt.gz PGS004197_hmPOS_GRCh38.txt.gz PGS004198_hmPOS_GRCh38.txt.gz PGS004199_hmPOS_GRCh38.txt.gz PGS004200_hmPOS_GRCh38.txt.gz PGS004201_hmPOS_GRCh38.txt.gz PGS004202_hmPOS_GRCh38.txt.gz PGS004203_hmPOS_GRCh38.txt.gz PGS004204_hmPOS_GRCh38.txt.gz PGS004205_hmPOS_GRCh38.txt.gz PGS004206_hmPOS_GRCh38.txt.gz PGS004207_hmPOS_GRCh38.txt.gz PGS004208_hmPOS_GRCh38.txt.gz PGS004209_hmPOS_GRCh38.txt.gz PGS004210_hmPOS_GRCh38.txt.gz PGS004211_hmPOS_GRCh38.txt.gz PGS004212_hmPOS_GRCh38.txt.gz PGS004213_hmPOS_GRCh38.txt.gz PGS004214_hmPOS_GRCh38.txt.gz PGS004215_hmPOS_GRCh38.txt.gz PGS004216_hmPOS_GRCh38.txt.gz PGS004217_hmPOS_GRCh38.txt.gz PGS004218_hmPOS_GRCh38.txt.gz PGS004219_hmPOS_GRCh38.txt.gz PGS004220_hmPOS_GRCh38.txt.gz PGS004221_hmPOS_GRCh38.txt.gz PGS004222_hmPOS_GRCh38.txt.gz PGS004223_hmPOS_GRCh38.txt.gz PGS004224_hmPOS_GRCh38.txt.gz PGS004225_hmPOS_GRCh38.txt.gz PGS004226_hmPOS_GRCh38.txt.gz PGS004227_hmPOS_GRCh38.txt.gz PGS004228_hmPOS_GRCh38.txt.gz PGS004229_hmPOS_GRCh38.txt.gz PGS004230_hmPOS_GRCh38.txt.gz PGS004231_hmPOS_GRCh38.txt.gz PGS004232_hmPOS_GRCh38.txt.gz PGS004233_hmPOS_GRCh38.txt.gz PGS004234_hmPOS_GRCh38.txt.gz PGS004235_hmPOS_GRCh38.txt.gz PGS004236_hmPOS_GRCh38.txt.gz PGS004237_hmPOS_GRCh38.txt.gz PGS004238_hmPOS_GRCh38.txt.gz PGS004239_hmPOS_GRCh38.txt.gz PGS004240_hmPOS_GRCh38.txt.gz PGS004241_hmPOS_GRCh38.txt.gz PGS004242_hmPOS_GRCh38.txt.gz PGS004243_hmPOS_GRCh38.txt.gz PGS004244_hmPOS_GRCh38.txt.gz PGS004245_hmPOS_GRCh38.txt.gz PGS004246_hmPOS_GRCh38.txt.gz PGS004247_hmPOS_GRCh38.txt.gz PGS004248_hmPOS_GRCh38.txt.gz PGS004249_hmPOS_GRCh38.txt.gz PGS004250_hmPOS_GRCh38.txt.gz PGS004251_hmPOS_GRCh38.txt.gz PGS004252_hmPOS_GRCh38.txt.gz PGS004253_hmPOS_GRCh38.txt.gz PGS004254_hmPOS_GRCh38.txt.gz PGS004256_hmPOS_GRCh38.txt.gz PGS004257_hmPOS_GRCh38.txt.gz PGS004258_hmPOS_GRCh38.txt.gz PGS004259_hmPOS_GRCh38.txt.gz PGS004260_hmPOS_GRCh38.txt.gz PGS004261_hmPOS_GRCh38.txt.gz PGS004262_hmPOS_GRCh38.txt.gz PGS004263_hmPOS_GRCh38.txt.gz PGS004264_hmPOS_GRCh38.txt.gz PGS004265_hmPOS_GRCh38.txt.gz PGS004266_hmPOS_GRCh38.txt.gz PGS004267_hmPOS_GRCh38.txt.gz PGS004268_hmPOS_GRCh38.txt.gz PGS004269_hmPOS_GRCh38.txt.gz PGS004270_hmPOS_GRCh38.txt.gz PGS004271_hmPOS_GRCh38.txt.gz PGS004272_hmPOS_GRCh38.txt.gz PGS004273_hmPOS_GRCh38.txt.gz             -t GRCh38             -o scorefiles.txt.gz             -l log_scorefiles.json             -v             -v
  
  cat <<-END_VERSIONS > versions.yml
  COMBINE_SCOREFILES:
      pgscatalog.core: $(echo $(python -c 'import pgscatalog.core; print(pgscatalog.core.__version__)'))
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:07 INFO     Processing PGS004245
  pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:07 INFO     Processing PGS004246
  pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:07 INFO     Processing PGS004247
  
   75%|███████▍  | 74/99 [01:43<00:31,  1.24s/it]pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:07 INFO     Processing PGS004248
  pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:07 INFO     Processing PGS004249
  pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:07 INFO     Processing PGS004250
  pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:07 INFO     Processing PGS004251
  pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:07 INFO     Processing PGS004252
  pgscatalog.core.lib._normalise: 2024-08-26 16:12:07 WARNING  Multiple other_alleles detected in 43 variants
  pgscatalog.core.lib._normalise: 2024-08-26 16:12:07 WARNING  Other allele for these variants is set to missing
  
   80%|███████▉  | 79/99 [01:43<00:14,  1.37it/s]pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:07 INFO     Processing PGS004253
  pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:19 INFO     Processing PGS004254
  
   80%|███████▉  | 79/99 [02:00<00:14,  1.37it/s]
   82%|████████▏ | 81/99 [02:09<00:56,  3.12s/it]pgscatalog.core.cli.combine_cli: 2024-08-26 16:12:33 INFO     Processing PGS004256
  
   82%|████████▏ | 81/99 [02:09<00:28,  1.60s/it]
  Traceback (most recent call last):
    File "/app/pgscatalog.utils/.venv/bin/pgscatalog-combine", line 8, in 
      sys.exit(run())
               ^^^^^
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/cli/combine_cli.py", line 65, in run
      normalised_score = list(
                         ^^^^^
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/scorefiles.py", line 485, in normalise
      yield from normalise(
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/_normalise.py", line 71, in check_duplicates
      for variant in variants:
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/_normalise.py", line 302, in detect_complex
      for variant in variants:
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/_normalise.py", line 283, in check_effect_allele
      for variant in variants:
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/_normalise.py", line 159, in assign_other_allele
      for variant in variants:
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/_normalise.py", line 138, in check_effect_weight
      for variant in variants:
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/_normalise.py", line 191, in assign_effect_type
      for variant in variants:
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/_normalise.py", line 253, in check_bad_variant
      for variant in variants:
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/_normalise.py", line 221, in remap_harmonised
      for variant in variants:
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/scorefiles.py", line 326, in _generate_variants
      yield from read_rows_lazy(
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/core/lib/_read.py", line 35, in read_rows_lazy
      yield ScoreVariant(**variant, **{"accession": name, "row_nr": row_nr})
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  TypeError: ScoreVariant.__init__() missing 1 required keyword-only argument: 'effect_weight'

Relevant files

###PGS CATALOG SCORING FILE - see https://www.pgscatalog.org/downloads/#dl_ftp_scoring for additional information
#format_version=2.0
##POLYGENIC SCORE (PGS) INFORMATION
#pgs_id=PGS004280
#pgs_name=GenoBoost_all-cause_dementia_0
#trait_reported=All-cause dementia
#trait_mapped=dementia
#trait_efo=MONDO_0001627
#genome_build=hg19
#variants_number=30
#weight_type=beta
##SOURCE INFORMATION
#pgp_id=PGP000546
#citation=Ohta R et al. Nat Commun (2024). doi:10.1038/s41467-024-48654-x
chr_name chr_position effect_allele other_allele dosage_0_weight dosage_1_weight dosage_2_weight
19 45395619 G A -0.1239091 0.2275461 0.68995
19 45396219 T C -0.1833033 0.1472852 0.4591863
19 45403412 T C 0.0663185 -0.0230981 -0.1023525
19 45389596 A G 0.0206892 -0.3438102 -1.0314368
19 45414451 T C 0.0626888 -0.0365833 -0.1220287
7 1569418 C T 0.0527261 0.0033867999999999997 -0.07103519999999999
7 100013457 T C 0.0262445 0.0068097999999999995 -0.0515715
9 6144065 A G -0.0226533 0.033497 0.0791566
9 10430602 C T -0.0162635 0.027885399999999998 0.1389225
16 12666279 G A -0.0205917 0.0189861 0.1378777
8 140267889 G A 0.0391212 -0.022668800000000003 -0.026091
5 156686040 C T -0.0164153 0.0582531 0.017068200000000002

System information

nextflow version 23.10.0

@Sabramow Sabramow added the bug Something isn't working label Sep 4, 2024
@smlmbrt
Copy link
Member

smlmbrt commented Sep 5, 2024

Hi @Sabramow,

We are aware of the problems in point 1 (redundant with #314 and PGScatalog/pygscatalog#44) and we may close the issue because of this.

Could you elaborate on part 2? What exact command did you use?

S

@Sabramow
Copy link
Author

Thanks for addressing point 1!

For point 2: I was able to replicate this issue, which the attached screenshot of the score report demonstrates; although I specified PGS002759 in the command, it seems the scoring file pulled was PGS000767.
Screenshot 2024-09-10 at 11 29 23 AM

@smlmbrt
Copy link
Member

smlmbrt commented Sep 10, 2024

Hi @Sabramow, I can also replicate the problem, it's not a problem with the calculator though (something must have went off when we uploaded those specific harmonised files). Will leave this open while we sort it out.

@smlmbrt smlmbrt added user-query User queries & requests bug Something isn't working and removed bug Something isn't working labels Sep 10, 2024
@smlmbrt
Copy link
Member

smlmbrt commented Sep 11, 2024

@Sabramow, we've replaced that file on the FTP. Future runs of pgsc_calc for that score should use the correct score (provided you delete the work directory to make sure a cached copy isn't used). Thanks again for reporting the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working user-query User queries & requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants