Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

Pandas error: Exception("name 'direct_long_range' is not defined") #49

Open
vmurigneu opened this issue Dec 26, 2022 · 0 comments
Open

Comments

@vmurigneu
Copy link

vmurigneu commented Dec 26, 2022

Hello,

We are trying to run the pore-c pipeline on two subsamples of a dataset. The pipeline previously ran successfully on the full dataset (using a different HPC that we can not access anymore).
We encountered the following error while running a first subsample:

Error in rule summarise_contacts:
    jobid: 8
    output: /scratch/project/gihex20hol/POR/sub10X/results_sub10X/merged_contacts/NlaIII_run01_BrahChr_chr1_2_shred_20kb_unphased.concatemers.parquet, /scratch/project/gihex20hol/POR/sub10X/results_sub10X/merged_contacts/NlaIII_run01_BrahChr_chr1_2_shred_20kb_unphased.concatemer_summary.csv
    log: /scratch/project/gihex20hol/POR/sub10X/results_sub10X/merged_contacts/NlaIII_run01_BrahChr_chr1_2_shred_20kb_unphased.concatemers.parquet.log (check log file(s) for error message)
    conda-env: /scratch/project_mnt/S0024/test/Pore-C-Snakemake/.snakemake/conda/14b8a690
    shell:
        pore_c --dask-scheduler-port 0 --dask-num-workers 10 contacts summarize /scratch/project/POR/sub10X/results_sub10X/merged_contacts/NlaIII_run01_BrahChr_chr1_2_shred_20kb_unphased.contacts.parquet /scratch/project/POR/sub10X/results_sub10X/basecall/NlaIII_run01.rd.summary.csv /scratch/project/POR/sub10X/results_sub10X/merged_contacts/NlaIII_run01_BrahChr_chr1_2_shred_20kb_unphased.concatemers.parquet /scratch/project/POR/sub10X/results_sub10X/merged_contacts/NlaIII_run01_BrahChr_chr1_2_shred_20kb_unphased.concatemer_summary.csv --user-metadata '{"run_id":"run01","enzyme":"NlaIII","biospecimen":"BrahMom","refgenome_id":"BrahChr_chr1_2_shred_20kb","phase_set_id":"unphased"}' 2>/scratch/project/POR/sub10X/results_sub10X/merged_contacts/NlaIII_run01_BrahChr_chr1_2_shred_20kb_unphased.concatemers.parquet.log
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

The log file contains :

    res = self.env.resolve(self.local_name, is_local=self.is_local)
  File "/scratch/project_mnt/S0024/test/Pore-C-Snakemake/.snakemake/conda/14b8a690/lib/python3.8/site-packages/pandas/core/computation/scope.py", line 203, in resolve
    raise UndefinedVariableError(key, is_local)
Exception: name 'direct_long_range' is not defined

This is the content of the results folder:

drwxr-sr-x. 2 uqvmurig Q1654RW   4096 Dec 11 13:29 refgenome
drwxr-sr-x. 2 uqvmurig Q1654RW  32768 Dec 11 14:17 basecall
drwxr-sr-x. 2 uqvmurig Q1654RW 524288 Dec 12 17:46 mapping
drwxr-sr-x. 2 uqvmurig Q1654RW   4096 Dec 12 17:47 virtual_digest
drwxr-sr-x. 2 uqvmurig Q1654RW 262144 Dec 16 10:54 align_table
drwxr-sr-x. 2 uqvmurig Q1654RW 131072 Dec 16 10:55 contacts
drwxr-sr-x. 3 uqvmurig Q1654RW   4096 Dec 19 09:28 merged_contacts

A similar error with another subsample of the dataset:

/scratch/project_mnt/S0024/test/Pore-C-Snakemake/.snakemake/conda/1614131d/lib/python3.8/site-packages/pandas/core/arrays/categorical.py:2747: FutureWarning: The `inplace` parameter in pandas.Categorical.set_categories is deprecated and will be removed in a future version. Removing unused categories will always return a new Categorical object.
  res = method(*args, **kwargs)
Traceback (most recent call last):
  File "/scratch/project_mnt/S0024/test/Pore-C-Snakemake/.snakemake/conda/1614131d/lib/python3.8/site-packages/pandas/core/computation/scope.py", line 208, in resolve
    return self.temps[key]
KeyError: 'direct_long_range'

  File "/scratch/project_mnt/S0024/test/Pore-C-Snakemake/.snakemake/conda/1614131d/lib/python3.8/site-packages/pandas/core/computation/ops.py", line 115, in _resolve_name
    res = self.env.resolve(self.local_name, is_local=self.is_local)
  File "/scratch/project_mnt/S0024/test/Pore-C-Snakemake/.snakemake/conda/1614131d/lib/python3.8/site-packages/pandas/core/computation/scope.py", line 213, in resolve
    raise UndefinedVariableError(key, is_local) from err
pandas.core.computation.ops.UndefinedVariableError: name "name 'direct_long_range' is not defined" is not defined

Thank you for your help
Valentine

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant