Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug report [Pycistopic Temporary Fragment file cannot be found] #162

Open
yrsong001 opened this issue Sep 4, 2024 · 2 comments
Open

Bug report [Pycistopic Temporary Fragment file cannot be found] #162

yrsong001 opened this issue Sep 4, 2024 · 2 comments

Comments

@yrsong001
Copy link

yrsong001 commented Sep 4, 2024

Describe the bug
Hi! I am following the pycistopic tutorial here. https://pycistopic.readthedocs.io/en/latest/notebooks/human_cerebellum.html. It shows an error ValueError: Fragment file ./temp/age_2y_s1/BCell1.fragments.tsv.gz does not exist., which I believe is generated in the process. Can you help with the debugging? I am unsure whether it relates to the temp file path or other issues. Thank you for your advice.

To Reproduce

fragments_dict = {'age_2y_s1': '/proj/liulab/users/yrsong/aging/Dataset_Creation/run_cellranger_atac/1-ATAC/outs/fragments.tsv.gz',
                 'age_2y_s2': './run_cellranger_atac/2-ATAC/outs/fragments.tsv.gz',
                 'age_1y_s1': './3-ATAC/outs/fragments.tsv.gz',
                 'age_1y_s2': './4-ATAC/outs/fragments.tsv.gz',
                 'age_3m_s1': './5-ATAC/outs/fragments.tsv.gz',
                 'age_3m_s2': './6-ATAC/outs/fragments.tsv.gz'}

from pycisTopic.pseudobulk_peak_calling import *
bw_paths, bed_paths = export_pseudobulk(
    input_data = cell_data,
    variable = "celltype",
    sample_id_col = "sample_id",
    chromsizes = chromsizes,
    bed_path = os.path.join(out_dir, "consensus_peak_calling/pseudobulk_bed_files"),
    bigwig_path = os.path.join(out_dir, "consensus_peak_calling/pseudobulk_bw_files"),
    path_to_fragments = fragments_dict,
    n_cpu = 10,
    normalize_bigwig = True,
    temp_dir = "./temp", 
    split_pattern = None
)

**Error output.**


bw_paths, bed_paths = export_pseudobulk(
    input_data = cell_data,
    variable = "celltype",
    sample_id_col = "sample_id",
    chromsizes = chromsizes,
    bed_path = os.path.join(out_dir, "consensus_peak_calling/pseudobulk_bed_files"),
    bigwig_path = os.path.join(out_dir, "consensus_peak_calling/pseudobulk_bw_files"),
    path_to_fragments = fragments_dict,
    n_cpu = 10,
    normalize_bigwig = True,
    temp_dir = "./temp", 
    split_pattern = None
)
2024-09-04 00:04:14,953 cisTopic     INFO     Splitting fragments by cell type.


ValueError                                Traceback (most recent call last)
Cell In[12], line 8
      6 ray.shutdown()
      7 from pycisTopic.pseudobulk_peak_calling import *
----> 8 bw_paths, bed_paths = export_pseudobulk(
      9     input_data = cell_data,
     10     variable = "celltype",
     11     sample_id_col = "sample_id",
     12     chromsizes = chromsizes,
     13     bed_path = os.path.join(out_dir, "consensus_peak_calling/pseudobulk_bed_files"),
     14     bigwig_path = os.path.join(out_dir, "consensus_peak_calling/pseudobulk_bw_files"),
     15     path_to_fragments = fragments_dict,
     16     n_cpu = 10,
     17     normalize_bigwig = True,
     18     temp_dir = "./temp", 
     19     split_pattern = None
     20 )

File /proj/liulab/users/yrsong/aging/Dataset_Creation/SCENIC_plus_Analysis/scplus_pipeline/Snakemake/config/pycisTopic/src/pycisTopic/pseudobulk_peak_calling.py:162, in export_pseudobulk(input_data, variable, chromsizes, bed_path, bigwig_path, path_to_fragments, sample_id_col, n_cpu, normalize_bigwig, split_pattern, temp_dir)
    159 # For each sample, get fragments for each cell type
    161 log.info("Splitting fragments by cell type.")
--> 162 split_fragment_files_by_cell_type(
    163     sample_to_fragment_file = path_to_fragments,
    164     path_to_temp_folder = temp_dir,
    165     path_to_output_folder = bed_path,
    166     sample_to_cell_type_to_cell_barcodes = sample_to_cell_type_to_barcodes,
    167     chromsizes = chromsizes_dict,
    168     n_cpu = n_cpu,
    169     verbose = False,
    170     clear_temp_folder = True
    171 )
    173 bed_paths = {}
    174 for cell_type in cell_data[variable].unique():

File ~/.conda/envs/scenicplus/lib/python3.11/site-packages/scatac_fragment_tools/library/split/split_fragments_by_cell_type.py:92, in split_fragment_files_by_cell_type(sample_to_fragment_file, path_to_temp_folder, path_to_output_folder, sample_to_cell_type_to_cell_barcodes, chromsizes, n_cpu, verbose, clear_temp_folder)
     90 path_to_fragment_file = os.path.join(path_to_temp_folder, sample, f"{cell_type_sanitized}.fragments.tsv.gz")
     91 if not os.path.exists(path_to_fragment_file):
---> 92     raise ValueError(f"Fragment file {path_to_fragment_file} does not exist.")
     93 if cell_type_sanitized not in cell_type_to_fragment_files:
     94     cell_type_to_fragment_files[cell_type_sanitized] = []

ValueError: Fragment file ./temp/age_2y_s1/BCell1.fragments.tsv.gz does not exist.

Version (please complete the following information):

  • Python: 3.11
  • SCENIC+: 1.0a1
@ghuls
Copy link
Member

ghuls commented Sep 6, 2024

Did you ran out of disk space, by any chance?

@SeppeDeWinter
Copy link
Collaborator

Hi @yrsong001

Please don't open duplicate issues: aertslab/scenicplus#458

All the best,

Seppe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants