Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cif::pdb::reconstruct_pdbx is very slow #64

Open
Augustin-Zidek opened this issue Sep 16, 2024 · 3 comments
Open

cif::pdb::reconstruct_pdbx is very slow #64

Augustin-Zidek opened this issue Sep 16, 2024 · 3 comments

Comments

@Augustin-Zidek
Copy link

Augustin-Zidek commented Sep 16, 2024

Hello, many thanks for the development and maintenance of libcifpp!

I've noticed that cif::pdb::reconstruct_pdbx is very slow. E.g. on 7soy mmCIF file from the PDB it takes < 0.2 seconds to parse, but running cif::pdb::reconstruct_pdbx on it takes roughly 4.5 seconds, i.e. a 20x slow-down if one wants to perform the correctness check/autofix.

Vast majority of the time is spent in cif::compound_factory::create:

image

Could that time be reduced? Also, cif::compound_factory::create seems to be called from multiple places. Would it make sense to cache that load?

I think that this could also be sped up if the CCD was compressed using zstd instead of gzip, as it decompresses much faster.

@mhekkel
Copy link
Member

mhekkel commented Sep 16, 2024

Could it be that your components.cif file is compressed? What happens if you extract that file, the one in /var/cache/libcifpp, does that help?

@mhekkel
Copy link
Member

mhekkel commented Sep 23, 2024

You mentioned using zstd. That's a good suggestion, but the point is, when you use the bundled script to update components.cif it will write out a file uncompressed. Removing the need for decompression entirely.

@mhekkel
Copy link
Member

mhekkel commented Sep 23, 2024

As a reference, cif-validate on 7soy takes 0.2 seconds on my laptop:

$ time build/cif-validate /tmp/7soy.cif.gz

real	0m0,246s
user	0m0,239s
sys	0m0,007s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants