Handle virtual cores more explicitly #271

JnsLns · 2024-05-16T13:24:23Z

Please write here what feature pandarallel is missing:

There is some inconsistency regarding whether nb_workers refers to pyhsical cores or logical cores.

The documentation does not specify explicitly whether pandarallel.initialize(nb_workers=...) is physical or logical cores
The default value used if nb_workers is not passed is the number of physical cores (code)
It seems however, that the value passed to nb_workers is actually interpreted as logical cores

The main problem is that on a machine with virtual cores, pandarallel will by default use only as many virtual cores as there are physical cores, because it counts the physical cores, but interprets the number as logical cores. This might be solvable by simply changing Falseto True in the line linked above (but maybe there are downstream complications).

The other improvement would be to mention explicitly on the documentation that the value passed to nbworkers is logical cores.

The text was updated successfully, but these errors were encountered:

JnsLns · 2024-05-21T14:59:15Z

In the meantime I found this FAQ: https://nalepae.github.io/pandarallel/troubleshooting/
Which states pandarallel can only use physical cores. While this seems to explain what I listed above, it seems at odds with my experiments on a machine with 20 virtual (10 physical) cores -- where I consistently was able to engage all virtual cores, only one virtual core, or all but one, and so on.

highvight · 2024-10-23T12:41:05Z

This issue is now quite old, I still can add some context to this I think.

nb_workers really is just that. The total number of workers (individual Python processes) to be run in parallel.

The allocation of these processes to CPU cores is handled by your OS, not Python. The documentation (and the default value) simply state you probably do not want to run more processes than you have physical cores on your CPU. This makes sense for CPU-bound tasks

JnsLns · 2024-10-26T13:54:04Z

I see. So the point is hyper-threading does not help with CPU bound tasks. Pandarallel can use more threads but it does not help, so it doesn't by default.

Thanks for clearing that up!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle virtual cores more explicitly #271

Handle virtual cores more explicitly #271

JnsLns commented May 16, 2024 •

edited

Loading

JnsLns commented May 21, 2024

highvight commented Oct 23, 2024

JnsLns commented Oct 26, 2024

Handle virtual cores more explicitly #271

Handle virtual cores more explicitly #271

Comments

JnsLns commented May 16, 2024 • edited Loading

JnsLns commented May 21, 2024

highvight commented Oct 23, 2024

JnsLns commented Oct 26, 2024

JnsLns commented May 16, 2024 •

edited

Loading