Drawbacks and advantages of parallel processing #45

rengel8 · 2021-05-20T22:35:31Z

rengel8
May 20, 2021

As described here (#42), parallel processing might not be advisable at all. It depends on the problem to solve. According to my experience and knowledge, for IO-Tasks, where delays aso. come into play, multi threading is often useful. For CPU-Tasks, where intense calculation/processing time is consumed, distributing workload to several computing cores is advisable, since python does a "bad" job here by default and exploits only a fraction of most modern systems' CPU power.

I adapted the approach mentioned above and want to summarize some findings.

Concurrent.futures vs. Multiprocessing
In Python 3 a new interface for multiprocessing named "concurrent.futures" was introduced. One advantage is, that it uses the same methodology and can be used for threads or processes. The "Pool" approach used in the example mentioned above is therefore one way of using this underlying technique for distributing jobs to processing cores.
But depending on the programming skills it can be difficult, to adapt this for individual tasks with more complex calculations/dependencies prior or as part of the fitness calculation. Maybe it is a good idea, to implement some logic to ease this process.

A structural idea for an optional new life cycle bubble
The cycle begins with the calculation of the fitness-function, which for itself is rigid in term of input arguments. One of these is "solution", which insinuates that some result is already available. This makes sense to me, since the next generation's population is the solution of the last cycle. But for me it would be interesting to become more open (flexible) for calculations that do not "fit" into the fitness function or are retrieved as part of a pre-step.

My Python-Knowledge is not strong enough to give a distinct advice here. But I think that it would be a benefit for many, if deciding between single and multi-core for pyGAD is (can be) as simple as setting its configuration accordingly (threads/processes, concurrent_jobs etc) and I would like to hear some other thoughts on this.

ahmedfgad · 2021-05-21T07:40:44Z

ahmedfgad
May 21, 2021
Maintainer

As described here (#42), parallel processing might not be advisable at all. It depends on the problem to solve. According to my experience and knowledge, for IO-Tasks, where delays aso. come into play, multi threading is often useful. For CPU-Tasks, where intense calculation/processing time is consumed, distributing workload to several computing cores is advisable, since python does a "bad" job here by default and exploits only a fraction of most modern systems' CPU power.

I adapted the approach mentioned above and want to summarize some findings.

Concurrent.futures vs. Multiprocessing
In Python 3 a new interface for multiprocessing named "concurrent.futures" was introduced. One advantage is, that it uses the same methodology and can be used for threads or processes. The "Pool" approach used in the example mentioned above is therefore one way of using this underlying technique for distributing jobs to processing cores.
But depending on the programming skills it can be difficult, to adapt this for individual tasks with more complex calculations/dependencies prior or as part of the fitness calculation. Maybe it is a good idea, to implement some logic to ease this process.

A structural idea for an optional new life cycle bubble
The cycle begins with the calculation of the fitness-function, which for itself is rigid in term of input arguments. One of these is "solution", which insinuates that some result is already available. This makes sense to me, since the next generation's population is the solution of the last cycle. But for me it would be interesting to become more open (flexible) for calculations that do not "fit" into the fitness function or are retrieved as part of a pre-step.

My Python-Knowledge is not strong enough to give a distinct advice here. But I think that it would be a benefit for many, if deciding between single and multi-core for pyGAD is (can be) as simple as setting its configuration accordingly (threads/processes, concurrent_jobs etc) and I would like to hear some other thoughts on this.

Dear Rainer,

This is another good piece from your side. I tested 2 parallel processing libraries (multiprocessing and Ray) and both had bad performance compared to using a singe core. I will read about concurrent.futures to see if something would be enhanced.

According to what you mentioned:

But for me it would be interesting to become more open (flexible) for calculations that do not "fit" into the fitness function or are retrieved as part of a pre-step.

Do you have a use case for that? If a step is to be added in the lifecycle before the fitness function, then this would mean this step is doing some necessary operations in each generation. I thought that adding everything in the fitness function would be fine.

This would be supported by adding a new optional callback function before the fitness function is called. Please let me know if you have any other suggestions to do that.

2 replies

rengel8 May 21, 2021
Author

Dear Ahmed,

concurrent.futures is nothing more than a nice wrapper around multiprocessing.Pool, which is interchangeable with threads, as far as I understood correctly.

So far I tested optimization for timeseries (technical) analysis. This demands input data and calculations based on it. I ran into the challenge of providing this data to a fitness-function as well as the solutions, which is internal data of the pyGAD cycle. This was the initiating moment for opening this topic, since parallel processing "per se" is possible to use.

You are right, that depending on the problem/scenario parallel processing may be clearly adverse. I tested a simple example with the basic configuration below, where it is reasonable/mandatory:

num_generations=3,
num_parents_mating=6,
fitness_func=fitness_fun,
sol_per_pop=48,
num_genes=3,

Each method (single vs. multi) ran 3 times to get a feeling about their performance deviation.

SINGLE CORE
Generation : 1
Fitness : 0.05247536168347411
Solution : [137 240 17]
Generation : 2
Fitness : 0.060852540604847905
Solution : [ 46 238 17]
Generation : 3
Fitness : 0.060852540604847905
Solution : [ 46 238 17]
Parameters of the best solution : [ 46 238 17]
Fitness value of the best solution = 0.060852540604847905
Finished in 249.16 second(s), Finished in 322.4 second(s), Finished in 272.03 second(s)

MULTI CORE (6 of a total of 8, better than 8 of 8 cores)
Generation : 1
Fitness : 0.08038715469290847
Solution : [ 53 137 13]
Generation : 2
Fitness : 0.08038715469290847
Solution : [ 53 137 13]
Generation : 3
Fitness : 0.08038715469290847
Solution : [ 53 137 13]
Parameters of the best solution : [ 53 137 13]
Fitness value of the best solution = 0.08038715469290847
Finished in 131.47 second(s), Finished in 103.87 second(s), Finished in 117.17 second(s)

ahmedfgad May 23, 2021
Maintainer

For loading the data, I used to load it globally and access it from inside the fitness function. The reason is that the data I used does not change from one generation to another and thus can be loaded only once.

The time reduction is notable. Thanks, Rainer! This is a most try thing.

Thanks for all! Your help is much appreciated.

rengel8 · 2021-05-26T12:37:03Z

rengel8
May 26, 2021
Author

This is something coming from tests I've done so far. My experience in multi processing/threading with python is not that profound, but since you tried several things in that domain on your own, I would think of something like this..

There are a multitude of very different uses cases for GA. Some run best (fastest) like default now, some benefit from multi-processing, others maybe from multi-threading, a few from a blending of last both (this might not make any sense as stated below). One important factor in this consideration is foremost probably the fitness function (you mentioned that), which can be set up serving individual needs.

I thought of something like a benchmark_nursery, where some generations with different (speed)-settings are executed/evaluated. By this, a user would get an overview and can decide how to continue. Beyond that, the GA could also auto-tune to the best (fastest) setting found, since - neglecting some deviation - each generation should take close to equal amount of time.

import multiprocessing
print(multiprocessing.cpu_count())	# method to get core amount of system

The resulting evaluation could look like this.

Benchmark-Nursery ..................................................
1 generation, 16 genes, 256 sol-p-pop
####################################################################
default mode: 200s
--------------------------------------------------------------------
multi-core:
- half (all_cores/2): 160s
- headroom (all_cores - 2): 110s 
- full (all_cores): 120s
--------------------------------------------------------------------
multi-threading: 
...
--------------------------------------------------------------------
multi-mix (cores/threading)
maybe this does not make any sense, since most problems have a clear
weight between I/O or CPU demand.
--------------------------------------------------------------------
multi-core, headroom is the fastest result of this survey!
####################################################################

Of course it can look completely different, but this is something I see in my tests. If benchmark_nursery=0 would be the default (not active) one could set the amout of generations per test to get more robust (average) results. After this survey the GA would stop or could continue, if benchmark_autotune=True.

1 reply

ahmedfgad May 26, 2021
Maintainer

This is promising specially that the performance gain does not only target the long-running fitness functions.

I may try to the multiprocessing library to benefit from multiple cores. I will get back with the outcome. Thanks @rengel8 for the experiments you did.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drawbacks and advantages of parallel processing #45

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Drawbacks and advantages of parallel processing #45

rengel8 May 20, 2021

Replies: 2 comments · 3 replies

ahmedfgad May 21, 2021 Maintainer

rengel8 May 21, 2021 Author

ahmedfgad May 23, 2021 Maintainer

rengel8 May 26, 2021 Author

ahmedfgad May 26, 2021 Maintainer

rengel8
May 20, 2021

Replies: 2 comments 3 replies

ahmedfgad
May 21, 2021
Maintainer

rengel8 May 21, 2021
Author

ahmedfgad May 23, 2021
Maintainer

rengel8
May 26, 2021
Author

ahmedfgad May 26, 2021
Maintainer