Batch (parallelized) calculations of beta Shapley #427

kosmitive · 2023-09-05T22:32:54Z

Fine-grained marginal evaluations (used in our implementation of Beta Shapley) in combination with a parallel executor has sub-optimal performance. As a result it uses only a fraction of the computing power. I like the way Beta Shapley is implemented with the breakdown to marginal computations. To maintain that structure we should think about batching the marginal evaluations to increase performance. Obviously the optimal value depends on the used cores, but a batch size in between 30 and 50 can be chosen as a first guess.

mdbenito · 2023-09-06T05:40:22Z

Note that batch size will change the behaviour of algorithms because convergence criteria are checked in the main process. The most obvious case is maxchecks but all other stopping criteria will be evaluated only every batch_size samples.

The issue is with models that are trained very quickly or sample sizes that are too small. I would have expected that using a process pool would alleviate the problem. Is joblib doing this?

Anyway, the proper solution is IMO not to batch but to use queues like we did a long time ago.

And if we decided to use batches as a stopgap, then it might make sense to batch by total size of all samples sent, instead of their number, because training on 5 points might be faster than training on 5000. I'm not sure this is worth it, though, because we are going to break the batch_size interface once we implement queues and that would be bad.

kosmitive · 2023-09-06T07:03:47Z

Yes but each sample of the batch is processed in order and after each sample done is checked. If done is true in the middle of the batch, the rest of the calculations get thrown away. The PR comes in handy for the current experiments to speed things up (e.g. by a factor of 14 for logistic regression). We don't have to merge it, in case we rely on a different parallelization scheme.

My favorite here would be also task queues, although ProcessPoolExecutor probably has a task queue under the hood and as far as I am concerned the loky backend manges a ProcessPoolExecutor internally.

mdbenito · 2023-09-07T08:07:14Z

Come to think of it, process pools have running processes but they start afresh with each new task, and this can take too long. They cannot be using a task queue like I meant, since they couldn’t guarantee a clean slate if they did. We almost certainly want our own queues.

As to your solution, given the massive speed up you mention, we can add it if we clearly document that the interface is temporary and will be deprecated. It is crucial to ensure that randomness is handled properly, though.

What do you think @AnesBenmerzoug ?

AnesBenmerzoug · 2023-09-07T13:42:09Z

@mdbenito I agree with what you said. Given the speed up that this brings I also think we should merge it.

We can go further in this direction by automatically finding the right batch number (This is how it is done in joblib, for example) or we can consider once again using queues.

mdbenito · 2023-09-17T09:12:09Z

We can go further in this direction by automatically finding the right batch number (This is how it is done in joblib, for example) or we can consider once again using queues.

With the current idiom for job submission, timing would probably have to happen within the caller and not the sampler, since we take a bunch of samples in one go and submit the jobs.

mdbenito · 2023-09-17T15:51:21Z

@kosmitive I'm going to merge your PR (after some changes to documentation and deprecation notices that I've pushed), but I think batching belongs in the samplers. A simple way to do it is to wrap them in a generic batched decorator like this one I wrote as a demo. I can add it some other time if we decide to keep the batching mechanism

kosmitive · 2023-09-17T17:19:27Z

@kosmitive I'm going to merge your PR (after some changes to documentation and deprecation notices that I've pushed), but I think batching belongs in the samplers. A simple way to do it is to wrap them in a generic batched decorator like this one I wrote as a demo. I can add it some other time if we decide to keep the batching mechanism

Okay when I think about the sampler is indeed the clearer location. Furthermore the decorator is way cleaner. But if we remove it latter it might be not the best to add it so deeply in the framework. I leave the MR for editing and merging to you @mdbenito. If you need some work from my side, let me know.

@AnesBenmerzoug We could transfer it indirectly to the called processes, by introducing a stop queue, which also serves the speed recommendation of each worker.

mdbenito · 2023-09-17T19:42:44Z

Okay when I think about the sampler is indeed the clearer location. Furthermore the decorator is way cleaner. But if we remove it latter it might be not the best to add it so deeply in the framework.

Good point. Although it only shifts the convention for samplers to return lists of samples instead of single items, which is not so bad. Whether or not using a decorator is good is not so clear. For instance we don't have explicit type information about the batched samplers and would have to create a protocol or maybe wrap the samplers at runtime, although yuk... Anyway, we can think about that some other time.

kosmitive added enhancement New feature or request light task A light task design-problem Problems with internal architecture and removed design-problem Problems with internal architecture labels Sep 5, 2023

kosmitive linked a pull request Sep 5, 2023 that will close this issue

Batch (parallelized) calculations of beta Shapley #428

Merged

3 tasks

kosmitive mentioned this issue Sep 5, 2023

Batch (parallelized) calculations of beta Shapley #428

Merged

3 tasks

kosmitive self-assigned this Sep 5, 2023

mdbenito added this to the v0.7.1 milestone Sep 17, 2023

mdbenito closed this as completed in #428 Sep 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch (parallelized) calculations of beta Shapley #427

Batch (parallelized) calculations of beta Shapley #427

kosmitive commented Sep 5, 2023

mdbenito commented Sep 6, 2023 •

edited

Loading

kosmitive commented Sep 6, 2023 •

edited

Loading

mdbenito commented Sep 7, 2023

AnesBenmerzoug commented Sep 7, 2023

mdbenito commented Sep 17, 2023

mdbenito commented Sep 17, 2023 •

edited

Loading

kosmitive commented Sep 17, 2023 •

edited

Loading

mdbenito commented Sep 17, 2023

Batch (parallelized) calculations of beta Shapley #427

Batch (parallelized) calculations of beta Shapley #427

Comments

kosmitive commented Sep 5, 2023

mdbenito commented Sep 6, 2023 • edited Loading

kosmitive commented Sep 6, 2023 • edited Loading

mdbenito commented Sep 7, 2023

AnesBenmerzoug commented Sep 7, 2023

mdbenito commented Sep 17, 2023

mdbenito commented Sep 17, 2023 • edited Loading

kosmitive commented Sep 17, 2023 • edited Loading

mdbenito commented Sep 17, 2023

mdbenito commented Sep 6, 2023 •

edited

Loading

kosmitive commented Sep 6, 2023 •

edited

Loading

mdbenito commented Sep 17, 2023 •

edited

Loading

kosmitive commented Sep 17, 2023 •

edited

Loading