Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve mathematical programs in parallel #21957

Conversation

AlexandreAmice
Copy link
Contributor

@AlexandreAmice AlexandreAmice commented Sep 26, 2024

Closes #19119.


This change is Reviewable

Copy link
Contributor Author

@AlexandreAmice AlexandreAmice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would close #19119 . I have not yet written the tests nor the bindings yet, but am looking for feedback on the interface design before finalizing the tests.

cc @hongkai-dai, @jwnimmer-tri for design
cc @iamsavva, @cohnt since I know you are wanting this feature and I want to be sure these would feel natural to you both.

Reviewable status: 2 unresolved discussions, needs platform reviewer assigned, needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/BUILD.bazel line 733 at r1 (raw file):

    implementation_deps = [
        ":choose_best_solver",
        ":get_program_type",

nit: This is not useful.


solvers/solve.cc line 68 at r1 (raw file):

    DRAKE_THROW_UNLESS(
        solvers.at(solver_id)->AreProgramAttributesSatisfied(*(progs.at(i))));
  }

As far as I can tell, all of our solvers are reentrant thread-safe and so it suffices to have only one copy of each solver. I may be wrong about this, in which case I would want to have each thread have it's own list of solvers.

For the case of safety, it may be a good idea to do this anyway. Thoughts?

Code quote:

  std::unordered_map<SolverId, std::unique_ptr<SolverInterface>> solvers;
  for (int i = 0; i < ssize(progs); ++i) {
    // Identity the best solver for the program and store it in the map of
    // available solvers.
    const SolverId solver_id =
        solver_ids->at(i).value_or(ChooseBestSolver(*(progs.at(i))));
    if (!solvers.contains(solver_id)) {
      solvers.insert({solver_id, MakeSolver(solver_id)});
    }
    DRAKE_THROW_UNLESS(
        solvers.at(solver_id)->AreProgramAttributesSatisfied(*(progs.at(i))));
  }

@jwnimmer-tri jwnimmer-tri self-assigned this Sep 26, 2024
@cohnt
Copy link
Contributor

cohnt commented Sep 27, 2024

Very happy with how the interface looks -- this should do be able to do everything I'm looking for.

I'm not sure what dynamic scheduling is, but I'm guessing I shouldn't really worry about it?

@jwnimmer-tri
Copy link
Collaborator

(We should be sure the docs explain this.)

Static scheduling means if you have N cores, each core solves 1/N'th of the programs based on a fixed schedule (e.g., every N'th program on the list). This is most efficient (less overhead) in case the programs all take a comparable amount of time to solve.

Dynamic scheduling is a pool where each core takes the next unsolved program on the list. This works better in for programs that might take widely variable amount of time to solve, but has a higher overhead for each program (needs some kind of semaphore or memory barrier to figure out what's next on the list, and claim it).

See "Scheduling_clauses" here: https://en.wikipedia.org/wiki/OpenMP

@cohnt
Copy link
Contributor

cohnt commented Sep 27, 2024

Ah, in that case, it is relevant for me! I would appreciate this being included in the documentation in some form.

Copy link
Contributor

@cohnt cohnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/solve.cc line 112 at r1 (raw file):

    // Convert the initial guess into the requisite optional.
    std::optional<Eigen::VectorXd> initial_guess{std::nullopt};
    if (initial_guesses->at(i) != nullptr) {

Need to check if initial_guesses is a nullptr before trying to dereference it. How about

if (initial_guesses != nullptr && initial_guesses->at(i) != nullptr)

Code quote:

if (initial_guesses->at(i) != nullptr)

solvers/solve.cc line 146 at r1 (raw file):

      // Convert the initial guess into the requisite optional.
      std::optional<Eigen::VectorXd> initial_guess{std::nullopt};
      if (initial_guesses->at(i) != nullptr) {

Need to check if initial_guesses is a nullptr before trying to dereference it. How about

if (initial_guesses != nullptr && initial_guesses->at(i) != nullptr)

Code quote:

if (initial_guesses->at(i) != nullptr)

Copy link
Contributor

@cohnt cohnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/solve.cc line 119 at r1 (raw file):

    // Solve the program.
    MathematicalProgramResult* result_local{nullptr};

This should be a MathematicalProgramResult, and then passed into solvers.at(solver_id)->Solve(...). That function is expecting a pointer that it can store its data into -- it shouldn't be nullptr.

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions / suggestions on the API:

(1) Now that I see that SolverInterface::Solve is phrased in terms of optional instead of nullable, we should probably do the same here to avoid extra copying.

(2) For the parameters where we want to allow for a either single value (broadcast) or an array value (per program), instead of overloading we could use a variant. This would allow for the user to mix and match how they want to broadcast versus not.

SolveInParallel(
    ...,
    const std::variant<
        std::optional<SolverOptions>,
        std::vector<std::optional<SolverOptions>>
            >& solver_options = {},
    const std::variant<
        std::optional<SolverId>,
        std::vector<std::optional<SolverId>>,
            >& solver_ids = {},

Reviewable status: 10 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/solve.h line 54 at r1 (raw file):

 * the solver and the problem formulation. If solver_ids == nullptr then this is
 * done for every progs[i]. Uses at most parallelism cores, with dynamic
 * scheduling by default.

BTW As per the discussion above, we need to explain dynamic vs static scheduling more clearly. That might be best as a @param instead of inline in the overview here.


solvers/solve.h line 60 at r1 (raw file):

 * manner.
 *
 * @throws if initial guess and solver options are provided and not the same

typos

Suggestion:

initial_guesses or solver_options or solver_ids

solvers/solve.h line 75 at r1 (raw file):

    bool dynamic_schedule = true);

/**

nit As with code, we should not copy-paste documentation, because it's brittle and annoying to keep large sections of text identical over time as people change things.

If we keep this as two separate functions, we should change this one to explain its contract by citing the other overload and explaining what's different.


solvers/solve.cc line 68 at r1 (raw file):

Previously, AlexandreAmice (Alexandre Amice) wrote…

As far as I can tell, all of our solvers are reentrant thread-safe and so it suffices to have only one copy of each solver. I may be wrong about this, in which case I would want to have each thread have it's own list of solvers.

For the case of safety, it may be a good idea to do this anyway. Thoughts?

The right question isn't whether they currently happen to be thread safe, it's whether we want to document them as such, and thus write regression tests for all of them to guard that documented promise. Currently, only SNOPT has such a regression test.

It is probably easiest to not try to coalesce them in this first PR. We can always circle back and do that in a follow-up if we find that it would be meaningfully faster.

Note that for the non-coalesced design, we should not pre-allocate all 1000 solvers. Each inner work loop should create (and destroy) the solver it needs moment by moment.


solvers/solve.cc line 160 at r1 (raw file):

                                   &(results.at(i)));
    } else {
      results[i] = *results_parallel[i];

It's somewhat wasteful to copy around the results again. How about in the workers we solve directly into the final results, and have a separate bookkeeping vector for those which were unsafe? (Don't use vector<bool>.)


solvers/solve.cc line 178 at r1 (raw file):

  }));
  if (initial_guesses != nullptr) {
    DRAKE_THROW_UNLESS(progs.size() == initial_guesses->size());

nit The next three checks all have the LHS and RHS switched. The suspicious quantity being examined should appear on the LHS.

@jwnimmer-tri
Copy link
Collaborator

Two questions / suggestions on the API: ...

Actually, let me retract that suggestion for now. Doing variant probably increases the copying, and doing optional probably helps C++ callers but hurts Python callers. I need to pencil out the details more carefully.

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) Okay, I think actually we should stick with nullptr instead of optional and (eventually) change SolverBase to accept nullptr to save the extra copying from pydrake.

(2) Using variant is not efficient for C++ callers, so we should avoid that.

In short, I think the proposed two function signatures are good as-is.

Reviewable status: 11 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/solve.cc line 198 at r1 (raw file):

    const std::optional<SolverId>& solver_id, const Parallelism parallelism,
    bool dynamic_schedule) {
  DRAKE_THROW_UNLESS(std::all_of(progs.begin(), progs.end(), [](auto prog) {

There is no reason for using the Impl stuff and repeating these argument checks in both overloads. This overload can broadcast the solver_options into a std::vector and then call the other overload, like we already do the for the IDs.

Copy link
Contributor Author

@AlexandreAmice AlexandreAmice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look at that! I hope to get to this tomorrow.

Reviewable status: 11 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes


solvers/solve.cc line 68 at r1 (raw file):

Previously, jwnimmer-tri (Jeremy Nimmer) wrote…

The right question isn't whether they currently happen to be thread safe, it's whether we want to document them as such, and thus write regression tests for all of them to guard that documented promise. Currently, only SNOPT has such a regression test.

It is probably easiest to not try to coalesce them in this first PR. We can always circle back and do that in a follow-up if we find that it would be meaningfully faster.

Note that for the non-coalesced design, we should not pre-allocate all 1000 solvers. Each inner work loop should create (and destroy) the solver it needs moment by moment.

To avoid having to create and destroy each solver I was thinking about having the solvers stored as:

std::vector<std::unordered<map<SolverId, std::unique_ptr<SolverInterface>> solvers(num_threads);

The ith entry of this vector would be all the solvers that thread i has needed up until this moment. If it encounters a program which needs a solver that the ith thread has not yet used, then it constructs that solver, otherwise it just accesses the already made solver.

My reasoning is if I want to solve thousands of small programs, I don't want to be allocating the solver memory thousands of times.

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/solve.cc line 68 at r1 (raw file):

Previously, AlexandreAmice (Alexandre Amice) wrote…

To avoid having to create and destroy each solver I was thinking about having the solvers stored as:

std::vector<std::unordered<map<SolverId, std::unique_ptr<SolverInterface>> solvers(num_threads);

The ith entry of this vector would be all the solvers that thread i has needed up until this moment. If it encounters a program which needs a solver that the ith thread has not yet used, then it constructs that solver, otherwise it just accesses the already made solver.

My reasoning is if I want to solve thousands of small programs, I don't want to be allocating the solver memory thousands of times.

That approach seems safe, but I'm not sure it's necessary. I would suggest running a performance experiment and see whether it matters or not.

My mental model is that creating and destroyed solver instances is an order (or two) of magnitude cheaper than the other things that happen during solving (e.g., populating the results), so would not actually make much a different compared to how much extra complexity that would introduce in the implementation.

There is also the problem that certain solver instances (e.g., mosek) hold a license as a member field, so keeping them around longer than strictly necessary might actually make things worse (in case license seats are a limited resource).

Copy link
Contributor Author

@AlexandreAmice AlexandreAmice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes


solvers/solve.cc line 68 at r1 (raw file):

Previously, jwnimmer-tri (Jeremy Nimmer) wrote…

That approach seems safe, but I'm not sure it's necessary. I would suggest running a performance experiment and see whether it matters or not.

My mental model is that creating and destroyed solver instances is an order (or two) of magnitude cheaper than the other things that happen during solving (e.g., populating the results), so would not actually make much a different compared to how much extra complexity that would introduce in the implementation.

There is also the problem that certain solver instances (e.g., mosek) hold a license as a member field, so keeping them around longer than strictly necessary might actually make things worse (in case license seats are a limited resource).

The license seats are actually an excellent reason why I shouldn't keep the solvers around and in fact reveals a subtlety I had not considered.

Say that an organization has two licenses to (e.g. Mosek) and they attempt to solve in parallel using 16 threads. If I create 16 solvers, this will crash their code as they don't have enough licenses to solve their problems. However, if I created only one solver and re-used it across all the threads, then they can safely use 16 threads.

Should I:
1.) Only create as many solvers of a particular kind as their are licenses? How would I even do this without a try-catch?
2.) Allow the error and throw a better error message?
3.) Stick with the current design and test our solvers for the promise that they are re-entrant thread safe?

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/solve.cc line 68 at r1 (raw file):

However, if I created only one solver and re-used it across all the threads, then they can safely use 16 threads.

I suppose they could use 16 threads in some sense, but I think only one at a time? As I understand it, each (e.g.) mosek seat buys you one concurrent call to (e.g.) MSK_optimizetrm. They would be probably better served by using a serial solve and giving MOSEK itself all 16 cores.

My initial thought is that the user should pass in a parallelism = N which does not exceed the number of seats they have purchased. Then, so long as each thread has at most 1 solver at a time, we know that the seats will never be exceeded.

Copy link
Contributor Author

@AlexandreAmice AlexandreAmice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a complete PR now. Are there any volunteers for a thorough feature review?

Reviewable status: 4 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/solve.cc line 68 at r1 (raw file):

Previously, jwnimmer-tri (Jeremy Nimmer) wrote…

However, if I created only one solver and re-used it across all the threads, then they can safely use 16 threads.

I suppose they could use 16 threads in some sense, but I think only one at a time? As I understand it, each (e.g.) mosek seat buys you one concurrent call to (e.g.) MSK_optimizetrm. They would be probably better served by using a serial solve and giving MOSEK itself all 16 cores.

My initial thought is that the user should pass in a parallelism = N which does not exceed the number of seats they have purchased. Then, so long as each thread has at most 1 solver at a time, we know that the seats will never be exceeded.

I added a note to this effect in the documentation. I kept the cached map scheme in the code however.


solvers/solve.cc line 112 at r1 (raw file):

Previously, cohnt (Thomas Cohn) wrote…

Need to check if initial_guesses is a nullptr before trying to dereference it. How about

if (initial_guesses != nullptr && initial_guesses->at(i) != nullptr)

Done.


solvers/solve.cc line 119 at r1 (raw file):

Previously, cohnt (Thomas Cohn) wrote…

This should be a MathematicalProgramResult, and then passed into solvers.at(solver_id)->Solve(...). That function is expecting a pointer that it can store its data into -- it shouldn't be nullptr.

Done.


solvers/solve.cc line 146 at r1 (raw file):

Previously, cohnt (Thomas Cohn) wrote…

Need to check if initial_guesses is a nullptr before trying to dereference it. How about

if (initial_guesses != nullptr && initial_guesses->at(i) != nullptr)

Done.

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we come to an agreement on the solver instance caching (see below), my plan is to push a bunch of cleanups to the implementation for simplicity, conciseness, and reduced copying.

Reviewable status: 5 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/solve.cc line 68 at r1 (raw file):

Previously, AlexandreAmice (Alexandre Amice) wrote…

I added a note to this effect in the documentation. I kept the cached map scheme in the code however.

The solver caching is a plausible design, but a handful of messages upthread I talked about if we keep the caching then means we need to lay a lot more groundwork first -- write a regression test for each one of our solvers, checking that it is threadsafe (and having that pass under all of our CI flavors). Currently only SNOPT has such a test. I still think it would be easier for this first spin not to cache. The only possible slow thing about create-destroy is the license check for two of the proprietary solvers, which users can (for now) work around by calling AcquireLicense themselves before doing anything in their app, in case it's a real problem. They would need to do that anyway, in case they are calling a higher-layer algorithm which itself calls SolveInParallel more than once.


solvers/solve.cc line 113 at r3 (raw file):

    }
    const SolverInterface* solver = solvers.at(thread_num).at(solver_id).get();
    DRAKE_THROW_UNLESS(solver->AreProgramAttributesSatisfied(*(progs.at(i))));

nit We don't need this check. The SolverBase already checks for this and throws, with a much nicer message.

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, with a closer look, this is ready for me to push/propose some changes. I'll try to get that by tomorrow.

Reviewable status: 4 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


solvers/solve.cc line 68 at r1 (raw file):

Previously, jwnimmer-tri (Jeremy Nimmer) wrote…

The solver caching is a plausible design, but a handful of messages upthread I talked about if we keep the caching then means we need to lay a lot more groundwork first -- write a regression test for each one of our solvers, checking that it is threadsafe (and having that pass under all of our CI flavors). Currently only SNOPT has such a test. I still think it would be easier for this first spin not to cache. The only possible slow thing about create-destroy is the license check for two of the proprietary solvers, which users can (for now) work around by calling AcquireLicense themselves before doing anything in their app, in case it's a real problem. They would need to do that anyway, in case they are calling a higher-layer algorithm which itself calls SolveInParallel more than once.

Oops, I take it back. I misread the code -- it has a separate cache per thread. That's fine as-is.

AlexandreAmice and others added 6 commits October 4, 2024 11:41
* remove the pointless impl indirection
* remove non-GSG const in header
* don't pay the cost of broadcasting null options
fix bug where `parallelism` was not obeyed in the serial section
Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed some changes. Please have a look.

Dismissed @cohnt from 3 discussions.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits), missing label for release notes (waiting on @AlexandreAmice)


bindings/pydrake/solvers/test/mathematicalprogram_test.py line 1530 at r4 (raw file):

                         mp.SolutionResult.kSolutionResultNotSet)

    def test_solve_in_parallel(self):

This is missing test coverage where an argument is non-None but its list contains a mixture of i'th-specific values and None.

I think it would be fine just to have the hard-coded initial_guesses and solver_ids and solver_options lists have 1 or 2 null elements each. We don't really need to vary our test cases to have different amounts of nulls, we just need coverage of at least one mixture.

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)


tools/workspace/coinutils_internal/defs.bzl line 126 at r17 (raw file):

            # On Clang 12, "-w" doesn't suppress this for some reason.
            "-Wno-register",
            # The Coin family of software uses NDEBUG for gross things, not

Working

Needs #22063 to merge first. (This change is due to that PR, so should not be reviewed here.)

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r18, all commit messages.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)

@jwnimmer-tri
Copy link
Collaborator

+@calderpg-tri for feature review, please.

Copy link
Contributor

@calderpg-tri calderpg-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 16 files at r8, 1 of 4 files at r13, 10 of 13 files at r14, 1 of 2 files at r15, 1 of 5 files at r17, 1 of 1 files at r18, 1 of 2 files at r19, 1 of 2 files at r20, all commit messages.
Reviewable status: 5 unresolved discussions, LGTM missing from assignees jwnimmer-tri(platform),calderpg-tri, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)


solvers/solve.h line 83 at r20 (raw file):

std::vector<MathematicalProgramResult> SolveInParallel(
    const std::vector<const MathematicalProgram*>& progs,
    const std::vector<const Eigen::VectorXd*>* initial_guesses,

If these are optional, wouldn't it make sense to default them to nullptr?


solvers/solve.cc line 80 at r20 (raw file):

      solvers(parallelism.num_threads());

  // The worker lambda behaves slightly differently depending on whether we are

Something you may want to do here to improve clarity would be use a couple of lambdas here so the call sites are clearer

const auto solve_ith = [&](const bool in_parallel, const int thread_num, const int64_t i) {
  ...
};

const auto solve_ith_parallel = [&](const int thread_num, const int64_t i) {
  solve_ith(/* in parallel */ true, thread_num, i);
};

const auto solve_ith_serial = [&](const int64_t i) {
  solve_ith(/* in parallel */ false, /* thread_num */ 0, i);
};

solvers/solve.cc line 107 at r20 (raw file):

    }

    // Find (or create) the specified solver.

btw In addition to the {Dynamic,Static}ParallelForIndexLoop, CRU also provides {Dynamic,Static}ParallelForRangeLoop which (particularly in the static case) would allow you to reduce the frequency of some lookups to once per range rather than once per index. If you went that route, you would probably end up with a structure like

const auto solve_range = [&](const ThreadWorkRange& work_range) {
  // Get solvers for thread.

  // Do any common SolverOptions config.

  for (int64_t i = work_range.GetRangeStart(); i < work_range.GetRangeEnd(); ++i) {
    if (not_safe_in_parallel(i)) {
      continue;
    }

    // Do per-index SolverOptions config.

    // Solve.
  }
};

solvers/solve.cc line 149 at r20 (raw file):

  // Call solve_ith in parallel for all of the progs.
  if (parallelism.num_threads() > 1) {
    (dynamic_schedule ? &(DynamicParallelForIndexLoop<decltype(solve_ith)>)

I think it would be clearer to write this more simply like

if (dynamic_schedule) {
  DynamicParallelForIndexLoop(...);
} else {
  StaticParallelForIndexLoop(...);
}

for only two cases the duplication isn't a problem.


solvers/solve.cc line 162 at r20 (raw file):

  // Finish solving the programs that couldn't be solved in parallel.
  operating_in_parallel = false;
  for (int i = 0; i < ssize(progs); ++i) {

It's a little odd to run the index through three different types (int, size_t, int64_t). This could instead be

for (size_t i = 0; i < progs.size(); ++i) {
  if (!result_is_populated[i]) {
    solve_ith(/* thread_num */ 0, ssize(i));
  }
}

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r19, 1 of 2 files at r20, all commit messages.
Reviewable status: 5 unresolved discussions, LGTM missing from assignees jwnimmer-tri(platform),calderpg-tri, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)


solvers/solve.h line 83 at r20 (raw file):

Previously, calderpg-tri wrote…

If these are optional, wouldn't it make sense to default them to nullptr?

That would lead to overload ambiguity with the next function, in case the user only passed progs.

Copy link
Contributor

@calderpg-tri calderpg-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r17.
Reviewable status: 6 unresolved discussions, LGTM missing from assignees jwnimmer-tri(platform),calderpg-tri, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)


bindings/pydrake/solvers/solvers_py_mathematicalprogram.cc line 1687 at r20 (raw file):

      .def("GetProgramType", &solvers::GetProgramType, doc.GetProgramType.doc)
      .def(
          "SolveInParallel",

I believe both of these bindings should get py::call_guard<py::gil_scoped_release>() to avoid potential deadlocks

Copy link
Contributor

@calderpg-tri calderpg-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 unresolved discussions, LGTM missing from assignees jwnimmer-tri(platform),calderpg-tri, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)


solvers/solve.h line 83 at r20 (raw file):

Previously, jwnimmer-tri (Jeremy Nimmer) wrote…

That would lead to overload ambiguity with the next function, in case the user only passed progs.

Ah, true.

Copy link
Contributor Author

@AlexandreAmice AlexandreAmice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 unresolved discussions, LGTM missing from assignees jwnimmer-tri(platform),calderpg-tri, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits)


solvers/solve.cc line 80 at r20 (raw file):

Previously, calderpg-tri wrote…

Something you may want to do here to improve clarity would be use a couple of lambdas here so the call sites are clearer

const auto solve_ith = [&](const bool in_parallel, const int thread_num, const int64_t i) {
  ...
};

const auto solve_ith_parallel = [&](const int thread_num, const int64_t i) {
  solve_ith(/* in parallel */ true, thread_num, i);
};

const auto solve_ith_serial = [&](const int64_t i) {
  solve_ith(/* in parallel */ false, /* thread_num */ 0, i);
};

Done.


solvers/solve.cc line 107 at r20 (raw file):

Previously, calderpg-tri wrote…

btw In addition to the {Dynamic,Static}ParallelForIndexLoop, CRU also provides {Dynamic,Static}ParallelForRangeLoop which (particularly in the static case) would allow you to reduce the frequency of some lookups to once per range rather than once per index. If you went that route, you would probably end up with a structure like

const auto solve_range = [&](const ThreadWorkRange& work_range) {
  // Get solvers for thread.

  // Do any common SolverOptions config.

  for (int64_t i = work_range.GetRangeStart(); i < work_range.GetRangeEnd(); ++i) {
    if (not_safe_in_parallel(i)) {
      continue;
    }

    // Do per-index SolverOptions config.

    // Solve.
  }
};

I think I am okay with the for index loop. Unless you think that there is a substantial impact on performance the current code reads fairly well.


solvers/solve.cc line 149 at r20 (raw file):

Previously, calderpg-tri wrote…

I think it would be clearer to write this more simply like

if (dynamic_schedule) {
  DynamicParallelForIndexLoop(...);
} else {
  StaticParallelForIndexLoop(...);
}

for only two cases the duplication isn't a problem.

Done.


solvers/solve.cc line 162 at r20 (raw file):

Previously, calderpg-tri wrote…

It's a little odd to run the index through three different types (int, size_t, int64_t). This could instead be

for (size_t i = 0; i < progs.size(); ++i) {
  if (!result_is_populated[i]) {
    solve_ith(/* thread_num */ 0, ssize(i));
  }
}

Done.


bindings/pydrake/solvers/solvers_py_mathematicalprogram.cc line 1687 at r20 (raw file):

Previously, calderpg-tri wrote…

I believe both of these bindings should get py::call_guard<py::gil_scoped_release>() to avoid potential deadlocks

Done.

Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@SeanCurtis-TRI for platform review per schedule, please.

:lgtm: from me as co-author.

Reviewed 2 of 2 files at r21, all commit messages.
Reviewable status: 5 unresolved discussions, LGTM missing from assignees calderpg-tri,SeanCurtis-TRI(platform), commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)


bindings/pydrake/solvers/solvers_py_mathematicalprogram.cc line 1687 at r20 (raw file):

Previously, AlexandreAmice (Alexandre Amice) wrote…

Done.

Yes, very important! Good catch.

Copy link
Contributor

@calderpg-tri calderpg-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r6, 1 of 2 files at r19, 1 of 2 files at r20, 2 of 2 files at r21, all commit messages.
Reviewable status: 1 unresolved discussion, LGTM missing from assignees SeanCurtis-TRI(platform),calderpg-tri, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)


solvers/test/solve_in_parallel_test.cc line 239 at r21 (raw file):

with Drake CI's sanitizers and memory checkers, this would flag memory or thread
errors seen during the parallel solves. */
class SolveInParallelIntegrationTest : public ::testing::Test {

As far as I can tell, these tests don't actually cover the case where Parallelism::None() is used, or cover a case where some programs can be solved in parallel and some in serial. You probably don't need a whole test fixture to test all solvers in serial, but a couple cases are necessary.

Copy link
Contributor Author

@AlexandreAmice AlexandreAmice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, LGTM missing from assignees SeanCurtis-TRI(platform),calderpg-tri, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits)


solvers/test/solve_in_parallel_test.cc line 239 at r21 (raw file):

Previously, calderpg-tri wrote…

As far as I can tell, these tests don't actually cover the case where Parallelism::None() is used, or cover a case where some programs can be solved in parallel and some in serial. You probably don't need a whole test fixture to test all solvers in serial, but a couple cases are necessary.

Done. Good catch.

In the testing of diverse program, I cover the case where some programs can be solved in parallel and other must be solved serially.

I changed the test fixture to test both for Parallelism::Max() and Parallelism::None()

Copy link
Contributor

@calderpg-tri calderpg-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r22, all commit messages.
Reviewable status: LGTM missing from assignee SeanCurtis-TRI(platform), commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:LGTM: With some small notes

Reviewed 1 of 2 files at r6, 1 of 16 files at r8, 1 of 4 files at r13, 10 of 13 files at r14, 1 of 2 files at r15, 1 of 5 files at r17, 1 of 1 files at r18, 1 of 2 files at r19, 1 of 2 files at r20, 2 of 2 files at r21, 1 of 1 files at r22, all commit messages.
Reviewable status: 9 unresolved discussions, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @AlexandreAmice)


solvers/solve.h line 51 at r22 (raw file):

 * solver_options[i] if given, by invoking the solver at solver_ids[i] if
 * provided.  If solver_ids[i] is nullopt then the best available solver is
 * selected for each progs[i] individually depending on the availability of

nit editorial

The spelling solver_ids[i] implies we're talking about a single choice which corresponds to a single progs[i].

Suggestion:

selected for progs[i] 

solvers/solve.h line 78 at r22 (raw file):
BTW Subtle rhetorical thought.

Would you consider solver_ids[i] = nullptr (or solver_ids = nullptr) as having "specified" solver_ids[i]? Alternative language:

if any program cannot be solved.

Also, the language of using the subscript i seems to suggest we're missing the phrase, "for any valid i" (or some such thing). In other words, if one program has a problem, an exception messes everything up.

Code quote:

specified

solvers/test/solve_in_parallel_test.cc line 42 at r22 (raw file):

  std::vector<std::unique_ptr<MathematicalProgram>> progs;
  std::vector<const MathematicalProgram*> prog_ptrs;
  std::vector<const Eigen::VectorXd*> initial_guesses;

BTW Given the API that allows you to specify one solver id and one solver options, why not use that API in this test?


solvers/test/solve_in_parallel_test.cc line 59 at r22 (raw file):

  symbolic::Variable x("x", symbolic::Variable::Type::BINARY);
  symbolic::Variable y("y");
  symbolic::Variable z("y");

nit: Two variables with the same symbol, "y"? Surely a typo, yes?

Code quote:

("y"

solvers/test/solve_in_parallel_test.cc line 63 at r22 (raw file):

  vars << x, y, z;

  int num_trials = 10;

nit: const (here and below).


solvers/test/solve_in_parallel_test.cc line 65 at r22 (raw file):

  int num_trials = 10;
  for (int i = 0; i < num_trials; i++) {
    progs.push_back(std::make_unique<MathematicalProgram>());

BTW As i read this, every program is identical. You're just creating the same program over and over. In a test below, you simply copy a single program pointer N times. Why not do that here? Why create unique instances?


solvers/test/solve_in_parallel_test.cc line 101 at r22 (raw file):

  std::vector<const SolverOptions*> solver_options;
  std::vector<std::optional<SolverId>> solver_ids;
  int num_trials = 10;

nit: const


solvers/test/solve_in_parallel_test.cc line 202 at r22 (raw file):

  // if certain costs/constraints are flagged for different thread safety in the
  // future, this test will fail and need to be updated.
  EXPECT_TRUE(convex_prog.IsThreadSafe());

nit: This should probably ASSERT_TRUE? Does it make sense to proceed if any of these flags fail?


solvers/test/solve_in_parallel_test.cc line 209 at r22 (raw file):

  std::vector<const MathematicalProgram*> progs;
  std::vector<std::optional<SolverId>> solver_ids;
  int num_trials = 10;

nit: const

Copy link
Contributor Author

@AlexandreAmice AlexandreAmice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits)


solvers/test/solve_in_parallel_test.cc line 42 at r22 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

BTW Given the API that allows you to specify one solver id and one solver options, why not use that API in this test?

Good call.

@AlexandreAmice AlexandreAmice added the status: squashing now https://drake.mit.edu/reviewable.html#curated-commits label Oct 24, 2024
Copy link
Contributor Author

@AlexandreAmice AlexandreAmice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+(status: squashing now)

Waiting for CI if everyone is satisfied. Thank you everyone for your help!

Reviewable status: :shipit: complete! all discussions resolved, LGTM from assignees jwnimmer-tri(platform),SeanCurtis-TRI(platform),calderpg-tri

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r23, all commit messages.
Reviewable status: :shipit: complete! all discussions resolved, LGTM from assignees jwnimmer-tri(platform),SeanCurtis-TRI(platform),calderpg-tri


solvers/solve.h line 51 at r22 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

nit editorial

The spelling solver_ids[i] implies we're talking about a single choice which corresponds to a single progs[i].

You have to delete the word "each" as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release notes: feature This pull request contains a new feature status: squashing now https://drake.mit.edu/reviewable.html#curated-commits
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Solve Mathematical Program in Parallel
6 participants