Skip to content

Interface only for new methods

Nikolay Mayorov edited this page Jul 10, 2015 · 7 revisions

If we abandon the idea of wrapping leastsq, then we can expose all necessary parameters and describe them in a unified manner in a single docstring.

The suggested interface:

least_squares(fun, x0, jac='2-point', bounds=(-np.inf, np.inf),
              method='trf', ftol=EPS**0.5, xtol=EPS**0.5, gtol=EPS**0.5,
              max_nfev=None, scaling=1.0, diff_step=None, tr_solver=None,
              jac_sparsity=None, lsmr_options={}, args=(), kwargs={}, options=None):

New parameters here:

  1. tr_solver : {None, 'exact', 'lsmr'}. Determines the trust-region subproblem solver. If None it is set to 'exact' for dense Jacobian and 'lsmr' for sparse (determined after the first call). About name 'lsmr'. For both methods it means that Gauss-Newton direction is sought by lsmr procedure. Then 'dogbox' operates in a usual manner, but 'trf' uses a 2-d subspace approach to solve a trust-region problem. Overall I think 'lsmr' is good universal name for both methods.

  2. jac_sparsity : array_like or sparse matrix. Determines sparsity structure of Jacobian for faster finite differencing (if jac is not provided). If None then dense differencing will be used. If set, triggers tr_solver to be lsmr if it was None originally.

Q: can one use a LinrearOperator for the Jacobian instead of this matrix of indicators? Will need to carefully work out possible combinations of parameters (exact solver + sparse jacobian, etc)

A: Yes, I considered such possibility. It's not difficult. Suggested logic of the method selection:

  1. If tr_solver is set explicitly, then the algorithm follows a user's choice. But if tr_solver='exact' and jac returns sparse matrix, it is converted to a dense array and a warning is raised, and if jac returns LinearOperator --- error is raised.
  2. If tr_solver is None, then if jac returns dense array on the first iteration, then 'exact' solver is chosen. If jac returns sparse matrix or LinearOperator then lsmr solver is chosen.

EB: Maybe keep it as simple as possible, and get rid of tr_solver=None. Also jac_sparsity: can one just require a LinearOperator for jac and avoid jac_sparsity altogether?

NM: Yes, we can drop None, if you think it's better. About jac_sparsity, I think there is some misunderstanding here. It is enhancement (crucial for sparse large-scale) for finite differencing (when jac is '2-point' or '3-point'), so your question should be: can we disable '2-point' and '3-point'? We can, but then the burden of doing computations will fall on a user and it's no good.

  1. tr_options : dict. Keyword parameters passed to lsmr. Some options of lsmr shouldn't be changed usually, but I think it's fine to give the full access. It's reasonable to change this parameter to lsmr_tol (float), which affects atol and btol, but changing maxiter also can be useful (and even damp). I prefer lsmr_options dict. EB: suggest renaming to tr_options or similar. The usage is similar to **options. NM: lsmr_options is more explicit, otherwise it's not clear what to put in this dict. And the connection is still here: tr_solver='lsmr', lsmr_options={btol=1e-10}.

EB: and then three years down the line somebody adds a new large-scale method which is alternative to LSMR...

NM: Good point. I think you are right, we can explain what options are expected for each tr_solver (give the link to lsmr). But please think about it some more.

Note that all parameters listed are relevant for both 'trf' and 'dogbox'.

EB: reinstating the method options dict, **options, it seems we can have both large-scale methods and LM.

NM: Yes, I think it's fine to keep it.

Clone this wiki locally