Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing extra (constant) arguments to differentiated function #28

Closed
papamarkou opened this issue Dec 12, 2015 · 17 comments
Closed

Allowing extra (constant) arguments to differentiated function #28

papamarkou opened this issue Dec 12, 2015 · 17 comments

Comments

@papamarkou
Copy link

@fredo-dedup I started familiarizing myself with your package, and it is really great. I'd like to suggest a feature. For now, ReverseDiffSource works well with one argument:

using ReverseDiffSource

f(x::Vector) = x[1]^3+x[2]^2

g = rdiff(f, ([1., 2.],))

g([2., 2.])

This is great but for Lora there is one more important use case. We may want to define a log-target f with more than one arguments. For example, we may want to define

f(x::Vector, y) = x[1]^3+x[2]^2+y

Differentiation will happen with respect to the first input argument x of f only. Subsequent arguments to f, after the first one, will be ignored as far as autodiff goes. This way one may pass additional data to f, which is useful in case of hyper-parameters, data and Gibbs.

Would it be possible to consider extending ReverseDiffSource this way? If this is possible, it would be nice to make this functionality available both for expressions and function input. to rdiff().

@papamarkou
Copy link
Author

P.S. Trying to generalize, there are two possible cases:

f(x::Vector, y) = x[1]^3+x[2]^2+y
f(x::Vector) = x[1]^3+x[2]^2+y

The second case involves closures. Both cases have one aspect in common, that of passing arguments (such as y) that should be ignored during differentiation.

@papamarkou
Copy link
Author

One more thing, what I am actually trying to do ultimately is to use a functor as in

immutable LogPrior
  f::Function
  y::Vector
end

call(lp::LogPrior, x::Vector) = lp.f(x, lp.y)

function f(x, y)
  x+y
end
y = [1., 2.5]
logprior = LogPrior(f, y)

logprior([2., 5.])

But then the following fails:

julia> using ReverseDiffSource

julia> rdiff(logprior, ([1., 2.],))
ERROR: MethodError: `rdiff` has no method matching rdiff(::LogPrior, ::Tuple{Array{Float64,1}})
Closest candidates are:
  rdiff(::Function, ::Tuple)
  rdiff(::Any)

@fredo-dedup
Copy link
Contributor

Good improvement ideas. Noted.

@papamarkou
Copy link
Author

Thanks a lot @fredo-dedup. It will become a bit clearer how this can be interfaced with Lora in the next week or so, as I am working on changing slightly function signatures in parameter methods to take a single argument using closures; we would still need to ignore some parameters when doing the differentiation. I will be able to be a bit more concrete with examples in a few days, once I make the needed changes...

@papamarkou
Copy link
Author

I know what is needed by now for Lora; a simple example below:

using ReverseDiffSource

rdiff( :(p^3+y) , p=2.)

We want to be able to able to tell ReverseDiffSource to differentiate with respect to p and to ignore y. In this example, p could be a parameter, whereas y could be a data-related variable whose gradient is not needed. Apologies for the several issues @fredo-dedup, I only referenced them and elaborated on them to facilitate their tracking. Please feel free to sort them out at your own time and pace.

@papamarkou
Copy link
Author

@fredo-dedup in order to better clarify what I meant, you may have a look at the example in Lora.jl/doc/examples/swiss/MALA.jl. The likelihood there takes two input arguments:

function ploglikelihood(p::Vector{Float64}, v::Vector)
  Xp = v[2]*p
  dot(Xp, v[3])-sum(log(1+exp(Xp)))
end

So we want to differentiate with respect to p and ignore the second input argument v (cell array). I added this comment here so that it can help understand what we want to achieve, for future reference.

@papamarkou
Copy link
Author

Hi, @fredo-dedup, I wanted to ask if this issue is part of your "todo" list or part of your near future plans. If you have a lot of work and can't get on top of it any time soon, may I suggest we make an effort to submit an idea for GSoC so that a student can push forward this issue as well as ReverseDiffSource? If you prefer to work on it at your own time though, it is fine of course. cc @mlubin

@fredo-dedup
Copy link
Contributor

I confirm it is still on my todo list.

It doesn't seem to me to be big enough for a GSoC though, but I don't know much about the kind of projects that are typical.

While we are at it: there are two ways I can think of to indicate that we do not want to differentiate with respect to one or several of the variables : 1) add an option to rdiff to indicate those variables or 2) require the user to build a closure setting the values of the variable and ask rdiff to find the value in the function environment (see example given in issue #24) (reminder: rdiff needs sample values for the variable to find the correct derivation rules and do simplifications).

Solution 1) seems more generic and require less efforts from the user. Is this what you also had in mind ? Having 2) would be nice too, there could be use cases for that. But it seems like a convoluted way to solve this issue.

@papamarkou
Copy link
Author

Yes, @fredo-dedup, what you say makes sense, this specific issue seems too small to fill in a whole GSoC project, esp if it is part of your todo list in the long run. Besides, it looks you have developed your package well, so perhaps thinking of a GSoC project is a stretch.

I also agree that solution 2 seems a bit more convoluted, solution 1 seems more transparent to me as a user. If solution 1 is easy to implement (and given that we both seem to find it a cleaner solution if not mistaken), it sounds the best way to proceed.

@fredo-dedup
Copy link
Contributor

I have updated the devl branch with a version of rdiff that takes an ignore keyword arg specifying an exclusion list of variables for derivation:

function ploglikelihood(p::Vector{Float64}, v::Vector)
  Xp = v[2]*p
  dot(Xp, v[3])-sum(log(1+exp(Xp)))
end

args = (ones(3), Any[1.,2.,[3.,2.,-2]])

dplog = m.rdiff( ploglikelihood, args, ignore=[:v;])
dplog(ones(3), Any[1.,2.,[3.,2.,-2]])

I'll push it to master and tag it if it solves your issue @Scidom.

@papamarkou
Copy link
Author

Thanks, @fredo-dedup! Yes, this would allow me to tackle all user cases and integrate completely your reverse autodiff with Lora! It would be great to push it to master and tag it.

One quick question; from a user's perspective, the args values don't matter? Is it only the types of values in args that matter?

@fredo-dedup
Copy link
Contributor

Unfortunately, the values are also required because I need to have the type of all the downstream variables and to do that I evaluate all the AST.
It might be doable to use the type inference that Julia already has but I haven't explored that possibility. I fear that inference would not work well enough in too many cases in practice to have ReverseDiffSource completely depend on it.
Since the user needs a starting point in the case of MCMC, I reasoned that there always was values available. But it may be burdensome in other contexts.

@papamarkou
Copy link
Author

Sure, @fredo-dedup, I only asked to enhance my understanding and to make sure I don't make any mistakes when I interface ReverseDiffSource with Lora, I didn't imply that a further change is needed in ReverseDiffSource (what you did above is great and solves the issue I had initially).

So to be clear, in practice I need to ensure that args in your example is set to the initial values of the MCMC simulation, that's what you mean, right?

@fredo-dedup
Copy link
Contributor

Ok. To directly answer your question : yes, values are required, but no, their actual value do not matter, as long as the type is correct and the function/expression is evaluable at those values (i.e. in the support of the function). The initial condition of the MCMC will, by necessity, fit those constraints, but other values can, of course, be used.

@papamarkou
Copy link
Author

Thanks, @fredo-dedup, it makes sense now. Then the way you have coded this initialization is rather flexible and easy to cope with when calling ReverseDiffSource. Looking forward to the tagging!

@fredo-dedup
Copy link
Contributor

Solved in v0.2.2 (registration pending)

@papamarkou
Copy link
Author

Thanks @fredo-dedup! I am very excited about this update on ReverseDiffSource, it means it will be possible to reverse diff in Lora with non-trivial examples and with MCMC with higher order derivatives computed automatically (for ex SMMALA). Appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants