-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multidimensional learning #89
Conversation
@tcstewar You removed your assignment, but the PR is still marked as WIP. Is it ready for review? Or collaboration? |
Another thing TODO:
|
52b9c46
to
0811e2a
Compare
I'll rebase it to master now that the manual decoders are merged in, then take a pass through the #22 comments, and then it should be ready for review. :) |
bcb2a3d
to
c1b46f0
Compare
The last commit adds a test of having multiple learning connections at once. It seems to mostly work, but doesn't converge to the target values and I'm not sure why. If someone else could take a look at it before the workshop, that'd be great! |
So adding some plotting to that test, I get the following in the emulator test_learning.test_multiple_pes_emu.pdf and this on hardware test_learning.test_multiple_pes.pdf One difference is that they change at different rates. That is likely down to a mismatch in learning rates that we probably need to do a better job of matching. It may also be partly due to the fact that error values are clipped to the [-1, 1] range. The more salient difference is that after a steady, predictable climb toward the target value, the line starts going haywire. This becomes more apparent when you run the network for longer, or with a high learning rate. The fact that the lines all start moving in a predictable manner tells me that it is not likely a problem with the PES implementation (i.e., the delta is being calculated correctly and is being applied to the right synapses, etc etc). I think the problem is most likely some value over or underflowing its discretized range; this has happened before (and is why we introduced the checks discussed in #88). I do not get any warnings raised in the emulator, so the over/underflowed quantity is not U or V; most likely it's the weights themselves. In the emulator, we currently use In the long term, we should definitely think about ways in which we can mitigate this problem. I'll make an issue to start thinking about that. In the short term, I'm thinking if that the initial function that the learned connection represents results in larger decoders, then we should do a much better job of discretizing. I'll try doing that now. |
Modifying the function (actually in this case just removing the Emu: Hardware: Also note that they both converge significantly faster. I suspect this might happen on reference Nengo too because the weights might have less distance to travel because they don't start near zero. But, it may also be the case that the range discretization results in the same weight updates having a larger effect than they would have in reference Nengo because weights are being pushed farther as a result of the same delta. In any case, I'll add some asserts and push the now working test. I'll also make a separate issue to figure out the best way to inform users about this issue. |
3f7bbd4
to
6eab5e3
Compare
Awesome! I completely didn't think of that weight scaling issue.... makes a lot of sense in hindsight, but I was not thinking in that direction at all..... Very good thing to know for the learning tutorials too. :) The commit looks good to me. :) Thank you! |
It would be nice if we could replicate that chip learning behaviour in the emulator, before we got rid of |
Yeah I'll make an issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a commit with mostly style fixes. With that, LGTM!
One thing that I tried was to switch the initial function in test_pes_comm_channel
from returning 0 to returning -x
, but when I did that all the parametrizations failed with U
overflow errors. So, I think the initial function is a super critical thing to play around with in learning networks.
I'll make a few issues out of the things discovered in this PR, then squash all the commits down to one and merge. Will probably take a little while, so if anyone has objections feel free to raise them in the next hour or so!
The actual communication protocal is not as efficient as it could be, but this work properly. The choice of the function computed across the learned connection prior to learning turns out to have a huge effect on the behavior of the network. If the weights are initially much smaller than they will be post-learning, the weights can easily over/underflow. Improving the weight discretization, however, is left for future work. This commit also removes some writes that used to be needed to close the snip successfully, but they appear to be unnecessary now.
3c5b7e5
to
19746b1
Compare
This PR fixes multidimensional learning. It is based off of
move-manual-decoders-onchip
, but only because that had a nice test that fit for this PR.All this does is fix the snip code and the communication code such that all the error data gets sent across and applied. It doesn't change the emulator at all, since it was already working in the emulator.
The communication protocol is not wonderful. The message format is
[core_id, n_vals, val_0, val_1, ... val_n-1]
, so the first two values are the same every time a message is sent (this isn't much worse than the old format, which was[core_id, val_0]
). Optimizing that will be a different PR, and will interact in interesting ways with #26 . But this is a definite improvement in the meantime.This is heavily based on, but replaces #22 (the rebasing got too complicated for me so I started this one fresh).
TODO