-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modelling Card Correlations #27
Comments
Hey, I wanted to let you know I haven't forgotten about this and am thinking about this thoughtful comment! |
Thats good to know! I am excited to hear your opinion, whenever you find the time. I have continued to think about this and found an issue, for which I have not yet found an intuitive solution: Timely decay of correlationsIn my rough sketch above I described how the correlation distributions are updated (and our estimate of the correlation hopefully is improved) due to correlating review results of cards. But this model has the build-in assumption, that the true correlation is stationary with respect to time. I believe that is not true, but I am still uncertain, whether this assumption would be harmful in any real world application. It may very well be the case, that the effect of card correlation induced parameter change is much smaller, than the parameter change due to reviewing. If most correlations means are around the single digit percent size, their effect on the cards scheduling is really more fine tuning. As I dont think correlations are stationary, and modeling this decay and deriving its parameters from the reviews could be hard, I think it would be the best to ignore this issue at the beginning. Again I would be interested in your opinion, whether the assumption of stationary correlations could be harmful? |
Hi! I almost don't know anything about bayesian inference, so I can't help in that regard. But I too wanted this feature. I had certain insights about this, and I thought perhaps I should share them, in case it might help. Suppose we have two quiz items: A and B. And that item A is related to item B in some way. For example, item A could be a certain sentence, with a certain word in it, and item B is a quiz on that word alone. And if you could answer quiz item A correctly, then it might mean that there's a chance that the quiz for item B was also answered at that moment. That is, by quizzing on item A, there's a chance that the answer for item B would also be remembered at that instance. If by quizzing on item A, the answer for item B was also remembered, then isn't that also practically the same as quizzing on item B? In other words, the model could be, the probability that quizzing on item A correctly would also quiz on item B correctly. But that's just the case for strengthening, there's also the case for weakening or interference: the probability that answering item A correctly would also answer item B incorrectly if quizzed on afterwards. Now that I have given you the general idea, let's move on to the many possible ways that item A could affect item B. There are 3 possible causes for item A to affect item B:
Now, for each of the above possible causes, there are 3 possible outcomes:
Therefore, we actually have 9 possible states regarding the relationship between quiz items A and B, and they should all be regarded as probabilistic. We could even deduce certain conclusions based on the current most probable state, e.g., if the state was cause 3, effect 1, then it could be that the question portion of item A is the one affecting item B. Notice also that the relationship described above is one-way only, i.e., quizzing on item A affects item B (and not the other way around). Now, we could then have a brand new API function: So after calling As a bonus, by designing the API this way, we could have situations where item B doesn't affect item A at all, even if item A could affect item B in some way. Again, I almost don't know anything about bayesian inference (and my knowledge of bayesian probability is not comprehensive), but I hope all these help and provide some useful insights. |
Hey fashia,
I recently read your ebisu article again and really enjoyed it! I would really be interested, whether it really reduces the review load empirically?
Card Correlations
But to the main matter of this issue, one aspect which should influence recall of cards, are correlations between them. For example an
I am no export in memory, but I believe one such correlation effect is called interference.
Card Correlations within native Ebisu
So my first question would be, whether I understood it correctly, that Ebisu could not model such correlation events, because a correct answer to a card B solely due to a correlation with a previously reviewed card A, would wrongly solely attributed to the card B?
Later review of the card B without the previous cue of card A would lead to an unexpected failure.
Bayesian model of card correlations
Therefore, I was wondering, whether it would be possible to model card correlations within your bayesian framework?
Correlation Modelling
A very rough sketch of this model could work like this:
For each (directed) card pair a prior zero mean distribution is initialized and when a cards changes its estimates (interval/recall probability) the previous changes of other cards are used to try to estimate posterior correlation distributions to these respective cards. Obvisiously larger time differences between these interval/recall probability changes should produce smaller changes to the prior.
Another way to see this would be, that cards form nodes in a graph, and the correlation distributions are directional edges. As the previous update method effectively only updates a few correlation distributions, whilst there are roughly n^2 correlations for n cards in a deck, another strategy would be needed to densely connect the graph.
The following simple example provides a good idea: Assume our priors for the cards 1 and 2 and for the cards 1 and 3 are well estimated and have non zero means with the same sign. As cards 2 and 3 were never reviewed in close timely fashion, our prior for the cards 2 and 3 (and vice versa) still has zero mean.
For this correlation graph to be consistent, we would have to update our 2,3 and 3,2 prior correlation distributions towards the non zero mean and identical sign.
Nicely enough both approaches to updating the correlation distributions can be done for past review data, thus this model would be backwards compatible.
Updating Card Estimates
Such correlation model could be used in the following way:
If the estimate (interval/recall probability) of card A changes due to a review, all connected cards (connected means the correlation distribution has a non zero mean) should also update their intervales/recall probabilities weighted by the mean of the respective correlation distribution.
Advantages
Once the estimates of the correlations between cards are good enough, the review load for positively correlated cards would be reduced. Furthermore, negatively correlated cards are reviewed more often, giving the student the opportunity to address interference issues.
Question
So what do you think about this as an expert in bayesian spaced repition models? Would this feasible mathematically as well as implementation wise?
And thinking of compatibility to Ebisu: You mentioned in another issue, that Ebisu is built on binary cards answers (passed/failed instead of Again/Hard/Good/Easy), could non binary correlations still be used as an input to update Ebisu's parameters?
The text was updated successfully, but these errors were encountered: