Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few ambiguities for replicating results #9

Open
hojjatkarami opened this issue Feb 7, 2023 · 1 comment
Open

A few ambiguities for replicating results #9

hojjatkarami opened this issue Feb 7, 2023 · 1 comment

Comments

@hojjatkarami
Copy link

hojjatkarami commented Feb 7, 2023

Hello,

First, thank you for your very-well written code, which made it easy for me to get started with it.
I managed to replicate one of your results: Table 3- synthea(full)-GRU encoder+CP decoder, AUROC ~= 0.85

  1. However, I wanted to test whether it is really useful to use PP loss function or not. Hence, I removed the integral effect in enc_dec.neg_log_likelihood:

intensity_integral = intensity_integral[:, :-1]*0 # [B,L]

As a result, we could say that our loss function will reduce to the simple cross entropy (for multi-class) or binary cross entropy (for multi-label).
Surprisingly, I saw no performance degradation, which might indicate that the integral term (and hence point process loss) has no effect.

What is more interesting is that for the Retweets dataset, I could achieve auroc=0.68 (in paper is 0.61) when omitting integral term!

  1. Another issue for me is the way you have reported AUROC for label prediction. In the literature, researchers tend to report metrics (acc-f1-auroc, ...) for next event prediction, but in your code, it seems to me that you use the information including $t_j$ for predicting $j-th$ mark itself.
@josephenguehard
Copy link
Contributor

Hi!

First sorry for the very late reply, I left Babylon a few months ago and didn't get a notification for this. And thanks for your interest in our work!

It's surprising that the integral term has no effect on the result. If you check Figure 2 of our paper, you'll see that the model is able to pick up regular events, which is not possible with a simple conditional poisson. So at least the NLL loss should be better.

As for the Retweets dataset, the auroc was already better using CP rather than the full TPP model, so I'm not surprised that omitting the integral term works even better. This dataset is probably not modelled best with TPPs.

About your 2nd point, we model the joint distribution to predict both time and type of event. But when computing the auroc, we use the true time of the next event $t_j$ to actually compare if the model predict the correct $j-th$ mark. Therefore, this metric is limited in that it can only be used to check of the correct mark is predicted. The NLL loss should as a result be preferred when comparing models.

Best,
Joseph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants