Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number of positive and negative examples #21

Open
DbrRoxane opened this issue Sep 14, 2020 · 0 comments
Open

Number of positive and negative examples #21

DbrRoxane opened this issue Sep 14, 2020 · 0 comments

Comments

@DbrRoxane
Copy link

Hi,

I have tried to train your model on your data but changed one thing : my summaries are not split in paragraph anymore and so my context is a list of size one, as well as the answers.
I adapt the index for the answer, of course.
The train0 is here for instance to see how it looks like
https://www.dropbox.com/t/pC0Wxoco8I6Gw6oz

Here is my issue: When I try to train, I have a out of range error.

Traceback (most recent call last):
  File "main.py", line 357, in <module>
    main()
  File "main.py", line 266, in main
    for step, batch in enumerate(train_dataloader):
  File "/media/data/roxane/anaconda3/envs/hard-em/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 346, in __next__
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/media/data/roxane/anaconda3/envs/hard-em/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/media/data/roxane/anaconda3/envs/hard-em/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/media/data/roxane/Desktop/nQAchallenge/narrative-reader/qa-hard-em/DataLoader.py", line 50, in __getitem__
    idx = self.negative_indices[int(idx/2)]

I checked that the index were correct, it is.
The out of range is because the index is twice more than the size of positive lenght
(

self.length = 2*len(self.positive_indices)
)
And this because I have more negative examples than positive.
If I let the previous wrong index, there is no errors (but the indexes are wrong...)
If I set all the index to a high number, I have a "real" index out of range error
I do not understand how this error can be created from simply changing the word_start and word_end values.
Would you have an idea why?

For the moment I patch the issue by replacing l30 by
self.length = 2*len(max(self.positive_indices, self.negative_indices))

Thanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant