Skip to content
This repository has been archived by the owner on Nov 20, 2020. It is now read-only.

Skimit comparison v2 #2

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Skimit comparison v2 #2

wants to merge 4 commits into from

Conversation

mbatchkarov
Copy link

This is one of several bits of code that we need in order to compare Dragnet to our content model. List of changes:

  • change makefile so dragnet's C extensions build on OSX (a recent version of gcc is required, eg. 7, but OSX has 4 by default). Install a new one with home-brew, then tell make to use it.
  • add a little script that wraps our content model (run locally as an API) in the interface required by dragnet's evaluation
  • make sure dragnet's eval does not include the training set

Two more PRs coming soon, one to prepare our data so dragnet can be trained on it, and one to train our model on the same data our model was trained on.

@mbatchkarov
Copy link
Author

mbatchkarov commented Jul 28, 2017

For the record, here are my current results:

Dragnet

output_dragnet

Skimit

output_skimit

Methods

  • Split all data (dragnets + all of ours) into training and testing
  • Train both models on training set
  • Evaluate on testing set

Evaluation takes a the content of a page (no markup) and the output of a model and computes precision, recall and F1 for each token. I think this is cleaner idea than looking at blocks because it focuses on the content only. I'll provide more details on Monday.

@mlehl88
Copy link

mlehl88 commented Aug 2, 2017

I haven't looked into the code much yet. What is the Y-axis of the graphs? Documents?

@mbatchkarov
Copy link
Author

mbatchkarov commented Aug 2, 2017

Yes, that's a histogram of p/r/f1 scores over the test set.

EDIT:
We are doing well on average but we still have a few catastrophic failures where F1 is 0. We should probably include this in our evaluation because these kinds of errors are very embarrassing in practice.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants