You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Provide a GERBIL instance that is able to evaluate Link Prediction results.
Proposed user workflow
User chooses type of evaluation
Head prediction
Tail prediction
Relation prediction
Combinations
User uploads the system answer ➡️ Need to define a file format (e.g., JSON)
User chooses dataset
We provide a list of available datasets
Uploading another dataset might be an option later on
Evaluation starts
Evaluation per sub task (head, tail and relation prediction)
Generation of summary over the tasks, in case a combination of tasks has been selected
Development details
We prepared the LinkPrediction branch for this development.
Note: all classes and methods should get some nice Javadoc comments 😉
1. Define Experiment Types
Add the 7 new experiment types to the ExperimentType enum. (This will create a lot of compiling errors, that we will have to fix) I could imagine something like the following:
Update the Experiment Type hierarchy in the equalsOrContainsType method. I would suggest to add a default: return false to the switch case statements of the old ExperimentTypes (e.g., A2KB). For the new types, you simply have to define that LP returns true for all 7 new types. HTP returns true for HP and TP and so on...
Replace the names of the other experiment types in the config file with the new names. We only want to have our new experiment types in the UI.
I guess there will be compiler errors, that I am simply not aware of at the moment. Just let me know in case you encounter issues that you cannot solve.
2. Define file format for result files
First, we should define the exact format of the file that the user should upload. There are several options (the example is for tail prediction).
The RDF variant is nice but very verbose while the JSON variant is shorter but we would have to establish our own structure there.
3. Internal representation
Internally, GERBIL works with the Document class. Each Document is typically a single task that a system has to solve. The additional information that is added to a Document is represented as a Marking. This is not exactly the best structure, but I think that it might be the easiest if we force our data into this structure than trying to rename all the classes, everywhere (I know that this is a dirty solution, but it might be the easiest for now 😉 ).
Following this structure, our document would represent a pair of IRIs for which the third IRI should be predicted. The two known IRIs would have to be stored in the document. We could put them into the text, but that doesn't sound good. Instead, we could define three classes to be able to add them as markings (in org.aksw.gerbil.datatypes.marking) (Oh noo... this is really dirty 😅):
publicclassSubjectextendsAnnotation {
// nothing to implement except the constructors of the Annotation class...
}
publicclassPredicateextendsAnnotation {
// nothing to implement except the constructors of the Annotation class...
}
publicclassObjectextendsAnnotation {
// nothing to implement except the constructors of the Annotation class...
}
Defining them as extensions of the Annotation class simplifies our effort.
Each document needs an IRI. It is used to map these documents to each other. I would suggest to have an IRI that is generated based on the two IRIs that are given for the document.
4. Provide an Annotator interface
We need to tell the framework what the system's task should be. That can be done quite easily:
Create a class that implements the LinkPredictor interface defined above. The class should take a file as input, parse it and represent the file in an internal Map. You can simply extend the existing InstanceListBasedAnnotator class. You just need to add the parsing of the file.
⚠️ This implementation should get a Junit test.
6. Provide adapters for datasets
The datasets need adapter(s) that are able to read them. Their constructor should take the path to a file. The class has to be able to load the correct answers from the file. It also needs to be able to load the triples to apply the filtering in the next step.
The filtering issue could be solved during the evaluation if we add all subjects/predicates/objects to a document that are already known from the training data. However, this would mean that we may need another Marking class, similar to the Marking classes we defined further above.
⚠️ This implementation should get a Junit test.
7. Implement filtering
If we added the subjects/predicates/objects that are already known to the Document instances of the dataset (as described as possible solution in the previous step), the filtering can simply be done as part of the evaluation, i.e., before the ScoredAnnotation instances are ranked according to their score, the annotations are filtered based on the already known triples that can be received from the Document of the dataset.
I will have to double check how exactly we can implement this part. 🤔
⚠️ This implementation should get a Junit test.
8. Implement evaluation metrics
The evaluation metric(s) should implement the Evaluator interface. Its general workflow should be the following:
Create ranking based on provided scores (shared ranks in case of a tie)
Goal
Provide a GERBIL instance that is able to evaluate Link Prediction results.
Proposed user workflow
Development details
We prepared the LinkPrediction branch for this development.
Note: all classes and methods should get some nice Javadoc comments 😉
1. Define Experiment Types
default: return false
to the switch case statements of the old ExperimentTypes (e.g., A2KB). For the new types, you simply have to define that LP returns true for all 7 new types. HTP returns true for HP and TP and so on...2. Define file format for result files
First, we should define the exact format of the file that the user should upload. There are several options (the example is for tail prediction).
JSON format
RDF (e.g., Turtle) with re-ification
The RDF variant is nice but very verbose while the JSON variant is shorter but we would have to establish our own structure there.
3. Internal representation
Internally, GERBIL works with the
Document
class. EachDocument
is typically a single task that a system has to solve. The additional information that is added to aDocument
is represented as aMarking
. This is not exactly the best structure, but I think that it might be the easiest if we force our data into this structure than trying to rename all the classes, everywhere (I know that this is a dirty solution, but it might be the easiest for now 😉 ).Following this structure, our document would represent a pair of IRIs for which the third IRI should be predicted. The two known IRIs would have to be stored in the document. We could put them into the text, but that doesn't sound good. Instead, we could define three classes to be able to add them as markings (in
org.aksw.gerbil.datatypes.marking
) (Oh noo... this is really dirty 😅):Defining them as extensions of the Annotation class simplifies our effort.
The predictions can simply be stored as instances of the ScoredAnnotation class.
Each document needs an IRI. It is used to map these documents to each other. I would suggest to have an IRI that is generated based on the two IRIs that are given for the document.
4. Provide an Annotator interface
We need to tell the framework what the system's task should be. That can be done quite easily:
5. Provide parsing of the answer file
Create a class that implements the LinkPredictor interface defined above. The class should take a file as input, parse it and represent the file in an internal Map. You can simply extend the existing InstanceListBasedAnnotator class. You just need to add the parsing of the file.
6. Provide adapters for datasets
The datasets need adapter(s) that are able to read them. Their constructor should take the path to a file. The class has to be able to load the correct answers from the file. It also needs to be able to load the triples to apply the filtering in the next step.
The filtering issue could be solved during the evaluation if we add all subjects/predicates/objects to a document that are already known from the training data. However, this would mean that we may need another Marking class, similar to the Marking classes we defined further above.
7. Implement filtering
If we added the subjects/predicates/objects that are already known to the Document instances of the dataset (as described as possible solution in the previous step), the filtering can simply be done as part of the evaluation, i.e., before the ScoredAnnotation instances are ranked according to their score, the annotations are filtered based on the already known triples that can be received from the Document of the dataset.
I will have to double check how exactly we can implement this part. 🤔
8. Implement evaluation metrics
The evaluation metric(s) should implement the Evaluator interface. Its general workflow should be the following:
The results are simply instances of the DoubleEvaluationResult class.
Finally, the evaluation has to be put together. This is done in the EvaluatorFactory. I can do that, as soon as the previous tasks are done 🙂
The text was updated successfully, but these errors were encountered: