Skip to content
Petr Baudis edited this page Mar 18, 2016 · 2 revisions

tf-idf Experiments

Answer Sentence Selection

Word overlap is much better than cosine distance. BM25 is awesome (while treating s0 as the query, i.e. weighing based only on s1 occurences).


Model trainAllMRR devMRR testMAP testMRR settings
termfreq 0.813992 0.829004 0.630100 0.765363 (defaults) termfreq-5e150127bfa12fab-00
termfreq 0.714169 0.725217 0.578200 0.708957 freq_mode="tf" termfreq-2d3b759c31ae7a0c-00
termfreq 0.602093 0.684234 0.545400 0.641078 score_mode='cos' termfreq-11d9aad0ee302e88-00
termfreq 0.601831 0.696384 0.549600 0.634582 freq_mode="tf" score_mode='cos' termfreq-5121bb88a5922f9-00


Model trainAllMRR devMRR testMAP testMRR settings
termfreq 0.483538 0.452647 0.294300 0.484530 (defaults) termfreq-7c2a88efab16d07d-00
termfreq 0.339544 0.324693 0.242700 0.337893 freq_mode="tf" termfreq-26a946355b7ba20d-00
termfreq 0.254189 0.214607 0.201000 0.275696 score_mode='cos' termfreq--4326af5eba873e89-00
termfreq 0.251412 0.238331 0.204800 0.278305 freq_mode="tf" score_mode='cos' termfreq--4e5be392f5f78798-00


Model trainAllMRR devMRR testMAP testMRR settings
termfreq 0.441573 0.432115 0.313900 0.490822 (defaults) termfreq-1c1547925afa2a69-00
termfreq 0.325390 0.328255 0.266800 0.362613 freq_mode="tf" termfreq--1146821b4b0960cf-00


Model train val ans.for. ans.stud belief headline images t. mean settings
termfreq TF-IDF #w 0.497085 0.651653 0.607226 0.676746 0.622920 0.725578 0.714331 0.669360 freq_mode='tf' termfreq--3e24e018ccdd67cf
termfreq BM25 #w 0.503736 0.656081 0.626950 0.690302 0.632223 0.725748 0.718185 0.678681 (defaults) termfreq--61f19fd57ac70195
termfreq 0.529827 0.623189 0.614495 0.561347 0.489752 0.674801 0.681121 0.604303 score_mode='cos' termfreq-48a0f78b96dd8c67
termfreq 0.516296 0.607707 0.615016 0.557084 0.491858 0.675379 0.683860 0.604639 freq_mode='tf' score_mode='cos' termfreq-1408c2bf9274f4d8
Clone this wiki locally