|
MRR@100, test queries |
ar |
bn |
en |
fi |
id |
ja |
ko |
ru |
sw |
te |
th |
|
avg |
|
BM25 |
0.368 |
0.418 |
0.140 |
0.284 |
0.376 |
0.212 |
0.285 |
0.316 |
0.389 |
0.528 |
0.401 |
|
0.338 |
|
|
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ar \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic \
--output run.mrtydi.bm25.ar.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
run.mrtydi.bm25.ar.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language bn \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali \
--output run.mrtydi.bm25.bn.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
run.mrtydi.bm25.bn.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language en \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english \
--output run.mrtydi.bm25.en.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-english-test \
run.mrtydi.bm25.en.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language fi \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish \
--output run.mrtydi.bm25.fi.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
run.mrtydi.bm25.fi.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language id \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian \
--output run.mrtydi.bm25.id.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
run.mrtydi.bm25.id.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ja \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese \
--output run.mrtydi.bm25.ja.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
run.mrtydi.bm25.ja.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ko \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean \
--output run.mrtydi.bm25.ko.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
run.mrtydi.bm25.ko.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ru \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian \
--output run.mrtydi.bm25.ru.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
run.mrtydi.bm25.ru.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language sw \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili \
--output run.mrtydi.bm25.sw.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
run.mrtydi.bm25.sw.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language te \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu \
--output run.mrtydi.bm25.te.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
run.mrtydi.bm25.te.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language th \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai \
--output run.mrtydi.bm25.th.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
run.mrtydi.bm25.th.test.txt
|
|
mDPR (split encoders), pre-FT w/ NQ |
0.291 |
0.296 |
0.291 |
0.205 |
0.271 |
0.212 |
0.234 |
0.282 |
0.188 |
0.110 |
0.171 |
|
0.232 |
|
|
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
run.mrtydi.mdpr-split-pft-nq.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
run.mrtydi.mdpr-split-pft-nq.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-english-test \
run.mrtydi.mdpr-split-pft-nq.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
run.mrtydi.mdpr-split-pft-nq.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
run.mrtydi.mdpr-split-pft-nq.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
run.mrtydi.mdpr-split-pft-nq.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
run.mrtydi.mdpr-split-pft-nq.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
run.mrtydi.mdpr-split-pft-nq.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
run.mrtydi.mdpr-split-pft-nq.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
run.mrtydi.mdpr-split-pft-nq.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
run.mrtydi.mdpr-split-pft-nq.th.test.txt
|
|
mDPR (tied encoders), pre-FT w/ NQ |
0.221 |
0.254 |
0.243 |
0.244 |
0.281 |
0.206 |
0.223 |
0.250 |
0.262 |
0.097 |
0.158 |
|
0.222 |
|
|
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
run.mrtydi.mdpr-tied-pft-nq.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
run.mrtydi.mdpr-tied-pft-nq.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-english-test \
run.mrtydi.mdpr-tied-pft-nq.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
run.mrtydi.mdpr-tied-pft-nq.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
run.mrtydi.mdpr-tied-pft-nq.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
run.mrtydi.mdpr-tied-pft-nq.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
run.mrtydi.mdpr-tied-pft-nq.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
run.mrtydi.mdpr-tied-pft-nq.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
run.mrtydi.mdpr-tied-pft-nq.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
run.mrtydi.mdpr-tied-pft-nq.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
run.mrtydi.mdpr-tied-pft-nq.th.test.txt
|
|
mDPR (tied encoders), pre-FT w/ MS MARCO |
0.441 |
0.397 |
0.327 |
0.275 |
0.352 |
0.311 |
0.282 |
0.356 |
0.342 |
0.310 |
0.269 |
|
0.333 |
|
|
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
run.mrtydi.mdpr-tied-pft-msmarco.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
run.mrtydi.mdpr-tied-pft-msmarco.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-english-test \
run.mrtydi.mdpr-tied-pft-msmarco.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
run.mrtydi.mdpr-tied-pft-msmarco.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
run.mrtydi.mdpr-tied-pft-msmarco.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
run.mrtydi.mdpr-tied-pft-msmarco.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
run.mrtydi.mdpr-tied-pft-msmarco.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
run.mrtydi.mdpr-tied-pft-msmarco.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
run.mrtydi.mdpr-tied-pft-msmarco.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
run.mrtydi.mdpr-tied-pft-msmarco.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
run.mrtydi.mdpr-tied-pft-msmarco.th.test.txt
|
|
mDPR (tied encoders), pre-FT w/ MS MARCO, FT w/ all |
0.695 |
0.623 |
0.492 |
0.559 |
0.578 |
0.501 |
0.486 |
0.516 |
0.644 |
0.891 |
0.618 |
|
0.600 |
|
|
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-english-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.th.test.txt
|
|
Recall@100, test queries |
ar |
bn |
en |
fi |
id |
ja |
ko |
ru |
sw |
te |
th |
|
avg |
|
BM25 |
0.793 |
0.869 |
0.536 |
0.720 |
0.843 |
0.643 |
0.619 |
0.654 |
0.764 |
0.897 |
0.853 |
|
0.745 |
|
|
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ar \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic \
--output run.mrtydi.bm25.ar.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-arabic-test \
run.mrtydi.bm25.ar.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language bn \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali \
--output run.mrtydi.bm25.bn.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-bengali-test \
run.mrtydi.bm25.bn.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language en \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english \
--output run.mrtydi.bm25.en.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-english-test \
run.mrtydi.bm25.en.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language fi \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish \
--output run.mrtydi.bm25.fi.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-finnish-test \
run.mrtydi.bm25.fi.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language id \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian \
--output run.mrtydi.bm25.id.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-indonesian-test \
run.mrtydi.bm25.id.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ja \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese \
--output run.mrtydi.bm25.ja.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-japanese-test \
run.mrtydi.bm25.ja.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ko \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean \
--output run.mrtydi.bm25.ko.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-korean-test \
run.mrtydi.bm25.ko.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ru \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian \
--output run.mrtydi.bm25.ru.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-russian-test \
run.mrtydi.bm25.ru.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language sw \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili \
--output run.mrtydi.bm25.sw.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-swahili-test \
run.mrtydi.bm25.sw.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language te \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu \
--output run.mrtydi.bm25.te.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-telugu-test \
run.mrtydi.bm25.te.test.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language th \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai \
--output run.mrtydi.bm25.th.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-thai-test \
run.mrtydi.bm25.th.test.txt
|
|
mDPR (split encoders), pre-FT w/ NQ |
0.650 |
0.793 |
0.678 |
0.568 |
0.685 |
0.584 |
0.532 |
0.647 |
0.528 |
0.366 |
0.515 |
|
0.595 |
|
|
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-arabic-test \
run.mrtydi.mdpr-split-pft-nq.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-bengali-test \
run.mrtydi.mdpr-split-pft-nq.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-english-test \
run.mrtydi.mdpr-split-pft-nq.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-finnish-test \
run.mrtydi.mdpr-split-pft-nq.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-indonesian-test \
run.mrtydi.mdpr-split-pft-nq.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-japanese-test \
run.mrtydi.mdpr-split-pft-nq.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-korean-test \
run.mrtydi.mdpr-split-pft-nq.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-russian-test \
run.mrtydi.mdpr-split-pft-nq.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-swahili-test \
run.mrtydi.mdpr-split-pft-nq.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-telugu-test \
run.mrtydi.mdpr-split-pft-nq.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder castorini/mdpr-question-nq \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai-mdpr-nq \
--output run.mrtydi.mdpr-split-pft-nq.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-thai-test \
run.mrtydi.mdpr-split-pft-nq.th.test.txt
|
|
mDPR (tied encoders), pre-FT w/ NQ |
0.600 |
0.707 |
0.689 |
0.640 |
0.691 |
0.573 |
0.550 |
0.618 |
0.597 |
0.245 |
0.455 |
|
0.579 |
|
|
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-arabic-test \
run.mrtydi.mdpr-tied-pft-nq.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-bengali-test \
run.mrtydi.mdpr-tied-pft-nq.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-english-test \
run.mrtydi.mdpr-tied-pft-nq.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-finnish-test \
run.mrtydi.mdpr-tied-pft-nq.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-indonesian-test \
run.mrtydi.mdpr-tied-pft-nq.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-japanese-test \
run.mrtydi.mdpr-tied-pft-nq.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-korean-test \
run.mrtydi.mdpr-tied-pft-nq.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-russian-test \
run.mrtydi.mdpr-tied-pft-nq.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-swahili-test \
run.mrtydi.mdpr-tied-pft-nq.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-telugu-test \
run.mrtydi.mdpr-tied-pft-nq.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-nq \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai-mdpr-tied-pft-nq \
--output run.mrtydi.mdpr-tied-pft-nq.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-thai-test \
run.mrtydi.mdpr-tied-pft-nq.th.test.txt
|
|
mDPR (tied encoders), pre-FT w/ MS MARCO |
0.797 |
0.784 |
0.754 |
0.647 |
0.736 |
0.732 |
0.617 |
0.743 |
0.634 |
0.782 |
0.595 |
|
0.711 |
|
|
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-arabic-test \
run.mrtydi.mdpr-tied-pft-msmarco.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-bengali-test \
run.mrtydi.mdpr-tied-pft-msmarco.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-english-test \
run.mrtydi.mdpr-tied-pft-msmarco.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-finnish-test \
run.mrtydi.mdpr-tied-pft-msmarco.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-indonesian-test \
run.mrtydi.mdpr-tied-pft-msmarco.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-japanese-test \
run.mrtydi.mdpr-tied-pft-msmarco.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-korean-test \
run.mrtydi.mdpr-tied-pft-msmarco.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-russian-test \
run.mrtydi.mdpr-tied-pft-msmarco.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-swahili-test \
run.mrtydi.mdpr-tied-pft-msmarco.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-telugu-test \
run.mrtydi.mdpr-tied-pft-msmarco.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai-mdpr-tied-pft-msmarco \
--output run.mrtydi.mdpr-tied-pft-msmarco.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-thai-test \
run.mrtydi.mdpr-tied-pft-msmarco.th.test.txt
|
|
mDPR (tied encoders), pre-FT w/ MS MARCO, FT w/ all |
0.900 |
0.955 |
0.841 |
0.856 |
0.861 |
0.813 |
0.785 |
0.843 |
0.876 |
0.966 |
0.883 |
|
0.871 |
|
|
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-arabic-test \
--index mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-arabic-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-bengali-test \
--index mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-bengali-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-english-test \
--index mrtydi-v1.1-english-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-english-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-finnish-test \
--index mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-finnish-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-indonesian-test \
--index mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-indonesian-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-japanese-test \
--index mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-japanese-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-korean-test \
--index mrtydi-v1.1-korean-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-korean-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-russian-test \
--index mrtydi-v1.1-russian-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-russian-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-swahili-test \
--index mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-swahili-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-telugu-test \
--index mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-telugu-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics mrtydi-v1.1-thai-test \
--index mrtydi-v1.1-thai-mdpr-tied-pft-msmarco-ft-all \
--output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 mrtydi-v1.1-thai-test \
run.mrtydi.mdpr-tied-pft-msmarco-ft-all.th.test.txt
|
Programmatic Execution
All experimental runs shown in the above table can be programmatically executed based on the instructions below.
To list all the experimental conditions:
python -m pyserini.2cr.mrtydi --list-conditions
Run all languages for a specific condition and show commands:
python -m pyserini.2cr.mrtydi --condition bm25 --display-commands
Run a particular language for a specific condition and show commands:
python -m pyserini.2cr.mrtydi --condition bm25 --language ko --display-commands
Run all languages for all conditions and show commands:
python -m pyserini.2cr.mrtydi --all --display-commands
With the above commands, run files will be placed in the current directory. Use the option --directory runs to place the runs in a sub-directory.
For a specific condition, just show the commands and do not run:
python -m pyserini.2cr.mrtydi --condition bm25 --display-commands --dry-run
This will generate exactly the commands for a specific condition above (corresponding to a row in the table).
For a specific condition and language, just show the commands and do not run:
python -m pyserini.2cr.mrtydi --condition bm25 --language ko --display-commands --dry-run
For all conditions, just show the commands and do not run and skip evaluation:
python -m pyserini.2cr.mrtydi --all --display-commands --dry-run --skip-eval
Finally, to generate this page:
python -m pyserini.2cr.mrtydi --generate-report --output docs/2cr/mrtydi.html
The output file mrtydi.html should be identical to this page.