This page provides two-click reproductions* for a number of experimental runs on the MIRACL dataset. Instructions for programmatic execution are shown at the bottom of this page. The dataset is described in the following paper:
Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, and Jimmy Lin. MIRACL: A Multilingual Retrieval Dataset Covering 18 Diverse Languages. Transactions of the Association for Computational Linguistics, 11:1114–1131, 2023.
Many of the models presented on this page are described in the following paper:
Xinyu Zhang, Kelechi Ogueji, Xueguang Ma, and Jimmy Lin. Towards Best Practices for Training Multilingual Dense Retrieval Models. ACM Transactions on Information Systems, 42(2), Article No. 39, 2023.
Key:
- BM25: BM25
- mDPR pFT: mDPR (tied encoders), pre-FT w/ MS MARCO
- BM25+mDPR pFT: hybrid of BM25 and mDPR (tied encoders), pre-FT w/ MS MARCO
- mDPR pFT+FT1: mDPR (tied encoders), pre-FT w/ MS MARCO then FT w/ all Mr. TyDi
- mDPR pFT+FT2: mDPR (tied encoders), pre-FT w/ MS MARCO then in-lang FT w/ MIRACL
- mContriever: mContriever (tied encoders), pre-FT w/ MS MARCO
|
nDCG@10, dev queries |
ar |
bn |
en |
es |
fa |
fi |
fr |
hi |
id |
ja |
ko |
ru |
sw |
te |
th |
zh |
de |
yo |
|
avg |
|
BM25 |
0.481 |
0.508 |
0.351 |
0.319 |
0.333 |
0.551 |
0.183 |
0.458 |
0.449 |
0.369 |
0.419 |
0.334 |
0.383 |
0.494 |
0.484 |
0.180 |
0.226 |
0.406 |
|
0.385 |
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ar \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar \
--output run.miracl.bm25.ar.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \
run.miracl.bm25.ar.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language bn \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn \
--output run.miracl.bm25.bn.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \
run.miracl.bm25.bn.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language en \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en \
--output run.miracl.bm25.en.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \
run.miracl.bm25.en.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language es \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es \
--output run.miracl.bm25.es.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \
run.miracl.bm25.es.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language fa \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa \
--output run.miracl.bm25.fa.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \
run.miracl.bm25.fa.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language fi \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi \
--output run.miracl.bm25.fi.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \
run.miracl.bm25.fi.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language fr \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr \
--output run.miracl.bm25.fr.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \
run.miracl.bm25.fr.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language hi \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi \
--output run.miracl.bm25.hi.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \
run.miracl.bm25.hi.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language id \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id \
--output run.miracl.bm25.id.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \
run.miracl.bm25.id.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ja \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja \
--output run.miracl.bm25.ja.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \
run.miracl.bm25.ja.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ko \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko \
--output run.miracl.bm25.ko.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \
run.miracl.bm25.ko.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ru \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru \
--output run.miracl.bm25.ru.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \
run.miracl.bm25.ru.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language sw \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw \
--output run.miracl.bm25.sw.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \
run.miracl.bm25.sw.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language te \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te \
--output run.miracl.bm25.te.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \
run.miracl.bm25.te.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language th \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th \
--output run.miracl.bm25.th.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \
run.miracl.bm25.th.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language zh \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh \
--output run.miracl.bm25.zh.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \
run.miracl.bm25.zh.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language de \
--topics miracl-v1.0-de-dev \
--index miracl-v1.0-de \
--output run.miracl.bm25.de.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \
run.miracl.bm25.de.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 --pretokenized \
--topics miracl-v1.0-yo-dev \
--index miracl-v1.0-yo \
--output run.miracl.bm25.yo.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \
run.miracl.bm25.yo.dev.txt
|
|
mDPR pFT |
0.499 |
0.443 |
0.394 |
0.478 |
0.480 |
0.472 |
0.435 |
0.383 |
0.272 |
0.439 |
0.419 |
0.407 |
0.299 |
0.356 |
0.358 |
0.512 |
0.490 |
0.444 |
|
0.421 |
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.ar.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \
run.miracl.mdpr-tied-pft-msmarco.ar.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.bn.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \
run.miracl.mdpr-tied-pft-msmarco.bn.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.en.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \
run.miracl.mdpr-tied-pft-msmarco.en.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.es.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \
run.miracl.mdpr-tied-pft-msmarco.es.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.fa.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \
run.miracl.mdpr-tied-pft-msmarco.fa.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.fi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \
run.miracl.mdpr-tied-pft-msmarco.fi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.fr.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \
run.miracl.mdpr-tied-pft-msmarco.fr.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.hi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \
run.miracl.mdpr-tied-pft-msmarco.hi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.id.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \
run.miracl.mdpr-tied-pft-msmarco.id.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.ja.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \
run.miracl.mdpr-tied-pft-msmarco.ja.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.ko.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \
run.miracl.mdpr-tied-pft-msmarco.ko.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.ru.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \
run.miracl.mdpr-tied-pft-msmarco.ru.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.sw.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \
run.miracl.mdpr-tied-pft-msmarco.sw.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.te.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \
run.miracl.mdpr-tied-pft-msmarco.te.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.th.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \
run.miracl.mdpr-tied-pft-msmarco.th.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.zh.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \
run.miracl.mdpr-tied-pft-msmarco.zh.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-de-dev \
--index miracl-v1.0-de-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.de.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \
run.miracl.mdpr-tied-pft-msmarco.de.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-yo-dev \
--index miracl-v1.0-yo-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.yo.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \
run.miracl.mdpr-tied-pft-msmarco.yo.dev.txt
|
|
BM25+mDPR pFT |
0.673 |
0.654 |
0.549 |
0.641 |
0.594 |
0.672 |
0.523 |
0.616 |
0.443 |
0.576 |
0.609 |
0.532 |
0.446 |
0.602 |
0.599 |
0.525 |
0.564 |
0.611 |
|
0.579 |
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.ar.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ar.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ar.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ar.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.bn.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.bn.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.bn.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.bn.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.en.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.en.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.en.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.en.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.es.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.es.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.es.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.es.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.fa.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fa.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fa.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fa.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.fi.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fi.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fi.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fi.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.fr.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fr.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fr.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fr.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.hi.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.hi.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.hi.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.hi.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.id.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.id.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.id.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.id.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.ja.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ja.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ja.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ja.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.ko.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ko.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ko.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ko.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.ru.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ru.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ru.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ru.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.sw.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.sw.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.sw.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.sw.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.te.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.te.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.te.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.te.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.th.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.th.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.th.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.th.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.zh.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.zh.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.zh.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.zh.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.de.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.de.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.de.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.de.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.yo.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.yo.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.yo.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.yo.dev.txt
|
|
mDPR pFT+FT1 |
0.578 |
0.580 |
0.281 |
0.251 |
0.384 |
0.569 |
0.301 |
0.329 |
0.346 |
0.500 |
0.486 |
0.393 |
0.658 |
0.778 |
0.598 |
0.358 |
0.322 |
0.598 |
|
0.462 |
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.ar.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.ar.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.bn.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.bn.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.en.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.en.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.es.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.es.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.fa.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.fa.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.fi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.fi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.fr.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.fr.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.hi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.hi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.id.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.id.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.ja.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.ja.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.ko.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.ko.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.ru.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.ru.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.sw.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.sw.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.te.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.te.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.th.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.th.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.zh.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.zh.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-de-dev \
--index miracl-v1.0-de-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.de.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.de.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-yo-dev \
--index miracl-v1.0-yo-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.yo.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.yo.dev.txt
|
|
mDPR pFT+FT2 |
0.725 |
0.684 |
0.488 |
0.565 |
0.593 |
0.714 |
0.589 |
0.516 |
0.496 |
0.642 |
0.590 |
0.597 |
0.685 |
0.804 |
0.695 |
0.650 |
-- |
-- |
|
0.627 |
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ar \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-miracl-ar \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ar.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ar.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-bn \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-miracl-bn \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.bn.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.bn.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-en \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-miracl-en \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.en.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.en.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-es \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-miracl-es \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.es.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.es.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fa \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-miracl-fa \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fa.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fa.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fi \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-miracl-fi \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fr \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-miracl-fr \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fr.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fr.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-hi \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-miracl-hi \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.hi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.hi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-id \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-miracl-id \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.id.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.id.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ja \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-miracl-ja \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ja.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ja.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ko \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-miracl-ko \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ko.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ko.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ru \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-miracl-ru \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ru.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ru.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-sw \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-miracl-sw \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.sw.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.sw.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-te \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-miracl-te \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.te.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.te.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-th \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-miracl-th \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.th.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.th.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-zh \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-miracl-zh \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.zh.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.zh.dev.txt
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
|
mContriever |
0.525 |
0.501 |
0.364 |
0.418 |
0.215 |
0.602 |
0.314 |
0.286 |
0.392 |
0.424 |
0.483 |
0.391 |
0.560 |
0.528 |
0.517 |
0.410 |
0.408 |
0.415 |
|
0.431 |
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.ar.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \
run.miracl.mcontriever-tied-pft-msmarco.ar.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.bn.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \
run.miracl.mcontriever-tied-pft-msmarco.bn.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.en.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \
run.miracl.mcontriever-tied-pft-msmarco.en.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.es.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \
run.miracl.mcontriever-tied-pft-msmarco.es.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.fa.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \
run.miracl.mcontriever-tied-pft-msmarco.fa.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.fi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \
run.miracl.mcontriever-tied-pft-msmarco.fi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.fr.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \
run.miracl.mcontriever-tied-pft-msmarco.fr.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.hi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \
run.miracl.mcontriever-tied-pft-msmarco.hi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.id.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \
run.miracl.mcontriever-tied-pft-msmarco.id.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.ja.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \
run.miracl.mcontriever-tied-pft-msmarco.ja.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.ko.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \
run.miracl.mcontriever-tied-pft-msmarco.ko.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.ru.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \
run.miracl.mcontriever-tied-pft-msmarco.ru.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.sw.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \
run.miracl.mcontriever-tied-pft-msmarco.sw.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.te.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \
run.miracl.mcontriever-tied-pft-msmarco.te.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.th.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \
run.miracl.mcontriever-tied-pft-msmarco.th.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.zh.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \
run.miracl.mcontriever-tied-pft-msmarco.zh.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-de-dev \
--index miracl-v1.0-de-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.de.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \
run.miracl.mcontriever-tied-pft-msmarco.de.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-yo-dev \
--index miracl-v1.0-yo-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.yo.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \
run.miracl.mcontriever-tied-pft-msmarco.yo.dev.txt
|
|
Recall@100, dev queries |
ar |
bn |
en |
es |
fa |
fi |
fr |
hi |
id |
ja |
ko |
ru |
sw |
te |
th |
zh |
de |
yo |
|
avg |
|
BM25 |
0.889 |
0.909 |
0.819 |
0.702 |
0.731 |
0.891 |
0.653 |
0.868 |
0.904 |
0.805 |
0.783 |
0.661 |
0.701 |
0.831 |
0.887 |
0.560 |
0.572 |
0.733 |
|
0.772 |
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ar \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar \
--output run.miracl.bm25.ar.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ar-dev \
run.miracl.bm25.ar.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language bn \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn \
--output run.miracl.bm25.bn.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-bn-dev \
run.miracl.bm25.bn.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language en \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en \
--output run.miracl.bm25.en.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-en-dev \
run.miracl.bm25.en.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language es \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es \
--output run.miracl.bm25.es.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-es-dev \
run.miracl.bm25.es.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language fa \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa \
--output run.miracl.bm25.fa.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fa-dev \
run.miracl.bm25.fa.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language fi \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi \
--output run.miracl.bm25.fi.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fi-dev \
run.miracl.bm25.fi.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language fr \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr \
--output run.miracl.bm25.fr.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fr-dev \
run.miracl.bm25.fr.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language hi \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi \
--output run.miracl.bm25.hi.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-hi-dev \
run.miracl.bm25.hi.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language id \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id \
--output run.miracl.bm25.id.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-id-dev \
run.miracl.bm25.id.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ja \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja \
--output run.miracl.bm25.ja.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ja-dev \
run.miracl.bm25.ja.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ko \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko \
--output run.miracl.bm25.ko.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ko-dev \
run.miracl.bm25.ko.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language ru \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru \
--output run.miracl.bm25.ru.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ru-dev \
run.miracl.bm25.ru.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language sw \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw \
--output run.miracl.bm25.sw.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-sw-dev \
run.miracl.bm25.sw.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language te \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te \
--output run.miracl.bm25.te.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-te-dev \
run.miracl.bm25.te.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language th \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th \
--output run.miracl.bm25.th.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-th-dev \
run.miracl.bm25.th.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language zh \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh \
--output run.miracl.bm25.zh.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-zh-dev \
run.miracl.bm25.zh.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--language de \
--topics miracl-v1.0-de-dev \
--index miracl-v1.0-de \
--output run.miracl.bm25.de.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-de-dev \
run.miracl.bm25.de.dev.txt
Command to generate run:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 --pretokenized \
--topics miracl-v1.0-yo-dev \
--index miracl-v1.0-yo \
--output run.miracl.bm25.yo.dev.txt \
--bm25 --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-yo-dev \
run.miracl.bm25.yo.dev.txt
|
|
mDPR pFT |
0.841 |
0.819 |
0.768 |
0.864 |
0.898 |
0.788 |
0.915 |
0.776 |
0.573 |
0.825 |
0.737 |
0.797 |
0.616 |
0.762 |
0.678 |
0.944 |
0.898 |
0.840 |
|
0.797 |
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.ar.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ar-dev \
run.miracl.mdpr-tied-pft-msmarco.ar.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.bn.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-bn-dev \
run.miracl.mdpr-tied-pft-msmarco.bn.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.en.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-en-dev \
run.miracl.mdpr-tied-pft-msmarco.en.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.es.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-es-dev \
run.miracl.mdpr-tied-pft-msmarco.es.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.fa.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fa-dev \
run.miracl.mdpr-tied-pft-msmarco.fa.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.fi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fi-dev \
run.miracl.mdpr-tied-pft-msmarco.fi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.fr.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fr-dev \
run.miracl.mdpr-tied-pft-msmarco.fr.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.hi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-hi-dev \
run.miracl.mdpr-tied-pft-msmarco.hi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.id.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-id-dev \
run.miracl.mdpr-tied-pft-msmarco.id.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.ja.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ja-dev \
run.miracl.mdpr-tied-pft-msmarco.ja.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.ko.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ko-dev \
run.miracl.mdpr-tied-pft-msmarco.ko.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.ru.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ru-dev \
run.miracl.mdpr-tied-pft-msmarco.ru.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.sw.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-sw-dev \
run.miracl.mdpr-tied-pft-msmarco.sw.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.te.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-te-dev \
run.miracl.mdpr-tied-pft-msmarco.te.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.th.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-th-dev \
run.miracl.mdpr-tied-pft-msmarco.th.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.zh.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-zh-dev \
run.miracl.mdpr-tied-pft-msmarco.zh.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-de-dev \
--index miracl-v1.0-de-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.de.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-de-dev \
run.miracl.mdpr-tied-pft-msmarco.de.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco \
--topics miracl-v1.0-yo-dev \
--index miracl-v1.0-yo-mdpr-tied-pft-msmarco \
--output run.miracl.mdpr-tied-pft-msmarco.yo.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-yo-dev \
run.miracl.mdpr-tied-pft-msmarco.yo.dev.txt
|
|
BM25+mDPR pFT |
0.941 |
0.932 |
0.882 |
0.948 |
0.937 |
0.895 |
0.965 |
0.912 |
0.768 |
0.904 |
0.900 |
0.874 |
0.725 |
0.857 |
0.823 |
0.959 |
0.948 |
0.950 |
|
0.895 |
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.ar.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ar.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ar.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ar-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ar.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.bn.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.bn.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.bn.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-bn-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.bn.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.en.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.en.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.en.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-en-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.en.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.es.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.es.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.es.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-es-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.es.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.fa.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fa.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fa.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fa-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fa.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.fi.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fi.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fi.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fi-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fi.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.fr.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fr.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fr.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fr-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fr.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.hi.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.hi.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.hi.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-hi-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.hi.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.id.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.id.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.id.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-id-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.id.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.ja.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ja.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ja.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ja-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ja.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.ko.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ko.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ko.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ko-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ko.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.ru.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ru.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ru.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ru-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ru.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.sw.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.sw.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.sw.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-sw-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.sw.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.te.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.te.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.te.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-te-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.te.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.th.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.th.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.th.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-th-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.th.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.zh.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.zh.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.zh.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-zh-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.zh.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.de.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.de.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.de.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-de-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.de.dev.txt
Command to generate run:
python -m pyserini.fusion \
--runs run.miracl.bm25.yo.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.yo.dev.top1000.txt \
--output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.yo.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-yo-dev \
run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.yo.dev.txt
|
|
mDPR pFT+FT1 |
0.795 |
0.848 |
0.508 |
0.471 |
0.686 |
0.798 |
0.601 |
0.637 |
0.584 |
0.745 |
0.718 |
0.671 |
0.888 |
0.951 |
0.836 |
0.673 |
0.599 |
0.891 |
|
0.717 |
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.ar.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ar-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.ar.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.bn.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-bn-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.bn.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.en.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-en-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.en.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.es.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-es-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.es.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.fa.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fa-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.fa.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.fi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fi-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.fi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.fr.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fr-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.fr.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.hi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-hi-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.hi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.id.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-id-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.id.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.ja.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ja-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.ja.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.ko.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ko-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.ko.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.ru.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ru-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.ru.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.sw.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-sw-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.sw.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.te.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-te-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.te.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.th.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-th-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.th.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.zh.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-zh-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.zh.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-de-dev \
--index miracl-v1.0-de-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.de.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-de-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.de.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-all \
--topics miracl-v1.0-yo-dev \
--index miracl-v1.0-yo-mdpr-tied-pft-msmarco-ft-all \
--output run.miracl.mdpr-tied-pft-msmarco-ft-all.yo.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-yo-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-all.yo.dev.txt
|
|
mDPR pFT+FT2 |
0.949 |
0.955 |
0.834 |
0.911 |
0.913 |
0.948 |
0.954 |
0.886 |
0.864 |
0.923 |
0.886 |
0.910 |
0.937 |
0.962 |
0.931 |
0.963 |
-- |
-- |
|
0.920 |
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ar \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-miracl-ar \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ar.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ar-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ar.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-bn \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-miracl-bn \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.bn.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-bn-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.bn.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-en \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-miracl-en \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.en.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-en-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.en.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-es \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-miracl-es \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.es.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-es-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.es.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fa \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-miracl-fa \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fa.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fa-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fa.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fi \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-miracl-fi \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fi-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fr \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-miracl-fr \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fr.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fr-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fr.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-hi \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-miracl-hi \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.hi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-hi-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.hi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-id \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-miracl-id \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.id.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-id-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.id.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ja \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-miracl-ja \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ja.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ja-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ja.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ko \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-miracl-ko \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ko.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ko-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ko.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ru \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-miracl-ru \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ru.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ru-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ru.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-sw \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-miracl-sw \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.sw.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-sw-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.sw.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-te \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-miracl-te \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.te.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-te-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.te.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-th \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-miracl-th \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.th.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-th-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.th.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-zh \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-miracl-zh \
--output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.zh.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-zh-dev \
run.miracl.mdpr-tied-pft-msmarco-ft-miracl.zh.dev.txt
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
|
mContriever |
0.925 |
0.921 |
0.797 |
0.841 |
0.654 |
0.953 |
0.824 |
0.646 |
0.802 |
0.878 |
0.875 |
0.850 |
0.911 |
0.961 |
0.936 |
0.903 |
0.841 |
0.770 |
|
0.849 |
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-ar-dev \
--index miracl-v1.0-ar-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.ar.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ar-dev \
run.miracl.mcontriever-tied-pft-msmarco.ar.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-bn-dev \
--index miracl-v1.0-bn-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.bn.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-bn-dev \
run.miracl.mcontriever-tied-pft-msmarco.bn.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-en-dev \
--index miracl-v1.0-en-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.en.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-en-dev \
run.miracl.mcontriever-tied-pft-msmarco.en.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-es-dev \
--index miracl-v1.0-es-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.es.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-es-dev \
run.miracl.mcontriever-tied-pft-msmarco.es.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-fa-dev \
--index miracl-v1.0-fa-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.fa.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fa-dev \
run.miracl.mcontriever-tied-pft-msmarco.fa.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-fi-dev \
--index miracl-v1.0-fi-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.fi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fi-dev \
run.miracl.mcontriever-tied-pft-msmarco.fi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-fr-dev \
--index miracl-v1.0-fr-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.fr.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-fr-dev \
run.miracl.mcontriever-tied-pft-msmarco.fr.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-hi-dev \
--index miracl-v1.0-hi-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.hi.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-hi-dev \
run.miracl.mcontriever-tied-pft-msmarco.hi.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-id-dev \
--index miracl-v1.0-id-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.id.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-id-dev \
run.miracl.mcontriever-tied-pft-msmarco.id.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-ja-dev \
--index miracl-v1.0-ja-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.ja.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ja-dev \
run.miracl.mcontriever-tied-pft-msmarco.ja.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-ko-dev \
--index miracl-v1.0-ko-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.ko.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ko-dev \
run.miracl.mcontriever-tied-pft-msmarco.ko.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-ru-dev \
--index miracl-v1.0-ru-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.ru.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-ru-dev \
run.miracl.mcontriever-tied-pft-msmarco.ru.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-sw-dev \
--index miracl-v1.0-sw-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.sw.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-sw-dev \
run.miracl.mcontriever-tied-pft-msmarco.sw.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-te-dev \
--index miracl-v1.0-te-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.te.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-te-dev \
run.miracl.mcontriever-tied-pft-msmarco.te.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-th-dev \
--index miracl-v1.0-th-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.th.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-th-dev \
run.miracl.mcontriever-tied-pft-msmarco.th.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-zh-dev \
--index miracl-v1.0-zh-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.zh.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-zh-dev \
run.miracl.mcontriever-tied-pft-msmarco.zh.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-de-dev \
--index miracl-v1.0-de-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.de.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-de-dev \
run.miracl.mcontriever-tied-pft-msmarco.de.dev.txt
Command to generate run:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class contriever \
--encoder facebook/mcontriever-msmarco \
--topics miracl-v1.0-yo-dev \
--index miracl-v1.0-yo-mcontriever-pft-msmarco \
--output run.miracl.mcontriever-tied-pft-msmarco.yo.dev.txt --hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval \
-c -m recall.100 miracl-v1.0-yo-dev \
run.miracl.mcontriever-tied-pft-msmarco.yo.dev.txt
|
Programmatic Execution
All experimental runs shown in the above table can be programmatically executed based on the instructions below.
To list all the experimental conditions:
python -m pyserini.2cr.miracl --list-conditions
Run all languages for a specific condition and show commands:
python -m pyserini.2cr.miracl --condition bm25 --display-commands
Run a particular language for a specific condition and show commands:
python -m pyserini.2cr.miracl --condition bm25 --language ko --display-commands
Run all languages for all conditions and show commands:
python -m pyserini.2cr.miracl --all --display-commands
With the above commands, run files will be placed in the current directory. Use the option --directory runs to place the runs in a sub-directory.
For a specific condition, just show the commands and do not run:
python -m pyserini.2cr.miracl --condition bm25 --display-commands --dry-run
This will generate exactly the commands for a specific condition above (corresponding to a row in the table).
For a specific condition and language, just show the commands and do not run:
python -m pyserini.2cr.miracl --condition bm25 --language ko --display-commands --dry-run
For all conditions, just show the commands and do not run and skip evaluation:
python -m pyserini.2cr.miracl --all --display-commands --dry-run --skip-eval
Finally, to generate this page:
python -m pyserini.2cr.miracl --generate-report --output docs/2cr/miracl.html
The output file miracl.html should be identical to this page.