Pyserini Reproductions

This page provides two-click reproductions^* for a number of experimental runs on the MIRACL dataset. Instructions for programmatic execution are shown at the bottom of this page. The dataset is described in the following paper:

Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, and Jimmy Lin. MIRACL: A Multilingual Retrieval Dataset Covering 18 Diverse Languages. Transactions of the Association for Computational Linguistics, 11:1114–1131, 2023.

Many of the models presented on this page are described in the following paper:

Xinyu Zhang, Kelechi Ogueji, Xueguang Ma, and Jimmy Lin. Towards Best Practices for Training Multilingual Dense Retrieval Models. ACM Transactions on Information Systems, 42(2), Article No. 39, 2023.

Key:

BM25: BM25
mDPR pFT: mDPR (tied encoders), pre-FT w/ MS MARCO
BM25+mDPR pFT: hybrid of BM25 and mDPR (tied encoders), pre-FT w/ MS MARCO
mDPR pFT+FT1: mDPR (tied encoders), pre-FT w/ MS MARCO then FT w/ all Mr. TyDi
mDPR pFT+FT2: mDPR (tied encoders), pre-FT w/ MS MARCO then in-lang FT w/ MIRACL
mContriever: mContriever (tied encoders), pre-FT w/ MS MARCO

	nDCG@10, dev queries	ar	bn	en	es	fa	fi	fr	hi	id	ja	ko	ru	sw	te	th	zh	de	yo	avg
	BM25	0.481	0.508	0.351	0.319	0.333	0.551	0.183	0.458	0.449	0.369	0.419	0.334	0.383	0.494	0.484	0.180	0.226	0.406	0.385
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language ar \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar \ --output run.miracl.bm25.ar.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \ run.miracl.bm25.ar.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language bn \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn \ --output run.miracl.bm25.bn.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \ run.miracl.bm25.bn.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language en \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en \ --output run.miracl.bm25.en.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \ run.miracl.bm25.en.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language es \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es \ --output run.miracl.bm25.es.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \ run.miracl.bm25.es.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language fa \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa \ --output run.miracl.bm25.fa.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \ run.miracl.bm25.fa.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language fi \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi \ --output run.miracl.bm25.fi.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \ run.miracl.bm25.fi.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language fr \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr \ --output run.miracl.bm25.fr.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \ run.miracl.bm25.fr.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language hi \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi \ --output run.miracl.bm25.hi.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \ run.miracl.bm25.hi.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language id \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id \ --output run.miracl.bm25.id.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \ run.miracl.bm25.id.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language ja \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja \ --output run.miracl.bm25.ja.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \ run.miracl.bm25.ja.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language ko \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko \ --output run.miracl.bm25.ko.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \ run.miracl.bm25.ko.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language ru \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru \ --output run.miracl.bm25.ru.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \ run.miracl.bm25.ru.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language sw \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw \ --output run.miracl.bm25.sw.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \ run.miracl.bm25.sw.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language te \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te \ --output run.miracl.bm25.te.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \ run.miracl.bm25.te.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language th \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th \ --output run.miracl.bm25.th.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \ run.miracl.bm25.th.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language zh \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh \ --output run.miracl.bm25.zh.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \ run.miracl.bm25.zh.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language de \ --topics miracl-v1.0-de-dev \ --index miracl-v1.0-de \ --output run.miracl.bm25.de.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \ run.miracl.bm25.de.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 --pretokenized \ --topics miracl-v1.0-yo-dev \ --index miracl-v1.0-yo \ --output run.miracl.bm25.yo.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \ run.miracl.bm25.yo.dev.txt`
	mDPR pFT	0.499	0.443	0.394	0.478	0.480	0.472	0.435	0.383	0.272	0.439	0.419	0.407	0.299	0.356	0.358	0.512	0.490	0.444	0.421
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.ar.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \ run.miracl.mdpr-tied-pft-msmarco.ar.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.bn.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \ run.miracl.mdpr-tied-pft-msmarco.bn.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.en.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \ run.miracl.mdpr-tied-pft-msmarco.en.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.es.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \ run.miracl.mdpr-tied-pft-msmarco.es.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.fa.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \ run.miracl.mdpr-tied-pft-msmarco.fa.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.fi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \ run.miracl.mdpr-tied-pft-msmarco.fi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.fr.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \ run.miracl.mdpr-tied-pft-msmarco.fr.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.hi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \ run.miracl.mdpr-tied-pft-msmarco.hi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.id.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \ run.miracl.mdpr-tied-pft-msmarco.id.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.ja.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \ run.miracl.mdpr-tied-pft-msmarco.ja.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.ko.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \ run.miracl.mdpr-tied-pft-msmarco.ko.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.ru.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \ run.miracl.mdpr-tied-pft-msmarco.ru.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.sw.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \ run.miracl.mdpr-tied-pft-msmarco.sw.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.te.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \ run.miracl.mdpr-tied-pft-msmarco.te.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.th.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \ run.miracl.mdpr-tied-pft-msmarco.th.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.zh.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \ run.miracl.mdpr-tied-pft-msmarco.zh.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-de-dev \ --index miracl-v1.0-de-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.de.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \ run.miracl.mdpr-tied-pft-msmarco.de.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-yo-dev \ --index miracl-v1.0-yo-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.yo.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \ run.miracl.mdpr-tied-pft-msmarco.yo.dev.txt`
	BM25+mDPR pFT	0.673	0.654	0.549	0.641	0.594	0.672	0.523	0.616	0.443	0.576	0.609	0.532	0.446	0.602	0.599	0.525	0.564	0.611	0.579
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.ar.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ar.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ar.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ar.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.bn.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.bn.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.bn.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.bn.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.en.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.en.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.en.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.en.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.es.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.es.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.es.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.es.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.fa.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fa.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fa.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fa.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.fi.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fi.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fi.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fi.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.fr.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fr.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fr.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fr.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.hi.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.hi.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.hi.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.hi.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.id.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.id.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.id.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.id.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.ja.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ja.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ja.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ja.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.ko.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ko.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ko.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ko.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.ru.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ru.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ru.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ru.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.sw.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.sw.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.sw.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.sw.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.te.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.te.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.te.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.te.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.th.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.th.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.th.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.th.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.zh.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.zh.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.zh.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.zh.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.de.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.de.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.de.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.de.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.yo.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.yo.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.yo.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.yo.dev.txt`
	mDPR pFT+FT1	0.578	0.580	0.281	0.251	0.384	0.569	0.301	0.329	0.346	0.500	0.486	0.393	0.658	0.778	0.598	0.358	0.322	0.598	0.462
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.ar.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.ar.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.bn.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.bn.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.en.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.en.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.es.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.es.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.fa.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.fa.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.fi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.fi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.fr.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.fr.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.hi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.hi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.id.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.id.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.ja.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.ja.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.ko.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.ko.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.ru.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.ru.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.sw.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.sw.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.te.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.te.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.th.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.th.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.zh.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.zh.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-de-dev \ --index miracl-v1.0-de-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.de.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.de.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-yo-dev \ --index miracl-v1.0-yo-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.yo.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.yo.dev.txt`
	mDPR pFT+FT2	0.725	0.684	0.488	0.565	0.593	0.714	0.589	0.516	0.496	0.642	0.590	0.597	0.685	0.804	0.695	0.650	--	--	0.627
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ar \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-miracl-ar \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ar.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ar.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-bn \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-miracl-bn \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.bn.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.bn.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-en \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-miracl-en \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.en.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.en.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-es \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-miracl-es \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.es.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.es.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fa \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-miracl-fa \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fa.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fa.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fi \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-miracl-fi \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fr \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-miracl-fr \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fr.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fr.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-hi \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-miracl-hi \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.hi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.hi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-id \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-miracl-id \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.id.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.id.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ja \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-miracl-ja \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ja.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ja.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ko \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-miracl-ko \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ko.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ko.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ru \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-miracl-ru \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ru.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ru.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-sw \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-miracl-sw \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.sw.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.sw.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-te \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-miracl-te \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.te.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.te.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-th \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-miracl-th \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.th.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.th.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-zh \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-miracl-zh \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.zh.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.zh.dev.txt` Command to generate run: Evaluation commands: Command to generate run: Evaluation commands:
	mContriever	0.525	0.501	0.364	0.418	0.215	0.602	0.314	0.286	0.392	0.424	0.483	0.391	0.560	0.528	0.517	0.410	0.408	0.415	0.431
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.ar.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ar-dev \ run.miracl.mcontriever-tied-pft-msmarco.ar.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.bn.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-bn-dev \ run.miracl.mcontriever-tied-pft-msmarco.bn.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.en.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-en-dev \ run.miracl.mcontriever-tied-pft-msmarco.en.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.es.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-es-dev \ run.miracl.mcontriever-tied-pft-msmarco.es.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.fa.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fa-dev \ run.miracl.mcontriever-tied-pft-msmarco.fa.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.fi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fi-dev \ run.miracl.mcontriever-tied-pft-msmarco.fi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.fr.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-fr-dev \ run.miracl.mcontriever-tied-pft-msmarco.fr.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.hi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-hi-dev \ run.miracl.mcontriever-tied-pft-msmarco.hi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.id.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-id-dev \ run.miracl.mcontriever-tied-pft-msmarco.id.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.ja.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ja-dev \ run.miracl.mcontriever-tied-pft-msmarco.ja.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.ko.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ko-dev \ run.miracl.mcontriever-tied-pft-msmarco.ko.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.ru.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-ru-dev \ run.miracl.mcontriever-tied-pft-msmarco.ru.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.sw.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-sw-dev \ run.miracl.mcontriever-tied-pft-msmarco.sw.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.te.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-te-dev \ run.miracl.mcontriever-tied-pft-msmarco.te.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.th.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-th-dev \ run.miracl.mcontriever-tied-pft-msmarco.th.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.zh.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-zh-dev \ run.miracl.mcontriever-tied-pft-msmarco.zh.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-de-dev \ --index miracl-v1.0-de-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.de.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-de-dev \ run.miracl.mcontriever-tied-pft-msmarco.de.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-yo-dev \ --index miracl-v1.0-yo-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.yo.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -M 100 -m ndcg_cut.10 miracl-v1.0-yo-dev \ run.miracl.mcontriever-tied-pft-msmarco.yo.dev.txt`

	Recall@100, dev queries	ar	bn	en	es	fa	fi	fr	hi	id	ja	ko	ru	sw	te	th	zh	de	yo	avg
	BM25	0.889	0.909	0.819	0.702	0.731	0.891	0.653	0.868	0.904	0.805	0.783	0.661	0.701	0.831	0.887	0.560	0.572	0.733	0.772
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language ar \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar \ --output run.miracl.bm25.ar.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ar-dev \ run.miracl.bm25.ar.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language bn \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn \ --output run.miracl.bm25.bn.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-bn-dev \ run.miracl.bm25.bn.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language en \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en \ --output run.miracl.bm25.en.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-en-dev \ run.miracl.bm25.en.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language es \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es \ --output run.miracl.bm25.es.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-es-dev \ run.miracl.bm25.es.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language fa \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa \ --output run.miracl.bm25.fa.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fa-dev \ run.miracl.bm25.fa.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language fi \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi \ --output run.miracl.bm25.fi.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fi-dev \ run.miracl.bm25.fi.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language fr \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr \ --output run.miracl.bm25.fr.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fr-dev \ run.miracl.bm25.fr.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language hi \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi \ --output run.miracl.bm25.hi.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-hi-dev \ run.miracl.bm25.hi.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language id \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id \ --output run.miracl.bm25.id.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-id-dev \ run.miracl.bm25.id.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language ja \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja \ --output run.miracl.bm25.ja.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ja-dev \ run.miracl.bm25.ja.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language ko \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko \ --output run.miracl.bm25.ko.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ko-dev \ run.miracl.bm25.ko.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language ru \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru \ --output run.miracl.bm25.ru.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ru-dev \ run.miracl.bm25.ru.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language sw \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw \ --output run.miracl.bm25.sw.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-sw-dev \ run.miracl.bm25.sw.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language te \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te \ --output run.miracl.bm25.te.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-te-dev \ run.miracl.bm25.te.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language th \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th \ --output run.miracl.bm25.th.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-th-dev \ run.miracl.bm25.th.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language zh \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh \ --output run.miracl.bm25.zh.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-zh-dev \ run.miracl.bm25.zh.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --language de \ --topics miracl-v1.0-de-dev \ --index miracl-v1.0-de \ --output run.miracl.bm25.de.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-de-dev \ run.miracl.bm25.de.dev.txt` Command to generate run: `python -m pyserini.search.lucene \ --threads 16 --batch-size 128 --pretokenized \ --topics miracl-v1.0-yo-dev \ --index miracl-v1.0-yo \ --output run.miracl.bm25.yo.dev.txt \ --bm25 --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-yo-dev \ run.miracl.bm25.yo.dev.txt`
	mDPR pFT	0.841	0.819	0.768	0.864	0.898	0.788	0.915	0.776	0.573	0.825	0.737	0.797	0.616	0.762	0.678	0.944	0.898	0.840	0.797
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.ar.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ar-dev \ run.miracl.mdpr-tied-pft-msmarco.ar.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.bn.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-bn-dev \ run.miracl.mdpr-tied-pft-msmarco.bn.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.en.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-en-dev \ run.miracl.mdpr-tied-pft-msmarco.en.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.es.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-es-dev \ run.miracl.mdpr-tied-pft-msmarco.es.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.fa.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fa-dev \ run.miracl.mdpr-tied-pft-msmarco.fa.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.fi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fi-dev \ run.miracl.mdpr-tied-pft-msmarco.fi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.fr.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fr-dev \ run.miracl.mdpr-tied-pft-msmarco.fr.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.hi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-hi-dev \ run.miracl.mdpr-tied-pft-msmarco.hi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.id.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-id-dev \ run.miracl.mdpr-tied-pft-msmarco.id.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.ja.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ja-dev \ run.miracl.mdpr-tied-pft-msmarco.ja.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.ko.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ko-dev \ run.miracl.mdpr-tied-pft-msmarco.ko.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.ru.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ru-dev \ run.miracl.mdpr-tied-pft-msmarco.ru.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.sw.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-sw-dev \ run.miracl.mdpr-tied-pft-msmarco.sw.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.te.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-te-dev \ run.miracl.mdpr-tied-pft-msmarco.te.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.th.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-th-dev \ run.miracl.mdpr-tied-pft-msmarco.th.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.zh.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-zh-dev \ run.miracl.mdpr-tied-pft-msmarco.zh.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-de-dev \ --index miracl-v1.0-de-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.de.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-de-dev \ run.miracl.mdpr-tied-pft-msmarco.de.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco \ --topics miracl-v1.0-yo-dev \ --index miracl-v1.0-yo-mdpr-tied-pft-msmarco \ --output run.miracl.mdpr-tied-pft-msmarco.yo.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-yo-dev \ run.miracl.mdpr-tied-pft-msmarco.yo.dev.txt`
	BM25+mDPR pFT	0.941	0.932	0.882	0.948	0.937	0.895	0.965	0.912	0.768	0.904	0.900	0.874	0.725	0.857	0.823	0.959	0.948	0.950	0.895
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.ar.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ar.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ar.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ar-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ar.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.bn.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.bn.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.bn.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-bn-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.bn.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.en.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.en.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.en.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-en-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.en.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.es.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.es.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.es.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-es-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.es.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.fa.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fa.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fa.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fa-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fa.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.fi.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fi.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fi.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fi-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fi.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.fr.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.fr.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fr.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fr-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.fr.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.hi.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.hi.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.hi.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-hi-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.hi.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.id.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.id.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.id.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-id-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.id.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.ja.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ja.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ja.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ja-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ja.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.ko.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ko.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ko.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ko-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ko.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.ru.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.ru.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ru.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ru-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.ru.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.sw.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.sw.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.sw.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-sw-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.sw.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.te.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.te.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.te.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-te-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.te.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.th.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.th.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.th.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-th-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.th.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.zh.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.zh.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.zh.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-zh-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.zh.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.de.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.de.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.de.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-de-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.de.dev.txt` Command to generate run: `python -m pyserini.fusion \ --runs run.miracl.bm25.yo.dev.top1000.txt run.miracl.mdpr-tied-pft-msmarco.yo.dev.top1000.txt \ --output run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.yo.dev.txt --method interpolation --alpha 0.5 --depth 1000 --k 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-yo-dev \ run.miracl.bm25-mdpr-tied-pft-msmarco-hybrid.yo.dev.txt`
	mDPR pFT+FT1	0.795	0.848	0.508	0.471	0.686	0.798	0.601	0.637	0.584	0.745	0.718	0.671	0.888	0.951	0.836	0.673	0.599	0.891	0.717
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.ar.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ar-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.ar.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.bn.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-bn-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.bn.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.en.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-en-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.en.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.es.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-es-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.es.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.fa.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fa-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.fa.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.fi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fi-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.fi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.fr.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fr-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.fr.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.hi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-hi-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.hi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.id.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-id-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.id.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.ja.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ja-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.ja.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.ko.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ko-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.ko.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.ru.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ru-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.ru.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.sw.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-sw-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.sw.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.te.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-te-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.te.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.th.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-th-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.th.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.zh.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-zh-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.zh.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-de-dev \ --index miracl-v1.0-de-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.de.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-de-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.de.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-all \ --topics miracl-v1.0-yo-dev \ --index miracl-v1.0-yo-mdpr-tied-pft-msmarco-ft-all \ --output run.miracl.mdpr-tied-pft-msmarco-ft-all.yo.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-yo-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-all.yo.dev.txt`
	mDPR pFT+FT2	0.949	0.955	0.834	0.911	0.913	0.948	0.954	0.886	0.864	0.923	0.886	0.910	0.937	0.962	0.931	0.963	--	--	0.920
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ar \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-miracl-ar \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ar.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ar-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ar.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-bn \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-miracl-bn \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.bn.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-bn-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.bn.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-en \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-miracl-en \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.en.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-en-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.en.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-es \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-miracl-es \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.es.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-es-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.es.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fa \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-miracl-fa \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fa.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fa-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fa.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fi \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-miracl-fi \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fi-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-fr \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-miracl-fr \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fr.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fr-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.fr.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-hi \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-miracl-hi \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.hi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-hi-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.hi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-id \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-miracl-id \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.id.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-id-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.id.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ja \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-miracl-ja \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ja.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ja-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ja.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ko \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-miracl-ko \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ko.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ko-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ko.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-ru \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-miracl-ru \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ru.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ru-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.ru.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-sw \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-miracl-sw \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.sw.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-sw-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.sw.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-te \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-miracl-te \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.te.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-te-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.te.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-th \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-miracl-th \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.th.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-th-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.th.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class auto \ --encoder castorini/mdpr-tied-pft-msmarco-ft-miracl-zh \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-miracl-zh \ --output run.miracl.mdpr-tied-pft-msmarco-ft-miracl.zh.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-zh-dev \ run.miracl.mdpr-tied-pft-msmarco-ft-miracl.zh.dev.txt` Command to generate run: Evaluation commands: Command to generate run: Evaluation commands:
	mContriever	0.925	0.921	0.797	0.841	0.654	0.953	0.824	0.646	0.802	0.878	0.875	0.850	0.911	0.961	0.936	0.903	0.841	0.770	0.849
ar bn en es fa fi fr hi id ja ko ru sw te th zh de yo Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-ar-dev \ --index miracl-v1.0-ar-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.ar.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ar-dev \ run.miracl.mcontriever-tied-pft-msmarco.ar.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-bn-dev \ --index miracl-v1.0-bn-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.bn.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-bn-dev \ run.miracl.mcontriever-tied-pft-msmarco.bn.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-en-dev \ --index miracl-v1.0-en-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.en.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-en-dev \ run.miracl.mcontriever-tied-pft-msmarco.en.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-es-dev \ --index miracl-v1.0-es-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.es.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-es-dev \ run.miracl.mcontriever-tied-pft-msmarco.es.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-fa-dev \ --index miracl-v1.0-fa-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.fa.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fa-dev \ run.miracl.mcontriever-tied-pft-msmarco.fa.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-fi-dev \ --index miracl-v1.0-fi-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.fi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fi-dev \ run.miracl.mcontriever-tied-pft-msmarco.fi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-fr-dev \ --index miracl-v1.0-fr-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.fr.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-fr-dev \ run.miracl.mcontriever-tied-pft-msmarco.fr.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-hi-dev \ --index miracl-v1.0-hi-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.hi.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-hi-dev \ run.miracl.mcontriever-tied-pft-msmarco.hi.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-id-dev \ --index miracl-v1.0-id-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.id.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-id-dev \ run.miracl.mcontriever-tied-pft-msmarco.id.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-ja-dev \ --index miracl-v1.0-ja-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.ja.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ja-dev \ run.miracl.mcontriever-tied-pft-msmarco.ja.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-ko-dev \ --index miracl-v1.0-ko-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.ko.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ko-dev \ run.miracl.mcontriever-tied-pft-msmarco.ko.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-ru-dev \ --index miracl-v1.0-ru-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.ru.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-ru-dev \ run.miracl.mcontriever-tied-pft-msmarco.ru.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-sw-dev \ --index miracl-v1.0-sw-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.sw.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-sw-dev \ run.miracl.mcontriever-tied-pft-msmarco.sw.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-te-dev \ --index miracl-v1.0-te-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.te.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-te-dev \ run.miracl.mcontriever-tied-pft-msmarco.te.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-th-dev \ --index miracl-v1.0-th-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.th.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-th-dev \ run.miracl.mcontriever-tied-pft-msmarco.th.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-zh-dev \ --index miracl-v1.0-zh-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.zh.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-zh-dev \ run.miracl.mcontriever-tied-pft-msmarco.zh.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-de-dev \ --index miracl-v1.0-de-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.de.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-de-dev \ run.miracl.mcontriever-tied-pft-msmarco.de.dev.txt` Command to generate run: `python -m pyserini.search.faiss \ --threads 16 --batch-size 512 \ --encoder-class contriever \ --encoder facebook/mcontriever-msmarco \ --topics miracl-v1.0-yo-dev \ --index miracl-v1.0-yo-mcontriever-pft-msmarco \ --output run.miracl.mcontriever-tied-pft-msmarco.yo.dev.txt --hits 1000` Evaluation commands: `python -m pyserini.eval.trec_eval \ -c -m recall.100 miracl-v1.0-yo-dev \ run.miracl.mcontriever-tied-pft-msmarco.yo.dev.txt`

Programmatic Execution

All experimental runs shown in the above table can be programmatically executed based on the instructions below. To list all the experimental conditions:

python -m pyserini.2cr.miracl --list-conditions

Run all languages for a specific condition and show commands:

python -m pyserini.2cr.miracl --condition bm25 --display-commands

Run a particular language for a specific condition and show commands:

python -m pyserini.2cr.miracl --condition bm25 --language ko --display-commands

Run all languages for all conditions and show commands:

python -m pyserini.2cr.miracl --all --display-commands

With the above commands, run files will be placed in the current directory. Use the option --directory runs to place the runs in a sub-directory.

For a specific condition, just show the commands and do not run:

python -m pyserini.2cr.miracl --condition bm25 --display-commands --dry-run

This will generate exactly the commands for a specific condition above (corresponding to a row in the table).

For a specific condition and language, just show the commands and do not run:

python -m pyserini.2cr.miracl --condition bm25 --language ko --display-commands --dry-run

For all conditions, just show the commands and do not run and skip evaluation:

python -m pyserini.2cr.miracl --all --display-commands --dry-run --skip-eval

Finally, to generate this page:

python -m pyserini.2cr.miracl --generate-report --output docs/2cr/miracl.html

The output file miracl.html should be identical to this page.

Pyserini Reproductions: MIRACL

Programmatic Execution