Pyserini Reproductions: Mr.TyDi

This page provides two-click reproductions* for a number of experimental runs on the Mr. TyDi dataset. Instructions for programmatic execution are shown at the bottom of this page. The dataset is described in the following paper:

Xinyu Zhang, Xueguang Ma, Peng Shi, and Jimmy Lin. Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval. Proceedings of 1st Workshop on Multilingual Representation Learning, pages 127-137, November 2021, Punta Cana, Dominican Republic.

Key:

MRR@100, test queries ar bn en fi id ja ko ru sw te th avg
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language ar \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-ar \
  --output run.mrtydi.bm25.ar.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
  run.mrtydi.bm25.ar.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language bn \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bn \
  --output run.mrtydi.bm25.bn.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
  run.mrtydi.bm25.bn.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language en \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-en \
  --output run.mrtydi.bm25.en.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-english-test \
  run.mrtydi.bm25.en.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language fi \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-fi \
  --output run.mrtydi.bm25.fi.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
  run.mrtydi.bm25.fi.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language id \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-id \
  --output run.mrtydi.bm25.id.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
  run.mrtydi.bm25.id.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language ja \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-ja \
  --output run.mrtydi.bm25.ja.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
  run.mrtydi.bm25.ja.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language ko \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-ko \
  --output run.mrtydi.bm25.ko.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
  run.mrtydi.bm25.ko.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language ru \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-ru \
  --output run.mrtydi.bm25.ru.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
  run.mrtydi.bm25.ru.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language sw \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-sw \
  --output run.mrtydi.bm25.sw.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
  run.mrtydi.bm25.sw.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language te \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-te \
  --output run.mrtydi.bm25.te.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
  run.mrtydi.bm25.te.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language th \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-th \
  --output run.mrtydi.bm25.th.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
  run.mrtydi.bm25.th.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-arabic-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
  run.mrtydi.mdpr-split-pft-nq.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bengali-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
  run.mrtydi.mdpr-split-pft-nq.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-english-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-english-test \
  run.mrtydi.mdpr-split-pft-nq.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-finnish-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
  run.mrtydi.mdpr-split-pft-nq.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-indonesian-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
  run.mrtydi.mdpr-split-pft-nq.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-japanese-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
  run.mrtydi.mdpr-split-pft-nq.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-korean-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
  run.mrtydi.mdpr-split-pft-nq.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-russian-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
  run.mrtydi.mdpr-split-pft-nq.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-swahili-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
  run.mrtydi.mdpr-split-pft-nq.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-telugu-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
  run.mrtydi.mdpr-split-pft-nq.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-thai-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
  run.mrtydi.mdpr-split-pft-nq.th.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-arabic-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
  run.mrtydi.mdpr-tied-pft-nq.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bengali-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
  run.mrtydi.mdpr-tied-pft-nq.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-english-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-english-test \
  run.mrtydi.mdpr-tied-pft-nq.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-finnish-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
  run.mrtydi.mdpr-tied-pft-nq.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-indonesian-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
  run.mrtydi.mdpr-tied-pft-nq.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-japanese-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
  run.mrtydi.mdpr-tied-pft-nq.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-korean-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
  run.mrtydi.mdpr-tied-pft-nq.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-russian-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
  run.mrtydi.mdpr-tied-pft-nq.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-swahili-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
  run.mrtydi.mdpr-tied-pft-nq.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-telugu-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
  run.mrtydi.mdpr-tied-pft-nq.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-thai-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
  run.mrtydi.mdpr-tied-pft-nq.th.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
  run.mrtydi.mdpr-tied-pft-msmarco.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
  run.mrtydi.mdpr-tied-pft-msmarco.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-english-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-english-test \
  run.mrtydi.mdpr-tied-pft-msmarco.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
  run.mrtydi.mdpr-tied-pft-msmarco.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
  run.mrtydi.mdpr-tied-pft-msmarco.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
  run.mrtydi.mdpr-tied-pft-msmarco.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-korean-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
  run.mrtydi.mdpr-tied-pft-msmarco.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-russian-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
  run.mrtydi.mdpr-tied-pft-msmarco.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
  run.mrtydi.mdpr-tied-pft-msmarco.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
  run.mrtydi.mdpr-tied-pft-msmarco.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-thai-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
  run.mrtydi.mdpr-tied-pft-msmarco.th.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-arabic-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-bengali-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-english-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-english-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-finnish-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-indonesian-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-japanese-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-korean-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-korean-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-russian-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-russian-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-swahili-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-telugu-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-thai-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -M 100 -m recip_rank mrtydi-v1.1-thai-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.th.test.txt
Recall@100, test queries ar bn en fi id ja ko ru sw te th avg
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language ar \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-ar \
  --output run.mrtydi.bm25.ar.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-arabic-test \
  run.mrtydi.bm25.ar.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language bn \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bn \
  --output run.mrtydi.bm25.bn.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-bengali-test \
  run.mrtydi.bm25.bn.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language en \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-en \
  --output run.mrtydi.bm25.en.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-english-test \
  run.mrtydi.bm25.en.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language fi \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-fi \
  --output run.mrtydi.bm25.fi.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-finnish-test \
  run.mrtydi.bm25.fi.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language id \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-id \
  --output run.mrtydi.bm25.id.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-indonesian-test \
  run.mrtydi.bm25.id.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language ja \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-ja \
  --output run.mrtydi.bm25.ja.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-japanese-test \
  run.mrtydi.bm25.ja.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language ko \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-ko \
  --output run.mrtydi.bm25.ko.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-korean-test \
  run.mrtydi.bm25.ko.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language ru \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-ru \
  --output run.mrtydi.bm25.ru.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-russian-test \
  run.mrtydi.bm25.ru.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language sw \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-sw \
  --output run.mrtydi.bm25.sw.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-swahili-test \
  run.mrtydi.bm25.sw.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language te \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-te \
  --output run.mrtydi.bm25.te.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-telugu-test \
  run.mrtydi.bm25.te.test.txt
Command to generate run:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --language th \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-th \
  --output run.mrtydi.bm25.th.test.txt --bm25 --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-thai-test \
  run.mrtydi.bm25.th.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-arabic-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-arabic-test \
  run.mrtydi.mdpr-split-pft-nq.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bengali-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-bengali-test \
  run.mrtydi.mdpr-split-pft-nq.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-english-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-english-test \
  run.mrtydi.mdpr-split-pft-nq.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-finnish-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-finnish-test \
  run.mrtydi.mdpr-split-pft-nq.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-indonesian-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-indonesian-test \
  run.mrtydi.mdpr-split-pft-nq.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-japanese-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-japanese-test \
  run.mrtydi.mdpr-split-pft-nq.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-korean-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-korean-test \
  run.mrtydi.mdpr-split-pft-nq.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-russian-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-russian-test \
  run.mrtydi.mdpr-split-pft-nq.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-swahili-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-swahili-test \
  run.mrtydi.mdpr-split-pft-nq.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-telugu-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-telugu-test \
  run.mrtydi.mdpr-split-pft-nq.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder castorini/mdpr-question-nq \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-thai-mdpr-nq \
  --output run.mrtydi.mdpr-split-pft-nq.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-thai-test \
  run.mrtydi.mdpr-split-pft-nq.th.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-arabic-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-arabic-test \
  run.mrtydi.mdpr-tied-pft-nq.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bengali-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-bengali-test \
  run.mrtydi.mdpr-tied-pft-nq.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-english-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-english-test \
  run.mrtydi.mdpr-tied-pft-nq.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-finnish-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-finnish-test \
  run.mrtydi.mdpr-tied-pft-nq.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-indonesian-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-indonesian-test \
  run.mrtydi.mdpr-tied-pft-nq.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-japanese-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-japanese-test \
  run.mrtydi.mdpr-tied-pft-nq.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-korean-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-korean-test \
  run.mrtydi.mdpr-tied-pft-nq.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-russian-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-russian-test \
  run.mrtydi.mdpr-tied-pft-nq.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-swahili-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-swahili-test \
  run.mrtydi.mdpr-tied-pft-nq.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-telugu-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-telugu-test \
  run.mrtydi.mdpr-tied-pft-nq.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-nq \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-thai-mdpr-tied-pft-nq \
  --output run.mrtydi.mdpr-tied-pft-nq.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-thai-test \
  run.mrtydi.mdpr-tied-pft-nq.th.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-arabic-test \
  run.mrtydi.mdpr-tied-pft-msmarco.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-bengali-test \
  run.mrtydi.mdpr-tied-pft-msmarco.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-english-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-english-test \
  run.mrtydi.mdpr-tied-pft-msmarco.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-finnish-test \
  run.mrtydi.mdpr-tied-pft-msmarco.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-indonesian-test \
  run.mrtydi.mdpr-tied-pft-msmarco.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-japanese-test \
  run.mrtydi.mdpr-tied-pft-msmarco.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-korean-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-korean-test \
  run.mrtydi.mdpr-tied-pft-msmarco.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-russian-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-russian-test \
  run.mrtydi.mdpr-tied-pft-msmarco.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-swahili-test \
  run.mrtydi.mdpr-tied-pft-msmarco.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-telugu-test \
  run.mrtydi.mdpr-tied-pft-msmarco.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-thai-mdpr-tied-pft-msmarco \
  --output run.mrtydi.mdpr-tied-pft-msmarco.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-thai-test \
  run.mrtydi.mdpr-tied-pft-msmarco.th.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-arabic-test \
  --index mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ar.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-arabic-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ar.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-bengali-test \
  --index mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.bn.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-bengali-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.bn.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-english-test \
  --index mrtydi-v1.1-english-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.en.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-english-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.en.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-finnish-test \
  --index mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.fi.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-finnish-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.fi.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-indonesian-test \
  --index mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.id.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-indonesian-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.id.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-japanese-test \
  --index mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ja.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-japanese-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ja.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-korean-test \
  --index mrtydi-v1.1-korean-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ko.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-korean-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ko.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-russian-test \
  --index mrtydi-v1.1-russian-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ru.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-russian-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.ru.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-swahili-test \
  --index mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.sw.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-swahili-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.sw.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-telugu-test \
  --index mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.te.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-telugu-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.te.test.txt
Command to generate run:
python -m pyserini.search.faiss \
  --threads 16 --batch-size 512 \
  --encoder-class auto \
  --encoder castorini/mdpr-tied-pft-msmarco-ft-all \
  --topics mrtydi-v1.1-thai-test \
  --index mrtydi-v1.1-thai-mdpr-tied-pft-msmarco-ft-all \
  --output run.mrtydi.mdpr-tied-pft-msmarco-ft-all.th.test.txt --hits 100
Evaluation commands:
python -m pyserini.eval.trec_eval \
  -c -m recall.100 mrtydi-v1.1-thai-test \
  run.mrtydi.mdpr-tied-pft-msmarco-ft-all.th.test.txt

Programmatic Execution

All experimental runs shown in the above table can be programmatically executed based on the instructions below. To list all the experimental conditions:

python -m pyserini.2cr.mrtydi --list-conditions

Run all languages for a specific condition and show commands:

python -m pyserini.2cr.mrtydi --condition bm25 --display-commands

Run a particular language for a specific condition and show commands:

python -m pyserini.2cr.mrtydi --condition bm25 --language ko --display-commands

Run all languages for all conditions and show commands:

python -m pyserini.2cr.mrtydi --all --display-commands

With the above commands, run files will be placed in the current directory. Use the option --directory runs to place the runs in a sub-directory.

For a specific condition, just show the commands and do not run:

python -m pyserini.2cr.mrtydi --condition bm25 --display-commands --dry-run

This will generate exactly the commands for a specific condition above (corresponding to a row in the table).

For a specific condition and language, just show the commands and do not run:

python -m pyserini.2cr.mrtydi --condition bm25 --language ko --display-commands --dry-run

For all conditions, just show the commands and do not run and skip evaluation:

python -m pyserini.2cr.mrtydi --all --display-commands --dry-run --skip-eval

Finally, to generate this page:

python -m pyserini.2cr.mrtydi --generate-report --output docs/2cr/mrtydi.html

The output file mrtydi.html should be identical to this page.