|
[1] (1a) |
BM25 (k1=0.9, b=0.4): Lucene |
0.3013 |
0.5058 |
0.7501 |
|
0.2856 |
0.4796 |
0.7863 |
|
0.1840 |
0.8526 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-default.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-default.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-default.dev.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-default.dev.txt
|
|
[1] (1b) |
BM25+RM3 (k1=0.9, b=0.4): Lucene |
0.3416 |
0.5216 |
0.8136 |
|
0.3006 |
0.4896 |
0.8236 |
|
0.1566 |
0.8606 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-rm3-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-rm3-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-rm3-default.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-rm3-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-rm3-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-rm3-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-rm3-default.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-rm3-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-rm3-default.dev.txt \
--bm25 --k1 0.9 --b 0.4 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rm3-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rm3-default.dev.txt
|
|
|
BM25+Rocchio (k1=0.9, b=0.4): Lucene |
0.3474 |
0.5275 |
0.8007 |
|
0.3115 |
0.4910 |
0.8156 |
|
0.1595 |
0.8620 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-rocchio-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-default.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-rocchio-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-default.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-rocchio-default.dev.txt \
--bm25 --k1 0.9 --b 0.4 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rocchio-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rocchio-default.dev.txt
|
|
|
|
BM25 (k1=0.82, b=0.68): Lucene |
0.2903 |
0.4973 |
0.7450 |
|
0.2876 |
0.4876 |
0.8031 |
|
0.1875 |
0.8573 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-passage \
--index msmarco-v1-passage \
--output run.msmarco-v1-passage.bm25-tuned.dl19.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-passage \
--output run.msmarco-v1-passage.bm25-tuned.dl20.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-passage-dev-subset \
--index msmarco-v1-passage \
--output run.msmarco-v1-passage.bm25-tuned.dev.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-tuned.dev.txt
|
|
|
BM25+RM3 (k1=0.82, b=0.68): Lucene |
0.3339 |
0.5147 |
0.7950 |
|
0.3017 |
0.4924 |
0.8292 |
|
0.1646 |
0.8704 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-rm3-tuned.dl19.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-rm3-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-rm3-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-rm3-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-rm3-tuned.dl20.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-rm3-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-rm3-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-rm3-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-rm3-tuned.dev.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rm3-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rm3-tuned.dev.txt
|
|
|
BM25+Rocchio (k1=0.82, b=0.68): Lucene |
0.3396 |
0.5275 |
0.7948 |
|
0.3120 |
0.4908 |
0.8327 |
|
0.1684 |
0.8726 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-rocchio-tuned.dl19.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-rocchio-tuned.dl20.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-rocchio-tuned.dev.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rocchio-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rocchio-tuned.dev.txt
|
|
|
[1] (2a) |
BM25 w/ doc2query-T5 (k1=0.9, b=0.4): Lucene |
0.4034 |
0.6417 |
0.8310 |
|
0.4074 |
0.6187 |
0.8452 |
|
0.2723 |
0.9470 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5 \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-d2q-t5-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-d2q-t5-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-d2q-t5-default.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-d2q-t5-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5 \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-d2q-t5-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-d2q-t5-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-d2q-t5-default.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-d2q-t5-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5 \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-d2q-t5-default.dev.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-d2q-t5-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-d2q-t5-default.dev.txt
|
|
[1] (2b) |
BM25+RM3 w/ doc2query-T5 (k1=0.9, b=0.4): Lucene |
0.4483 |
0.6586 |
0.8863 |
|
0.4286 |
0.6131 |
0.8700 |
|
0.2139 |
0.9460 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dl19.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dl20.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dev.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-default.dev.txt
|
|
|
BM25+Rocchio w/ doc2query-T5 (k1=0.9, b=0.4): Lucene |
0.4469 |
0.6538 |
0.8855 |
|
0.4246 |
0.6102 |
0.8675 |
|
0.2158 |
0.9467 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dl19.txt \
--bm25 --rocchio --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dl20.txt \
--bm25 --rocchio --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dev.txt \
--bm25 --rocchio --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-default.dev.txt
|
|
|
|
BM25 w/ doc2query-T5 (k1=2.18, b=0.86): Lucene |
0.4046 |
0.6336 |
0.8134 |
|
0.4171 |
0.6265 |
0.8393 |
|
0.2816 |
0.9506 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5 \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-d2q-t5-tuned.dl19.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-d2q-t5-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-d2q-t5-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-d2q-t5-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5 \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-d2q-t5-tuned.dl20.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-d2q-t5-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-d2q-t5-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-d2q-t5-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5 \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-d2q-t5-tuned.dev.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-d2q-t5-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-d2q-t5-tuned.dev.txt
|
|
|
BM25+RM3 w/ doc2query-T5 (k1=2.18, b=0.86): Lucene |
0.4377 |
0.6537 |
0.8443 |
|
0.4348 |
0.6235 |
0.8605 |
|
0.2382 |
0.9528 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dl19.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dl20.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dev.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rm3-d2q-t5-tuned.dev.txt
|
|
|
BM25+Rocchio w/ doc2query-T5 (k1=2.18, b=0.86): Lucene |
0.4339 |
0.6559 |
0.8465 |
|
0.4376 |
0.6224 |
0.8641 |
|
0.2395 |
0.9535 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics dl19-passage \
--output run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dl19.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics dl20 \
--output run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dl20.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.d2q-t5-docvectors \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dev.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bm25-rocchio-d2q-t5-tuned.dev.txt
|
|
|
[1] (3b) |
uniCOIL (w/ doc2query-T5): Lucene, cached queries |
0.4612 |
0.7024 |
0.8292 |
|
0.4430 |
0.6745 |
0.8430 |
|
0.3516 |
0.9582 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil \
--topics dl19-passage-unicoil \
--output run.msmarco-v1-passage.unicoil.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.unicoil.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.unicoil.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.unicoil.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil \
--topics dl20-unicoil \
--output run.msmarco-v1-passage.unicoil.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.unicoil.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.unicoil.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.unicoil.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil \
--topics msmarco-passage-dev-subset-unicoil \
--output run.msmarco-v1-passage.unicoil.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil.dev.txt
|
|
|
uniCOIL (w/ doc2query-T5): Lucene, PyTorch |
0.4612 |
0.7024 |
0.8292 |
|
0.4430 |
0.6745 |
0.8430 |
|
0.3509 |
0.9583 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil \
--topics dl19-passage \
--encoder castorini/unicoil-msmarco-passage \
--output run.msmarco-v1-passage.unicoil-pytorch.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.unicoil-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.unicoil-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.unicoil-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil \
--topics dl20 \
--encoder castorini/unicoil-msmarco-passage \
--output run.msmarco-v1-passage.unicoil-pytorch.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.unicoil-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.unicoil-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.unicoil-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil \
--topics msmarco-passage-dev-subset \
--encoder castorini/unicoil-msmarco-passage \
--output run.msmarco-v1-passage.unicoil-pytorch.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-pytorch.dev.txt
|
|
|
uniCOIL (w/ doc2query-T5): Lucene, ONNX |
0.4612 |
0.7024 |
0.8292 |
|
0.4430 |
0.6745 |
0.8430 |
|
0.3509 |
0.9583 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil \
--topics dl19-passage \
--onnx-encoder UniCoil \
--output run.msmarco-v1-passage.unicoil-onnx.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.unicoil-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.unicoil-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.unicoil-onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil \
--topics dl20 \
--onnx-encoder UniCoil \
--output run.msmarco-v1-passage.unicoil-onnx.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.unicoil-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.unicoil-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.unicoil-onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil \
--topics msmarco-passage-dev-subset \
--onnx-encoder UniCoil \
--output run.msmarco-v1-passage.unicoil-onnx.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-onnx.dev.txt
|
|
|
[1] (3a) |
uniCOIL (noexp): Lucene, cached queries |
0.4033 |
0.6433 |
0.7752 |
|
0.4021 |
0.6523 |
0.7861 |
|
0.3153 |
0.9239 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil-noexp \
--topics dl19-passage-unicoil-noexp \
--output run.msmarco-v1-passage.unicoil-noexp.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.unicoil-noexp.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.unicoil-noexp.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.unicoil-noexp.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil-noexp \
--topics dl20-unicoil-noexp \
--output run.msmarco-v1-passage.unicoil-noexp.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.unicoil-noexp.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.unicoil-noexp.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.unicoil-noexp.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil-noexp \
--topics msmarco-passage-dev-subset-unicoil-noexp \
--output run.msmarco-v1-passage.unicoil-noexp.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-noexp.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-noexp.dev.txt
|
|
|
uniCOIL (noexp): Lucene, PyTorch |
0.4033 |
0.6433 |
0.7752 |
|
0.4021 |
0.6523 |
0.7861 |
|
0.3153 |
0.9239 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil-noexp \
--topics dl19-passage \
--encoder castorini/unicoil-noexp-msmarco-passage \
--output run.msmarco-v1-passage.unicoil-noexp-pytorch.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.unicoil-noexp-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.unicoil-noexp-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.unicoil-noexp-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil-noexp \
--topics dl20 \
--encoder castorini/unicoil-noexp-msmarco-passage \
--output run.msmarco-v1-passage.unicoil-noexp-pytorch.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.unicoil-noexp-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.unicoil-noexp-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.unicoil-noexp-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil-noexp \
--topics msmarco-passage-dev-subset \
--encoder castorini/unicoil-noexp-msmarco-passage \
--output run.msmarco-v1-passage.unicoil-noexp-pytorch.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-noexp-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-noexp-pytorch.dev.txt
|
|
|
uniCOIL (noexp): Lucene, ONNX |
0.4059 |
0.6535 |
0.7811 |
|
0.3908 |
0.6400 |
0.7910 |
|
0.3120 |
0.9239 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil-noexp \
--topics dl19-passage \
--onnx-encoder UniCoil \
--output run.msmarco-v1-passage.unicoil-noexp-onnx.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.unicoil-noexp-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.unicoil-noexp-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.unicoil-noexp-onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil-noexp \
--topics dl20 \
--onnx-encoder UniCoil \
--output run.msmarco-v1-passage.unicoil-noexp-onnx.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.unicoil-noexp-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.unicoil-noexp-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.unicoil-noexp-onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.unicoil-noexp \
--topics msmarco-passage-dev-subset \
--onnx-encoder UniCoil \
--output run.msmarco-v1-passage.unicoil-noexp-onnx.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-noexp-onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.unicoil-noexp-onnx.dev.txt
|
|
|
[2] |
SPLADE++ EnsembleDistil: Lucene, PyTorch |
0.5050 |
0.7308 |
0.8728 |
|
0.4999 |
0.7197 |
0.8998 |
|
0.3828 |
0.9831 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed \
--topics dl19-passage \
--encoder naver/splade-cocondenser-ensembledistil \
--output run.msmarco-v1-passage.splade-pp-ed-pytorch.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed \
--topics dl20 \
--encoder naver/splade-cocondenser-ensembledistil \
--output run.msmarco-v1-passage.splade-pp-ed-pytorch.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed \
--topics msmarco-passage-dev-subset \
--encoder naver/splade-cocondenser-ensembledistil \
--output run.msmarco-v1-passage.splade-pp-ed-pytorch.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-ed-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-ed-pytorch.dev.txt
|
|
[2] |
SPLADE++ EnsembleDistil: Lucene, ONNX |
0.5050 |
0.7308 |
0.8728 |
|
0.4999 |
0.7197 |
0.8998 |
|
0.3828 |
0.9831 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed \
--topics dl19-passage \
--onnx-encoder SpladePlusPlusEnsembleDistil \
--output run.msmarco-v1-passage.splade-pp-ed-onnx.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed \
--topics dl20 \
--onnx-encoder SpladePlusPlusEnsembleDistil \
--output run.msmarco-v1-passage.splade-pp-ed-onnx.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed \
--topics msmarco-passage-dev-subset \
--onnx-encoder SpladePlusPlusEnsembleDistil \
--output run.msmarco-v1-passage.splade-pp-ed-onnx.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-ed-onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-ed-onnx.dev.txt
|
|
|
SPLADE++ EnsembleDistil w/ Rocchio: Lucene, PyTorch |
0.5140 |
0.7119 |
0.8799 |
|
0.5084 |
0.7280 |
0.9069 |
|
0.3301 |
0.9811 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed-text \
--topics dl19-passage \
--encoder naver/splade-cocondenser-ensembledistil \
--output run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dl19.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed-text \
--topics dl20 \
--encoder naver/splade-cocondenser-ensembledistil \
--output run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dl20.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed-text \
--topics msmarco-passage-dev-subset \
--encoder naver/splade-cocondenser-ensembledistil \
--output run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dev.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-ed-rocchio-pytorch.dev.txt
|
|
|
SPLADE++ EnsembleDistil w/ Rocchio: Lucene, ONNX |
0.5140 |
0.7119 |
0.8799 |
|
0.5084 |
0.7280 |
0.9069 |
|
0.3300 |
0.9811 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed-text \
--topics dl19-passage \
--onnx-encoder SpladePlusPlusEnsembleDistil \
--output run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dl19.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed-text \
--topics dl20 \
--onnx-encoder SpladePlusPlusEnsembleDistil \
--output run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dl20.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-ed-text \
--topics msmarco-passage-dev-subset \
--onnx-encoder SpladePlusPlusEnsembleDistil \
--output run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dev.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-ed-rocchio-onnx.dev.txt
|
|
[2] |
SPLADE++ SelfDistil: Lucene, PyTorch |
0.4998 |
0.7358 |
0.8761 |
|
0.5139 |
0.7282 |
0.9024 |
|
0.3776 |
0.9846 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd \
--topics dl19-passage \
--encoder naver/splade-cocondenser-selfdistil \
--output run.msmarco-v1-passage.splade-pp-sd-pytorch.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd \
--topics dl20 \
--encoder naver/splade-cocondenser-selfdistil \
--output run.msmarco-v1-passage.splade-pp-sd-pytorch.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd \
--topics msmarco-passage-dev-subset \
--encoder naver/splade-cocondenser-selfdistil \
--output run.msmarco-v1-passage.splade-pp-sd-pytorch.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-sd-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-sd-pytorch.dev.txt
|
|
[2] |
SPLADE++ SelfDistil: Lucene, ONNX |
0.4998 |
0.7358 |
0.8761 |
|
0.5139 |
0.7282 |
0.9024 |
|
0.3776 |
0.9846 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd \
--topics dl19-passage \
--onnx-encoder SpladePlusPlusSelfDistil \
--output run.msmarco-v1-passage.splade-pp-sd-onnx.dl19.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd \
--topics dl20 \
--onnx-encoder SpladePlusPlusSelfDistil \
--output run.msmarco-v1-passage.splade-pp-sd-onnx.dl20.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd \
--topics msmarco-passage-dev-subset \
--onnx-encoder SpladePlusPlusSelfDistil \
--output run.msmarco-v1-passage.splade-pp-sd-onnx.dev.txt \
--hits 1000 --impact
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-sd-onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-sd-onnx.dev.txt
|
|
|
SPLADE++ SelfDistil w/ Rocchio: Lucene, PyTorch |
0.5072 |
0.7156 |
0.8918 |
|
0.5335 |
0.7388 |
0.9120 |
|
0.3278 |
0.9824 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd-text \
--topics dl19-passage \
--encoder naver/splade-cocondenser-selfdistil \
--output run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dl19.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd-text \
--topics dl20 \
--encoder naver/splade-cocondenser-selfdistil \
--output run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dl20.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd-text \
--topics msmarco-passage-dev-subset \
--encoder naver/splade-cocondenser-selfdistil \
--output run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dev.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-sd-rocchio-pytorch.dev.txt
|
|
|
SPLADE++ SelfDistil w/ Rocchio: Lucene, ONNX |
0.5072 |
0.7156 |
0.8918 |
|
0.5335 |
0.7388 |
0.9120 |
|
0.3278 |
0.9824 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd-text \
--topics dl19-passage \
--onnx-encoder SpladePlusPlusSelfDistil \
--output run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dl19.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd-text \
--topics dl20 \
--onnx-encoder SpladePlusPlusSelfDistil \
--output run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dl20.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.splade-pp-sd-text \
--topics msmarco-passage-dev-subset \
--onnx-encoder SpladePlusPlusSelfDistil \
--output run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dev.txt \
--hits 1000 --impact --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.splade-pp-sd-rocchio-onnx.dev.txt
|
|
|
[3] |
ANCE: Faiss flat, cached queries |
0.3710 |
0.6452 |
0.7554 |
|
0.4076 |
0.6458 |
0.7764 |
|
0.3302 |
0.9584 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics dl19-passage --encoded-queries ance-dl19-passage \
--output run.msmarco-v1-passage.ance.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.ance.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.ance.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.ance.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics dl20 --encoded-queries ance-dl20 \
--output run.msmarco-v1-passage.ance.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.ance.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.ance.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.ance.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics msmarco-passage-dev-subset --encoded-queries ance-msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.ance.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.ance.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.ance.dev.txt
|
|
[3] |
ANCE: Faiss flat, PyTorch |
0.3710 |
0.6452 |
0.7554 |
|
0.4076 |
0.6458 |
0.7764 |
|
0.3301 |
0.9587 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics dl19-passage \
--encoder castorini/ance-msmarco-passage \
--output run.msmarco-v1-passage.ance-pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.ance-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.ance-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.ance-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics dl20 \
--encoder castorini/ance-msmarco-passage \
--output run.msmarco-v1-passage.ance-pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.ance-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.ance-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.ance-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics msmarco-passage-dev-subset \
--encoder castorini/ance-msmarco-passage \
--output run.msmarco-v1-passage.ance-pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.ance-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.ance-pytorch.dev.txt
|
|
[9] |
ANCE w/ Average PRF: Faiss flat, PyTorch |
0.4247 |
0.6532 |
0.7739 |
|
0.4325 |
0.6573 |
0.7909 |
|
0.3075 |
0.9490 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics dl19-passage \
--encoder castorini/ance-msmarco-passage \
--output run.msmarco-v1-passage.ance-avg-prf-pytorch.dl19.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.ance-avg-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.ance-avg-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.ance-avg-prf-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics dl20 \
--encoder castorini/ance-msmarco-passage \
--output run.msmarco-v1-passage.ance-avg-prf-pytorch.dl20.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.ance-avg-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.ance-avg-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.ance-avg-prf-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics msmarco-passage-dev-subset \
--encoder castorini/ance-msmarco-passage \
--output run.msmarco-v1-passage.ance-avg-prf-pytorch.dev.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.ance-avg-prf-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.ance-avg-prf-pytorch.dev.txt
|
|
[9] |
ANCE w/ Rocchio PRF: Faiss flat, PyTorch |
0.4211 |
0.6539 |
0.7825 |
|
0.4315 |
0.6471 |
0.7957 |
|
0.3048 |
0.9547 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics dl19-passage \
--encoder castorini/ance-msmarco-passage \
--output run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dl19.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics dl20 \
--encoder castorini/ance-msmarco-passage \
--output run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dl20.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.ance \
--topics msmarco-passage-dev-subset \
--encoder castorini/ance-msmarco-passage \
--output run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dev.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.ance-rocchio-prf-pytorch.dev.txt
|
|
|
[10] |
SBERT: Faiss flat, PyTorch |
0.4060 |
0.6930 |
0.7872 |
|
0.4124 |
0.6344 |
0.7937 |
|
0.3314 |
0.9558 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.sbert \
--topics dl19-passage \
--encoder sentence-transformers/msmarco-distilbert-base-v3 \
--output run.msmarco-v1-passage.sbert-pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.sbert-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.sbert-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.sbert-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.sbert \
--topics dl20 \
--encoder sentence-transformers/msmarco-distilbert-base-v3 \
--output run.msmarco-v1-passage.sbert-pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.sbert-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.sbert-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.sbert-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.sbert \
--topics msmarco-passage-dev-subset \
--encoder sentence-transformers/msmarco-distilbert-base-v3 \
--output run.msmarco-v1-passage.sbert-pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.sbert-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.sbert-pytorch.dev.txt
|
|
[9] |
SBERT w/ Average PRF: Faiss flat, PyTorch |
0.4354 |
0.7001 |
0.7937 |
|
0.4258 |
0.6412 |
0.8169 |
|
0.3035 |
0.9446 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.sbert \
--topics dl19-passage \
--encoder sentence-transformers/msmarco-distilbert-base-v3 \
--output run.msmarco-v1-passage.sbert-avg-prf-pytorch.dl19.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.sbert-avg-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.sbert-avg-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.sbert-avg-prf-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.sbert \
--topics dl20 \
--encoder sentence-transformers/msmarco-distilbert-base-v3 \
--output run.msmarco-v1-passage.sbert-avg-prf-pytorch.dl20.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.sbert-avg-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.sbert-avg-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.sbert-avg-prf-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.sbert \
--topics msmarco-passage-dev-subset \
--encoder sentence-transformers/msmarco-distilbert-base-v3 \
--output run.msmarco-v1-passage.sbert-avg-prf-pytorch.dev.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.sbert-avg-prf-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.sbert-avg-prf-pytorch.dev.txt
|
|
[9] |
SBERT w/ Rocchio PRF: Faiss flat, PyTorch |
0.4371 |
0.6952 |
0.7941 |
|
0.4342 |
0.6559 |
0.8226 |
|
0.2972 |
0.9529 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.sbert \
--topics dl19-passage \
--encoder sentence-transformers/msmarco-distilbert-base-v3 \
--output run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dl19.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.sbert \
--topics dl20 \
--encoder sentence-transformers/msmarco-distilbert-base-v3 \
--output run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dl20.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.sbert \
--topics msmarco-passage-dev-subset \
--encoder sentence-transformers/msmarco-distilbert-base-v3 \
--output run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dev.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.sbert-rocchio-prf-pytorch.dev.txt
|
|
|
[4] |
DistilBERT KD: Faiss flat, cached queries |
0.4053 |
0.6994 |
0.7653 |
|
0.4159 |
0.6447 |
0.7953 |
|
0.3251 |
0.9553 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-margin-mse-t2 \
--topics dl19-passage --encoded-queries distilbert_kd-dl19-passage \
--output run.msmarco-v1-passage.distilbert-kd.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.distilbert-kd.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.distilbert-kd.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.distilbert-kd.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-margin-mse-t2 \
--topics dl20 --encoded-queries distilbert_kd-dl20 \
--output run.msmarco-v1-passage.distilbert-kd.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.distilbert-kd.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.distilbert-kd.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.distilbert-kd.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-margin-mse-t2 \
--topics msmarco-passage-dev-subset --encoded-queries distilbert_kd-msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.distilbert-kd.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd.dev.txt
|
|
[4] |
DistilBERT KD: Faiss flat, PyTorch |
0.4053 |
0.6994 |
0.7653 |
|
0.4159 |
0.6447 |
0.7953 |
|
0.3251 |
0.9553 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-margin-mse-t2 \
--topics dl19-passage \
--encoder sebastian-hofstaetter/distilbert-dot-margin_mse-T2-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.distilbert-kd-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-margin-mse-t2 \
--topics dl20 \
--encoder sebastian-hofstaetter/distilbert-dot-margin_mse-T2-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.distilbert-kd-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-margin-mse-t2 \
--topics msmarco-passage-dev-subset \
--encoder sebastian-hofstaetter/distilbert-dot-margin_mse-T2-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-pytorch.dev.txt
|
|
[5] |
DistilBERT KD TASB: Faiss flat, cached queries |
0.4590 |
0.7210 |
0.8406 |
|
0.4698 |
0.6854 |
0.8727 |
|
0.3444 |
0.9771 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics dl19-passage --encoded-queries distilbert_tas_b-dl19-passage \
--output run.msmarco-v1-passage.distilbert-kd-tasb.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics dl20 --encoded-queries distilbert_tas_b-dl20 \
--output run.msmarco-v1-passage.distilbert-kd-tasb.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics msmarco-passage-dev-subset --encoded-queries distilbert_tas_b-msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.distilbert-kd-tasb.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-tasb.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-tasb.dev.txt
|
|
[5] |
DistilBERT KD TASB: Faiss flat, PyTorch |
0.4590 |
0.7210 |
0.8406 |
|
0.4698 |
0.6854 |
0.8727 |
|
0.3444 |
0.9771 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics dl19-passage \
--encoder sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics dl20 \
--encoder sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics msmarco-passage-dev-subset \
--encoder sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-tasb-pytorch.dev.txt
|
|
[9] |
DistilBERT KD TASB w/ Average PRF: Faiss flat, PyTorch |
0.4856 |
0.7190 |
0.8517 |
|
0.4887 |
0.7086 |
0.9030 |
|
0.2910 |
0.9613 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics dl19-passage \
--encoder sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dl19.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics dl20 \
--encoder sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dl20.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics msmarco-passage-dev-subset \
--encoder sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dev.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-tasb-avg-prf-pytorch.dev.txt
|
|
[9] |
DistilBERT KD TASB w/ Rocchio PRF: Faiss flat, PyTorch |
0.4974 |
0.7231 |
0.8775 |
|
0.4879 |
0.7083 |
0.8926 |
|
0.2896 |
0.9702 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics dl19-passage \
--encoder sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dl19.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics dl20 \
--encoder sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dl20.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics msmarco-passage-dev-subset \
--encoder sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco \
--output run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dev.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.distilbert-kd-tasb-rocchio-prf-pytorch.dev.txt
|
|
|
[6] |
TCT_ColBERT-V2-HN+: Faiss flat, cached queries |
0.4469 |
0.7204 |
0.8261 |
|
0.4754 |
0.6882 |
0.8429 |
|
0.3584 |
0.9695 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics dl19-passage --encoded-queries tct_colbert-v2-hnp-dl19-passage \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics dl20 --encoded-queries tct_colbert-v2-hnp-dl20 \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics msmarco-passage-dev-subset --encoded-queries tct_colbert-v2-hnp-msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp.dev.txt
|
|
[6] |
TCT_ColBERT-V2-HN+: Faiss flat, PyTorch |
0.4469 |
0.7204 |
0.8261 |
|
0.4754 |
0.6882 |
0.8429 |
|
0.3584 |
0.9695 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics dl19-passage \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics dl20 \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics msmarco-passage-dev-subset \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-pytorch.dev.txt
|
|
[9] |
TCT_ColBERT-V2-HN+ w/ Average PRF: Faiss flat, PyTorch |
0.4879 |
0.7312 |
0.8586 |
|
0.4811 |
0.6836 |
0.8579 |
|
0.3121 |
0.9585 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics dl19-passage \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dl19.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics dl20 \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dl20.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics msmarco-passage-dev-subset \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dev.txt \
--prf-method avg --prf-depth 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-avg-prf-pytorch.dev.txt
|
|
[9] |
TCT_ColBERT-V2-HN+ w/ Rocchio PRF: Faiss flat, PyTorch |
0.4883 |
0.7111 |
0.8694 |
|
0.4860 |
0.6804 |
0.8518 |
|
0.3125 |
0.9659 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics dl19-passage \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dl19.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics dl20 \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dl20.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.tct_colbert-v2-hnp \
--topics msmarco-passage-dev-subset \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dev.txt \
--prf-method rocchio --prf-depth 5 --rocchio-alpha 0.4 --rocchio-beta 0.6 --rocchio-topk 5
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-rocchio-prf-pytorch.dev.txt
|
|
|
[6] |
Hybrid TCT_ColBERT-V2-HN+ and BM25: PyTorch |
0.4697 |
0.7320 |
0.8802 |
|
0.4859 |
0.7016 |
0.8898 |
|
0.3683 |
0.9707 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.hybrid \
dense --index msmarco-v1-passage.tct_colbert-v2-hnp \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
sparse --index msmarco-v1-passage \
fusion --alpha 0.06 \
run --threads 16 --batch-size 512 \
--topics dl19-passage \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.hybrid \
dense --index msmarco-v1-passage.tct_colbert-v2-hnp \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
sparse --index msmarco-v1-passage \
fusion --alpha 0.06 \
run --threads 16 --batch-size 512 \
--topics dl20 \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.hybrid \
dense --index msmarco-v1-passage.tct_colbert-v2-hnp \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
sparse --index msmarco-v1-passage \
fusion --alpha 0.06 \
run --threads 16 --batch-size 512 \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25-pytorch.dev.txt
|
|
[6] |
Hybrid TCT_ColBERT-V2-HN+ and BM25 doc2query: PyTorch |
0.4829 |
0.7376 |
0.8614 |
|
0.5078 |
0.7244 |
0.8847 |
|
0.3731 |
0.9759 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.hybrid \
dense --index msmarco-v1-passage.tct_colbert-v2-hnp \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
sparse --index msmarco-v1-passage.d2q-t5 \
fusion --alpha 0.1 \
run --threads 16 --batch-size 512 \
--topics dl19-passage \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.hybrid \
dense --index msmarco-v1-passage.tct_colbert-v2-hnp \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
sparse --index msmarco-v1-passage.d2q-t5 \
fusion --alpha 0.1 \
run --threads 16 --batch-size 512 \
--topics dl20 \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.hybrid \
dense --index msmarco-v1-passage.tct_colbert-v2-hnp \
--encoder castorini/tct_colbert-v2-hnp-msmarco \
sparse --index msmarco-v1-passage.d2q-t5 \
fusion --alpha 0.1 \
run --threads 16 --batch-size 512 \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.tct_colbert-v2-hnp-bm25d2q-pytorch.dev.txt
|
|
|
[7] |
SLIM: Lucene, PyTorch |
0.4509 |
0.7010 |
0.8241 |
|
0.4419 |
0.6403 |
0.8543 |
|
0.3581 |
0.9622 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.slimr \
--topics dl19-passage \
--encoder castorini/slimr-msmarco-passage \
--encoded-corpus scipy-sparse-vectors.msmarco-v1-passage-slimr \
--output run.msmarco-v1-passage.slimr.dl19.txt \
--output-format msmarco --hits 1000 --impact --min-idf 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.slimr.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.slimr.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.slimr.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.slimr \
--topics dl20 \
--encoder castorini/slimr-msmarco-passage \
--encoded-corpus scipy-sparse-vectors.msmarco-v1-passage-slimr \
--output run.msmarco-v1-passage.slimr.dl20.txt \
--output-format msmarco --hits 1000 --impact --min-idf 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.slimr.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.slimr.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.slimr.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.slimr \
--topics msmarco-passage-dev-subset \
--encoder castorini/slimr-msmarco-passage \
--encoded-corpus scipy-sparse-vectors.msmarco-v1-passage-slimr \
--output run.msmarco-v1-passage.slimr.dev.txt \
--output-format msmarco --hits 1000 --impact --min-idf 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.slimr.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.slimr.dev.txt
|
|
[7] |
SLIM++: Lucene, PyTorch |
0.4687 |
0.7140 |
0.8415 |
|
0.4906 |
0.7021 |
0.8551 |
|
0.4032 |
0.9680 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.slimr-pp \
--topics dl19-passage \
--encoder castorini/slimr-pp-msmarco-passage \
--encoded-corpus scipy-sparse-vectors.msmarco-v1-passage-slimr-pp \
--output run.msmarco-v1-passage.slimr-pp.dl19.txt \
--output-format msmarco --hits 1000 --impact --min-idf 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.slimr-pp.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.slimr-pp.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.slimr-pp.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.slimr-pp \
--topics dl20 \
--encoder castorini/slimr-pp-msmarco-passage \
--encoded-corpus scipy-sparse-vectors.msmarco-v1-passage-slimr-pp \
--output run.msmarco-v1-passage.slimr-pp.dl20.txt \
--output-format msmarco --hits 1000 --impact --min-idf 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.slimr-pp.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.slimr-pp.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.slimr-pp.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.slimr-pp \
--topics msmarco-passage-dev-subset \
--encoder castorini/slimr-pp-msmarco-passage \
--encoded-corpus scipy-sparse-vectors.msmarco-v1-passage-slimr-pp \
--output run.msmarco-v1-passage.slimr-pp.dev.txt \
--output-format msmarco --hits 1000 --impact --min-idf 3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.slimr-pp.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.slimr-pp.dev.txt
|
|
|
[8] |
Aggretriever-DistilBERT: Faiss flat, PyTorch |
0.4301 |
0.6816 |
0.8023 |
|
0.4329 |
0.6726 |
0.8351 |
|
0.3412 |
0.9604 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.aggretriever-distilbert \
--topics dl19-passage \
--encoder castorini/aggretriever-distilbert \
--output run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.aggretriever-distilbert \
--topics dl20 \
--encoder castorini/aggretriever-distilbert \
--output run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.aggretriever-distilbert \
--topics msmarco-passage-dev-subset \
--encoder castorini/aggretriever-distilbert \
--output run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.aggretriever-distilbert-pytorch.dev.txt
|
|
[8] |
Aggretriever-coCondenser: Faiss flat, PyTorch |
0.4350 |
0.6837 |
0.8078 |
|
0.4710 |
0.6972 |
0.8555 |
|
0.3619 |
0.9735 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.aggretriever-cocondenser \
--topics dl19-passage \
--encoder castorini/aggretriever-cocondenser \
--output run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.aggretriever-cocondenser \
--topics dl20 \
--encoder castorini/aggretriever-cocondenser \
--output run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.aggretriever-cocondenser \
--topics msmarco-passage-dev-subset \
--encoder castorini/aggretriever-cocondenser \
--output run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.aggretriever-cocondenser-pytorch.dev.txt
|
|
|
[11] |
OpenAI ada2: Faiss flat, cached queries |
0.4788 |
0.7035 |
0.8629 |
|
0.4771 |
0.6759 |
0.8705 |
|
0.3435 |
0.9858 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.openai-ada2 \
--topics dl19-passage --encoded-queries openai-ada2-dl19-passage \
--output run.msmarco-v1-passage.openai-ada2.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.openai-ada2.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.openai-ada2.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.openai-ada2.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.openai-ada2 \
--topics dl20 --encoded-queries openai-ada2-dl20 \
--output run.msmarco-v1-passage.openai-ada2.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.openai-ada2.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.openai-ada2.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.openai-ada2.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.openai-ada2 \
--topics msmarco-passage-dev-subset --encoded-queries openai-ada2-msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.openai-ada2.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.openai-ada2.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.openai-ada2.dev.txt
|
|
[12] |
HyDE-OpenAI ada2: Faiss flat, cached queries |
0.5125 |
0.7163 |
0.9002 |
|
0.4938 |
0.6666 |
0.8919 |
|
- |
- |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.openai-ada2 \
--topics dl19-passage --encoded-queries openai-ada2-dl19-passage-hyde \
--output run.msmarco-v1-passage.openai-ada2-hyde.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.openai-ada2-hyde.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.openai-ada2-hyde.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.openai-ada2-hyde.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 128 \
--index msmarco-v1-passage.openai-ada2 \
--topics dl20 --encoded-queries openai-ada2-dl20-hyde \
--output run.msmarco-v1-passage.openai-ada2-hyde.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.openai-ada2-hyde.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.openai-ada2-hyde.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.openai-ada2-hyde.dl20.txt
Not available.
|
|
|
OpenAI text-embedding-3-large: Faiss flat, cached queries |
0.5259 |
0.7173 |
0.8991 |
|
0.5134 |
0.7163 |
0.8884 |
|
0.3342 |
0.9885 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.openai-text-embedding-3-large \
--topics dl19-passage --encoded-queries openai-text-embedding-3-large-dl19-passage \
--output run.msmarco-v1-passage.openai-text-embedding-3-large.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.openai-text-embedding-3-large.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.openai-text-embedding-3-large.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.openai-text-embedding-3-large.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.openai-text-embedding-3-large \
--topics dl20 --encoded-queries openai-text-embedding-3-large-dl20 \
--output run.msmarco-v1-passage.openai-text-embedding-3-large.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.openai-text-embedding-3-large.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.openai-text-embedding-3-large.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.openai-text-embedding-3-large.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.openai-text-embedding-3-large \
--topics msmarco-passage-dev-subset --encoded-queries openai-text-embedding-3-large-msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.openai-text-embedding-3-large.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.openai-text-embedding-3-large.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.openai-text-embedding-3-large.dev.txt
|
|
|
[13] |
cosDPR-distil: Faiss flat, PyTorch |
0.4656 |
0.7250 |
0.8201 |
|
0.4876 |
0.7025 |
0.8533 |
|
0.3896 |
0.9796 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.cosdpr-distil \
--topics dl19-passage \
--encoder castorini/cosdpr-distil \
--output run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.cosdpr-distil \
--topics dl20 \
--encoder castorini/cosdpr-distil \
--output run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.cosdpr-distil \
--topics msmarco-passage-dev-subset \
--encoder castorini/cosdpr-distil \
--output run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.cosdpr-distil.faiss-flat.pytorch.dev.txt
|
|
[13] |
cosDPR-distil: Lucene HNSW, ONNX |
0.4660 |
0.7250 |
0.8222 |
|
0.4876 |
0.7025 |
0.8540 |
|
0.3887 |
0.9765 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.cosdpr-distil.hnsw \
--topics dl19-passage \
--onnx-encoder CosDprDistil \
--output run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dl19.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.cosdpr-distil.hnsw \
--topics dl20 \
--onnx-encoder CosDprDistil \
--output run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dl20.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.cosdpr-distil.hnsw \
--topics msmarco-passage-dev-subset \
--onnx-encoder CosDprDistil \
--output run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dev.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw.onnx.dev.txt
|
|
[13] |
cosDPR-distil: Lucene quantized HNSW, ONNX |
0.4664 |
0.7247 |
0.8218 |
|
0.4871 |
0.6996 |
0.8538 |
|
0.3899 |
0.9764 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.cosdpr-distil.hnsw-int8 \
--topics dl19-passage \
--onnx-encoder CosDprDistil \
--output run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dl19.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.cosdpr-distil.hnsw-int8 \
--topics dl20 \
--onnx-encoder CosDprDistil \
--output run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dl20.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.cosdpr-distil.hnsw-int8 \
--topics msmarco-passage-dev-subset \
--onnx-encoder CosDprDistil \
--output run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dev.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.cosdpr-distil.lucene-hnsw-int8.onnx.dev.txt
|
|
|
[14] |
BGE-base-en-v1.5: Faiss flat, PyTorch |
0.4485 |
0.7016 |
0.8427 |
|
0.4628 |
0.6768 |
0.8547 |
|
0.3583 |
0.9811 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder BAAI/bge-base-en-v1.5 --l2-norm --query-prefix "Represent this sentence for searching relevant passages:" \
--index msmarco-v1-passage.bge-base-en-v1.5 \
--topics dl19-passage \
--output run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dl19.txt \
--hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder BAAI/bge-base-en-v1.5 --l2-norm --query-prefix "Represent this sentence for searching relevant passages:" \
--index msmarco-v1-passage.bge-base-en-v1.5 \
--topics dl20 \
--output run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dl20.txt \
--hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--encoder-class auto \
--encoder BAAI/bge-base-en-v1.5 --l2-norm --query-prefix "Represent this sentence for searching relevant passages:" \
--index msmarco-v1-passage.bge-base-en-v1.5 \
--topics msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dev.txt \
--hits 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bge-base-en-v1.5.faiss-flat.pytorch.dev.txt
|
|
[14] |
BGE-base-en-v1.5: Lucene HNSW, ONNX |
0.4486 |
0.7016 |
0.8441 |
|
0.4626 |
0.6768 |
0.8526 |
|
0.3575 |
0.9788 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.bge-base-en-v1.5.hnsw \
--topics dl19-passage \
--onnx-encoder BgeBaseEn15 \
--output run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dl19.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.bge-base-en-v1.5.hnsw \
--topics dl20 \
--onnx-encoder BgeBaseEn15 \
--output run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dl20.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.bge-base-en-v1.5.hnsw \
--topics msmarco-passage-dev-subset \
--onnx-encoder BgeBaseEn15 \
--output run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dev.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw.onnx.dev.txt
|
|
[14] |
BGE-base-en-v1.5: Lucene quantized HNSW, ONNX |
0.4454 |
0.7017 |
0.8436 |
|
0.4596 |
0.6767 |
0.8468 |
|
0.3575 |
0.9772 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.bge-base-en-v1.5.hnsw-int8 \
--topics dl19-passage \
--onnx-encoder BgeBaseEn15 \
--output run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dl19.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.bge-base-en-v1.5.hnsw-int8 \
--topics dl20 \
--onnx-encoder BgeBaseEn15 \
--output run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dl20.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 512 --dense --hnsw \
--index msmarco-v1-passage.bge-base-en-v1.5.hnsw-int8 \
--topics msmarco-passage-dev-subset \
--onnx-encoder BgeBaseEn15 \
--output run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dev.txt \
--hits 1000 --ef-search 1000
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.bge-base-en-v1.5.lucene-hnsw-int8.onnx.dev.txt
|
|
|
|
Cohere Embed English v3.0: Faiss flat, cached queries |
0.4884 |
0.6956 |
0.8630 |
|
0.5067 |
0.7245 |
0.8682 |
|
0.3660 |
0.9785 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.cohere-embed-english-v3.0 \
--topics dl19-passage --encoded-queries cohere-embed-english-v3.0-dl19-passage \
--output run.msmarco-v1-passage.cohere-embed-english-v3.0.dl19.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl19-passage \
run.msmarco-v1-passage.cohere-embed-english-v3.0.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-passage \
run.msmarco-v1-passage.cohere-embed-english-v3.0.dl19.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl19-passage \
run.msmarco-v1-passage.cohere-embed-english-v3.0.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.cohere-embed-english-v3.0 \
--topics dl20 --encoded-queries cohere-embed-english-v3.0-dl20 \
--output run.msmarco-v1-passage.cohere-embed-english-v3.0.dl20.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -l 2 -m map dl20-passage \
run.msmarco-v1-passage.cohere-embed-english-v3.0.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-passage \
run.msmarco-v1-passage.cohere-embed-english-v3.0.dl20.txt
python -m pyserini.eval.trec_eval -c -l 2 -m recall.1000 dl20-passage \
run.msmarco-v1-passage.cohere-embed-english-v3.0.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.faiss \
--threads 16 --batch-size 512 \
--index msmarco-v1-passage.cohere-embed-english-v3.0 \
--topics msmarco-passage-dev-subset --encoded-queries cohere-embed-english-v3.0-msmarco-passage-dev-subset \
--output run.msmarco-v1-passage.cohere-embed-english-v3.0.dev.txt
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 10 -m recip_rank msmarco-passage-dev-subset \
run.msmarco-v1-passage.cohere-embed-english-v3.0.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-passage-dev-subset \
run.msmarco-v1-passage.cohere-embed-english-v3.0.dev.txt
|