|
[1] (1a) |
BM25 doc (k1=0.9, b=0.4) |
0.2434 |
0.5176 |
0.6966 |
|
0.3793 |
0.5286 |
0.8085 |
|
0.2299 |
0.8856 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-doc-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-doc-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-doc-default.dev.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-doc-default.dev.txt
|
|
[1] (1b) |
BM25 doc seg (k1=0.9, b=0.4) |
0.2449 |
0.5302 |
0.6871 |
|
0.3586 |
0.5281 |
0.7755 |
|
0.2684 |
0.9178 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-doc-segmented-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-doc-segmented-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-doc-segmented-default.dev.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-doc-segmented-default.dev.txt
|
|
[1] (1c) |
BM25+RM3 doc (k1=0.9, b=0.4) |
0.2773 |
0.5174 |
0.7507 |
|
0.4015 |
0.5254 |
0.8259 |
|
0.1618 |
0.8783 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rm3-doc-default.dl19.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rm3-doc-default.dl20.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rm3-doc-default.dev.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-doc-default.dev.txt
|
|
[1] (1d) |
BM25+RM3 doc seg (k1=0.9, b=0.4) |
0.2892 |
0.5684 |
0.7368 |
|
0.3792 |
0.5202 |
0.8023 |
|
0.2413 |
0.9351 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl19.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl20.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dev.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dev.txt
|
|
|
BM25+Rocchio doc (k1=0.9, b=0.4) |
0.2811 |
0.5256 |
0.7546 |
|
0.4089 |
0.5192 |
0.8273 |
|
0.1624 |
0.8789 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rocchio-doc-default.dl19.txt \
--bm25 --rocchio --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rocchio-doc-default.dl20.txt \
--bm25 --rocchio --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rocchio-doc-default.dev.txt \
--bm25 --rocchio --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rocchio-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rocchio-doc-default.dev.txt
|
|
|
BM25+Rocchio doc seg (k1=0.9, b=0.4) |
0.2889 |
0.5570 |
0.7423 |
|
0.3830 |
0.5226 |
0.8102 |
|
0.2447 |
0.9351 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl19.txt \
--bm25 --rocchio --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl20.txt \
--bm25 --rocchio --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dev.txt \
--bm25 --rocchio --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dev.txt
|
|
|
|
BM25 doc (k1=4.46, b=0.82) |
0.2336 |
0.5233 |
0.6757 |
|
0.3581 |
0.5061 |
0.7776 |
|
0.2767 |
0.9357 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-doc-tuned.dl19.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-doc-tuned.dl20.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-doc-tuned.dev.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-doc-tuned.dev.txt
|
|
|
BM25 doc seg (k1=2.16, b=0.61) |
0.2398 |
0.5389 |
0.6565 |
|
0.3458 |
0.5213 |
0.7725 |
|
0.2756 |
0.9311 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl19.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl20.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-doc-segmented-tuned.dev.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-doc-segmented-tuned.dev.txt
|
|
|
BM25+RM3 doc (k1=4.46, b=0.82) |
0.2638 |
0.5526 |
0.7188 |
|
0.3610 |
0.5195 |
0.8180 |
|
0.2227 |
0.9303 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl19.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl20.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rm3-doc-tuned.dev.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-doc-tuned.dev.txt
|
|
|
BM25+RM3 doc seg (k1=2.16, b=0.61) |
0.2655 |
0.5392 |
0.7037 |
|
0.3471 |
0.5030 |
0.8056 |
|
0.2448 |
0.9359 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl19.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl20.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dev.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dev.txt
|
|
|
BM25+Rocchio doc (k1=4.46, b=0.82) |
0.2657 |
0.5584 |
0.7299 |
|
0.3628 |
0.5199 |
0.8217 |
|
0.2242 |
0.9314 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl19.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl20.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dev.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dev.txt
|
|
|
BM25+Rocchio doc seg (k1=2.16, b=0.61) |
0.2672 |
0.5421 |
0.7115 |
|
0.3521 |
0.4997 |
0.8042 |
|
0.2475 |
0.9395 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl19.txt \
--bm25 --rocchio --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl20.txt \
--bm25 --rocchio --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dev.txt \
--bm25 --rocchio --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dev.txt
|
|
|
[1] (2a) |
BM25 w/ doc2query-T5 doc (k1=0.9, b=0.4) |
0.2700 |
0.5968 |
0.7190 |
|
0.4230 |
0.5885 |
0.8403 |
|
0.2880 |
0.9259 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5 \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5 \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5 \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dev.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dev.txt
|
|
[1] (2b) |
BM25 w/ doc2query-T5 doc seg (k1=0.9, b=0.4) |
0.2798 |
0.6119 |
0.7165 |
|
0.4150 |
0.5957 |
0.8046 |
|
0.3179 |
0.9490 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5 \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5 \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5 \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dev.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dev.txt
|
|
[1] (2c) |
BM25+RM3 w/ doc2query-T5 doc (k1=0.9, b=0.4) |
0.3045 |
0.5904 |
0.7737 |
|
0.4230 |
0.5427 |
0.8631 |
|
0.1834 |
0.9126 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5-docvectors \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl19.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5-docvectors \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl20.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5-docvectors \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dev.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dev.txt
|
|
[1] (2d) |
BM25+RM3 w/ doc2query-T5 doc seg (k1=0.9, b=0.4) |
0.3030 |
0.6290 |
0.7483 |
|
0.4271 |
0.5851 |
0.8266 |
|
0.2803 |
0.9551 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5-docvectors \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl19.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5-docvectors \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl20.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5-docvectors \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev.txt
|
|
|
|
BM25 w/ doc2query-T5 doc (k1=4.68, b=0.87) |
0.2620 |
0.5972 |
0.6867 |
|
0.4099 |
0.5852 |
0.8105 |
|
0.3269 |
0.9553 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5 \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl19.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5 \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl20.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5 \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dev.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dev.txt
|
|
|
BM25 w/ doc2query-T5 doc seg (k1=2.56, b=0.59) |
0.2658 |
0.6273 |
0.6707 |
|
0.4047 |
0.5943 |
0.7968 |
|
0.3209 |
0.9530 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5 \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl19.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5 \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl20.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5 \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dev.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dev.txt
|
|
|
BM25+RM3 w/ doc2query-T5 doc (k1=4.68, b=0.87) |
0.2813 |
0.6091 |
0.7184 |
|
0.4100 |
0.5745 |
0.8238 |
|
0.2623 |
0.9522 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5-docvectors \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl19.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5-docvectors \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl20.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc.d2q-t5-docvectors \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dev.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dev.txt
|
|
|
BM25+RM3 w/ doc2query-T5 doc seg (k1=2.56, b=0.59) |
0.2892 |
0.6247 |
0.7069 |
|
0.4016 |
0.5711 |
0.8156 |
|
0.2973 |
0.9563 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5-docvectors \
--topics dl19-doc \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl19.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5-docvectors \
--topics dl20 \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl20.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.d2q-t5-docvectors \
--topics msmarco-doc-dev \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dev.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dev.txt
|
|
|
[1] (3a) |
uniCOIL (noexp): cached queries |
0.2665 |
0.6349 |
0.6391 |
|
0.3698 |
0.5893 |
0.7623 |
|
0.3409 |
0.9420 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil-noexp \
--topics dl19-doc-unicoil-noexp \
--output run.msmarco-v1-doc.unicoil-noexp.dl19.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.unicoil-noexp.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.unicoil-noexp.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.unicoil-noexp.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil-noexp \
--topics dl20-unicoil-noexp \
--output run.msmarco-v1-doc.unicoil-noexp.dl20.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.unicoil-noexp.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.unicoil-noexp.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.unicoil-noexp.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil-noexp \
--topics msmarco-doc-dev-unicoil-noexp \
--output run.msmarco-v1-doc.unicoil-noexp.dev.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.unicoil-noexp.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.unicoil-noexp.dev.txt
|
|
|
uniCOIL (noexp): PyTorch |
0.2665 |
0.6349 |
0.6391 |
|
0.3698 |
0.5893 |
0.7623 |
|
0.3409 |
0.9420 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil-noexp \
--topics dl19-doc \
--encoder castorini/unicoil-noexp-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-noexp-pytorch.dl19.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.unicoil-noexp-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.unicoil-noexp-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.unicoil-noexp-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil-noexp \
--topics dl20 \
--encoder castorini/unicoil-noexp-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-noexp-pytorch.dl20.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.unicoil-noexp-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.unicoil-noexp-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.unicoil-noexp-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil-noexp \
--topics msmarco-doc-dev \
--encoder castorini/unicoil-noexp-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-noexp-pytorch.dev.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.unicoil-noexp-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.unicoil-noexp-pytorch.dev.txt
|
|
|
[1] (3b) |
uniCOIL (w/ doc2query-T5): cached queries |
0.2789 |
0.6396 |
0.6652 |
|
0.3882 |
0.6033 |
0.7869 |
|
0.3531 |
0.9546 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil \
--topics dl19-doc-unicoil \
--output run.msmarco-v1-doc.unicoil.dl19.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.unicoil.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.unicoil.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.unicoil.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil \
--topics dl20-unicoil \
--output run.msmarco-v1-doc.unicoil.dl20.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.unicoil.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.unicoil.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.unicoil.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil \
--topics msmarco-doc-dev-unicoil \
--output run.msmarco-v1-doc.unicoil.dev.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.unicoil.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.unicoil.dev.txt
|
|
|
uniCOIL (w/ doc2query-T5): PyTorch |
0.2789 |
0.6396 |
0.6652 |
|
0.3882 |
0.6033 |
0.7869 |
|
0.3531 |
0.9546 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil \
--topics dl19-doc \
--encoder castorini/unicoil-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-pytorch.dl19.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc \
run.msmarco-v1-doc.unicoil-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc \
run.msmarco-v1-doc.unicoil-pytorch.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc \
run.msmarco-v1-doc.unicoil-pytorch.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil \
--topics dl20 \
--encoder castorini/unicoil-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-pytorch.dl20.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc \
run.msmarco-v1-doc.unicoil-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc \
run.msmarco-v1-doc.unicoil-pytorch.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc \
run.msmarco-v1-doc.unicoil-pytorch.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--threads 16 --batch-size 128 \
--index msmarco-v1-doc-segmented.unicoil \
--topics msmarco-doc-dev \
--encoder castorini/unicoil-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-pytorch.dev.txt \
--impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev \
run.msmarco-v1-doc.unicoil-pytorch.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev \
run.msmarco-v1-doc.unicoil-pytorch.dev.txt
|