The two-click* reproduction matrix below provides commands for reproducing experimental results reported in the following paper. Numbered rows correspond to tables in the paper; additional conditions are provided for comparison purposes.
Xueguang Ma, Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin. Document Expansions and Learned Sparse Lexical Representations for MS MARCO V1 and V2. Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), July 2022.
Instructions for programmatic execution are shown at the bottom of this page (scroll down).
TREC 2021 | TREC 2022 | TREC 2023 | dev | dev2 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AP |
nDCG@10 | R@1K | AP |
nDCG@10 | R@1K | AP |
nDCG@10 | R@1K | RR@100 | R@1K | RR@100 | R@1K | ||||||||
(1a) | BM25 doc | 0.2126 | 0.5116 | 0.6739 | 0.0801 | 0.2993 | 0.4107 | 0.1046 | 0.2946 | 0.5262 | 0.1572 | 0.8054 | 0.1659 | 0.8029 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
(1b) | BM25 doc segmented | 0.2436 | 0.5776 | 0.6930 | 0.1036 | 0.3618 | 0.4664 | 0.1341 | 0.3405 | 0.5662 | 0.1896 | 0.8542 | 0.1930 | 0.8549 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
(1c) | BM25+RM3 doc | 0.2452 | 0.5304 | 0.7341 | 0.0798 | 0.2536 | 0.4217 | 0.1174 | 0.2462 | 0.5232 | 0.0974 | 0.7699 | 0.1033 | 0.7736 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
(1d) | BM25+RM3 doc segmented | 0.2936 | 0.6189 | 0.7678 | 0.1260 | 0.3834 | 0.5114 | 0.1652 | 0.3452 | 0.5755 | 0.1660 | 0.8608 | 0.1702 | 0.8639 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
(2a) | BM25 w/ doc2query-T5 doc | 0.2387 | 0.5792 | 0.7066 | 0.0977 | 0.3539 | 0.4301 | 0.1273 | 0.3511 | 0.5549 | 0.2011 | 0.8614 | 0.2012 | 0.8568 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
(2b) | BM25 w/ doc2query-T5 doc segmented | 0.2683 | 0.6289 | 0.7202 | 0.1203 | 0.3975 | 0.4984 | 0.1460 | 0.3612 | 0.5967 | 0.2226 | 0.8982 | 0.2234 | 0.8952 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
(2c) | BM25+RM3 w/ doc2query-T5 doc | 0.2611 | 0.5375 | 0.7574 | 0.0904 | 0.2758 | 0.4263 | 0.1246 | 0.2681 | 0.5616 | 0.1141 | 0.8191 | 0.1170 | 0.8247 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
(2d) | BM25+RM3 w/ doc2query-T5 doc segmented | 0.3191 | 0.6559 | 0.7948 | 0.1319 | 0.3912 | 0.5188 | 0.1699 | 0.3454 | 0.6006 | 0.1975 | 0.9002 | 0.1978 | 0.8972 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
(3a) | uniCOIL (noexp): cached queries | 0.2587 | 0.6495 | 0.6787 | 0.1180 | 0.4165 | 0.4779 | 0.1413 | 0.3898 | 0.5462 | 0.2231 | 0.8987 | 0.2314 | 0.8995 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
(3b) | uniCOIL (w/ doc2query-T5): cached queries | 0.2718 | 0.6783 | 0.7069 | 0.1400 | 0.4451 | 0.5235 | 0.1554 | 0.4149 | 0.5753 | 0.2419 | 0.9122 | 0.2445 | 0.9172 | ||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
uniCOIL (noexp): PyTorch | 0.2587 | 0.6495 | 0.6787 | 0.1180 | 0.4165 | 0.4779 | 0.1413 | 0.3898 | 0.5462 | 0.2231 | 0.8987 | 0.2314 | 0.8995 | |||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
||||||||||||||||||||
uniCOIL (w/ doc2query-T5): PyTorch | 0.2718 | 0.6783 | 0.7069 | 0.1400 | 0.4451 | 0.5235 | 0.1554 | 0.4150 | 0.5753 | 0.2419 | 0.9122 | 0.2445 | 0.9172 | |||||||
Command to generate run on TREC 2021 queries:
Evaluation commands:
Command to generate run on TREC 2022 queries:
Evaluation commands:
Command to generate run on TREC 2023 queries:
Evaluation commands:
Command to generate run on dev queries:
Evaluation commands:
Command to generate run on dev2 queries:
Evaluation commands:
|
All experimental runs shown in the above table can be programmatically executed based on the instructions below. To list all the experimental conditions:
python -m pyserini.2cr.msmarco --collection v2-doc --list-conditions
These conditions correspond to the table rows above.
For all conditions, just show the commands in a "dry run":
python -m pyserini.2cr.msmarco --collection v2-doc --all --display-commands --dry-run
To actually run all the experimental conditions:
python -m pyserini.2cr.msmarco --collection v2-doc --all --display-commands
With the above command, run files will be placed in the current directory. Use the option --directory runs/ to place the runs in a sub-directory.
To show the commands for a specific condition:
python -m pyserini.2cr.msmarco --collection v2-doc --condition bm25-doc-default --display-commands --dry-run
This will generate exactly the commands for a specific condition above (corresponding to a row in the table).
To actually run a specific condition:
python -m pyserini.2cr.msmarco --collection v2-doc --condition bm25-doc-default --display-commands
Again, with the above command, run files will be placed in the current directory. Use the option --directory runs/ to place the runs in a sub-directory.
Finally, to generate this page:
python -m pyserini.2cr.msmarco --collection v2-doc --generate-report --output msmarco-v2-doc.html
The output file msmarco-v2-doc.html should be identical to this page.