MS MARCO V2 Passage

The two-click* reproduction matrix below provides commands for reproducing experimental results reported in a number of papers, denoted by the references in square brackets. Instructions for programmatic execution are shown at the bottom of this page (scroll down).

Multi-Pass First-Stage Method Top-k TREC 2021 TREC 2022 TREC 2023
nDCG@10 nDCG@10 nDCG@10
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/monot5-3b-msmarco-10k \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/monot5_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/monot5-3b-msmarco-10k \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/monot5_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/monot5-3b-msmarco-10k \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/monot5_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/duot5-3b-msmarco-10k \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/duot5_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/duot5-3b-msmarco-10k \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/duot5_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/duot5-3b-msmarco-10k \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/duot5_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/LiT5-Distill-large \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_fid_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/LiT5-Distill-large \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_fid_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/LiT5-Distill-large \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_fid_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/rank_vicuna_7b_v1 \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/rank_vicuna_7b_v1 \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/rank_vicuna_7b_v1 \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/rank_zephyr_7b_v1_full \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/rank_zephyr_7b_v1_full \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/rank_zephyr_7b_v1_full \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/first_mistral \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages --use_alpha --use_logits
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/first_mistral \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages --use_alpha --use_logits
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=castorini/first_mistral \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages --use_alpha --use_logits
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=Qwen/Qwen2.5-7B-Instruct \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=Qwen/Qwen2.5-7B-Instruct \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=Qwen/Qwen2.5-7B-Instruct \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=meta-llama/Llama-3.1-8B-Instruct \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=meta-llama/Llama-3.1-8B-Instruct \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=meta-llama/Llama-3.1-8B-Instruct \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gemini-2.0-flash \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gemini-2.0-flash \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gemini-2.0-flash \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_zephyr_template.yaml  \
  --context_size=4096 \
  --variable_passages
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gpt-4o-mini \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_gpt_template.yaml \
  --context_size=4096 \
  --variable_passages --use_azure_openai
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gpt-4o-mini \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_gpt_template.yaml \
  --context_size=4096 \
  --variable_passages --use_azure_openai
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gpt-4o-mini \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_gpt_template.yaml \
  --context_size=4096 \
  --variable_passages --use_azure_openai
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gpt-4o-mini \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_gpt_apeer_template.yaml \
  --context_size=4096 \
  --variable_passages --use_azure_openai
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gpt-4o-mini \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_gpt_apeer_template.yaml \
  --context_size=4096 \
  --variable_passages --use_azure_openai
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gpt-4o-mini \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_gpt_apeer_template.yaml \
  --context_size=4096 \
  --variable_passages --use_azure_openai
Command to generate and evaluate run on TREC 2021 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gpt-4o-mini \
  --top_k_candidates=100 --dataset=dl21 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_lrl_template.yaml \
  --context_size=4096 \
  --variable_passages --use_azure_openai
Command to generate and evaluate run on TREC 2022 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gpt-4o-mini \
  --top_k_candidates=100 --dataset=dl22 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_lrl_template.yaml \
  --context_size=4096 \
  --variable_passages --use_azure_openai
Command to generate and evaluate run on TREC 2023 queries:
python src/rank_llm/scripts/run_rank_llm.py  \
  --model_path=gpt-4o-mini \
  --top_k_candidates=100 --dataset=dl23 \
  --retrieval_method=bm25 \
  --prompt_template_path=src/rank_llm/rerank/prompt_templates/rank_lrl_template.yaml \
  --context_size=4096 \
  --variable_passages --use_azure_openai

Programmatic Execution

Activate Conda Environment:

conda create -n rz python=3.10 conda activate rz

All experimental runs shown in the above table can be programmatically executed based on the instructions below. To list all the experimental conditions:

python -m src.rank_llm.2cr.msmarco --collection v2-passage --list-conditions

These conditions correspond to the table rows above.

For all conditions, just show the commands in a "dry run":

python -m src.rank_llm.2cr.msmarco --collection v2-passage --all --display-commands --dry-run

To actually run all the experimental conditions:

python -m src.rank_llm.2cr.msmarco --collection v2-passage --all --display-commands

With the above command, run files will be placed in the current directory. Use the option --directory runs/ to place the runs in a sub-directory.

To show the commands for a specific condition:

python -m src.rank_llm.2cr.msmarco --collection v2-passage --condition lrl --display-commands --dry-run

This will generate exactly the commands for a specific condition above (corresponding to a row in the table).

To actually run a specific condition:

python -m src.rank_llm.2cr.msmarco --collection v2-passage --condition lrl --display-commands

Again, with the above command, run files will be placed in the current directory. Use the option --directory runs/ to place the runs in a sub-directory.

Finally, to generate this page:

python -m src.rank_llm.2cr.msmarco --collection v2-passage --generate-report --output msmarco-v2-passage.html

The output file msmarco-v2-passage.html should be identical to this page.