Skip to content

The code used to evaluate embedding models on the Massive Legal Embedding Benchmark (MLEB).

License

isaacus-dev/mleb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Massive Legal Embedding Benchmark (MLEB)

The Massive Legal Embedding Benchmark (MLEB) by Isaacus is the largest, most diverse, and most comprehensive benchmark for legal text embedding models. It contains 10 datasets spanning multiple document types, jurisdictions, areas of law, and tasks. To do well on MLEB, embedding models must demonstrate both extensive legal domain knowledge and strong legal reasoning skills.

This repository contains the code used to evaluate embedding models on MLEB (available in the scripts directory), as well as the full results of evaluated models (available in the results directory).

If you're looking for MLEB itself, you can find it here. You can also read our paper here.

Setup

We recommend setting up a virtual environment for this project and installing necessary dependencies using uv like so:

git clone https://github.com/isaacus-dev/mleb.git
cd mleb
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv --python 3.12
source .venv/bin/activate
uv sync

That will download this repository, create a Python 3.12 virtual environment using uv, activate the virtual environment, and install all necessary dependencies.

Alternatively, you may manually install the necessary dependencies listed in our pyproject.toml file.

After installing the necessary dependencies, we recommend creating a .env file in the root directory of this repository to store your API keys for various embedding model providers. You can use the provided .env.example file as a template:

# Isaacus
ISAACUS_API_KEY=...

# OpenAI
OPENAI_API_KEY=...

# Google
GOOGLE_API_KEY=...

# Voyage AI
VOYAGE_API_KEY=...

Make sure to replace the ... with your actual API keys. You may omit any keys for providers you won't be using.

Usage

To evaluate embedding models on MLEB, you can simply run the scripts/mleb.py script, like so:

python scripts/mleb.py

Inside the script, you can specify which specific models you want to evaluate by modifying the MODEL_IDS list near the top of the file. Model IDs correspond to ids of models defined in the MODEL_CONFIGS list in the scripts/models.py file.

New models may be added by adding new MLEBEvaluationModelConfig instances (defined in scripts/structs.py) to the MODEL_CONFIGS list.

Results are written in the mteb format to the results directory.

scripts/export.py may be run to pack all results into a single JSONL file available at results/results.jsonl. That file is used to dynamically present the latest benchmark results on the MLEB website.

License

This project is licensed under the MIT License.

Citation

@misc{butler2025massivelegalembeddingbenchmark,
      title={The Massive Legal Embedding Benchmark (MLEB)}, 
      author={Umar Butler and Abdur-Rahman Butler and Adrian Lucas Malec},
      year={2025},
      eprint={2510.19365},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.19365}, 
}

About

The code used to evaluate embedding models on the Massive Legal Embedding Benchmark (MLEB).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages