Massive Legal Embedding Benchmark (MLEB)

The Massive Legal Embedding Benchmark (MLEB) by Isaacus is the largest, most diverse, and most comprehensive benchmark for legal text embedding models. It contains 10 datasets spanning multiple document types, jurisdictions, areas of law, and tasks. To do well on MLEB, embedding models must demonstrate both extensive legal domain knowledge and strong legal reasoning skills.

This repository contains the code used to evaluate embedding models on MLEB (available in the scripts directory), as well as the full results of evaluated models (available in the results directory).

If you're looking for MLEB itself, you can find it here. You can also read our paper here.

Setup

We recommend setting up a virtual environment for this project and installing necessary dependencies using uv like so:

git clone https://github.com/isaacus-dev/mleb.git
cd mleb
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv --python 3.12
source .venv/bin/activate
uv sync

That will download this repository, create a Python 3.12 virtual environment using uv, activate the virtual environment, and install all necessary dependencies.

Alternatively, you may manually install the necessary dependencies listed in our pyproject.toml file.

After installing the necessary dependencies, we recommend creating a .env file in the root directory of this repository to store your API keys for various embedding model providers. You can use the provided .env.example file as a template:

# Isaacus
ISAACUS_API_KEY=...

# OpenAI
OPENAI_API_KEY=...

# Google
GOOGLE_API_KEY=...

# Voyage AI
VOYAGE_API_KEY=...

Make sure to replace the ... with your actual API keys. You may omit any keys for providers you won't be using.

Usage

To evaluate embedding models on MLEB, you can simply run the scripts/mleb.py script, like so:

python scripts/mleb.py

Inside the script, you can specify which specific models you want to evaluate by modifying the MODEL_IDS list near the top of the file. Model IDs correspond to ids of models defined in the MODEL_CONFIGS list in the scripts/models.py file.

New models may be added by adding new MLEBEvaluationModelConfig instances (defined in scripts/structs.py) to the MODEL_CONFIGS list.

Results are written in the mteb format to the results directory.

scripts/export.py may be run to pack all results into a single JSONL file available at results/results.jsonl. That file is used to dynamically present the latest benchmark results on the MLEB website.

License

This project is licensed under the MIT License.

Citation

@misc{butler2025massivelegalembeddingbenchmark,
      title={The Massive Legal Embedding Benchmark (MLEB)}, 
      author={Umar Butler and Abdur-Rahman Butler and Adrian Lucas Malec},
      year={2025},
      eprint={2510.19365},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.19365}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
results		results
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Massive Legal Embedding Benchmark (MLEB)

Setup

Usage

License

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Uh oh!

License

Uh oh!

isaacus-dev/mleb

Folders and files

Latest commit

History

Repository files navigation

Massive Legal Embedding Benchmark (MLEB)

Setup

Usage

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages