EvoSLD uses evolutionary computation to discover and optimize scaling laws π across machine-learning scenarios. It sits on top of the OpenEvolve framework and co-evolves both the functional form of scaling laws and their fitting algorithms π€.
π Paper: EvoSLD: Automated Neural Scaling Law Discovery With Large Language Models
- Why EvoSLD?
- Features
- Whatβs included (tasks)
- Requirements
- Install
- Quick Start
- Project Layout
- Running Tasks
- Evaluating a Discovered Program
- Add a New Scaling Law
- Configuration Guide
- Data Interface
- Tips for Search/Evolution
- Troubleshooting
- FAQ
- Cite
- Acknowledgments
Scaling laws relate performance to factors like model size, dataset size, compute, and architecture. Hand-deriving such laws is time-consuming and often brittle. EvoSLD automates this by:
- Searching π§ for symbolic forms of scaling laws (closed-form functions).
- Co-designing π§βπ¨ the corresponding fitting/optimization routine.
- Selecting β candidates via evolutionary pressure on held-out data.
The result is a practical engine that can rediscover known laws and propose better onesβwith explicit code you can inspect and re-use.
- End-to-end discovery: Evolves closed-form scaling functions and their bespoke optimizers.
- Multiple domains out of the box:
- Data-Constrained: How training data affects loss.
- Domain Mixture: Effect of mixing domains on performance.
- Learning Rate: Scaling with learning rate & batch size.
- Mixture of Experts (MoE): Behavior in MoE architectures.
- Rectified (SFT): Laws for supervised fine-tuning.
- Vocabulary: Impact of vocabulary size.
- Customizable: All stages (prompting, evolution, evaluation, data) are configurable.
- Checkpoints & reproducibility: Periodic snapshots + seeds for reliable runs.
Task key | Config file | Data folder |
---|---|---|
data_constrained_scaling_law |
configs/data_constrained_scaling_law.yaml |
data/data_constrained_scaling_law/ |
domain_mixture_scaling_law |
configs/domain_mixture_scaling_law.yaml |
data/domain_mixture_scaling_law/ |
lr_bsz_scaling_law |
configs/lr_bsz_scaling_law.yaml |
data/lr_bsz_scaling_law/ |
moe_scaling_law |
configs/moe_scaling_law.yaml |
data/moe_scaling_law/ |
sft_scaling_law |
configs/sft_scaling_law.yaml |
data/sft_scaling_law/ |
vocab_scaling_law |
configs/vocab_scaling_law.yaml |
data/vocab_scaling_law/ |
Add your own tasks in the same pattern; see Add a New Scaling Law.
- Python 3.13+
uv
package manager (recommended)- An OpenAI-compatible API key (set
OPENAI_API_KEY
) - macOS/Linux/Windows
Note:
uv run
guarantees commands execute inside a synchronized project environment. If you prefer plainpip
, you can adapt the commands accordingly.
# 1) Clone the repo
git clone <repository-url>
cd evosld
# 2) Install dependencies
uv sync
# 3) Provide your LLM API key
export OPENAI_API_KEY=your_key
# Optional: if using a non-default endpoint
# export OPENAI_BASE_URL=https://your.openai.compatible.endpoint/v1
On Windows (PowerShell):
$env:OPENAI_API_KEY="your_key"
# $env:OPENAI_BASE_URL="https://your.openai.compatible.endpoint/v1"
# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install dependencies
pip install -U pip
pip install -r requirements.txt # Or use pyproject.toml
# Set your API key
export OPENAI_API_KEY=your_key
# export OPENAI_BASE_URL=https://your.openai.compatible.endpoint/v1
Run a single discovery task (e.g., Data-Constrained):
EVAL_TASK_NAME="data_constrained_scaling_law" \
uv run openevolve-run.py \
--config configs/data_constrained_scaling_law.yaml \
init_program.py evaluator.py \
--output results/data_constrained_scaling_law/run_1
Or run all tasks in batch:
# If the script is executable:
uv run scripts/run.sh
# Otherwise:
bash scripts/run.sh
evosld/
ββ configs/ # βοΈ YAML configs (one per scaling law)
β ββ data_constrained_scaling_law.yaml
β ββ ...
ββ data/ # π Data files & loaders
β ββ {task_name}/ # π One folder per task
β β ββ data.csv
β β ββ {task_name}_loader.py
β ββ ...
ββ data_loader.py # βοΈ Unified data loading interface
ββ evaluator.py # β
Unified evaluation system
ββ init_program.py # π± Initial scaling-law template
ββ results/ # π Outputs & checkpoints
ββ scripts/
ββ run.sh # π Batch execution helper
export EVAL_TASK_NAME="data_constrained_scaling_law"
uv run python openevolve-run.py \
--config configs/data_constrained_scaling_law.yaml \
init_program.py evaluator.py \
--output results/data_constrained_scaling_law/run_1
bash scripts/run.sh
This will:
- Run each task 3 times with different random seeds.
- Write outputs to
results/{task_name}/run_{1,2,3}/
. - Save intermediate checkpoints.
- Evaluate and save the best program from each run.
EVAL_TASK_NAME="data_constrained_scaling_law" \
uv run python evaluator.py \
results/data_constrained_scaling_law/run_1/best/best_program.py
Create configs/your_law_name.yaml
and customize the settings (see the full template in the original README). Key sections include llm
, prompt
, database
, and evaluator
.
Create a directory for your data and add a data.csv
file:
mkdir -p data/your_law_name
Your data.csv
should have columns for features and the target variable.
Add a Python script data/your_law_name_loader.py
to load your data. It must contain a load_data_for_task
function that returns a dictionary containing NumPy arrays for features (X) and labels (y).
Add your task to the TASK_CONFIG
dictionary in evaluator.py
:
TASK_CONFIG = {
# ... existing tasks ...
"your_law_name": {
"scaling_vars": ["your_feature1", "your_feature2"],
"response_var": "your_target",
},
}
Add "your_law_name"
to the tasks
array in scripts/run.sh
to include it in batch runs.
Key knobs to tune in your .yaml
files:
- Search Budget: Increase
max_iterations
andpopulation_size
for more thorough exploration. - Exploration vs. Exploitation: Adjust
exploration_ratio
andexploitation_ratio
. - Parallelism: Raise
parallel_evaluations
to speed things up. - Reproducibility: Set a fixed
random_seed
for consistent results. - API Resilience: Bump
llm.timeout
andllm.retries
for flaky networks.
- Your data loader should return a dictionary where values are tuples of
(features, target)
. - Features should be a 2D
(N, F)
NumPy array, and the target should be a 1D NumPy array. - The unified
data_loader.py
will use your task-specific loader based on theEVAL_TASK_NAME
.
- Start with a modest budget and inspect intermediate checkpoints in
results/
. - If evolution stalls, try increasing
population_size
or theexploration_ratio
. - Consider grouping data into different regimes (e.g., compute-limited vs. data-limited) and evaluating on each subset for more nuanced insights.
- Import Errors: Run
uv sync
to ensure your environment is up-to-date. - Task Not Found: Check that
EVAL_TASK_NAME
matches a key inTASK_CONFIG
inevaluator.py
. - API Timeouts: Increase
llm.timeout
andllm.retries
in your config, or check yourOPENAI_BASE_URL
. - Script Not Executable: Run
chmod +x scripts/run.sh
or execute it withbash scripts/run.sh
.
Do I have to use OpenAI?
No. Any OpenAI-compatible endpoint works. Just set the api_base
in your YAML config or the OPENAI_BASE_URL
environment variable.
Can I use pip
instead of uv
?
Yes. Create a virtual environment, activate it, and install dependencies from requirements.txt
. Then run the Python commands directly.
Where are the results stored?
Under results/{task_name}/{run_id}/
. You'll find checkpoints, logs, and the final best/best_program.py
.
If you use EvoSLD in your academic work, please cite the paper:
@article{lin2025evosld,
title = {EvoSLD: Automated Neural Scaling Law Discovery With Large Language Models},
author = {Lin, Haowei et al},
journal = {arXiv preprint arXiv:2507.21184},
year = {2025}
}
This project is built on the excellent OpenEvolve evolutionary coding framework.