Skip to content

Question regarding the effective batch size #112

@hank0316

Description

@hank0316

Hi @willccbb,

I have a question regarding how effective batch size and the number of global training steps per epoch are computed in the GPPOTrainer.

My setup:

  • 6000 prompts for training
  • num_generations=8
  • gradient_accumulation_steps=8
  • per_device_train_batch_size=8
  • 2 GPUs for training

My current understanding is that the total number of global steps in one epoch can be calculated as:

total_global_steps = (#prompts) * num_generations / effective_batch_size

where:

effective_batch_size = per_device_train_batch_size * num_processes * gradient_accumulation_steps

Plugging in my settings:

total_global_steps = 6000 * 8 / (8 * 2 * 8) = 375

However, I noticed that the logs (e.g., on wandb) show that one epoch actually corresponds to 750 global steps in my case, which is double what I expected.

Could you clarify how the effective batch size and total steps per epoch are computed? Am I misunderstanding how batches are constructed in GPPOTrainer?

Thanks for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions