Skip to content

run nova-lite full fine tuning throw exception #51

@ybalbert001

Description

@ybalbert001

Describe the bug

in :3 │
│ │
│ 1 from sagemaker.pytorch import PyTorch │
│ 2 │
│ ❱ 3 estimator = PyTorch( │
│ 4 │ image_uri=image_uri, │
│ 5 │ base_job_name=base_job_name, │
│ 6 │ role=role_arn, │
│ │
│ /opt/conda/lib/python3.12/site-packages/sagemaker/pytorch/estimator.py:368 in init
│ │
│ 365 │ │ │ │ logger.warning("Argument hyperparameters will be ignored with training r │
│ 366 │ │ │ if distribution is not None: │
│ 367 │ │ │ │ logger.warning("Argument distribution will be ignored with training_reci │
│ ❱ 368 │ │ │ args = self._setup_for_training_recipe( │
│ 369 │ │ │ │ training_recipe, recipe_overrides, source_dir, kwargs │
│ 370 │ │ │ ) │
│ 371 │ │ │ entry_point = args["entry_point"] │
│ │
│ /opt/conda/lib/python3.12/site-packages/sagemaker/pytorch/estimator.py:695 in │
│ _setup_for_training_recipe │
│ │
│ 692 │ │ │ device_type = "cpu" │
│ 693 │ │ │
│ 694 │ │ if "trainer" not in recipe: │
│ ❱ 695 │ │ │ raise ValueError("Supplied recipe does not contain required field trainer.") │
│ 696 │ │ if "instance_count" in kwargs and "num_nodes" in recipe["trainer"]: │
│ 697 │ │ │ logger.warning( │
│ 698 │ │ │ │ "Using instance_count argument to estimator to set number " │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Supplied recipe does not contain required field trainer.

How to Reproduce?

just follow the documentation.

Expected behavior

start pytorch training job normally

Screenshots, error messages or logs

Image

System information

A description of your system. Please provide:

  • Docker image you ran against:
  • Source code version you ran against:
  • Python version:
  • Hardware accelerator used:

Additional context

Add any other context about the problem here. Please provide any additional steps you have tried to solve your issue here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions