Skip to content

Commit 48d0239

Browse files
pintaoz-awspintaoz
andauthored
Update HyperPodPytorchJob (#52)
* Add HyperpodPytorchJob * update to class methods * update to class methods * Address feedback * Fix bug * Update HyperPodPytorchJob * Fix dependency * Add status * Add list_pods and get_logs_from_pod * Add error handling and metadata * Add example notebook * Fix bug --------- Co-authored-by: pintaoz <[email protected]>
1 parent cebe4c2 commit 48d0239

File tree

6 files changed

+4489
-96
lines changed

6 files changed

+4489
-96
lines changed

sagemaker-hyperpod/examples/training_sdk_example.ipynb

Lines changed: 1235 additions & 0 deletions
Large diffs are not rendered by default.

sagemaker-hyperpod/src/sagemaker/hyperpod/inference/config/common.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
from typing import Dict, List, Optional, Self
1+
from typing import Dict, List, Optional
22
from pydantic import Field, ConfigDict, BaseModel
33

44

55
class Metadata(BaseModel):
66
"""Metadata class"""
77

8-
name: Optional[str] = Field(
8+
name: str = Field(
99
description="Name must match the name of one entry in pod.spec.resourceClaims of the Pod where this field is used. It makes that resource available inside a container."
1010
)
1111
namespace: Optional[str] = Field(
12-
default=None,
12+
default="default",
1313
description="Name must match the name of one entry in pod.spec.resourceClaims of the Pod where this field is used. It makes that resource available inside a container.",
1414
)
1515
labels: Optional[Dict[str, str]] = Field(

sagemaker-hyperpod/src/sagemaker/hyperpod/training/config/hyperpod_pytorch_job_config.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2957,8 +2957,8 @@ class RunPolicy(BaseModel):
29572957
)
29582958

29592959

2960-
class HyperPodPytorchJobSpec(BaseModel):
2961-
"""HyperPodPytorchJobSpec defines the desired state of HyperPodPytorchJob"""
2960+
class _HyperPodPytorchJob(BaseModel):
2961+
"""Config defines the desired state of HyperPodPytorchJob"""
29622962

29632963
model_config = ConfigDict(extra="forbid")
29642964

0 commit comments

Comments
 (0)