Skip to content

Commit 2ccd95a

Browse files
Aditi2424adishaamaheshxbjiayelamazon
authored
Documentation (#154)
* Update telemetry status to be Integer for parity (#130) Co-authored-by: adishaa <[email protected]> * Release new version for Health Monitoring Agent (1.0.643.0_1.0.192.0) with minor improvements and bug fixes (#137) * Release new version for Health Monitoring Agent (1.0.674.0_1.0.199.0) with minor improvements and bug fixes. (#139) * documentation working setup * training inference documentation changes * Add more inference examples * UI changes for documentation * Change to tabbed view for CLI and SDK * Change to tabbed view getting started page * clean up custom css * fix inference sdk create commands --------- Co-authored-by: adishaa <[email protected]> Co-authored-by: maheshxb <[email protected]> Co-authored-by: jiayelamazon <[email protected]>
1 parent c8bf891 commit 2ccd95a

File tree

3 files changed

+82
-52
lines changed

3 files changed

+82
-52
lines changed

doc/conf.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,15 @@
1+
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License"). You
4+
# may not use this file except in compliance with the License. A copy of
5+
# the License is located at
6+
#
7+
# http://aws.amazon.com/apache2.0/
8+
#
9+
# or in the "license" file accompanying this file. This file is
10+
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
11+
# ANY KIND, either express or implied. See the License for the specific
12+
# language governing permissions and limitations under the License.
113
"""Sphinx configuration."""
214

315
import datetime

doc/inference.md

Lines changed: 68 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -35,18 +35,30 @@ hyp create hyp-jumpstart-endpoint \
3535
3636
````{tab-item} SDK
3737
```python
38-
from sagemaker.hyperpod.inference import HyperPodJumpstartEndpoint
38+
from sagemaker.hyperpod.inference.config.hp_jumpstart_endpoint_config import Model, Server, SageMakerEndpoint, TlsConfig
39+
from sagemaker.hyperpod.inference.hp_jumpstart_endpoint import HPJumpStartEndpoint
3940
40-
# Create a JumpStart endpoint
41-
endpoint = HyperPodJumpstartEndpoint(
42-
endpoint_name="endpoint-jumpstart",
43-
model_id="jumpstart-model-id",
44-
instance_type="ml.g5.8xlarge",
45-
tls_output_s3_uri="s3://sample-bucket"
41+
model = Model(
42+
model_id="deepseek-llm-r1-distill-qwen-1-5b",
43+
model_version="2.0.4"
44+
)
45+
46+
server = Server(
47+
instance_type="ml.g5.8xlarge"
4648
)
4749
48-
# Deploy the endpoint
49-
endpoint.create()
50+
endpoint_name = SageMakerEndpoint(name="endpoint-jumpstart")
51+
52+
tls_config = TlsConfig(tls_certificate_output_s3_uri="s3://sample-bucket")
53+
54+
js_endpoint = HPJumpStartEndpoint(
55+
model=model,
56+
server=server,
57+
sage_maker_endpoint=endpoint_name,
58+
tls_config=tls_config
59+
)
60+
61+
js_endpoint.create()
5062
```
5163
````
5264
`````
@@ -68,19 +80,51 @@ hyp create hyp-custom-endpoint \
6880
6981
````{tab-item} SDK
7082
```python
71-
from sagemaker.hyperpod.inference import HyperPodCustomEndpoint
83+
from sagemaker.hyperpod.inference.config.hp_custom_endpoint_config import Model, Server, SageMakerEndpoint, TlsConfig, EnvironmentVariables
84+
from sagemaker.hyperpod.inference.hp_custom_endpoint import HPCustomEndpoint
85+
86+
model = Model(
87+
model_source_type="s3",
88+
model_location="test-pytorch-job/model.tar.gz",
89+
s3_bucket_name="my-bucket",
90+
s3_region="us-east-2",
91+
prefetch_enabled=True
92+
)
7293
73-
# Create a custom endpoint
74-
endpoint = HyperPodCustomEndpoint(
75-
endpoint_name="endpoint-custom",
76-
model_uri="s3://my-bucket/model-artifacts",
77-
image="123456789012.dkr.ecr.us-west-2.amazonaws.com/my-inference-image:latest",
94+
server = Server(
7895
instance_type="ml.g5.8xlarge",
79-
tls_output_s3_uri="s3://sample-bucket"
96+
image_uri="763104351884.dkr.ecr.us-east-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.4.0-tgi2.3.1-gpu-py311-cu124-ubuntu22.04-v2.0",
97+
container_port=8080,
98+
model_volume_mount_name="model-weights"
99+
)
100+
101+
resources = {
102+
"requests": {"cpu": "30000m", "nvidia.com/gpu": 1, "memory": "100Gi"},
103+
"limits": {"nvidia.com/gpu": 1}
104+
}
105+
106+
env = EnvironmentVariables(
107+
HF_MODEL_ID="/opt/ml/model",
108+
SAGEMAKER_PROGRAM="inference.py",
109+
SAGEMAKER_SUBMIT_DIRECTORY="/opt/ml/model/code",
110+
MODEL_CACHE_ROOT="/opt/ml/model",
111+
SAGEMAKER_ENV="1"
112+
)
113+
114+
endpoint_name = SageMakerEndpoint(name="endpoint-custom-pytorch")
115+
116+
tls_config = TlsConfig(tls_certificate_output_s3_uri="s3://sample-bucket")
117+
118+
custom_endpoint = HPCustomEndpoint(
119+
model=model,
120+
server=server,
121+
resources=resources,
122+
environment=env,
123+
sage_maker_endpoint=endpoint_name,
124+
tls_config=tls_config,
80125
)
81126
82-
# Deploy the endpoint
83-
endpoint.create()
127+
custom_endpoint.create()
84128
```
85129
````
86130
`````
@@ -113,14 +157,15 @@ hyp list hyp-custom-endpoint
113157
114158
````{tab-item} SDK
115159
```python
116-
from sagemaker.hyperpod.inference import HyperPodJumpstartEndpoint, HyperPodCustomEndpoint
160+
from sagemaker.hyperpod.inference.hp_jumpstart_endpoint import HPJumpStartEndpoint
161+
from sagemaker.hyperpod.inference.hp_custom_endpoint import HPCustomEndpoint
117162
118163
# List JumpStart endpoints
119-
jumpstart_endpoints = HyperPodJumpstartEndpoint.list()
164+
jumpstart_endpoints = HPJumpStartEndpoint.list()
120165
print(jumpstart_endpoints)
121166
122167
# List custom endpoints
123-
custom_endpoints = HyperPodCustomEndpoint.list()
168+
custom_endpoints = HPCustomEndpoint.list()
124169
print(custom_endpoints)
125170
```
126171
````
@@ -171,16 +216,8 @@ hyp invoke hyp-custom-endpoint \
171216
172217
````{tab-item} SDK
173218
```python
174-
from sagemaker.hyperpod.inference import HyperPodCustomEndpoint
175-
176-
# Load the endpoint
177-
endpoint = HyperPodCustomEndpoint.load(endpoint_name="endpoint-custom")
178-
179-
# Invoke the endpoint
180-
response = endpoint.invoke(
181-
payload={"inputs": "What is machine learning?"},
182-
content_type="application/json"
183-
)
219+
data = '{"inputs":"What is the capital of USA?"}'
220+
response = endpoint.invoke(body=data).body.read()
184221
print(response)
185222
```
186223
````

doc/installation.md

Lines changed: 2 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
(installation)=
2-
32
# Installation
4-
53
This guide provides installation instructions for the SageMaker HyperPod CLI and SDK.
64

75
## System Requirements
@@ -18,9 +16,7 @@ This guide provides installation instructions for the SageMaker HyperPod CLI and
1816
- PyTorch (version ≥ 1.10)
1917

2018
### Supported Python Versions
21-
- 3.9
22-
- 3.10
23-
- 3.11
19+
- 3.9 and above
2420

2521
## Prerequisites
2622

@@ -38,7 +34,7 @@ To enable this, install the **SageMaker Inference Operator**.
3834

3935
## Installation Options
4036

41-
### Option 1: Install from PyPI
37+
### Install from PyPI
4238

4339
You can install the SageMaker HyperPod CLI and SDK directly using `pip`:
4440

@@ -53,18 +49,3 @@ To verify that the installation was successful, run:
5349
# Verify CLI installation
5450
hyp --help
5551
```
56-
57-
### Option 2: Install from Source
58-
59-
Clone the GitHub repository and install the CLI from source:
60-
61-
```bash
62-
# Clone the repository
63-
git clone https://github.com/aws/sagemaker-hyperpod-cli.git
64-
65-
# Change to the repository directory
66-
cd sagemaker-hyperpod-cli
67-
68-
# Install using pip
69-
pip install .
70-
```

0 commit comments

Comments
 (0)