pamelafox
diff --git a/‎.env.sample‎
Lines changed: 2 additions & 3 deletions b/‎.env.sample‎
Lines changed: 2 additions & 3 deletions
diff --git a/‎.env.sample.azure‎
Lines changed: 2 additions & 3 deletions b/‎.env.sample.azure‎
Lines changed: 2 additions & 3 deletions
diff --git a/‎README.md‎
Lines changed: 108 additions & 40 deletions b/‎README.md‎
Lines changed: 108 additions & 40 deletions
diff --git a/‎azure.yaml‎
Lines changed: 17 additions & 0 deletions b/‎azure.yaml‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎chained_calls.py‎
Lines changed: 4 additions & 5 deletions b/‎chained_calls.py‎
Lines changed: 4 additions & 5 deletions
diff --git a/‎chat.py‎
Lines changed: 4 additions & 5 deletions b/‎chat.py‎
Lines changed: 4 additions & 5 deletions
diff --git a/‎chat_async.py‎
Lines changed: 29 additions & 11 deletions b/‎chat_async.py‎
Lines changed: 29 additions & 11 deletions
diff --git a/‎chat_history.py‎
Lines changed: 4 additions & 5 deletions b/‎chat_history.py‎
Lines changed: 4 additions & 5 deletions
diff --git a/‎chat_history_stream.py‎
Lines changed: 4 additions & 5 deletions b/‎chat_history_stream.py‎
Lines changed: 4 additions & 5 deletions
@@ -1,9 +1,8 @@
 # API_HOST can be either azure, ollama, openai, or github:
 API_HOST=azure
 # Needed for Azure:
-AZURE_OPENAI_ENDPOINT=https://YOUR-AZURE-OPENAI-SERVICE-NAME.openai.azure.com
-AZURE_OPENAI_DEPLOYMENT=YOUR-AZURE-DEPLOYMENT-NAME
-AZURE_OPENAI_VERSION=2024-03-01-preview
+AZURE_OPENAI_ENDPOINT=https://YOUR-AZURE-OPENAI-SERVICE-NAME.openai.azure.com/openai/v1
+AZURE_OPENAI_CHAT_DEPLOYMENT=YOUR-AZURE-DEPLOYMENT-NAME
 # Needed for Ollama:
 OLLAMA_ENDPOINT=http://localhost:11434/v1
 OLLAMA_MODEL=llama3.1
 
@@ -1,5 +1,4 @@
 # See .env.sample for all options
 API_HOST=azure
-AZURE_OPENAI_ENDPOINT=https://YOUR-AZURE-OPENAI-SERVICE-NAME.openai.azure.com
-AZURE_OPENAI_DEPLOYMENT=YOUR-AZURE-DEPLOYMENT-NAME
-AZURE_OPENAI_VERSION=2024-03-01-preview
+AZURE_OPENAI_ENDPOINT=https://YOUR-AZURE-OPENAI-SERVICE-NAME.openai.azure.com/openai/v1
+AZURE_OPENAI_CHAT_DEPLOYMENT=YOUR-AZURE-DEPLOYMENT-NAME
@@ -2,9 +2,25 @@
 
 This repository contains a collection of Python scripts that demonstrate how to use the OpenAI API to generate chat completions.
 
-## OpenAI package
-
-These scripts use the OpenAI package to demonstrate how to use the OpenAI API.
+* [Examples](#examples)
+  * [OpenAI Chat Completions](#openai-chat-completions)
+  * [Popular LLM libraries](#popular-llm-libraries)
+  * [Function calling](#function-calling)
+  * [Structured outputs](#structured-outputs)
+  * [Retrieval-Augmented Generation (RAG)](#retrieval-augmented-generation-rag)
+* [Setting up the Python environment](#setting-up-the-python-environment)
+* [Configuring the OpenAI environment variables](#configuring-the-openai-environment-variables)
+  * [Using GitHub Models](#using-github-models)
+  * [Using Azure OpenAI models](#using-azure-openai-models)
+  * [Using OpenAI.com models](#using-openaicom-models)
+  * [Using Ollama models](#using-ollama-models)
+* [Resources](#resources)
+
+## Examples
+
+### OpenAI Chat Completions
+
+These scripts use the openai Python package to demonstrate how to use the OpenAI Chat Completions API.
 In increasing order of complexity, the scripts are:
 
 1. [`chat.py`](./chat.py): A simple script that demonstrates how to use the OpenAI API to generate chat completions.
@@ -17,15 +33,22 @@ Plus these scripts to demonstrate additional features:
 * [`chat_safety.py`](./chat_safety.py): The simple script with exception handling for Azure AI Content Safety filter errors.
 * [`chat_async.py`](./chat_async.py): Uses the async clients to make asynchronous calls, including an example of sending off multiple requests at once using `asyncio.gather`.
 
-## Popular LLM libraries
+### Function calling
+
+These scripts demonstrate using the Chat Completions API "tools" (a.k.a. function calling) feature, which lets the model decide when to call developer-defined functions and return structured arguments instead of (or before) a natural language answer.
+
+In all of these examples, a list of functions is declared in the `tools` parameter. The model may respond with `message.tool_calls` containing one or more tool calls. Each tool call includes the function `name` and a JSON string of `arguments` that match the declared schema. Your application is responsible for: (1) detecting tool calls, (2) executing the corresponding local / external logic, and (3) (optionally) sending the tool result back to the model for a final answer.
 
-These scripts use popular LLM libraries to demonstrate how to use the OpenAI API with them:
+Scripts (in increasing order of capability):
 
-* [`chat_langchain.py`](./chat_langchain.py): Uses the Langchain package to generate chat completions. [Learn more from Langchain docs](https://python.langchain.com/docs/get_started/quickstart)
-* [`chat_llamaindex.py`](./chat_llamaindex.py): Uses the LlamaIndex package to generate chat completions. [Learn more from LlamaIndex docs](https://docs.llamaindex.ai/en/stable/)
-* [`chat_pydanticai.py`](./chat_pydanticai.py): Uses the PydanticAI package to generate chat completions. [Learn more from PydanticAI docs](https://ai.pydantic.dev/)
+1. [`function_calling_basic.py`](./function_calling_basic.py): Declares a single `lookup_weather` function and prompts the model. It prints the tool call (if any) or falls back to the model's normal content. No actual function execution occurs.
+2. [`function_calling_call.py`](./function_calling_call.py): Executes the `lookup_weather` function if the model requests it by parsing the returned arguments JSON and calling the local Python function.
+3. [`function_calling_extended.py`](./function_calling_extended.py): Shows a full round‑trip: after executing the function, it appends a `tool` role message containing the function result and asks the model again so it can incorporate real data into a final user-facing response.
+4. [`function_calling_multiple.py`](./function_calling_multiple.py): Exposes multiple functions (`lookup_weather`, `lookup_movies`) so you can see how the model chooses among them and how multiple tool calls could be returned.
 
-## Retrieval-Augmented Generation (RAG)
+You must use a model that supports function calling (such as the defaults `gpt-4o`, `gpt-4o-mini`, etc.). Some local or older models may not support the `tools` parameter.
+
+### Retrieval-Augmented Generation (RAG)
 
 These scripts demonstrate how to use the OpenAI API for Retrieval-Augmented Generation (RAG) tasks, where the model retrieves relevant information from a source and uses it to generate a response.
 
@@ -44,7 +67,7 @@ Then run the scripts (in order of increasing complexity):
 * [`rag_documents_flow.py`](./rag_pdfs.py): A RAG flow that retrieves matching results from the local JSON file created by `rag_documents_ingestion.py`.
 * [`rag_documents_hybrid.py`](./rag_documents_hybrid.py): A RAG flow that implements a hybrid retrieval with both vector and keyword search, merging with Reciprocal Rank Fusion (RRF), and semantic re-ranking with a cross-encoder model.
 
-## Structured outputs with OpenAI
+## Structured outputs
 
 These scripts demonstrate how to use the OpenAI API to generate structured responses using Pydantic data models:
 
@@ -54,7 +77,7 @@ These scripts demonstrate how to use the OpenAI API to generate structured respo
 * [`structured_outputs_function_calling.py`](./structured_outputs_function_calling.py): Demonstrates how to use functions defined with Pydantic for automatic function calling based on user queries.
 * [`structured_outputs_nested.py`](./structured_outputs_nested.py): Uses nested Pydantic models to handle more complex structured responses, such as events with participants having multiple attributes.
 
-## Setting up the environment
+## Setting up the Python environment
 
 If you open this up in a Dev Container or GitHub Codespaces, everything will be setup for you.
 If not, follow these steps:
@@ -70,61 +93,106 @@ python -m pip install -r requirements.txt
 ## Configuring the OpenAI environment variables
 
 These scripts can be run with Azure OpenAI account, OpenAI.com, local Ollama server, or GitHub models,
-depending on the environment variables you set.
+depending on the environment variables you set. All the scripts reference the environment variables from a `.env` file, and an example `.env.sample` file is provided. Host-specific instructions are below.
 
-1. Copy the `.env.sample` file to a new file called `.env`:
+## Using GitHub Models
 
-    ```bash
-    cp .env.sample .env
+If you open this repository in GitHub Codespaces, you can run the scripts for free using GitHub Models without any additional steps, as your `GITHUB_TOKEN` is already configured in the Codespaces environment.
+
+If you want to run the scripts locally, you need to set up the `GITHUB_TOKEN` environment variable with a GitHub [personal access token (PAT)](https://github.com/settings/tokens). You can create a PAT by following these steps:
+
+1. Go to your GitHub account settings.
+2. Click on "Developer settings" in the left sidebar.
+3. Click on "Personal access tokens" in the left sidebar.
+4. Click on "Tokens (classic)" or "Fine-grained tokens" depending on your preference.
+5. Click on "Generate new token".
+6. Give your token a name and select the scopes you want to grant. For this project, you don't need any specific scopes.
+7. Click on "Generate token".
+8. Copy the generated token.
+9. Set the `GITHUB_TOKEN` environment variable in your terminal or IDE:
+
+    ```shell
+    export GITHUB_TOKEN=your_personal_access_token
     ```
 
-2. For Azure OpenAI, create an Azure OpenAI gpt-3.5 or gpt-4 deployment (perhaps using [this template](https://github.com/Azure-Samples/azure-openai-keyless)), and customize the `.env` file with your Azure OpenAI endpoint and deployment id.
+10. Optionally, you can use a model other than "gpt-4o" by setting the `GITHUB_MODEL` environment variable. Use a model that supports function calling, such as: `gpt-4o`, `gpt-4o-mini`, `o3-mini`, `AI21-Jamba-1.5-Large`, `AI21-Jamba-1.5-Mini`, `Codestral-2501`, `Cohere-command-r`, `Ministral-3B`, `Mistral-Large-2411`, `Mistral-Nemo`, `Mistral-small`
 
-    ```bash
-    API_HOST=azure
-    AZURE_OPENAI_ENDPOINT=https://YOUR-AZURE-OPENAI-SERVICE-NAME.openai.azure.com
-    AZURE_OPENAI_DEPLOYMENT=YOUR-AZURE-DEPLOYMENT-NAME
-    AZURE_OPENAI_VERSION=2024-03-01-preview
+## Using Azure OpenAI models
+
+You can run all examples in this repository using GitHub Models. If you want to run the examples using models from Azure OpenAI instead, you need to provision the Azure AI resources, which will incur costs.
+
+This project includes infrastructure as code (IaC) to provision Azure OpenAI deployments of "gpt-4o" and "text-embedding-3-large". The IaC is defined in the `infra` directory and uses the Azure Developer CLI to provision the resources.
+
+1. Make sure the [Azure Developer CLI (azd)](https://aka.ms/install-azd) is installed.
+
+2. Login to Azure:
+
+    ```shell
+    azd auth login
+    ```
+
+    For GitHub Codespaces users, if the previous command fails, try:
+
+   ```shell
+    azd auth login --use-device-code
+    ```
+
+3. Provision the OpenAI account:
+
+    ```shell
+    azd provision
     ```
 
-    If you are not yet logged into the Azure account associated with that deployment, run this command to log in:
+    It will prompt you to provide an `azd` environment name (like "agents-demos"), select a subscription from your Azure account, and select a location. Then it will provision the resources in your account.
+
+4. Once the resources are provisioned, you should now see a local `.env` file with all the environment variables needed to run the scripts.
+5. To delete the resources, run:
 
     ```shell
-    az login
+    azd down
     ```
 
-3. For OpenAI.com, customize the `.env` file with your OpenAI API key and desired model name.
+
+## Using OpenAI.com models
+
+1. Create a `.env` file by copying the `.env.sample` file and updating it with your OpenAI API key and desired model name.
 
     ```bash
-    API_HOST=openai
-    OPENAI_KEY=YOUR-OPENAI-API-KEY
-    OPENAI_MODEL=gpt-3.5-turbo
+    cp .env.sample .env
     ```
 
-4. For Ollama, customize the `.env` file with your Ollama endpoint and model name (any model you've pulled).
+2. Update the `.env` file with your OpenAI API key and desired model name:
 
     ```bash
-    API_HOST=ollama
-    OLLAMA_ENDPOINT=http://localhost:11434/v1
-    OLLAMA_MODEL=llama2
+    API_HOST=openai
+    OPENAI_API_KEY=your_openai_api_key
+    OPENAI_MODEL=gpt-4o-mini
     ```
 
-    If you're running inside the Dev Container, replace `localhost` with `host.docker.internal`.
+## Using Ollama models
 
-5. For GitHub models, customize the `.env` file with your GitHub model name.
+1. Install [Ollama](https://ollama.com/) and follow the instructions to set it up on your local machine.
+2. Pull a model, for example:
+
+    ```shell
+    ollama pull llama3.1
+    ```
+
+3. Create a `.env` file by copying the `.env.sample` file and updating it with your Ollama endpoint and model name.
 
     ```bash
-    API_HOST=github
-    GITHUB_MODEL=gpt-4o
+    cp .env.sample .env
     ```
 
-    You'll need a `GITHUB_TOKEN` environment variable that stores a GitHub personal access token.
-    If you're running this inside a GitHub Codespace, the token will be automatically available.
-    If not, generate a new [personal access token](https://github.com/settings/tokens) and run this command to set the `GITHUB_TOKEN` environment variable:
+4. Update the `.env` file with your Ollama endpoint and model name (any model you've pulled):
 
-    ```shell
-    export GITHUB_TOKEN="<your-github-token-goes-here>"
+    ```bash
+    API_HOST=ollama
+    OLLAMA_ENDPOINT=http://localhost:11434/v1
+    OLLAMA_MODEL=llama3.1
+    ```
 
 ## Resources
 
+* [Upcoming October 2025 series: Python + AI](https://aka.ms/PythonAI/series)
 * [Video series: Learn Python + AI](https://techcommunity.microsoft.com/blog/EducatorDeveloperBlog/learn-python--ai-from-our-video-series/4400393)
@@ -0,0 +1,17 @@
+# yaml-language-server: $schema=https://raw.githubusercontent.com/Azure/azure-dev/main/schemas/v1.0/azure.yaml.json
+
+name: python-ai-agent-frameworks-demos
+metadata:
+  template: [email protected]
+hooks:
+  postprovision:
+    windows:
+      shell: pwsh
+      run: ./infra/write_dot_env.ps1
+      interactive: false
+      continueOnError: false
+    posix:
+      shell: sh
+      run: ./infra/write_dot_env.sh
+      interactive: false
+      continueOnError: false
@@ -12,12 +12,11 @@
     token_provider = azure.identity.get_bearer_token_provider(
         azure.identity.DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
     )
-    client = openai.AzureOpenAI(
-        api_version=os.environ["AZURE_OPENAI_VERSION"],
-        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
-        azure_ad_token_provider=token_provider,
+    client = openai.OpenAI(
+        base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
+        api_key=token_provider,
     )
-    MODEL_NAME = os.environ["AZURE_OPENAI_DEPLOYMENT"]
+    MODEL_NAME = os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"]
 
 elif API_HOST == "ollama":
     client = openai.OpenAI(base_url=os.environ["OLLAMA_ENDPOINT"], api_key="nokeyneeded")
 
@@ -12,12 +12,11 @@
     token_provider = azure.identity.get_bearer_token_provider(
         azure.identity.DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
     )
-    client = openai.AzureOpenAI(
-        api_version=os.environ["AZURE_OPENAI_VERSION"],
-        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
-        azure_ad_token_provider=token_provider,
+    client = openai.OpenAI(
+        base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
+        api_key=token_provider,
     )
-    MODEL_NAME = os.environ["AZURE_OPENAI_DEPLOYMENT"]
+    MODEL_NAME = os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"]
 
 elif API_HOST == "ollama":
     client = openai.OpenAI(base_url=os.environ["OLLAMA_ENDPOINT"], api_key="nokeyneeded")
 
@@ -1,24 +1,25 @@
 import asyncio
 import os
 
-import azure.identity
+import azure.identity.aio
 import openai
 from dotenv import load_dotenv
 
 # Setup the OpenAI client to use either Azure, OpenAI.com, or Ollama API
 load_dotenv(override=True)
 API_HOST = os.getenv("API_HOST", "github")
 
+azure_credential = None  # Will hold the Azure credential so we can close it properly.
 if API_HOST == "azure":
-    token_provider = azure.identity.get_bearer_token_provider(
-        azure.identity.DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
+    azure_credential = azure.identity.aio.DefaultAzureCredential()
+    token_provider = azure.identity.aio.get_bearer_token_provider(
+        azure_credential, "https://cognitiveservices.azure.com/.default"
     )
-    client = openai.AsyncAzureOpenAI(
-        api_version=os.environ["AZURE_OPENAI_VERSION"],
-        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
-        azure_ad_token_provider=token_provider,
+    client = openai.AsyncOpenAI(
+        base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
+        api_key=token_provider,
     )
-    MODEL_NAME = os.environ["AZURE_OPENAI_DEPLOYMENT"]
+    MODEL_NAME = os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"]
 elif API_HOST == "ollama":
     client = openai.AsyncOpenAI(base_url=os.environ["OLLAMA_ENDPOINT"], api_key="nokeyneeded")
     MODEL_NAME = os.environ["OLLAMA_MODEL"]
@@ -52,11 +53,13 @@ async def generate_response(location):
     return response.choices[0].message.content
 
 
-async def single():
+async def single() -> None:
+    """Run a single request example and handle cleanup."""
     print(await generate_response("Tokyo"))
 
 
-async def multiple():
+async def multiple() -> None:
+    """Run multiple requests concurrently and handle cleanup."""
     answers = await asyncio.gather(
         generate_response("Tokyo"),
         generate_response("Berkeley"),
@@ -66,4 +69,19 @@ async def multiple():
         print(answer, "\n")
 
 
-asyncio.run(single())
+async def close_clients() -> None:
+    """Close the OpenAI async client and (if applicable) the Azure credential."""
+    await client.close()
+    if azure_credential is not None:
+        await azure_credential.close()
+
+
+async def main():
+    try:
+        await single()  # Change to await multiple() to run multiple requests concurrently
+    finally:
+        await close_clients()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
@@ -12,12 +12,11 @@
     token_provider = azure.identity.get_bearer_token_provider(
         azure.identity.DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
     )
-    client = openai.AzureOpenAI(
-        api_version=os.environ["AZURE_OPENAI_VERSION"],
-        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
-        azure_ad_token_provider=token_provider,
+    client = openai.OpenAI(
+        base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
+        api_key=token_provider,
     )
-    MODEL_NAME = os.environ["AZURE_OPENAI_DEPLOYMENT"]
+    MODEL_NAME = os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"]
 elif API_HOST == "ollama":
     client = openai.OpenAI(base_url=os.environ["OLLAMA_ENDPOINT"], api_key="nokeyneeded")
     MODEL_NAME = os.environ["OLLAMA_MODEL"]
 
@@ -12,12 +12,11 @@
     token_provider = azure.identity.get_bearer_token_provider(
         azure.identity.DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
     )
-    client = openai.AzureOpenAI(
-        api_version=os.environ["AZURE_OPENAI_VERSION"],
-        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
-        azure_ad_token_provider=token_provider,
+    client = openai.OpenAI(
+        base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
+        api_key=token_provider,
     )
-    MODEL_NAME = os.environ["AZURE_OPENAI_DEPLOYMENT"]
+    MODEL_NAME = os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"]
 elif API_HOST == "ollama":
     client = openai.OpenAI(base_url=os.environ["OLLAMA_ENDPOINT"], api_key="nokeyneeded")
     MODEL_NAME = os.environ["OLLAMA_MODEL"]