feat: add model-specific prompt formatting with conversation history support #1

GeekRicardo · 2025-07-15T08:37:16Z

Description

This PR enhances the OpenAI-compatible API server by adding model-specific prompt formatting with full conversation history support. Different AI models require different prompt structures and special tokens to function correctly, especially when handling multi-turn conversations. This change ensures that prompts are properly formatted based on the model being used while preserving the entire conversation context.

Changes

Added model-specific formatting functions with context support:

formatLlama3() - Handles multi-turn conversations with <|begin_of_text|>, <|start_header_id|> tokens
formatLlama2() - Preserves conversation history using [INST], <<SYS>> tokens
formatMistral() - Maintains context across multiple user/assistant interactions
formatClaude() - Formats entire conversation history in Claude's conversational style
formatGrok() - Supports system instructions and conversation context
formatSimpleChat() - Fallback that preserves all messages for OpenAI/Gemini
formatDefault() - Generic format maintaining full message history

Main function:

formatPromptForModel() - Processes entire message array while preserving conversation flow

Key Features

🔄 Full Conversation History Support

Processes entire message arrays (system, user, assistant messages)
Maintains proper conversation flow and context
Preserves message order and role information

🎯 Model-Specific Context Handling

Each model family has unique requirements for handling conversation history:

Llama models: Properly chains messages with end-of-turn tokens
Claude: Maintains conversational format with clear role prefixes
Mistral: Uses instruction tags to separate conversation turns
System messages: Handled according to each model's requirements

Motivation

When using the OpenAI-compatible API endpoint with different models through Raycast AI, maintaining conversation context is crucial. Without proper formatting:

Models lose track of previous interactions
System instructions may be ignored or misplaced
Multi-turn conversations produce inconsistent results
Context windows are not utilized effectively

Testing

Tested with multi-turn conversations on:

Llama 2/3/3.1 with system prompts and conversation history
Mistral/Codestral with multiple user/assistant exchanges
Claude with system instructions and long conversations
OpenAI format with full message history
Generic fallback maintaining all context

Example

Before this change:

// Only last message or simple concatenation
const prompt = messages[messages.length - 1].content;

After this change:

// Full conversation with proper formatting
const messages = [
  { role: 'system', content: 'You are a helpful assistant.' },
  { role: 'user', content: 'What is Python?' },
  { role: 'assistant', content: 'Python is a programming language...' },
  { role: 'user', content: 'Can you give me an example?' }
];

const prompt = formatPromptForModel(messages, "claude");
// Returns properly formatted conversation with full context:
// You are a helpful assistant.
// 
// User: What is Python?
// 
// Assistant: Python is a programming language...
// 
// User: Can you give me an example?
// 
// Assistant:

Breaking Changes

None. This change is backwards compatible and only enhances the existing functionality.

Benefits

✅ Better context understanding: Models can reference previous messages
✅ Improved response quality: Full conversation context leads to more coherent responses
✅ System instruction support: Properly handles system messages for each model
✅ Multi-turn conversations: Enables complex, contextual interactions
✅ Model compatibility: Each model receives its optimal format

Related Issues

Improves model compatibility for OpenAI-compatible API usage
Addresses prompt formatting issues with conversation history
Enables proper context handling for multi-turn conversations

Checklist

Code follows the project's style guidelines
All comments are in English
Functions are properly documented with JSDoc
Handles full message arrays with conversation history
Preserves context across multiple conversation turns
No breaking changes introduced
Tested with multi-turn conversations on multiple model types

Disclosure

This code enhancement was developed with AI assistance. The PR description and documentation were also generated using AI to ensure comprehensive coverage of the changes and their implications.

…support

feat: add model-specific prompt formatting with conversation history …

5788c76

…support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add model-specific prompt formatting with conversation history support #1

feat: add model-specific prompt formatting with conversation history support #1

Uh oh!

GeekRicardo commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: add model-specific prompt formatting with conversation history support #1

Are you sure you want to change the base?

feat: add model-specific prompt formatting with conversation history support #1

Uh oh!

Conversation

GeekRicardo commented Jul 15, 2025

Description

Changes

Added model-specific formatting functions with context support:

Main function:

Key Features

🔄 Full Conversation History Support

🎯 Model-Specific Context Handling

Motivation

Testing

Example

Breaking Changes

Benefits

Related Issues

Checklist

Disclosure

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant