-
Notifications
You must be signed in to change notification settings - Fork 1
Labels
bugSomething isn't workingSomething isn't working
Description
I ran metacoder without having OpenAI configued, and the evaluation process gets stuck in a failure loop.
This should be detected so that the process can exit cleanly.
(metacoder) PS C:\Users\CTParker\PycharmProjects\metacoder> uv run metacoder eval .\tests\input\literature_mcp_eval_config.yaml
🔬 Running evaluations from: tests\input\literature_mcp_eval_config.yaml
📊 Loaded dataset: pubmed tools evals
Models: claude-sonnet
Coders: goose, dummy (all available)
Cases: 1
Total evaluations: 2
🚀 Starting evaluations...
Progress: 1/1 - goose/claude-sonnet/PMID_28027860_Full_Text with servers: mcp-simple-pubmed
Running goose with claude-sonnet on case 'PMID_28027860_Full_Text'
📁 Preparing workdir: eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text
🔒 Obtaining lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🔧 Writing config object: .config/goose/config.yaml type=yaml
🔓 Releasing lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🔒 Obtaining lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🦆 Running command: goose run -t What is the first sentence of section 2 in PMID: 28027860?
🦆 Command took 25.10282039642334 seconds
🔓 Releasing lock for eval_workdir\claude-sonnet_goose_PMID_28027860_Full_Text_mcp-simple-pubmed\claude-sonnet_goose_PMID_28027860_Full_Text; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
Evaluating with CorrectnessMetric
✨ You're running DeepEval's latest Correctness [GEval] Metric! (using gpt-4.1, strict=False, async_mode=True)...
Evaluating 1 test case(s) in parallel ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0:00:00
🎯 Evaluating test case #0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0:00:00HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Evaluating 1 test case(s) in parallel ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0:00:01
🎯 Evaluating test case #0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0:00:01HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Evaluating 1 test case(s) in parallel ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0:00:02
🎯 Evaluating test case #0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0:00:02HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
OpenAI Error: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.ope
Aborted!
Evaluating 1 test case(s) in parallel ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0:00:05
🎯 Evaluating test case #0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0:00:05
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working