Skip to content

Actions: willccbb/verifiers

Actions

Test

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
169 workflow runs
169 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Fix eval saving failing on -n -1
Test #212: Pull request #255 synchronize by mikasenghaas
August 27, 2025 11:06 51s mika/fix/eval-all-examples
August 27, 2025 11:06 51s
[DRAFT] Port LiveCodeBench Eval to verifiers
Test #211: Pull request #225 synchronize by IseLein
August 27, 2025 08:09 Action required IseLein:env/livecodebench1
August 27, 2025 08:09 Action required
detect when tool_calls is a list of JSON strings (#250)
Test #210: Commit 85ae8e4 pushed by willccbb
August 27, 2025 03:34 44s main
August 27, 2025 03:34 44s
Add MedAgentBench Envrionment
Test #208: Pull request #249 opened by Pranavb333
August 26, 2025 20:36 Action required Pranavb333:add-med_agent_bench
August 26, 2025 20:36 Action required
Release version 0.1.3
Test #206: Commit 2106820 pushed by willccbb
August 26, 2025 11:55 50s main
August 26, 2025 11:55 50s
fix saving dataset to HF, toolcall sanitizing (#246)
Test #205: Commit aef9f21 pushed by willccbb
August 26, 2025 08:14 46s main
August 26, 2025 08:14 46s
August 26, 2025 07:23 44s
Add sampling_args flag to vf-eval (#240)
Test #201: Commit 8e38e7f pushed by willccbb
August 26, 2025 03:44 41s main
August 26, 2025 03:44 41s
Allow unsetting max_tokens in eval script (#241)
Test #199: Commit c054ff9 pushed by willccbb
August 26, 2025 03:28 42s main
August 26, 2025 03:28 42s
MMLU example working, tui fixes (#243)
Test #198: Commit fcc0267 pushed by willccbb
August 26, 2025 03:23 50s main
August 26, 2025 03:23 50s
MMLU example working, tui fixes
Test #197: Pull request #243 synchronize by willccbb
August 26, 2025 03:20 44s will/multimodal-envs
August 26, 2025 03:20 44s
MMLU example working, tui fixes
Test #196: Pull request #243 opened by willccbb
August 26, 2025 03:20 47s will/multimodal-envs
August 26, 2025 03:20 47s
Fix for JudgeRubric not using async (#238)
Test #192: Commit b1fb34f pushed by willccbb
August 25, 2025 23:41 43s main
August 25, 2025 23:41 43s
fix light mode for tui
Test #174: Commit 88a48dd pushed by willccbb
August 24, 2025 22:19 41s main
August 24, 2025 22:19 41s
spacing
Test #173: Commit e82a4ca pushed by willccbb
August 24, 2025 22:07 42s main
August 24, 2025 22:07 42s
hotfix for saving tool call results
Test #172: Commit 8f17fd8 pushed by willccbb
August 24, 2025 21:53 43s main
August 24, 2025 21:53 43s
reorg funcs in env
Test #171: Commit 0e6098b pushed by willccbb
August 24, 2025 21:41 45s main
August 24, 2025 21:41 45s
env section comments
Test #170: Commit f0d2ce5 pushed by willccbb
August 24, 2025 21:39 39s main
August 24, 2025 21:39 39s