Skip to content

Actions: willccbb/verifiers

Actions

Style

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
225 workflow runs
225 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Fix eval saving failing on -n -1
Style #270: Pull request #255 synchronize by mikasenghaas
August 27, 2025 11:06 11s mika/fix/eval-all-examples
August 27, 2025 11:06 11s
Fix eval saving failing on -n -1
Style #269: Pull request #255 opened by mikasenghaas
August 27, 2025 11:04 14s mika/fix/eval-all-examples
August 27, 2025 11:04 14s
[DRAFT] Port LiveCodeBench Eval to verifiers
Style #268: Pull request #225 synchronize by IseLein
August 27, 2025 08:09 Action required IseLein:env/livecodebench1
August 27, 2025 08:09 Action required
detect when tool_calls is a list of JSON strings (#250)
Style #267: Commit 85ae8e4 pushed by willccbb
August 27, 2025 03:34 18s main
August 27, 2025 03:34 18s
Add MedAgentBench Envrionment
Style #265: Pull request #249 opened by Pranavb333
August 26, 2025 20:36 Action required Pranavb333:add-med_agent_bench
August 26, 2025 20:36 Action required
Release version 0.1.3
Style #263: Commit 2106820 pushed by willccbb
August 26, 2025 11:55 10s main
August 26, 2025 11:55 10s
revert version
Style #262: Commit 93b8b72 pushed by willccbb
August 26, 2025 09:40 12s main
August 26, 2025 09:40 12s
fix saving dataset to HF, toolcall sanitizing (#246)
Style #261: Commit aef9f21 pushed by willccbb
August 26, 2025 08:14 9s main
August 26, 2025 08:14 9s
August 26, 2025 07:23 9s
Add sampling_args flag to vf-eval (#240)
Style #257: Commit 8e38e7f pushed by willccbb
August 26, 2025 03:44 9s main
August 26, 2025 03:44 9s
Allow unsetting max_tokens in eval script (#241)
Style #255: Commit c054ff9 pushed by willccbb
August 26, 2025 03:28 12s main
August 26, 2025 03:28 12s
MMLU example working, tui fixes (#243)
Style #254: Commit fcc0267 pushed by willccbb
August 26, 2025 03:23 9s main
August 26, 2025 03:23 9s
MMLU example working, tui fixes
Style #253: Pull request #243 synchronize by willccbb
August 26, 2025 03:20 8s will/multimodal-envs
August 26, 2025 03:20 8s
MMLU example working, tui fixes
Style #252: Pull request #243 opened by willccbb
August 26, 2025 03:20 12s will/multimodal-envs
August 26, 2025 03:20 12s
Fix for JudgeRubric not using async (#238)
Style #248: Commit b1fb34f pushed by willccbb
August 25, 2025 23:41 14s main
August 25, 2025 23:41 14s
fix light mode for tui
Style #230: Commit 88a48dd pushed by willccbb
August 24, 2025 22:19 12s main
August 24, 2025 22:19 12s
spacing
Style #229: Commit e82a4ca pushed by willccbb
August 24, 2025 22:07 13s main
August 24, 2025 22:07 13s
hotfix for saving tool call results
Style #228: Commit 8f17fd8 pushed by willccbb
August 24, 2025 21:53 9s main
August 24, 2025 21:53 9s