Skip to content

Pull requests: willccbb/verifiers

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add MedAgentBench Envrionment
#249 opened Aug 26, 2025 by Pranavb333 Draft
7 of 11 tasks
Fix: Stop tool execution on failure in ToolEnv
#236 opened Aug 25, 2025 by Mirza-Samad-Ahmed-Baig Loading…
7 of 10 tasks
[DRAFT] Port LiveCodeBench Eval to verifiers
#225 opened Aug 23, 2025 by IseLein Draft
14 tasks
Kubernetes cluster management and migration
#168 opened Jul 28, 2025 by willccbb Draft
11 of 14 tasks
[DRAFT] Port LiveCodeBench Eval to verifiers
#164 opened Jul 27, 2025 by willccbb Draft
14 tasks
[DRAFT] Port Tau2-Bench eval to verifiers
#163 opened Jul 27, 2025 by willccbb Draft
9 of 14 tasks
feat: Add support for multiple tool calls in a single message
#147 opened Jul 18, 2025 by PastaPastaPasta Loading…
9 of 14 tasks
Add token usage to vllm_server
#118 opened Jul 1, 2025 by tcapelle Loading…
Vllm server with HF parser
#97 opened Jun 21, 2025 by tcapelle Loading…
add multimodal support
#81 opened Jun 10, 2025 by nph4rd Loading…
Release workflow with dynamic versioning
#72 opened Jun 6, 2025 by mattmorgis Loading…
add: RAG tool (simple BM25)
#61 opened May 15, 2025 by hahuyhoang411 Loading…
ProTip! Adding no:label will show everything without a label.