Bats Research
- 77 followers
- United States of America
- http://cs.brown.edu/people/sbach/
Pinned Loading
Repositories
- self-jailbreaking Public
Official code repository for "Self-Jailbreaking: Language Models Can Reason Themselves Out of Safety Alignment After Benign Reasoning Training"
BatsResearch/self-jailbreaking’s past year of commit activity - cot-monitor Public
Can We Predict Alignment Before Models Finish Thinking? Towards Monitoring Misaligned Reasoning Models
BatsResearch/cot-monitor’s past year of commit activity - labelmodels Public
Lightweight implementations of generative label models for weakly supervised machine learning
BatsResearch/labelmodels’s past year of commit activity - bonito Public
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
BatsResearch/bonito’s past year of commit activity - alfred Public
A system for prompted weak supervision. Alfred is a powerful tool that leverages large language models to accelerate data annotation.
BatsResearch/alfred’s past year of commit activity - cross-lingual-detox Public
Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages." Paper accepted at Findings of EMNLP 2024
BatsResearch/cross-lingual-detox’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…