Highlights
- Pro
Pinned Loading
-
haizelabs/llama3-jailbreak
haizelabs/llama3-jailbreak PublicA trivial programmatic Llama 3 jailbreak. Sorry Zuck!
-
haizelabs/dspy-redteam
haizelabs/dspy-redteam PublicRed-Teaming Language Models with DSPy
-
haizelabs/verdict
haizelabs/verdict PublicInference-time scaling for LLMs-as-a-judge.
-
haizelabs/j1-micro
haizelabs/j1-micro Publicj1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.
-
The-Naughtyformer
The-Naughtyformer PublicThe Naughtyformer: A Transformer Understands Offensive Humor (AAAI 2023)
-
LLM-Watermarks
LLM-Watermarks PublicBaselines for Identifying Watermarked Large Language Models (ICML AdvML 2023)
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.