Notes from the Wired

Home Articles Paper-Summaries Tags Writings Tangled-Thoughts Media Links About

Writing on "Alignment"

These are all the writings that I tagged with "Alignment". If you want to see more writings on a specific topic, check out the tags page.

2024

Dec. 08
Frontier Models are Capable of In-context Scheming
Feb. 05
Training language models to follow instructions with human feedback
Feb. 05
Constitutional AI: Harmlessness from AI Feedback
Feb. 02
Self-Instruct: Aligning Language Models with Self-Generated Instructions
Feb. 01
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Made with Hugo, website licensed under CC BY-NC-SA 4.0.

I2P Mascot