Some conceptual alignment research projectsPublished on August 25, 2022 10:51 PM GMT | 09/01/22 |
Survey advicePublished on August 24, 2022 3:10 AM GMT | 08/27/22 |
Toni Kurz and the Insanity of Climbing MountainsPublished on July 3, 2022 8:51 PM GMT | 08/23/22 |
Deliberate GrievingPublished on May 30, 2022 8:49 PM GMT | 08/19/22 |
Language models seem to be much better than humans at next-token predictionPublished on August 11, 2022 5:45 PM GMT | 08/16/22 |
Humans provide an untapped wealth of evidence about alignmentPublished on July 14, 2022 2:31 AM GMT | 08/14/22 |
Changing the world through slack & hobbiesPublished on July 21, 2022 6:11 PM GMT | 08/10/22 |
«Boundaries», Part 1: a key missing concept from utility theoryPublished on July 26, 2022 11:03 PM GMT | 08/04/22 |
ITT-passing and civility are good; "charity" is bad; steelmanning is nichePublished on July 5, 2022 12:15 AM GMT | 07/26/22 |
What should you change in response to an "emergency"? And AI riskPublished on July 18, 2022 1:11 AM GMT | 07/23/22 |
On how various plans miss the hard bits of the alignment challengePublished on July 12, 2022 2:49 AM GMT | 07/20/22 |
Humans are very reliable agentsPublished on June 16, 2022 10:02 PM GMT | 07/14/22 |
Looking back on my alignment PhDPublished on July 1, 2022 3:19 AM GMT | 07/09/22 |
It’s Probably Not LithiumPublished on June 28, 2022 9:24 PM GMT | 07/05/22 |
What Are You Tracking In Your Head?Published on June 28, 2022 7:30 PM GMT | 07/01/22 |
Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI AlignmentPublished on June 21, 2022 11:55 PM GMT | 06/29/22 |
Nonprofit Boards are WeirdPublished on June 23, 2022 2:40 PM GMT | 06/25/22 |
Where I agree and disagree with EliezerPublished on June 19, 2022 7:15 PM GMT | 06/21/22 |
Six Dimensions of Operational Adequacy in AGI ProjectsPublished on May 30, 2022 5:00 PM GMT | 06/17/22 |
Moses and the Class StrugglePublished on April 1, 2022 11:55 AM GMT | 06/14/22 |