What is the problem with RLHF?In progressThis is a draft that someone has submitted for feedback. It hasn't undergone our usual vetting and may not meet our standards or reflect our views. Could AI alignment research be bad? How?