Thank you for the thoughtful questions, Renan.
1- The research agenda is still in formation as a key goal over the next 3-4 months is to further shape the directions and priorities and secure funding from the identified sources. However, I envision a significant portion focusing on interpretability, particularly interpreting reward models learned via reinforcement learning. Additional areas will likely include safe verification techniques, aligning with much of Stuart Russell's work as well as the expert areas of Phil and David.
2- Regarding team composition, we expect at least two existing research fellows to be involved and several PhD students to be hired. Most members will have strong technical backgrounds and solid foundational knowledge in AI alignment literature. We aim to assemble a diverse team with complementary strengths to pursue impactful research directions.
Please let me know if you have any other questions! I'm excited by the potential here and value your perspective.