Societal Alignment Frameworks Can Improve LLM Alignment
Stańczak, Karolina, Nicholas Meade, Mehar Bhatia, et al. 2025. “Societal Alignment Frameworks Can Improve LLM Alignment.” arXiv:2503.00069. Preprint, arXiv, February 27. https://doi.org/10.48550/arXiv.2503.00069.
Notes
In-text annotations
"To better understand this misalignment, we frame LLM alignment within a principal-agent1 framework (Eisenhardt, 1989), a well-established paradigm in economic theory. As shown in Figure 1, in this framework, the LLM acts as the agent and the model developer (or user) serves as the principal." (Page 1)
"In this position piece, we advocate for leveraging insights from societal alignment frameworks to guide the development of LLM alignment within incomplete contracting environments." (Page 2)
"These contextual rules, while not directly influencing primary optimization objectives, are often followed due to tradition, or social norms. Despite their indirect nature, such rules can provide valuable signals about broader societal dynamics, thereby guiding the alignment of LLMs, as discussed by Hadfield-Menell et al. (2019) and Köster et al. (2020) within the broader context of AI alignment." (Page 5)