Towards Measuring and Modeling - Culture - in LLMs - A Survey

Adilazuarda, Muhammad Farid, Sagnik Mukherjee, Pradhyumna Lavania, et al. 2024. โ€œTowards Measuring and Modeling โ€˜Cultureโ€™ in LLMs: A Survey.โ€ arXiv:2403.15412. Preprint, arXiv, September 4. https://doi.org/10.48550/arXiv.2403.15412.

Notes

In-text annotations

"The growing body of work that broadly aims at evaluating LLMs for their multi-cultural awareness and biases underscore an important problem - that the existing models are strongly biased towards Western, Anglocentric or American cultures (Johnson et al., 2022; Cieciuch and Schwartz, 2012; Dwivedi et al., 2023)." (Page 1)

"Hershcovich et al. (2022) in their study calls out three axes of interaction between language and culture that NLP research and language technology needs to consider: common ground, aboutness and objectives and values." (Page 2)

"In addition, we highlight limitations in the robustness of the probing methods used in the studies, which raises doubts about the reliability and generalizability of the findings. Whilst benchmarking is important and necessary, it is not sufficient, as the choices made in creating rigorous benchmarking datasets are unlikely to reveal the full extent of either LLMs cultural limitations or their full cultural representation. Not only is culture multi-faceted, but cultural representation is tied in closely with other related factors such as local language use and local terminology" (Page 3)

"We have not come across any study on culture that uses white-box approaches, and deem this to be an important gap in the area because these approaches are more interpretable and likely more robust than black-box methods." (Page 4)

"Definition of culture. While the multifaceted nature of culture makes a unified definition across studies virtually impossible, it is quite surprising that none of the studies explicitly acknowledge this and nor do they make any attempt to critically engage with the social science literature on culture. Thus, an obvious gap is lack of a framework for defining culture and contextualizing the studies, leading to a lack of a coherent research program. Our survey takes first step in this direction. We recommend that future studies in this area should explicitly call out the proxies of culture that their datasets represent and situate the study within the broader research agenda." (Page 8)