State Media Control Influences Large Language Models
Morning Keynote, 9:10 - 9:40 AM
Millions of people around the world query large language models for information. While several studies have compellingly documented the persuasive potential of these models, there is limited evidence of who or what influences the models themselves, leading to a flurry of concerns about which companies and governments build and regulate the models. We show through six studies that government control of the media across the world already influences the output of large language models (LLMs) via their training data. We use a cross-national audit to show that LLMs exhibit a stronger pro-government valence in the languages of countries with lower media freedom than those with higher media freedom. This result is correlational so to triangulate the specific mechanism of how state media control can influence LLMs, we develop a multi-part case study on China’s media. We demonstrate that media scripted and coordinated by the Chinese state appears in large language model training datasets. To evaluate the plausible effect of this inclusion, we use an open-weight model to show that additional pretraining on Chinese state-coordinated media generates more positive answers to prompts about Chinese political institutions and leaders. We link this phenomenon to commercial models through two audit studies demonstrating that prompting models in Chinese generates more positive responses about China’s institutions and leaders than do the same queries in English. The combination of influence and persuasive potential across languages suggests the troubling conclusion that states and powerful institutions have increased strategic incentives to leverage media control in the hopes of shaping large language model output.
![]() | Pronouns: she/herPortland, OR, USAHannah Waight is an Assistant Professor of Sociology at the University of Oregon and a former Postdoctoral Research Associate at the Center for Social Media and Politics (CSMaP), New York University. She received her PhD in 2022 from the Department of Sociology at Princeton University. She previously received a B.A. and M.A. in East Asian Studies, both from Harvard University. Hannah studies the politics of media and information and their implications for social organization. She has studied how the Chinese state intervenes in media organizations and coordinates the news and how, due to the structure of the information environment, those same propaganda objects travel beyond their origins in state-controlled media to other news environments and even machine learning training data. She is also interested in how individuals perceive these interventions by the state into media and information environments as well as issues of social perception more generally. She's worked on popular perceptions of inequality and has an ongoing project on how people in authoritarian regimes "see the state." Methodologically she uses a range of approaches, including computational methods/natural language processing, experimental methods, and qualitative primary document analysis. Learn more at |
