Surveillance Academia
Sabyasachi Das
Sabyasachi Das is an Associate Professor of Economics at Ahmedabad University. His research lies at the intersection of political economy, public economics, and institutional analysis, with a focus on how governance structures, electoral processes, and public policy shape inequality and resource distribution. He holds a PhD in Economics from Yale University.
Academics and researchers sometimes have an unhealthy relationship with data. We are always craving ever larger and more granular datasets, hoping they will uncover deeper and more interesting relationships between social and economic phenomena. The craving turns toxic when the urge to access larger datasets overrides other concerns about how data is generated. This is especially worrying as we move from active forms of data generation to passive ones, driven by the omnipresence of digital technologies in our lives and the surveillance apparatus built around them.
Traditionally, micro-data collection has been largely manual. Researchers either conduct surveys—which are time-consuming and labor-intensive—or access internal records of firms, government agencies, or other organizations. The latter typically requires digitizing and compiling physical records, again involving substantial labor. These modes of data generation, which I refer to as active processes, remain prevalent in research. They require active engagement with human subjects for the purpose of data generation. As a result, the act of collecting data about some human or organizational activity (such as fertility choices, consumption decisions, or firm production and revenues) is separate from the activity itself. This decoupling collapses when data collection becomes passive. Consider a telecom company collecting cellphone metadata from its customers. These data are generated automatically—passively—during routine cellphone use. Data collection no longer requires additional labor, but it does require additional capital: investments in data-collection technologies and storage infrastructure. This shift toward a surveillance-based, capital-intensive model of data production has several important implications for research.
When data production is active, usage purposes are generally better defined and research practices typically include built-in protocols that require informed consent from participants and adherence to ethical norms during data collection (such as asking only ethically permissible questions) and storage (for example, maintaining anonymity). This is especially true for survey-based research. Even firm-level data is often collected by government agencies operating under legal and ethical constraints. For instance, India’s Annual Survey of Industries—which collects firm-level information on revenues, employment, and capital—is conducted by the Ministry of Statistics and Programme Implementation. The Ministry, for the survey, must adhere to the guidelines of the Collection of Statistics Act, first passed in 1953 and amended in 2008. Similarly, in the United States, Title 13 of the U.S. Code (passed in 1954) provides the legal framework for census data collection, including firm data. Nordic countries, known for maintaining detailed social registries for the entire population, have independent public institutions (such as the Norwegian Data Protection Authority) with the primary purpose of ensuring compliance with data protection laws.
This is not to say that survey or institutional data collection is devoid of ethical issues
Researchers are increasingly collaborating with private organizations that collect vast amounts of micro-data—cellphone records, purchase histories, transportation activity on ride-sharing platforms, and so on. As a result, researchers’ access to such data depends heavily on their relationship with the organization that holds the data. In practice, researchers must offer a value proposition for access. Academic researchers can help firms in strategy, and more importantly, can influence policy to make it more favorable to them. The terms of these collaborations are often implicit and private, making them difficult for outsiders to evaluate.
A 2018 Guardian investigation
Sometimes, the nature of the research question itself signals the underlying relationship. For example, in a paper titled “Personalized Pricing and Consumer Welfare”(2022)
There are similar examples of such “firm-sanctioned” studies. The research questions in these papers are all valid and intellectually interesting exercises. However, using firms’ surveillance data on their customers for research creates an externality for both the firm and its customers. The customers generated the data in the first place, but did not—and likely cannot— meaningfully consent to their data being used in this way.
Academics also increasingly collaborate with governments to study the effectiveness of using surveillance data—generated either within government systems or by private organizations—to improve policy implementation and governance outcomes. One prominent policy area, particularly in developing countries, is the targeting of welfare programs. Identifying poor households is a central challenge in implementing such programs, and governments traditionally rely on surveys for this purpose.
A growing body of research asks whether individual cellphone records can help governments identify poor households more effectively than surveys. The underlying hypothesis is that the cellphone usage patterns of poorer individuals differ systematically from those of others. If true, such methods could, in principle, be scaled up to reduce the inclusion and exclusion errors associated with survey-based targeting.
Many of these studies are conducted in difficult settings—Afghanistan
The Afghanistan study, for example, finds that cellphone-based targeting “is nearly as accurate as the commonly employed asset- and consumption-based methods.” The Bangladesh study concludes that while survey-based targeting (the Proxy Means Test) “is more costly than phone-based targeting, it is also more accurate.” As a result, the policy conclusions are nuanced: phone-based methods cannot replace traditional approaches but may at best complement them. As the Togo study notes,
our results do not imply that mobile-phone-based targeting should replace traditional approaches reliant on proxy means tests or community-based targeting. Rather, these methods provide a rapid and cost-effective supplement that may be most useful in crisis settings or in contexts where traditional data sources are incomplete or out of date.
Nevertheless, providing governments—especially those with limited capacity—with alternative digital technologies that can eschew investments in data-collection institutions carries risks. It may nudge them toward wider adoption of such tools than is appropriate, leading to underinvestment in (active) data collection mechanisms, which the research finds to be generally superior. Inappropriate or hasty adoption of these technologies in governance practices can indeed generate adverse outcomes for the beneficiaries, as discussed here
In some cases, the benefits of state surveillance are easier to demonstrate than the costs, shaping which research questions are asked—or asked first. In January 2026, the Journal of Development Economics, a leading journal in the field, published a paper examining the impact of China’s nationwide expansion of facial-recognition-enabled surveillance cameras on crime. Ominously titled “Keeping an Eye on the Villain,”
Surveillance has effectively shifted the responsibility for data generation from governments and researchers to human subjects themselves, with a wide scope of use. This has undoubtedly made some researchers’ lives easier—and intellectually, and sometimes materially, richer. In absence of clear guidelines that govern academic collaborations enabling passive data access, several ethical issues arise. Before passive data becomes the default benchmark for research, however, we should pause to reflect on its broader consequences for society and the economy at large.
Acknowledgement
The author thanks Reetika Khera, Ankur Sarin, Parikshit Ghosh and Ritwik Banerjee for valuable feedback and comments.