Problems of algorithmic bias are often framed in terms of lack of
representative data or formal fairness optimization constraints to be
applied to automated decision-making systems. However, these discussions
sidestep deeper issues with data used in AI, including problematic
categorizations and the extractive logics of crowdwork and data mining.
This talk will examine two interventions: first by reframing of data as a
form of infrastructure, and as such, implicating politics and power in
the construction of datasets; and secondly discussing the development of
a research program around the genealogy of datasets used in machine
learning and AI systems. These genealogies should be attentive to the
constellation of organizations and stakeholders involved in their
creation, the intent, values, and assumptions of their authors and
curators, and the adoption of datasets by subsequent researchers.
Bio:
Alex is a sociologist and senior research scientist on the Ethical AI
team at Google. Before that, Alex was an assistant professor at the
Institute of Communication, Culture, Information and Technology at the
University of Toronto.
Alex received a Ph.D. in sociology from the University of Wisconsin-Madison. Her dissertation was the Machine-learning Protest Event Data System (MPEDS), a system which uses machine learning and natural language processing to create protest event data.
Her current research agenda is two-fold. One line of research centers
on origins of the training data which form the informational
infrastructure of machine learning, artificial intelligence, and
algorithmic fairness frameworks. Another line of research (with Ellen Berrey)
seeks to understand the interplay between student protest and
university responses in US and Canada. Alex’s past work has focused on
how new and social media has changed social movement mobilization and
political participation.
Alex is as much as an educator as she is a researcher. She has taught workshops and courses
on computational methods for social scientists, social movements, and
the implications of information as infrastructure. She co-founded the
[now defunct] computational social science blog Bad Hessian.
As a second job, she plays women’s flat track roller derby with Bay Area Derby.