SIGMOD 2018: Special Session
A Technical Research Agenda in Data Ethics and Responsible Data Management
Moderator: Julia Stoyanovich (Drexel University)
Speakers:
- Krishna Gummadi (MPI Software Systems)
- Bill Howe (University of Washington)
- HV Jagadish (University of Michigan)
- Alexandra Meliou (University of Massachusetts Amherst)
Abstract
Recently, there has begun a movement towards fairness, accountability, and transparency (FAT) in algorithmic decision-making, and in data science more broadly. The database community has not been significantly involved in this movement, despite ''owning'' the models, languages, and systems that produce the input to the machine learning applications that are often the focus in data science. If training data are biased, or have errors, it stands to reason that the algorithmic result will also be unfair or erroneous. Similarly, transparency of just the algorithm is usually insufficient to understand why certain results were obtained: one needs also to know the data used. In short, fairness, accountability and transparency depend not just on the algorithm, but also on the data. What are the core data management issues to which the objectives of fairness, accountability and transparency give rise? What role should the database community play in this movement? Will emphasis on these topics dilute our core competency in techniques and technologies for data, or can it reinforce our central role in technology stacks ranging from startups to the enterprise, and from local non-profits to the federal government? This special session features leading researchers who are doing exciting technical work in FAT. The goal of this session is to outline a technical research agenda in data management foundations and systems around data ethics and responsible data management.
Bio
Krishna Gummadi is a tenured faculty member and head of the Networked Systems research group at the
Max Planck Institute for Software Systems (MPI-SWS) in Germany. He also holds an honorary professorship
at the University of Saarland. Krishna's research interests are in the measurement, analysis, design, and
evaluation of complex Internet-scale systems. His current projects focus on understanding and building social
computing systems, tackling the challenges associated with (i) assessing the credibility of information shared
by anonymous online crowds, (ii) understanding and controlling privacy risks for users sharing data on online
forums, (iii) understanding, predicting and influencing human behaviors on social media sites, and (iv)
understanding biases and enhancing fairness and transparency in machine learning-based (data-driven)
decision making. Krishna's works have been widely cited and his papers have received numerous awards,
including SIGCOMM Test of Time, IW3C2 WWW Best Paper Honorable Mention, and Best Papers at NIPS
ML & Law Symposium, ACM COSN, ACM/Usenix SOUPS, AAAI ICWSM, Usenix OSDI, ACM SIGCOMM IMC, and SPIE
MMCN. He received an ERC Advanced Grant in 2017 for investigating ''Foundations for Fair Social Computing''.
Bill Howe is Associate Professor in the Information School, Adjunct Associate Professor in Computer
Science & Engineering, Senior Data Science Fellow and Founding Associate Director of the UW eScience
Institute, Director of the UW Urbanalytics Group, and Founding Chair of the UW Data Science Masters
Degree. He has received two Jim Gray Seed Grant awards from Microsoft Research for work on managing
scientific data, has had two papers selected for VLDB Journal's Best of Conference issues, and co-authored
what are currently the most-cited papers from both VLDB 2010 and ACM SIGMOD 2012. Howe developed
a first MOOC on data science that attracted over 200,000 students across two offerings, and founded UW's
Data Science for Social Good program. He has a Ph.D. in Computer Science from Portland State University
and a Bachelor's degree in Industrial & Systems Engineering from Georgia Tech.
HV Jagadish is the Bernard A Galler Collegiate Professor of Electrical Engineering and Computer Science,
and Distinguished Scientist at the Institute for Data Science, at the University of Michigan in Ann Arbor.
Prior to 1999, he was Head of the Database Research Department at AT&T Labs, Florham Park, NJ. He is
a fellow of the ACM (since 2003), fellow of AAAS (since 2018) and serves on the board of the Computing
Research Association (since 2009). He has been an Associate Editor for the ACM Transactions on Database
Systems (1992-1995), Program Chair of the ACM SIGMOD annual conference (1996), Program Chair of the
ISMB conference (2005), a trustee of the VLDB foundation (2004-2009), Founding Editor-in-Chief of the
Proceedings of the VLDB Endowment (2008-2014), and Program Chair of the VLDB Conference (2014).
Since 2016, he is Editor of the Morgan & Claypool ''Synthesis'' Lecture Series on Data Management. Among
his many awards are the 2013 ACM SIGMOD Contributions Award and the 2008 David E Liddle Research Excellence Award
(at the University of Michigan). He has developed a popular MOOC on Data Science Ethics that is carried by Coursera and
EdX.
Alexandra Meliou is an Assistant Professor in the College of Information and Computer Science, at the
University of Massachusetts, Amherst. Prior to that, she was a Post-Doctoral Research Associate at the
University of Washington, working with Dan Suciu. Alexandra received her PhD degree from the Electrical
Engineering and Computer Sciences Department at the University of California, Berkeley. She has received
recognitions for research and teaching, including a CACM Research Highlight, an ACM SIGMOD Research
Highlight Award, an ACM SIGSOFT Distinguished Paper Award, an NSF CAREER Award, a Google Faculty
Research Award, and a Lilly Fellowship for Teaching Excellence. Her research focuses on data provenance,
causality, explanations, data quality, and algorithmic and data fairness.
Julia Stoyanovich (moderator) is an Assistant Professor of Computer Science at Drexel University, and
an affiliated faculty at the Center for Information Technology Policy at Princeton. She is a recipient of an
NSF CAREER award and of an NSF/CRA CI Fellowship. Julia's research focuses on responsible data
management and analysis practices: on operationalizing fairness, diversity, transparency, and data
protection in all stages of the data acquisition and processing lifecycle. She established the Data, Responsibly
consortium, serves on the ACM task force to revise the Code of Ethics and Professional Conduct, and is active
in the New York City algorithmic transparency effort. In addition to data ethics, Julia works on management
and analysis of preference data, and on querying large evolving graphs. She holds M.S. and Ph.D. degrees
in Computer Science from Columbia University, and a B.S. in Computer Science and in Mathematics and
Statistics from the University of Massachusetts at Amherst.