SIGMOD 2018: Special Session

A Technical Research Agenda in Data Ethics and Responsible Data Management

Moderator: Julia Stoyanovich (Drexel University)

Speakers:

Krishna Gummadi (MPI Software Systems)

Bill Howe (University of Washington)

HV Jagadish (University of Michigan)

Alexandra Meliou (University of Massachusetts Amherst)

Abstract

Recently, there has begun a movement towards fairness, accountability, and transparency (FAT) in algorithmic decision-making, and in data science more broadly. The database community has not been significantly involved in this movement, despite ''owning'' the models, languages, and systems that produce the input to the machine learning applications that are often the focus in data science. If training data are biased, or have errors, it stands to reason that the algorithmic result will also be unfair or erroneous. Similarly, transparency of just the algorithm is usually insufficient to understand why certain results were obtained: one needs also to know the data used. In short, fairness, accountability and transparency depend not just on the algorithm, but also on the data. What are the core data management issues to which the objectives of fairness, accountability and transparency give rise? What role should the database community play in this movement? Will emphasis on these topics dilute our core competency in techniques and technologies for data, or can it reinforce our central role in technology stacks ranging from startups to the enterprise, and from local non-profits to the federal government? This special session features leading researchers who are doing exciting technical work in FAT. The goal of this session is to outline a technical research agenda in data management foundations and systems around data ethics and responsible data management.

Bio

Krishna Gummadi is a tenured faculty member and head of the Networked Systems research group at the Max Planck Institute for Software Systems (MPI-SWS) in Germany. He also holds an honorary professorship at the University of Saarland. Krishna's research interests are in the measurement, analysis, design, and evaluation of complex Internet-scale systems. His current projects focus on understanding and building social computing systems, tackling the challenges associated with (i) assessing the credibility of information shared by anonymous online crowds, (ii) understanding and controlling privacy risks for users sharing data on online forums, (iii) understanding, predicting and influencing human behaviors on social media sites, and (iv) understanding biases and enhancing fairness and transparency in machine learning-based (data-driven) decision making. Krishna's works have been widely cited and his papers have received numerous awards, including SIGCOMM Test of Time, IW3C2 WWW Best Paper Honorable Mention, and Best Papers at NIPS ML & Law Symposium, ACM COSN, ACM/Usenix SOUPS, AAAI ICWSM, Usenix OSDI, ACM SIGCOMM IMC, and SPIE MMCN. He received an ERC Advanced Grant in 2017 for investigating ''Foundations for Fair Social Computing''.

Bill Howe is Associate Professor in the Information School, Adjunct Associate Professor in Computer Science & Engineering, Senior Data Science Fellow and Founding Associate Director of the UW eScience Institute, Director of the UW Urbanalytics Group, and Founding Chair of the UW Data Science Masters Degree. He has received two Jim Gray Seed Grant awards from Microsoft Research for work on managing scientific data, has had two papers selected for VLDB Journal's Best of Conference issues, and co-authored what are currently the most-cited papers from both VLDB 2010 and ACM SIGMOD 2012. Howe developed a first MOOC on data science that attracted over 200,000 students across two offerings, and founded UW's Data Science for Social Good program. He has a Ph.D. in Computer Science from Portland State University and a Bachelor's degree in Industrial & Systems Engineering from Georgia Tech.

HV Jagadish is the Bernard A Galler Collegiate Professor of Electrical Engineering and Computer Science, and Distinguished Scientist at the Institute for Data Science, at the University of Michigan in Ann Arbor. Prior to 1999, he was Head of the Database Research Department at AT&T Labs, Florham Park, NJ. He is a fellow of the ACM (since 2003), fellow of AAAS (since 2018) and serves on the board of the Computing Research Association (since 2009). He has been an Associate Editor for the ACM Transactions on Database Systems (1992-1995), Program Chair of the ACM SIGMOD annual conference (1996), Program Chair of the ISMB conference (2005), a trustee of the VLDB foundation (2004-2009), Founding Editor-in-Chief of the Proceedings of the VLDB Endowment (2008-2014), and Program Chair of the VLDB Conference (2014). Since 2016, he is Editor of the Morgan & Claypool ''Synthesis'' Lecture Series on Data Management. Among his many awards are the 2013 ACM SIGMOD Contributions Award and the 2008 David E Liddle Research Excellence Award (at the University of Michigan). He has developed a popular MOOC on Data Science Ethics that is carried by Coursera and EdX.

Alexandra Meliou is an Assistant Professor in the College of Information and Computer Science, at the University of Massachusetts, Amherst. Prior to that, she was a Post-Doctoral Research Associate at the University of Washington, working with Dan Suciu. Alexandra received her PhD degree from the Electrical Engineering and Computer Sciences Department at the University of California, Berkeley. She has received recognitions for research and teaching, including a CACM Research Highlight, an ACM SIGMOD Research Highlight Award, an ACM SIGSOFT Distinguished Paper Award, an NSF CAREER Award, a Google Faculty Research Award, and a Lilly Fellowship for Teaching Excellence. Her research focuses on data provenance, causality, explanations, data quality, and algorithmic and data fairness.

Julia Stoyanovich (moderator) is an Assistant Professor of Computer Science at Drexel University, and an affiliated faculty at the Center for Information Technology Policy at Princeton. She is a recipient of an NSF CAREER award and of an NSF/CRA CI Fellowship. Julia's research focuses on responsible data management and analysis practices: on operationalizing fairness, diversity, transparency, and data protection in all stages of the data acquisition and processing lifecycle. She established the Data, Responsibly consortium, serves on the ACM task force to revise the Code of Ethics and Professional Conduct, and is active in the New York City algorithmic transparency effort. In addition to data ethics, Julia works on management and analysis of preference data, and on querying large evolving graphs. She holds M.S. and Ph.D. degrees in Computer Science from Columbia University, and a B.S. in Computer Science and in Mathematics and Statistics from the University of Massachusetts at Amherst.

Welcome

Organization

Special and Co-located Events

Participant Information

Calls For Submissions

PODS Program

SIGMOD Program

SIGMOD 2018: Special Session

A Technical Research Agenda in Data Ethics and Responsible Data Management

Moderator: Julia Stoyanovich (Drexel University)

Speakers:

Abstract

Bio