Gems of PODS
Talk 1
Reflections on Schema Mappings, Data Exchange, and Metadata Management
Speaker: Phokion G. Kolaitis (UC Santa Cruz and IBM Almaden Research Center)
Abstract
A schema mapping is a high-level specification of the relationship between two database schemas. For the past fifteen years, schema mappings have played an essential role in the modeling and analysis of data exchange, data integration, and related data inter-operability tasks. The aim of this talk is to critically reflect on the body of work carried out to date, describe some of the persisting challenges, and suggest directions for future work.
The first part of the talk will focus on schema-mapping languages, especially on the language of GLAV (global-and-local as view) mappings and its two main sublanguages, the language of GAV (global-as-view) mappings and the language of LAV (local-as-view) mappings. After highlighting the fundamental structural properties of these languages, we will discuss how structural properties can actually characterize schema-mapping languages. The second part of the talk will focus on metadata management by considering operators on schema mappings, such as the composition operator and the inverse operator. We will discuss why richer languages are needed to express these operators, and will illustrate some of their uses in schema-mapping evolution. The third and final part of the talk will focus on the derivation of schema mappings from semantic information. In particular, we will discuss a variety of approaches for deriving schema mappings from data examples, including casting the derivation of schema mappings as an optimization problem and as a learning problem.
Bio
Phokion G. Kolaitis is a Distinguished Professor of Computer Science at UC Santa Cruz and a Principal Research Staff Member at the IBM Almaden Research Center. His research interests include principles of database systems, logic in computer science, and computational complexity. Kolaitis is a Fellow of the American Association for the Advancement of Science (AAAS), a Fellow of the Association for Computing Machinery (ACM), a Foreign Member of the Finnish Academy of Science and Letters, a Foreign Member of Academia Europaea, and the recipient of a 1993 Guggenheim Fellowship. He is also the recipient of two IBM Research Division Outstanding Innovation Awards, an IBM Research Division Outstanding Technical Achievement Award, a co-winner of both the 2008 and the 2014 ACM PODS Alberto O. Mendelzon Test-of-Time Award, and a co-winner of the 2013 International Conference on Database Theory Test-of-Time Award.
Talk 2
Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems
Speaker: Hung Q. Ngo (RelationalAI)
Abstract
Worst-case optimal join algorithms are the class of join algorithms whose runtime match the worst-case output size of a given join query. While the first provably worse-case optimal join algorithm was discovered relatively recently, the techniques and results surrounding these algorithms grow out of decades of research from a wide range of areas, intimately connecting graph theory, algorithms, information theory, constraint satisfaction, database theory, and geometric inequalities. These ideas are not just paperware: in addition to academic project implementations, two variations of such algorithms are the work-horse join algorithms of commercial database and data analytics engines.
This paper aims to be a brief introduction to the design and analysis of worst-case optimal join algorithms. We discuss the key techniques for proving runtime and output size bounds. We particularly focus on the fascinating connection between join algorithms and information theoretic inequalities, and the idea of how one can turn a proof into an algorithm. Finally, we conclude with a representative list of fundamental open problems in this area.
Bio
Hung Q. Ngo was a professor at the State University of New York at Buffalo from 2001 to 2015. From 2015, he started working for a couple of startups building datalog and data analytic engines: LogicBlox and RelationalAI. His current research and development interests include the design, analysis, and implementation of in-database computation algorithms. These algorithms cover typical logic and statistical query optimization. He received an NSF CAREER award, best paper awards at COCOON 2008, PODS 2012 and PODS 2016, and ACM SIGMOD Research Highlight Award.
Gems of PODS Committee for 2018:
- Marcelo Arenas (Pontificia Universidad Católica de Chile) - Chair
- Tova Milo (Tel Aviv University)
- Dan Olteanu (Oxford University)