ISTC for Big Data principal investigators, researchers and their students presented work at North East Database Day 2017, held at MIT’s Stata Center in Cambridge, Mass., on January 27, 2017. Microsoft and Facebook sponsored the event.
The 9th Annual North East Database Day, which drew 225 registrants, had a decidedly forward-looking tone. It showcased work by some of the top university research programs and corporate/industrial research labs operating in the Northeast.
ISTC PIs, researchers and students presenting their views of a better-living-through-databases future included:
PhD student Joy Arulraj of Carnegie Mellon University, on “How to Build a Non-Volatile Memory Database Management System.” Today, the difference in the performance characteristics of volatile (DRAM) and non-volatile storage devices (HDD/SSDs) influences the design of DBMSs. The key assumption has always been that the latter is much slower than the former, which affects all aspects of a DBMS runtime architecture. With the impending arrival of new non-volatile memory (NVM) storage that’s almost as fast as DRAM with fine-grained read/writes, previous design choices are invalidated, the CMU researchers believe. The CMU team has already created a prototype NVM-aware DBMS: NVMRocks, an NVM-aware variant of the open-source RocksDB database. (Read their latest blog post to learn more).
Jeremy Kepner of MIT Lincoln Laboratory, on “Enabling Scale-up, Scale-Out and Scale-Deep for Big Data” through the supercomputing resources of MIT (which he founded and directs). Dr. Kepner talked about how he and his team provide thousands of MIT researchers and scientists with the scalable supercomputing resources they need for their work. The answer: mathematically rigorous, stable interfaces that shield users from the continual improvements in underlying high-performance software and hardware. (Learn more in this recent interview with Dr. Kepner by insideHPC.)
Professor Carsten Binnig of Brown University, on “Revisiting Reuse in Main Memory Database Systems.” The Brown team has developed a novel, more-efficient method for reusing intermediates in main memory database systems to speed up analytic query processing. The method focuses on hash tables, the most commonly used internal data structure in main memory databases to perform join and aggregation operations. Read more in the team’s paper.
At the post-program reception, ISTC for Big Data researchers and teams displayed posters describing their current research projects:
- “BigDAWG Polystore Release and Demonstration.” Kyle O’Brien, Vijay Gadepally, Jennie Duggan, Adam Dziedzic, Aaron Elmore, Samuel Madden, Timothy Mattson, Zuohao She, Michael Stonebraker. (Read the team’s paper for a look at what’s forthcoming in the release of BigDAWG code later this Spring.)
- “Towards Sustainable Insights, or Why Polygamy is Bad for You.” Carsten Binnig, Lorenzo De Stefani, Tim Kraska, Eli Upfal, Emanuel Zgraggen, Zheguang Zhao. (Read this excellent summary of this paper by “The Morning Paper” blog.)
- “Data Ingestion for the Connected World.” John Meehan, Jiang Du, Cansu Aslantas, Nesime Tatbul, Stan Zdonik
- “Metronome: Real-time Data Management for Time Series Data,” Eric Metcalf, Cansu Aslantas, Stan Zdonik, John Meehan, Nesime Tatbul, Philipp Eichmann
- “Storage Approaches for Large Time Series Data.” Alex Galakatos, Andrew Crotty, Tim Kraska
- “Making the Case for Query-by-Voice with Echo Query.” Ying Su, Xiaocheng Wang, Carsten Binnig, Ugur Cetintemel, Nikhil Murgai
- “In-Database vs. External System Analytics on a Key-Value Store.” Dylan Hutchison, Bill Howe, Vijay Gadepally, Jeremy Kepner
- “SiliconDB: Rethinking Databases for Modern Heterogeneous, Co-Processor Environments.” Kayhan Dursun, Carsten Binning, Ugur Centintemel, Robert Petrocelli
- “Sculpin: Efficient Exploration of Multidimensional Datasets.” Leilani Battle
For short abstracts on these and other posters, go here.
Two keynote speakers helped set the pace and tone of NEDB Day 2017:
John Leonard of the MIT Department of Mechanical Engineering and MIT CSAIL, kicked off NEDB Day 2017 with an informative and entertaining talk on “Self-Driving Vehicles, SLAM and Databases.” He acknowledged that, in spite of growing interest in the field of in self-driving vehicles over the last several years, “fully autonomous driving remains extremely difficult, with a number of difficult open questions.” Most notable is how to get much better at planning for the unexpected, including the unpredictability of human behavior. Areas where database algorithms and technology might help with self-driving vehicles are mobile robotics and SLAM (Simultaneous Localization and Mapping), he said.
Professor David J. DeWitt of MIT CSAIL presented “Data Warehousing in the Cloud – The Death of Shared Nothing,” his analysis of why we’re entering the era of truly scalable data warehousing in the cloud.
Slides from NEDB Day 2017 keynotes and talks are available for download here.