ISTC for Big Data 2016 Research Highlights

In 2016, ISTC for Big Data principal investigators, researchers and their students continued to break down the barriers to data analytics at scale, with creative new approaches and infrastructure software. Developments are being integrated into BigDAWG, the next-generation polystore architecture … Continue reading

Posted in Big Data Architecture, Data Management, ISTC for Big Data Blog, Polystores | Tagged , , , , , , , , , , , | Leave a comment

Write-Behind Logging

By Joy Arulraj, Matthew Perron, and Andrew Pavlo, Carnegie Mellon University In a joint collaboration between Carnegie Mellon University and Intel Labs, we explore the changes required in the logging and recovery algorithms in non-volatile memory database management systems (DBMSs). The results of this work … Continue reading

Posted in Big Data Architecture, Data Management, DBMS, ISTC for Big Data Blog, Math and Algorithms, Storage | Tagged , , , , , , , | Leave a comment

Polystore Databases to be Examined at IEEE, CIDR Conferences

Polystores, a more-modern approach to sharing heterogeneous data that addresses Big Data’s volume, variety and velocity demands, will be the topic of discussion at two upcoming conferences: The first IEEE Workshop on Methods to Manage Heterogeneous Big Data and Polystore Databases, … Continue reading

Posted in Big Data Applications, Big Data Architecture, Data Management, ISTC for Big Data Blog, Polystores, Tools for Big Data | Tagged , , , , , | Leave a comment

PipeGen: A Data Pipe Generator for Hybrid Analytics

By Brandon Haynes, Alvin Cheung, and Magdalena Balazinska, University of Washington As the number of big data management systems continues to grow, users increasingly seek to leverage multiple systems in the context of a single data analysis task. A critical challenge … Continue reading

Posted in Big Data Applications, Big Data Architecture, Data Management, ISTC for Big Data Blog, Polystores | Tagged , , , , , | Leave a comment

Analytic Monitoring for the Internet of Things

By Peter Bailis, Stanford Infolab, and Sam Madden, MIT CSAIL An increasing proportion of data today is generated by automated processes, sensors and devices—collectively called the Internet of Things (IoT).   Inexpensive hardware, widespread access to communication networks, and decreased … Continue reading

Posted in Analytics, Big Data Applications, Big Data Architecture, Databases and Analytics, ISTC for Big Data Blog, Streaming Big Data | Tagged , , , | Leave a comment

Genomics Data, Analytics and the Future of Climate Change

By Vijay Gadepally, MIT CSAIL, in collaboration with the Chisholm Laboratory at MIT Meet Prochlorococcus marinus, a marine cyanobacterium that’s intricately linked to the global carbon cycle, widely present in seawater, and possibly holds secrets to future climate change. These … Continue reading

Posted in Big Data Applications, Big Data Architecture, Data Management, Databases and Analytics, DBMS, Graph Computation, ISTC for Big Data Blog, Polystores, Streaming Big Data, Tools for Big Data, Visualizing Big Data | Tagged , , , , , , , , | Leave a comment

ModelDB: A System for Managing Machine Learning Models

By Manasi Vartak, Harihar Subramanyam, Wei-En Lee, Srinidhi Viswanathan, Saadiyah Husnoo, Sam Madden and Matei Zaharia, MIT CSAIL Building real-world machine learning (ML) algorithms is an iterative process. A data scientist will build many 10s to 100s of models before arriving … Continue reading

Posted in Big Data Applications, Big Data Architecture, ISTC for Big Data Blog, Tools for Big Data | Tagged , , | 4 Comments

Larger-than-Memory Data Management on Modern Storage Hardware for In-Memory OLTP Database Systems

By Lin Ma, Carnegie Mellon University; Joy Arulraj, Carnegie Mellon University; Sam Zhao, Brown University; Andrew Pavlo, Carnegie Mellon University; Subramanya R. Dulloor, Intel Labs; Michael J. Giardino, Georgia Institute of Technology; Jeff Parkhurst, Jason L. Gardner, Kshitij Doshi, Intel Labs; and Col. Stanley Zdonik, Brown … Continue reading

Posted in Big Data Architecture, Data Management, DBMS, ISTC for Big Data Blog, Storage | Tagged , , , , , | Leave a comment

TicToc: Time Traveling Optimistic Concurrency Control

By Xiangyao Yu, MIT CSAIL; Andrew Pavlo, Carnegie Mellon; Daniel Sanchez and Srinivas Devadas, MIT CSAIL The TicToc algorithm enables scalable and high-performing concurrency control for future multi- and many-core systems. Large-scale, highly parallel transaction processing systems can be built with TicToc. We … Continue reading

Posted in Big Data Architecture, Data Management, DBMS, ISTC for Big Data Blog, Math and Algorithms | Tagged , , , , | Leave a comment

PolyPEG: A Proposal for Polystore Optimization

By Dylan Hutchison, Bill Howe, Dan Suciu, and Zachary Tatlock, University of Washington There has been a “cambrian explosion” of systems and languages for large-scale data analytics:  Postgres and H-Store accept SQL queries; Datomic and Myria accept Datalog; SciDB accepts … Continue reading

Posted in Big Data Applications, Big Data Architecture, ISTC for Big Data Blog, Polystores, Tools for Big Data | Tagged , , , | Leave a comment