In-Database Analytics for Large Array Data

By Jack Dongarra,* Piotr Luszczek and Thomas Herault of the University of Tennessee, Knoxville  Performing analytics inside a database gets progressively important. In the context of SciDB, the data model involves arrays either fully populated (dense) or with empty entries (sparse). Very … Continue reading

Posted in Big Data Architecture, DBMS, High-Performance Computing, ISTC for Big Data Blog, Math and Algorithms | Tagged , , , , , | Leave a comment

Write-Behind Logging

By Joy Arulraj, Matthew Perron, and Andrew Pavlo, Carnegie Mellon University In a joint collaboration between Carnegie Mellon University and Intel Labs, we explore the changes required in the logging and recovery algorithms in non-volatile memory database management systems (DBMSs). The results of this work … Continue reading

Posted in Big Data Architecture, Data Management, DBMS, ISTC for Big Data Blog, Math and Algorithms, Storage | Tagged , , , , , , , | Leave a comment

TicToc: Time Traveling Optimistic Concurrency Control

By Xiangyao Yu, MIT CSAIL; Andrew Pavlo, Carnegie Mellon; Daniel Sanchez and Srinivas Devadas, MIT CSAIL The TicToc algorithm enables scalable and high-performing concurrency control for future multi- and many-core systems. Large-scale, highly parallel transaction processing systems can be built with TicToc. We … Continue reading

Posted in Big Data Architecture, Data Management, DBMS, ISTC for Big Data Blog, Math and Algorithms | Tagged , , , , | Leave a comment

Interface Sharing between Data Storage and Analytics

By Jack Dongarra, Piotr Luszczek and Thomas Herault of the University of Tennessee Innovative Computing Laboratory It is trite to say that traditional RDBMS optimize the data movement by bringing the query close to the data and not the other way around. … Continue reading

Posted in Analytics, Big Data Architecture, ISTC for Big Data Blog, Math and Algorithms, Storage | Tagged , , , , | Leave a comment

Winning at Big Data: What’s Math Got to Do with It? (A Lot)

Big Data describes a new era in the digital age in which the volume, velocity and variety of data created across a wide range of fields – from Internet search and social media to finance and healthcare to defense and … Continue reading

Posted in Databases and Analytics, Graph Computation, ISTC for Big Data Blog, Math and Algorithms, Tools for Big Data | Tagged , , , , | Leave a comment

ISTC for Big Data Researchers Present Work at NEDB Day 2015

ISTC for Big Data principal investigators (PIs) and researchers presented a broad base of research work at New England Database Day 2015, which was sponsored by Microsoft and held at the Stata Center at MIT in Cambridge, Mass., on Friday, … Continue reading

Posted in Big Data Applications, Big Data Architecture, Computer Architecture, Data Management, DBMS, Graph Computation, ISTC for Big Data Blog, Math and Algorithms, Query Engines, Streaming Big Data | Tagged , , , , , , , , , | Leave a comment

ISTC for Big Data: 2015 and Beyond

By Sam Madden, ISTC for Big Data Co-Director Happy New Year, ISTC for Big Data blog readers! It’s been a busy year here at the ISTC for Big Data.  In this post, I call out a few of our highlights … Continue reading

Posted in Analytics, Big Data Applications, Big Data Architecture, Data Management, Databases and Analytics, DBMS, Graph Computation, ISTC for Big Data Blog, Math and Algorithms, Streaming Big Data, Visualizing Big Data | Tagged , , , , , , , , , , , , | Leave a comment

Fast Data Analysis with SVD

By Jack Dongarra, University of Tennessee Knoxville and Innovative Computing Laboratory The GenBase benchmark was developed as a collaboration with the Intel Parallel Computing Lab, the Broad Institute and Novartis, and the MIT Database Group. Among many challenging tests that the benchmark includes is a computation of the Singular Value … Continue reading

Posted in Analytics, Benchmarks, Big Data Architecture, High-Performance Computing, ISTC for Big Data Blog, Math and Algorithms | Tagged , , , , , | Leave a comment

Approximate Analytics: Keeping Pace with Big Data using Parallel Locality Sensitive Hashing

By Narayanan Sundaram and Nadathur Satish, Intel Parallel Computing Lab We tend to think of big data problems as those involving finding a needle in a haystack. However, many big data problems tend to be those estimating the shape or … Continue reading

Posted in Analytics, Big Data Architecture, Databases and Analytics, ISTC for Big Data Blog, Math and Algorithms, Tools for Big Data | Tagged , , , , , | Comments Off on Approximate Analytics: Keeping Pace with Big Data using Parallel Locality Sensitive Hashing

Accelerated Linear Algebra on Big Data

By Jack Dongarra, University of Tennessee Knoxville and Innovative Computing Laboratory Often with Big Data come massive amounts of computations. For example, gene correlations may be analyzed with the Singular Value Decomposition as it is done in the GenMark benchmark. The SVD algorithm is a robust method that … Continue reading

Posted in Big Data Architecture, Computer Architecture, ISTC for Big Data Blog, Math and Algorithms | Tagged , , , | Leave a comment