Pushing the Boundaries of Visual Interactive Analytics

 

Vizdom, a new system for interactive analytics, running on an interactive whiteboard. (Photo courtesy of Brown University Data Management Research Group.)

Vizdom lets data analysts visually compose complex workflows of machine learning and statistics operators on an interactive whiteboard, using pen and touch. (Photo from “Vizdom: Interactive Analytics through Pen and Touch,” Proceedings of the VLDB Endowment 8(12), August 2015.)

As the volume, variety and velocity of data grow, data analysts struggle with asking and answering big questions of the data – even with the availability of increasingly sophisticated data visualization tools. It takes far too long for analysts to get from visualization to answers or discoveries. Think of it as the “World Wide Wait” but for complex analytics.

Through a number of research projects, the ISTC for Big Data is “imagineering” the possibilities of faster, more-interactive and more-accessible data visualization. Three project teams presented papers describing their work at the recent VLDB 2015 and IEEE VIS 2015 conferences.

Finding the “Interestingness” of Data Visualizations

SeeDB is a visualization recommendation engine that answers the question: “What’s the most ‘interesting’ or ‘useful’ visualization of my data?” A simple question, right? But according to the authors of the SeeDB research paper, “Data analysts often build visualizations as the first step in their analytical workflow. However, when working with high-dimensional datasets, identifying visualizations that show relevant or desired trends in data can be laborious.”  

The SeeDB engine facilitates fast visual analysis of an identified subset of data. The authors explain: “Given a subset of data to be discovered, SeeDB intelligently explores the space of visualizations, evaluates promising visualizations for trends, and recommends those it deems most ‘interesting’ or ‘useful.’”

In building SeeDB, the researchers created new approaches and optimizations that overcome two key challenges:

  • scale, or being able to evaluate a large number of candidate visualizations while responding within interactive time frames; and
  • utility, identifying an appropriate metric (in this case, deviation) for assessing the “interestingness” of visualizations.

They’ve implemented SeeDB as a middleware layer that can run on top of any DBMS. Experiments show that the framework delivers both high accuracy and “multiple orders of magnitude speedup on relational and column store databases,” providing recommendations at “interactive time scales.”  

The researchers’ implementation of SeeDB incorporates an end-to-end data querying data visualization environment that “allows analysts to manually generate their own visualizations (like Tableau or Spotfire) or get data-driven recommendations on demand, which can be further refined using the manual interface.”

SeeDB was developed by a team that spans MIT CSAIL, the University of Illinois (UIUC) and Google.

Using Pen and Touch to Speed Complex Analytics

ISTC researchers at Brown University believe that high-powered Big Data analytics should be accessible to everyone – not just those rare people with deep domain knowledge and expertise in machine learning and statistical inference. The researchers continue to break down barriers between data/computer scientists and the rest of us.

For example: Machine learning and advanced statistics are important tools for exploring and drawing insights from large datasets, say Brown researchers. But multiple humans often need to get involved to steer computation towards meaningful results.  At VLDB 2015, the Brown research team presented Vizdom, a new system for interactive analytics using pen and touch.

The Vizdom research team envisions “a complete paradigm shift in how data scientists conduct exploratory analytics.” Instead of today’s labor-intensive cycle of back-and-forth iterations between data scientists and domain experts, the researchers thought: “Why not let them work together on an interactive whiteboard to generate an initial solution that could be further refined offline?”

In their paper (which won the Best Demo Award at VLDB 2015), the authors explained: “Vizdom’s frontend allows users to visually compose complex workflows of ML and statistics operators on an interactive whiteboard, and the backend leverages recent advances in workflow compilation techniques to run these computations at interactive speeds. Additionally, we are exploring approximation techniques for quickly visualizing partial results that incrementally refine over time.”

Their VLDB 2015 demo showed how Vizdom allowed users to interactively build complex analytics workflows using a real-life medical dataset, the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC II) dataset.

Vizdom runs on Tupleware, a new back-end analytical framework built for the typical user.

A Recommendation-Powered Visualization Browser

At IEEE 2015 VIS in late October, researchers from the University of Washington’s Interactive Data Lab and Tableau Research presented Voyager, an easier, more interactive way to do broad data exploration in the early stages of data analysis.  

Writing in their paper, the authors explained the problem with current tools: “General visualization tools typically require manual specification of views: analysts must select data variables and then choose which transformations and visual encodings to apply. These decisions often involve both domain and visualization design expertise, and may impose a tedious specification process that impedes exploration.”

They describe Voyager as “a mixed-initiative system that supports faceted browsing of recommended charts chosen according to statistical and perceptual measures. Voyager contributes a visualization recommender system (Compass) to power a novel browsing interface that exchanges manual chart specification for interactive browsing of suggested views [a gallery of automatically-generated visualizations].”

In a study comparing Voyager to a manual visualization specification tool, the authors found that Voyager facilitated exploration of previously unseen data and led to increased data variable coverage.  

Voyager, a recommendation-powered visualization browser. (Courtesy of the Interactive Data Lab, University of Washington.)

Voyager, a recommendation-powered visualization browser. (Courtesy of the Interactive Data Lab, University of Washington.)

Voyager, Compass, the Vega visualization composition system, and other system components described in the paper are freely available as open source software here.   

SeeDB, Vizdom, Tupleware and Vega are part of Big DAWG, an ambitious new data management architecture and reference implementation for Big Data applications being built by the ISTC for Big Data. The four technologies contribute to the exploratory analysis layer of Big DAWG, a polystore system that scales to the analytical needs of Big Data.

Quick Links:

Research paper: “SEEDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics.” Manasi Vartak, Sajjadur Rahman, Samuel Madden, Aditya Parameswaran, Neoklis Polyzotis

Research paper: “Interactive Analytics through Pen and Touch.”  Andrew Crotty, Alex Galakatos, Emanuel Zgraggen, Carsten Binnig, Tim Kraska

Research paper: “Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations.” Kanit Wongsuphasawat, Dominik Moritz, Anushka Anand, Jock Mackinlay, Bill Howe, Jeffrey Heer

Research paper: “A Demonstration of the BigDAWG Polystore System.” A. Elmore, J. Duggan, M. Stonebraker, M. Balazinska,  U. Cetintemel, V. Gadepally, J. Heer, B. Howe, J. Kepner, T. Kraska, S. Madden, D. Maier, T. Mattson,  S.Papadopoulos, J. Parkhurst, N. Tatbul, M. Vartak, S. Zdonik

Article: “The Case for Polystores.” Communications of the ACM, July 13, 2015.

 

This entry was posted in Analytics, Big Data Applications, ISTC for Big Data Blog, Visualizing Big Data and tagged , , , , . Bookmark the permalink.

Leave A Reply

Your email address will not be published. Required fields are marked *


seven + 5 =