Higher-Level Tools for Interactive Data Visualization

By Jeffrey Heer, Arvind Satyanarayan, Kanit Wongsuphasawat & Dominik Moritz of the UW Interactive Data Lab

In popular conversation, discussions of data visualization often involve specialized interactive graphics – typically intended to convey a story – that were hand-crafted by skilled designers. Over our years working in visualization, a central goal of our group has been to build tools that help designers craft sophisticated graphics, leading us to create tools such as Prefuse, Protovis and D3.js.

In the grand scheme of things, however, finely honed hand-coded visualizations are the exception, not the rule. Instead, the vast majority of the world’s visualizations are produced using end-user software, most notably spreadsheet applications. While valuable, these tools are often created by engineers, who, though well-intentioned, unfortunately lack familiarity with effective visualization design and how to support the iterative process of data analysis. Better tools are needed to help everyone, not just skilled designers, create effective visualizations and better understand their data.

As part of our ISTC-supported research, we’ve been exploring new approaches for creating interactive tools that better leverage perception and support analysis. The overarching goal is to build up an ecosystem of usable and interoperable tools that can support use cases ranging from rapid exploration of large data sets to custom design work to communicate insights. This goal has led us to develop a stack of higher-level tools for interactive data visualization.

At the foundation of this stack is Vega, a declarative visualization grammar. Similar in spirit to how SQL provides a language for expressing database queries, Vega provides a language for describing visualizations. Vega specifications state the data transformations and visual encoding rules needed to express a rich space of visualizations. Building on lower-level tools such as D3, the Vega runtime parses specifications in a JSON format to produce interactive web-based graphics.

“At the foundation of this new stack of higher-level tools for interactive data visualization is Vega, a declarative visualization grammar. Similar in spirit to how SQL provides a language for expressing database queries, Vega provides a language for describing visualizations.”

In recent work presented at the ACM UIST 2014 conference, we extended Vega to also support the declarative specification of interactive techniques. By treating user input (mouse movements, touch events, etc.) as a first-class streaming data source, we can formulate a unified model that augments a familiar grammar of graphics with a grammar of interaction. These extensions, along with more general support for streaming data, are now being realized in the Vega 2.0 system.

Vega is useful in its own right; for example, it is now used on Wikipedia to define visualizations directly within a wiki page. However, our primary motivation is for Vega to serve as a foundation for even higher-level tools. Vega provides a formal language (and file format) for representing and reasoning about visualizations.

William Playfair’s classic chart comparing the price of wheat and wages in England, recreated in the Lyra visualization design environment. (Courtesy Jeffrey Heer, University of Washington et al.)

Figure 1: William Playfair’s classic chart comparing the price of wheat and wages in England, recreated in the Lyra visualization design environment. (Courtesy of Jeffrey Heer, University of Washington, et al.)

Lyra is one example of a novel tool built on top of Vega (see Figure 1). Lyra is an interactive environment that enables custom visualization design without writing any code. Graphical marks can be bound to data fields using property drop zones; dynamically positioned using connectors; and directly moved, rotated, and resized using handles. Lyra is more expressive than interactive systems like Tableau, allowing designers to create custom visualizations comparable to hand-coded visualizations built with D3 or Processing. These visualizations, realized as Vega specifications, can then be easily published and reused on the Web.

Moving from communication graphics to exploratory analysis, we are now developing tools that actively guide users and suggest useful visualizations. Vega-lite is a simple domain-specific language for concise specifications of common statistical graphics, which are then compiled to full Vega specifications. Vega-lite provides a particularly convenient abstraction for enumerating and searching over a space of possible visualizations.

Voyager: a recommendation-powered visualization browser. The schema panel (left) lists data variables selectable by users. The main gallery (right) presents suggested visualizations of different variable subsets and data transformations. (Courtesy Jeffrey Heer, University of Washington, et al.)

Figure 2: Voyager: a recommendation-powered visualization browser. The schema panel (left) lists data variables selectable by users. The main gallery (right) presents suggested visualizations of different variable subsets and data transformations. (Courtesy of Jeffrey Heer, University of Washington, et al.)

Our Voyager system (see Figure 2) uses Vega-lite to consider thousands of charts in parallel, rank them using both statistical and perceptual measures, and present top-ranked examples as recommended views. Rather than having users manually construct charts, Voyager presents a gallery of suggested visualizations. Users can interactively steer the recommendations, for example by selecting particular data variables of interest. In studies of early-stage data analysis, we found that this mode of exploration significantly increases data coverage compared to state-of-the-art tools for manual specification. Moreover, by leveraging the Vega tool stack, any visualization in Voyager can be exported for further editing, including design customization within Lyra.

Going forward, many challenges remain. Active research questions include how to improve (and best evaluate) visualization recommender systems, and how to better leverage Vega’s declarative nature to automatically optimize processing and enable scalable visualization.

These tools are freely available as open-source projects at vega.github.io. We hope these systems will prove valuable to the visualization community, and we welcome your feedback and contributions.

Interested in learning more? See Jeff’s 10-minute visualization keynote at the O’Reilly Strata conference or see his longer OpenVis Conference keynote about the tools research we are conducting in the Big Data ISTC.

This entry was posted in ISTC for Big Data Blog, Visualizing Big Data and tagged , , . Bookmark the permalink.

Leave A Reply

Your email address will not be published. Required fields are marked *


− four = 3