Automating Visualization Design with Perceptual Kernels

By Çağatay Demiralp and Jeffrey Heer

The question of how to visually encode data values is central to visualization design. As viewers’ interpretations of data may shift across encodings, it is important to understand how different choices of visual encoding variables such as color, shape, and size affect the interpretations. Quantifying these trade­-offs can help automate creation of visualizations that better reflect patterned structures (relations) in data, which is critical given the ever-increasing size and complexity of data that we would like to understand.

Perceptual kernels (distance matrices derived from aggregate perceptual similarity judgments) provide a useful operational model to this end, incorporating empirical perception data directly into visualization design in a reusable form.

Putting Numbers on Perceptual Differences

So, how can we derive a numerical value for the perceptual difference between two visual variables? In our recent ISTC­-sponsored research, we conducted a set of crowdsourced experiments to measure perceived differences between visual encoding variables of shape, size, color and combinations thereof.  Figure 1 shows the six palettes used in our experiments. We used color and shape stimuli from palettes in Tableau.


Figure 1: Palettes of visual stimuli used in our experiments: shape, color, size, shape­-color, shape-­size, size-­color.

Through these experiments, we elicited subjective measures of judged similarity, to which we refer as perceptual distances. In this context, a perceptual kernel is the distance matrix of aggregated pairwise perceptual distances. Figure 2 shows a perceptual kernel for a set of plotting symbols; distances are visualized using grayscale values, with darker cells indicating higher similarity. The prominent clusters suggest that users will perceive similarities among shapes that may or may not mirror encoded data values.


Figure 2: (Left) A crowd­-estimated perceptual kernel for a shape palette. Darker entries indicate perceptually closer (similar) shapes. (Right) A two­-dimensional projection of the palette shapes obtained via multi-dimensional scaling of the perceptual kernel. You can interactively explore all the perceptual kernels collected.

What Are Perceptual Kernels Useful For?

Perceptual kernels can help automate visualization design by enabling the direct application of empirical perception data within visualization tools. Typically, automated design methods leverage an effectiveness ranking of visual encoding variables with respect to data types (nominal, ordinal, quantitative). Once a visual variable is chosen, however, these methods provide little guidance on how to best pair data values with visual elements, instead relying on default palettes for variables such as color and shape. Perceptual kernels provide a means for computing optimized assignments to visual variables whose perceived differences are congruent with underlying distances among data points.

Here are two examples of how perceptual kernels can be used:

Automatically Designing New Palettes

Given an estimated perceptual kernel, we can use it to revisit existing palettes. For example, we can create a new palette that maximizes perceptual distance, better promoting discrimination among visual elements.

Figure 3 shows both original and re­-ordered palettes for shape, color and size variables. It is instructive to compare the re-­ordered shape palette with the two-­dimensional projection of the shape kernel (Figure 2). For example, the first four elements in the re­-ordered shape palette include representatives from the each of the four clusters seen in the two-­dimensional projection of the kernel in Figure 2.


Figure 3:  Shape, color and size palettes: (Top) original palettes and (Bottom) palettes re­-ordered to maximize perceptual discriminability according to the triplet-matching kernels.

Visual Embedding: Perceptual Painting of Data Relation

Visual embedding has been recently proposed as a model for automatically generating and evaluating visualizations. It is based on the premise that good visualizations should convey structures or relations (e.g., similarities, distances, etc.) in data with corresponding, proportional perceptual relations. For example, if two data values are similar to each other in some sense (user­-defined or otherwise), they should be encoded with perceptually similar values of visual variables and vice versa.  The underlying assumption here is, however, that degrees of perceptual affinities between and within visual encoding variables are available to us. This is exactly what perceptual kernels provide. Perceptual kernels can therefore guide visual embedding to choose encodings that preserve data­-space distances in terms of kernel­-defined perceptual distances.

In the following graph visualization (Figure 4), we use visual embedding to encode community clusters in a character co­-occurrence graph derived from Victor Hugo’s novel Les Misérables. Automatically chosen node shapes and colors reflect strengths of interconnectivity between community clusters. For example, the high perceptual similarity between red right-­triangle and orange left-­triangle suggests that the two community clusters they encode are highly connected. Conversely, the low perceptual similarity between brown down-­triangle and cyan asterisk indicates that the respective encoded communities are weakly interconnected.

Figure 4: Graph of character co-­occurrences in Les Misérables, with node colors and shapes automatically chosen via visual embedding to reflect connection strengths between community clusters.

What Is the Best Way to Collect Perceptual Kernels?

There are alternative task designs to collect judged similarity measurements. One of the goals of our current research is to understand trade-­offs among these alternatives. So, we compared a variety of judgment types: Likert ratings among pairs, ordinal triplet comparisons, and manual spatial arrangement (you can try out the experiments for the shape palette by following the given links below).

We found that ordinal triplet matching judgments provided the most consistent results, while spatial arrangement performed poorly in most of the criteria we considered. Our analysis is relevant to the general problem of crowdsourcing similarity models, providing new evidence in support of triplet matching.

The poor performance of spatial arrangement has implications for existing visual analytics tools. Semantic interaction systems use spatial arrangement tasks to elicit domain expertise to drive modeling and layout. Our results suggest that this mode of interaction may engender significant variation among experts and provide insufficient expressiveness for high-­dimensional relations. Such tools may benefit by incorporating alternative similarity judgment tasks.

Moving Forward

Integrating perceptual kernels into visualization design tools is an important next step. Towards this end, we have made our perceptual kernels and experiment source code publicly available at­kernels.  While we focused on specific shape, color, and size palettes in the current study, we plan to incorporate additional stimuli in each of these perceptual channels. In the meantime, perceptual kernels provide a useful operational model for incorporating empirical perception data directly into visualization design tools.

We will be presenting a paper on this work at the IEEE InfoVis 2014 conference in Paris, on Thursday, November 13, at 16:15.


Çağatay Demiralp is a postdoctoral scholar in computer science at Stanford University and member of Interactive Data Lab at the University of Washington.

Jeffrey Heer is an Associate Professor of Computer Science at the University of Washington.

This entry was posted in Analytics, ISTC for Big Data Blog, Visualizing Big Data and tagged , . Bookmark the permalink.

Leave A Reply

Your email address will not be published. Required fields are marked *

nine − 8 =