Recognizing the Limits of Visualization
Visualization (viz) is an incredibly hot topic in the business analytics/data science (DS) world right now. In every job description, you’ll see phrases such as “proficient with Tableau” or “telling visual stories with data”. While visualization has its place in the data science skills stack, it’s important to recognize its limitations: human sight is imperfect, visual similarity does not imply correlation, and the halo effect.
Human Sight is Imperfect
It seems intuitive, but it is truly the case that human sight is imperfect, and this has very important implications in the use (and abuse) of visualization.
There are actually two different layers here, a physical one and a digital one. From a physical perspective, our human nervous systems are designed to find patterns from disparate data. This is how we recognize that a banana is not a pineapple, or examine a single leaf in a large pile of them. However, this tendency to look for patterns sometimes leads us to intuit them to exist in places where they do not. The term for this is Apophenia; we see a great example of this tendency when we see faces in everyday inanimate objects. If you’ve ever looked at an electrical outlet and saw a concerned face staring back at you, you’ve just experienced Apophenia.
We even go to some lengths to induce this response in ourselves, as the Honda ad below points out:
It is not hard to imagine why this is relevant to the issue of creating visualizations. Bad viz provides the end user with a lump of data and no guidance for how to interpret it. In the absence of such guidance, it is the human tendency to start looking for patterns, and to take those patterns as truth. As a result, bad viz will leave the user with unfounded or even unrelated conclusions about the subject matter.
From a digital perspective, it is important to recognize that we experience visualizations in a digital environment, with limited resolution. It’s been argued that the human eye, while having a very high technical resolution, is only able to address a small amount of data at once. In essence, while the eye is a large format digital sensor, the brain can only analyze and interpret the data in a small camera phone sensor’s amount of the visual field.
In addition to this physical limitation of resolution, the screens with which we view viz are technically limited to their own resolution. Recognizing that not all viz scale to resolution (unfortunately, even this is a big ask), if you’re lucky enough to have, say, a 4k monitor, this at most an 8.3 megapixel image at 1:1 scaling (it’s a popular option to use some degree of upscaling on 4k monitors, since it can render fonts quite tiny).
The physical resolution of the monitor panel comes into play when scaling data. If you’re looking at a viz that is 1000 px tall, then each individual pixel is 1/1000 of the total height of the viz.
Let’s say you were looking at a viz of the GDP growth in the United States from founding through 2015. You would see a line stretching from zero to $17.947 billion in the course of the viz. If it’s a bad viz, it won’t use any methods of scaling that line to account for such large variation (a logarithmic scale, for example… or even using per capita GDP instead of gross GDP). Each pixel of the thousand would be worth $179 million! Today, $179 million between friends is nothing, but in 1776 it was surely several years worth of production!
If we don’t remember that the way we perceive trends is colored by our biology and the technology we use to view viz, it’s very easy to fall into the trap of…
Visual Similarity Does Not Imply Correlation
If this saying sounds familiar, this is because it is a take on the saying of “correlation does not imply causation”, which has well-meaning origins, but has quickly become an end-all-be-all for skepticism about trends that we perceive (but I digress!).
When it comes to viz, it’s important to be critical of the construction of the viz, because bad viz can make unrelated things look related.
The WTF-viz Tumblr provides a great example of this. In this German language visualization (apparently about the demographic attributes of personality), all segments of the pie chart are the same size, even though their stated proportions are different! The labelling is also not incredibly clear (but again, I am digressing!).
To the casual viewer, if they were not looking at the numbers, or had only given this viz a cursory glance, the conclusion from looking at this viz might be that all personality traits had the same share. As one commenter puts it: “When 1% is the same size as 40%, you missed the point of a pie chart”
This is a pretty basic example, but more sophisticated visual techniques also file prey to this problem. One common example comes from our ability to generate a regression line for literally any combination of data. Lines of best fit, or “trendlines” are typically generated by regressing the observed data against a linear, logarithmic, or other mathematical model. These techniques always produce a result, but said result is not always (arguably, rarely) meaningful and useful for the data being analyzed.
Setting aside the absolute mess that p-values represent in the determination of statistical significance, inadequate coefficients of determination (r-squared) are almost always more likely to cause unintentional malfeasance in a viz.
The Halo Effect
Simply put, the halo effect is the tendency for an observer’s judgment of a subject to be affected by their perception of that subject. This does not always lead to undesirable results; past experiences are often very informative of future experiences. However, in the case of visualizations, the primary negative impact of the halo effect comes when the production value of the visualization exceeds the value of the viz itself, or vice versa.
It is the now the case that, using such tools as d3.js (and the numerous libraries built on top of it) or Tableau, nearly anybody can put together an attractive and interactive visualization of their own. The attractiveness of the visualization, however, is in no terms correlated with its accuracy, value, or merit as a tool for learning. Conversely, it is likely that some very unattractive visualizations have great insights inside of them. Arguably, the latter is a larger problem than the former.
Even Google is guilty of this. In a recent blog post, they reviewed their experience supporting the rollout of Pokemon Go with Niantic’s team. When it was first released, Pokemon Go was often unaccessible due to excessive load on the servers, presumably because of its massive and unexpected popularity. All in all, a good problem to have. In the aforementioned blog post, they show the viz below, regarding the expected, planned, and actual levels of traffic that they had to scale in order to support.
The biggest problem with this viz is that there are axis labels for neither the X nor Y axes. Presumably, the Y axis represents the scale of traffic, and the X axis represents the time since some event. However, we don’t know that because there are no labels! The idea here is that there is a different story to tell depending on the units. If it turns out that this span represents a single hour, then it was truly shortsighted of Niantic to underestimate the popularity of their game so badly. However, if this data takes place over the span of weeks or months, it is largely an embarrassment to Google that they were unable to scale their infrastructure in an appropriate and timely manner.