Titles #
Titles should make a clear, declarative claim about the data.
The data visualization should be the evidence supporting this claim.
On the page dealing with the the point of data visualizations, I argued that, when you are making a data visualization to share with another person, you are attempting to show meaning in numbers. This means, you have a story to tell, or that you have some sort of claim you are making about the meaning in the data. So, the guidance for how to title data visualizations is simple: state this claim.
Your title should make a claim #
It is worth exploring what this means, because there are many examples of it not being followed, which leads people to find it strange to make a claim.
Do not simply state what the data is #
A very common thing to see is a title that simply states what is being plotted. In science you might see a title like “UV-vis spectrum of <insert molecule>.” In journalism you will find titles like “change in housing prices over time” on a plot like that below.
I think titles like this are advocated for under the misconception that the point of a data visualization is to present data without bias. My response to this is found on the page regarding the point of data visualizations. But, in short, your goal should be to show people what you see in the data. Since this is the goal, you should use every tool you can to do this. Thanks to convention, people will read the title first. If you are not levering the title to make your point, you are wasting a golden opportunity.
Of course, this still requires understanding how to best leverage the title.
Do not ask a question #
Another idea you will see is to pose a question in the title. In science, you might see something like “Does <insert molecule> absorb in the NIR?” There are a number of problems with asking a question. For one, a person may not reach the same conclusion you want them to. So, in order to ensure that they answer the question they way you are hoping, you need to make the question so painfully obvious that people might be insulted. For instance, we might turn the title for the above plot into a question: “Are housing prices in the USA increasing?”
If you try something more subtle, like “Is the housing price in the USA increasing exponentially?” You might have people reach different conclusions.
Additionally, there is always the temptation to ask a question that the data is not really able to answer, which is also a bit frustrating. For instance, in trying to connect with the reader, you might ask a question like “Is now the right time to buy a house?” But this plot simply cannot answer this question. Sure, there is currently a dip in the housing price (at least at the end of this chart), but there are many other factors that go into buying a house, rather than just price.
Despite these problems, asking questions in titles is still pretty common. I think the idea is that it invites the reader to explore the data, and therefore engage with the data more. However, I think in practice, it is less effective than making a claim. Making a claim implicitly poses a much more engaging question: “Do I agree with the creator of this data visualization?”
Making a claim #
When you make a claim, it should be as short as you can make it and I recommend using a declarative sentence structure: noun-verb-object. In science, this might be “<*insert molecule> absorbs in the NIR.” In journalism, it might be “Housing pricing in the USA have increased over time.”
The advantage of this format is that the main point of the visualization is clear. Additionally, it guides the reader on what aspects of the data to focus on. Lastly, it assigns the reader a task: decide if they agree with the claim. This invites much more critical analysis of the visualization.
There are a few guidelines for generating a well-formed claim. They are:
- Your claim should be be falsifiable. That is, it should be stated in such a way that it is logically possible for the viewer to disagree. So, tautologies are not good titles.
- Your claim should not require support from outside the data visualization. That is, the viewer should not need to look for information outside of the data that is labeled by the title.
- The data visualization should provide sufficient information for the viewer to make a judgement about your data. Combined with the guideline above, this means that the data visualization is both complete and sufficient. This last point is important, and so, let us consider it in more detail next.
Your data visualization should provide evidence supporting this claim #
One way to interpret this instruction is that you should only make claims that the data supports. While this is a good rule to follow, it overlooks another powerful aspect of making a claim about the data. If you make a claim, it will provide you guidance on how to ensure that you are designing the data visualization to support this claim.
Consider that we want to use a slightly different title for the data visualization we have been considering. Perhaps we want to make the claim “Housing prices in the USA increase exponentially over time.”
Looking at the figure above, we don’t really have sufficient evidence to make that judgement, so this suggest we need to change our data visualization. Perhaps we can add additional evidence, such as a fit of an exponential curve to the data, and a measure of the goodness of fit. Though there are many options, one that many readers are familiar with is R^2, and so we an use that.
Other considerations for titles #
There are just a few other things that one might consider when making a title.
Alignment #
A standard practice is to place the title in the top left, which is all well and good. However, there is one problem with this, from the perspective of alignment. Basically, we normally expect to see text left-aligned (which it is), but there is also a very strong indicator of where to align this, in terms of the y-axis line. Thus, I think it makes more sense to move the title to be aligned on the axis.
This sort of alignment creates a more firm connection between the title and the data visualization is labels.
Color #
Another place that you can draw a better connection between the title and the data is to color-code the words in the title to the data it refers to. For instance, “housing prices” refers to the data represented by the blue trace, while “exponentially” refers to the orange fit line. We can recolor the words in the title accordingly.
Further reading (a.k.a. evidence for my claims) #
The basic format suggested here is inspired by a paradigm in scientific presentations called “Assertion-Evidence.” And there is an entire website devoted to it. It was developed, in part, by a engineering professor at Penn State, Richard Alley, who has also written multiple papers and a book on the topic. His work focuses on the assertion-evidence model in the context of presentations (which I highly recommend using), but I find that it also provides a great framework for thinking about titles for data visualizations.