# Are scatterplots too complex for lay folks?

Usually, I like to write about the solutions to problems I’ve had, but today I only have a problem to write about.

This is the second research job I’ve had outside of academia, and in both cases I’ve met with resistance when I’ve tried to display bivariate relations using scatterplot. For example, a colleague came past my work computer yesterday while I had on screen a scatterplot with a linear trend line showing. She looked at the plot and blurted out “Wow, that looks complicated!”.

One option that I’ve tried in the past to display bivariate relations where I’d normally use a scatterplot is to discretize, or bin the x variable, and show the average y value for each range level of the x variable. This can certainly can help to show the direction of the relation, but hides the number of data points that go into the averages. I like being able to see the diagonal stick like orientation of a dot cloud signalling a strong correlation, or seeing the loose, circular orientation that signals no apparent correlation.

Do I have to say goodbye to the scatterplot now that I’m outside of academia? Will I ever be able to fruitfully use it again to communicate results to lay folks? Are the bar graph and line graph my only viable tools for data visualization now? Is there something I’m missing that could help people see scatterplots as helpful representations of bivariate relations? I’d appreciate answers to any of my questions.

## 11 thoughts on “Are scatterplots too complex for lay folks?”

1. I don’t have a good answer to this question, except to say that I feel your pain. Often what is beautiful and/or obvious to us is not so at all for other people. The more basic question is, does your audience need to be shown that two variables are correlated? I’ve relied on bar charts when explaining the general idea behind predictive modelling to lay persons, but otherwise I keep details of any particular analysis to myself. In general, a person’s level of interest will dictate the degree of brain power they are willing to invest in understanding what you show them.

• Thanks very much for your answer! The goal is to display the direction and slope of a correlation, but also to display the reliability of it at different ends of the x scale (one side having more data points than the other). I could make a bar plot as I mentioned in my post and type the numbers representing the subgroup sizes on/below/within the bars, but is that even going to be seen and understood?

2. Sounds like this could lead to some fun if we all have the time.
Hopefully your colleague was joking, but then I do agree with Kevin.

3. This is not a joke. I did a scatterplot with a few colors and a legend and my client, a technical person, was overwhelmed. That said, think about how to make it easier. Soft vertical grid lines at key x-values will help. Use open circles to show how points overlap. Also consider using a loess fit to show the relationship instead of the default linear regression line. Linear regression is a big problem because it doesn’t always appear to fit well. Others might recommend spline fits if you are interested in applying different fits.

4. I’ve generally resisted even though it’s clear as day to me. I think context is one way to overcome. If you’re showing this plot for the first time, you could find another everyday relationship – e.g. for US guys, I guess higher GPA scores would translate to higher salaries. This would illustrate how an expected outcome looks on this type of chart, then present your two variables.

Alternatively, overlay a notated grid – e.g. if you were looking at S&P returns vs Treasury Bond returns, the area in negative x and negative y would be labelled, all assets perform badly, etc.

Anyway, just my initial thoughts.

5. I think you should teach your clients about charts. Don’t just give them chart, It is your job as a consultant to explain why you use this or that chart, what information it conveys, how it can be used, how can your client can make a decision using your chart.
In my company we use lots of statistical charts and created several custom complicated charts for reports. Initially our customers were scared of them, but now they love them, and request these types of charts for new reports.

• I agree wholeheartedly. I’ve been slowly introducing some “new” chart types in our organization and with the proper explanation this has gone quite well, albeit slow. I experencied that a new chart is not really an issue if it shows information that the organization really wants to have, but did not have before.

6. Binning a continuous variable in the way you mention is a serious violation and can give misleading results

7. I work in a big organisation and I consider it part of my role to educate people about ways of interpreting data and visualisation. I’ve been in post for 4 years now and so far I have weaned them off pie charts and am even looking at some 95% confidence interval plots (yes, I know, before everyone jumps down my throat, but I’m doing what I can).

A scatter plot is a very simple graph. You should definitely not give up. Maybe you could make a fake one that plots height versus weight, and then the regression line will be something people can imagine straight away?

Part of my issue with the way people use graphs is they expect to understand them in 0.1 seconds otherwise they just give up. But a good data analyst, even one who is trained in visualisation, can take 5 or even 10 seconds looking at a graph and getting some of the nuances. So I think a big message from me is always just to stick with it, and not immediately dismiss a graph because “I can’t do maths”.

8. Is there an error in the title? Did you mean “lazy” not “lay” people?

Your experience is all too common, but along with the other comments I would re-iterate the don’t give up mantra. Its a long process to build acuity in your colleagues, some may never get it, but most can.