Maximizing the Effectiveness of Data Visualizations in Regulatory Submissions
Regulatory submissions for new drugs, biologics, or medical devices are replete with complex data that must be effectively communicated to the FDA/EMA review team. Data visualizations (eg, statistical graphs, plots, charts) are a way to communicate quantitative information with greater efficiency, clarity, and precision than with text alone. However, all too often, these powerful tools are either underutilized or misapplied, leading to ambiguity or confusion about key aspects of the data.
There is a common misconception that data visualizations in regulatory submissions need to be basic and visually sparse to be perceived as credible. That’s not the case. While an effective data visualization should be easy to interpret and as simple as possible, that doesn’t mean it needs to be boring. In fact, incorporating elements of graphical style can often enhance interpretability. Below, I’ll provide a brief overview of best practices for creating effective data visualizations from the medical communications experts and statisticians at 3D Communications.
There are four design aspects that should be asked about every data visualization in the submission:
Integrity: Are the data being presented with transparency and without bias? Are the visualizations presenting data consistently throughout the submission?
Concept: Are the data telling a story? Is the meaningfulness of the data coming through?
Function: Is this the best type of visualization for the data? Will the visualization be useful to the reviewer? Is the visual telling the story efficiently providing the necessary details without unnecessary clutter?
Form: Are the data being presented in a way that is visually attractive? Does the visualization pass the “glance test” (i.e., can the reader understand the message without having to rely on supporting text)?
The easiest way to see what makes an effective data visualization is by example. The left and right panels below illustrate the hypothetical results of an efficacy endpoint based on percentage of patients in each arm meeting a treatment responder definition.
Even though both charts are plotting the identical data, the effectiveness of the visualizations are clearly different. I’ll walk through the four key design aspects for these figures informed by 3D Communications’ best practices.
- Both plots correctly show variability as 95% confidence intervals (CIs), reflecting the precision of the point estimate for each arm. Confidence intervals should be preferred over other measures of variability such as standard errors or standard deviations for hypothesis-based endpoints because CIs can be used to make statistical inference. (Standard errors show precision but not in a way that is directly interpretable and standard deviations reflect variability of data but not precision of a point estimate).
- The y-axis on the plot in the left panel has an upper limit of 70%. While both graphs are technically showing the same numeric results, credible graphs should show the full range from 0 to 100%.
- The figure on the left does not label the observed percentages or the sample size. Labels on graphs should be clear and thorough to overcome any ambiguity about the results. Provide relevant explanations on the graphic itself and label important events in the data when applicable.
- The graph on the right uses different shades of blue for the active arms to compare to the placebo arm in orange. Using color can add another dimension to the graphic that informs the viewer and supports the continuity of results throughout a document. For example, if blue is consistently used to denote the active arm and orange is consistently used to denote the placebo arm across all bar charts, Kaplan-Meier plots, and line plots, the reader will have intuition about what data corresponds to which arm without having to reorient themselves for each figure. (A helpful hint: be sensitive to color-impaired readers. Blue can be distinguished by most color-impaired individuals – avoid red and green.)
- The p-values on the right figure also draw attention to the important comparisons and make the important point that the results for this endpoint are statistically significant for both primary comparisons.
- Well-labeled graphs don’t require legends. Don’t make your reader have to play what we at 3D often refer to as “legend tennis” where the reader has to go back and forth from the figure to the footnote to the text in order to understand the results. Label the treatment arms in a meaningful way with the study drug and the dose rather than letters that require a footnote.
- No one reads a book or an article at a 90-degree angle, so why would we want to read a graph that way? All words should run left to right, rather than vertically on the y-axis.
- Statistical packages like R and SAS can produce great graphics, but it requires time and effort to go beyond their “base” capabilities.
- The ideal figure size is typically the “Golden Rectangle,” which is approximately a 1 to 1.7 ratio of height to length of the figure. Figures stretched too far left-to-right actually visually diminish the impact of differences between treatment arms, particularly for line charts or Kaplan-Meier plots.
Similar to how we approach writing, best practices for data visualizations should not be rigidly applied. Preparing a regulatory submission requires the input from internal experts in medicine, statistics, and regulatory affairs. While we take it for granted that the text of regulatory submissions will undergo thorough review by team members across disciplines, decisions on data visualizations are often left only to the statistical group. Garnering comments, questions, and suggestions on figures and graphs from the full sponsor team will ensure that the data visualizations tell a compelling scientific story that will be effective for a wide variety of audiences.
ABOUT THE AUTHOR
Chris Miller, MS is a biostatistician who brings experience in the design, analysis, and interpretation of clinical trials to 3D clients. As a senior project manager, Chris leverages statistical expertise with excellent communications skills to integrate complex data with key messages. Connect with Chris on LinkedIn.