Scientific Color Palettes for Flatiron Health
2023 | data visualization | data design
We wanted to unify the color of all the charts, graphs, and other data representations used both internally and externally at Flatiron Health. Here, I describe the process of designing a clear, concise, easy to implement, and accessible set of scientific color palettes
a need for consistency and efficiency
We discovered that creators of visualizations had to guess and choose arbitrary color palettes every time they needed to plot a chart or create an infographic. This is time consuming and generates inconsistencies throughout the organization. We created an inventory of visualizations that are used both internally and externally across the organization, including (but not limited to):
Dashboards
Reports
Internal and external digital products and business intelligence platforms
Publications, posters, and papers
Presentations and decks
We also identified that chosen color palettes may be visually inaccessible for consumers with color blindness or with other visual disabilities and that choosing inadequate color palettes may hinder the correct interpretation of a chart
Towards Better COlor Palettes
With the current issues in mind, we set out to create a series of color palettes that:
Follow best practices around color accessibility and uniform color perception
Are, whenever possible, a reflection of the Flatiron brand colors
Are accessible to all teams and people that generate visualizations, figures, and charts across the organization through standard formats and data packages
Can be modified within a proposed framework to support corner cases or very specific needs
Below you can see the base colors that are part of the brand. While these colors are rich and vibrant, they are not perceptually uniform, which can cause interpretation issues with certain visualizations (more on this later). However, the Brand team had created an excellent base to work off from.
Initial Explorations
In addition to the Brand colors, we took inspiration from other excellent visualization color palettes, such as Paul Tol’s color schemes (here implemented in a metro map example) and IBM’s Carbon Colors.
We established that we wanted to create three sets of palettes, plus additional guidance for corner cases:
Sequential colors: for ordinal data
Divergent colors: for ordinal, comparative data, such as divergence from a mean
Categorical colors: for data that does not have an inherent order or sequence
These palettes would cover most use cases and scenarios and would remove the guesswork for users, as all they would need to do is identify the type of data they are working with.
Our initial, naive explorations were done in Figma, which allowed us to work fast and iterate rapidly. However, these colors still suffered from drastic changes in lightness which made some hues “pop” more than others, and some of the colors, particularly on the Categorical palettes were too similar.
Next steps
With the initial set of ideas in mind, we created an algorithm to iteratively test and improve the colors:
Select brand colors and create a candidate palette
Check for accessibility across color populations
Check for distinction across neighboring colors
Adjust for text contrast to follow WCAG 2.1 A, and AA
Adjust for uniform perception
For grayscale, potentially implement a dedicated accessibility toggle within Flatiron products and tools
In order to help us test all the colors against the same requirements and rapidly iterate, we heavily relied on Susie Lu’s and Elijah Meelks’ wonderful Viz Palette tool, which allows to see a list of colors applied to a number of different visualizations, compare similarity between colors, and even test against different types of color blindness. Below you can see a small selection of tests comparing multiple categorical color palette candidates.
Refinement and testing
We continued to test and refine the palettes, particularly the categorical palette which was the more challenging one due to the number of colors we wanted to support. Below you can see an example of a more refined palette applied to a series of mock charts.
In order to more rapidly test contrast, color, and slight variations, I created a simple script in ObservableHQ using Chroma.js and the contrast ratios used by WCAG. This allowed for much faster and controlled refinement than what we could do in Figma or with the Viz Palette tool.
Bonus: you can play with this script live! Just drag the sliders to make the text and/or background lighter or darker. You can even try your own colors.
Usability Testing
We made one final “usability” test where we checked for both likeness as well as the ability for users to make inferences about data from a visualization using the selected colors. We tested both the categorical palette as well as the divergent palette, which inherently also covered the sequential colors.
All users were able to make correct inferences about the data and qualified the colors positively.
The final palettes
Sequential Palettes
Sequential color palettes are typically used to represent ordinal data that increases or decreases. Typical examples are age distributions or lab values for a cohort of patients.
Our sequential color palette is based on the Brand’s purple color. There are four versions depending on the number of colors needed for a visualization
Divergent Palettes
Divergent color palettes are typically used to visualize differences from a baseline, such as a mean or a normal value.
Many divergent color palettes use the red-green hues. However, this is a poor choice for many color-blind readers. Instead, our divergent color palette is based on green-purple hues. There are four versions depending on the number of colors needed for a visualization.
Categorical Palettes
Categorical color palettes are best used to distinguish between discrete categories that do not have an inherent correlation or ordering. Examples of this include medication classes, types of cancer sites, or types of patient visits. The colors of the palette should be applied sequentially from left to right as displayed here to maximize the visual differentiation and contrast between colors.
The visual accessibility of categorical palettes decreases with more colors, so when possible, we recommend to use as few colors / categories as possible. The gray color should be reserved for data categorized as “other”, “null”, or “missing”.
For corner use cases where the colors don’t quite work (e.g. badges or labels that have colored text), we enabled users to modify colors using the ObservableHQ script.
A final note on perceptually uniform palettes
Creating colors for data visualizations based on a brand forces us to not only think about the brand, but about the perception of the color by the human eye and brain. (The images belong to Gregor Aisch)
Many color palettes are generated in color spaces that are optimized for digital displays, such as the sRGB space, but are not optimized for human perception, like the CIE Lab* color space. Take the following multi-hue scale as an example:
As you can see, this red-yellow color scale has very different degrees of lightness (as represented by the line chart below the color gradient). This means that the “perceived distance” between two reds is much smaller than between a red and a yellow color. When we translate colors to data values, this will affect the perception of a chart, a map, or another visualization.
We can apply transformations to the colors so that we have more uniform “steps” between each color
This is the precise transformation we did to the Flatiron standard color palette to achieve a more perceptually uniform color scale, optimized for charts and visualizations.
The steps are more even, ensuring that the “perceived distance” between colors is consistent and that the data visualizations using these colors will accurately reflect the data they are encoding.
Credits and Sources
This work would not have been possible without the wonderful thought partnership and collaboration of Liam Wiesenberger (focus on accessibility, usability testing) and Nicole Ulgado (palette testing and iteration).