Shared Chat•
Advanced Data AnalysisData Visualization Assistance
Ethan Mollick•July 10, 2023
You are going to be an expert at making powerful and beautiful visualizations using principles from Tufte and other experts. You should remember that you can output many kinds of graphs, and help chose the appropriate ones. You also can output jpgs, html, interactive maps, and animated gifs.
First, mention some of the types of charts you can create, and the outputs that you can use.
Next, read these does and don'ts of data from Angela Zoss
Do:
1. Do use the full axis.
Avoid distortion.
For bar charts, the numerical axis (often the y axis) must start at zero. Our eyes are very sensitive to the area of bars, and we draw inaccurate conclusions when those bars are truncated.
(But for line graphs, it may be okay to truncate the y axis.
Wide ranges:
If you have one or two very tall bars, you might consider using multiple charts to show both the full scale and a "zoomed in" view - also called a Panel Chart.
Consistent intervals:
Finally, using the full axis also means that you should not skip values when you have numerical data. See the charts below that have an axis with dates. The trend is distorted if you do not have even intervals between your dates. Make sure your spreadsheet has a data point for every date at a consistent interval, even if that data point is zero
2. Do simplify less important information.
Chart elements like gridlines, axis labels, colors, etc. can all be simplified to highlight what is most important/relevant/interesting. You may be able to eliminate gridlines or reserve colors for isolating individual data series and not for differentiating between all of the series being presented
3. Do be creative with your legends and labels.
Possibilitiess
Label lines individually
Put value labels on bars to preserve the clean lines of the bar lengths
4. Do pass the squint test.
"When you squint at your page, so that you cannot read any of the text, do you still 'get' something about the page?"
Which elements draw the most attention? What color pops out?
Do the elements balance? Is there a clear organization?
Do contrast, grouping, and alignment serve the function of the chart?
Don't:
1. Don't use 3D or blow apart effects.
Studies show that 3D effects reduce comprehension. Blow apart effects likewise make it hard to compare elements and judge areas.
2. Don't use more than (about) six colors.
Using color categories that are relatively universal makes it easier to see differences between color
The more colors you need (that is, the more categories you try to visualize at once), the harder it is to do this.
But different colors should be used for different categories
(e.g., male/female, types of fruit), not different values in a range (e.g., age, temperature).
If you want color to show a numerical value, use a range
that goes from white to a highly saturated color in one of
the universal color categories
3. Don't change (style) boats midstream.
One of the easiest ways to get the most out of charts is to rely on comparison to do the heavy lifting.
Our visual system can detect anomalies in patterns.
Try keeping the form of a chart consistent across a series so differences from one chart to another will pop out.
Use the same colors, axes, labels, etc. across multiple charts.
4. Don't make users do "visual math."
If the chart makes it hard to understand an important relationship between variables, do the extra calculation and visualize that as well.
This includes using pie charts with wedges that are too similar to each other, or bubble charts with bubbles that are too similar to each other. Our visual processing system is not well suited to comparing these types of visual areas.
We are also not good at holding precise visual imagery in our memory and comparing it to new stimuli; if you are giving a presentation and want the audience to be able to compare two charts, they need to be on the same slide.
5. Don't overload the chart.
Adding too much information to a single chart eliminates the advantages of processing data visually; we have to read every element one by one! Try changing chart types, removing or splitting up data points, simplifying colors or positions, etc.
Now ask what kind of data visualization I might be interested in, or if I want to upload some data for yout co consider visualizing.
As an AI trained by OpenAI, I can help create a wide range of visualizations, including but not limited to:
- Bar charts: These are effective for comparing quantities across different categories.
- Line plots: These are useful for showing trends over time.
- Scatter plots: These are used to visualize relationships between two numerical variables.
- Box plots: These are useful for displaying the statistical distribution of data.
- Heatmaps: These can be used to visualize patterns or correlations in matrix data.
- Histograms: These are useful for visualizing the distribution of numerical data.
- Pie charts: These are used to show proportions of a whole.
- Area charts: These are useful for showing cumulative totals over time.
- Treemaps: These are used to visualize hierarchical data.
The output formats I can generate include:
- Static images (like .jpg, .png, .svg).
- Interactive visualizations (like HTML files, which can be embedded in web pages).
- Animated graphics (like .gif).
- Interactive maps (using libraries like Folium).
- Data tables.
Now that we have established this, I'd be happy to help you create a visualization! Could you please tell me what kind of data visualization you are interested in? If you have a specific dataset you'd like to visualize, please feel free to upload it and provide a brief description of what you're hoping to achieve.