We are going to explore the data visualization capabilities of Pandas. We’ll start by introducing the basics — line graphs, bar charts and pie charts — and then we’ll take a look at the more statistical views with histograms and box plots. Lastly, we’ll see how we can create multiple plot in one chart and how we save charts as images, so we can utilize them in our own reports, documents and web pages.
Throughout the tutorial you will use a dataset about the weather in London, UK, and you’ll create a number of charts using that data. I have created this dataset from public domain information that is available from the UK Meteorological Office.
https://projectcodeed.blogspot.com/2020/02/plotting-with-pandas-introduction-to.html
Fundamentally, we are talking about a set of methods that can be used with a Pandas DataFrame to plot various graphs from the data contained in that DataFrame. It relies on a Python plotting library called matplotlib.
The purpose is to simplify the creation of graphs and plots, so you don’t need to know the details of how matplotlib works. However, you will need to know one or two matplotlib commands but they are very simple and I’ll explain them as we get to them.
You can think of matplotlib as being a ‘backend’ for Pandas that takes care of the mechanics of creating a plot.
-
Import libraries
-
Get data
-
Basic plots: Examples & syntax
- Line
- Multi-line
- Bar and Horizontal bar
- Scatter
- Pie
NOTES:
- Different formats for different purposes
- Components of charts: axes & aesthetics (titles, labels, legends)
- Use of indices
-
Statistical plots:
- Boxplot
- Histograms
-
Advanced plotting
- Multiple plots (subplots)
- Saving plots
-
- Changing size and color
- Setting a title
- Display a grid
- Changing the legend
- Customizing the ticks
- Recycling formats using
**kwargs