Prerequisites
To create a Matplotlib bar chart, we’ll need the following:
- Python installed on your machine
- Pip: package management system (it comes with Python)
- Jupyter Notebook: an online editor for data visualization
- Pandas: a library to create data frames from data sets and prepare data for plotting
- Numpy: a library for multi-dimensional arrays
- Matplotlib: a plotting library
- Seaborn: a plotting library (we’ll only use part of its functionally to add a gray grid to the plot and get rid of borders)
You can download the latest version of Python for Windows on the official website.
To get other tools, you’ll need to install recommended Scientific Python Distributions. Type this in your terminal:
pip install numpy scipy matplotlib ipython jupyter pandas sympy nose seaborn
Getting Started
Create a folder that will contain your notebook (e.g. “matplotlib-bar-chart”). And open Jupyter Notebook by typing this command in your terminal (change the pathway):
cd C:\Users\Shark\Documents\code\matplotlib-bar-chart
py -m notebook
This will automatically open the Jupyter home page at http://localhost:8888/tree. Click on the “New” button in the top right corner, select the Python version installed on your machine, and a notebook will open in a new browser window.
In the first line of the notebook, import all the necessary libraries:
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
import pandas as pd
import seaborn as sns
sns.set()
%matplotlib notebook
You’ll need the last line (%matplotlib notebook) to display charts in input cells.
Data Preparation
Let’s create a bar plot that will show the top 10 movies with the highest revenue. You can download the data set from Kaggle. We’ll need the file named movies_metadata.csv. Put it in your “matplotlib-bar-chart” folder.
On the second line in your Jupyter notebook, type this code to read the file and to display the first 5 rows:
df = pd.read_csv('movies_metadata.csv')
df.head()
Next, create a data frame, sort and format values:
data = pd.DataFrame(df, columns=['revenue', 'title'])
data_sorted = data.sort_values(by='revenue', ascending=False)
data_sorted['revenue'] = data_sorted['revenue'] / 1000000
pd.options.display.float_format = '{:,.0f}'.format
data_sorted.set_index('title', inplace=True)
ranking = data_sorted.head(10)
ranking
The output will look like this:
We’ll use this piece of data frame to create our chart.
Plotting
We’ll create a horizontal descending bar chart in 7 steps. All the code snippets below should be placed inside one cell in your Jupyter Notebook.
Here’s the list of variables that will be used in our code. You can insert your values or names if you like.
# Variables
index = ranking.index
values = ranking['revenue']
plot_title = 'Top 10 movies by revenue, USD million'
title_size = 18
subtitle = 'Source: Kaggle / The Movies Dataset'
x_label = 'Revenue, USD million'
filename = 'barh-plot'
1. Create subplots and set a colormap
First, sort data for plotting to create a descending bar chart:
ranking.sort_values(by='revenue', inplace=True, ascending=True)
Next, draw a figure with a subplot. We’re using the viridis color scheme to create gradients later.
fig, ax = plt.subplots(figsize=(10,6), facecolor=(.94, .94, .94))
mpl.pyplot.viridis()
figsize=(10,6) creates a 1000 × 600 px figure.
facecolor means the color of the plot’s background.
mpl.pyplot.viridis() sets colormap (gradient colors from yellow to blue to purple). Other colormaps are plasma, inferno, magma, and cividis. See more examples in Matplotlib’s official documentation.
2. Create bars
bar = ax.barh(index, values)
plt.tight_layout()
ax.xaxis.set_major_formatter(mpl.ticker.StrMethodFormatter('{x:,.0f}'))
ax.bar() would create vertical bar plots, while ax.barh() would draw horizontal bar plots. We’re using Matplotlib barh, and our chart will have a horizontal layout.
plt.tight_layout() adjusts subplot params so that subplots are nicely fit in the figure.
We’re also using set_major_formatter() to format ticks with commas (like 1,500 or 2,000).
3. Set title, its font size, and position
title = plt.title(plot_title, pad=20, fontsize=title_size)
title.set_position([.33, 1])
plt.subplots_adjust(top=0.9, bottom=0.1)
pad=20 sets the title’s padding, and .33 sets its left margin.
subplots_adjust(top=0.9) allows us to keep the title from being cropped, and subplots_adjust(bottom=0.1) prevents the x-axis label from being cropped.
4. Create gradient background
Set grid z-order to 0 and bar z-order to 1. This will hide the grid behind bars.
ax.grid(zorder=0)
def gradientbars(bars):
grad = np.atleast_2d(np.linspace(0,1,256))
ax = bars[0].axes
lim = ax.get_xlim()+ax.get_ylim()
for bar in bars:
bar.set_zorder(1)
bar.set_facecolor('none')
x,y = bar.get_xy()
w, h = bar.get_width(), bar.get_height()
ax.imshow(grad, extent=[x+w, x, y, y+h], aspect='auto', zorder=1)
ax.axis(lim)
gradientbars(bar)
extent=[x+w, x, y ,y+h] means values in the following order: left, right, bottom, top. Change this to change the order of colors in the gradient, if needed.
5. Create bar labels/annotations
Since we’re creating a Python bar graph with labels, we need to define label values and label position.
rects = ax.patches
# Place a label for each bar
for rect in rects:
# Get X and Y placement of label from rect
x_value = rect.get_width()
y_value = rect.get_y() + rect.get_height() / 2
# Number of points between bar and label; change to your liking
space = -30
# Vertical alignment for positive values
ha = 'left'
# If value of bar is negative: place label to the left of the bar
if x_value < 0:
# Invert space to place label to the left
space *= -1
# Horizontally align label to the right
ha = 'right'
# Use X value as label and format number
label = '{:,.0f}'.format(x_value)
# Create annotation
plt.annotate(
label, # Use `label` as label
(x_value, y_value), # Place label at bar end
xytext=(space, 0), # Horizontally shift label by `space`
textcoords='offset points', # Interpret `xytext` as offset in points
va='center', # Vertically center label
ha=ha, # Horizontally align label differently for positive and negative values
color = 'white') # Change label color to white
6. Set a subtitle and labels, if needed
# Set subtitle
tfrom = ax.get_xaxis_transform()
ann = ax.annotate(subtitle, xy=(5, 1), xycoords=tfrom, bbox=dict(boxstyle='square,pad=1.3', fc='#f0f0f0', ec='none'))
#Set x-label
ax.set_xlabel(x_label, color='#525252')
7. Save the chart as a picture
plt.savefig(filename+'.png', facecolor=(.94, .94, .94))
You might need to repeat facecolor in savefig(). Otherwise, plt.savefig might ignore it.
That’s it, your horizontal bar chart is ready. You can download the notebook on GitHub to get the full code.
Read also: