This video is still being processed. Please check back later and refresh the page.

Uh oh! Something went wrong, please try again.

Introduction to Data Visualization with Python

Derive insights from data using pandas .plot, Seaborn, and Matplotlib.

rate limit

Code not recognized.

About this course

Data visualization is an essential part of data science and data analytics. Remember: “Pictures speak louder than words.”

Data visualization is a critical skill for data professionals and anyone who works with data. Our brain processes visual information exponentially faster than words. Data visualization allows us to use graphical representations of data to communicate information effectively. 

This course will help you learn the essential data visualization concepts and tools, with a focus on creating static, non-interactive figures and charts. You will learn data visualization techniques using the most popular and fundamental Python visualization tool, Matplotlib. This course will also focus on the convenient Matplotlib interface provided by pandas, and high-level statistical plotting techniques provided by Seaborn.

Sophia Yang will walk through a visualization project to illustrate the research and preparation work needed for a complete project. Through this process, you will learn how to approach and understand the patterns and relationships of your data through different plotting techniques – ultimately allowing you to apply the visualization skills you have learned to solve real-world problems.

What you'll learn—and how you can apply it

By the end of this course, you’ll understand how to:

  • Describe basic concepts of data visualization using Python
  • Explain your data through visualization
  • Visualize a pandas Dataframe using the pandas .plot() method
  • Use statistical plots and facet plots with Seaborn
  • Use Matplotlib’s object-oriented API to fine-tune and customize plots

This training is for you because:

  • You’re a student wanting to learn about Python data visualization
  • You’re interested in learning how to effectively visualize information
  • You want to become a data analyst or a data scientist

Prerequisites

  • A basic understanding of Python, including lists, variables, and functions. The Introduction to Python Programming Learning Path covers these concepts. 
  • An understanding of Jupyter notebook/JupyterLab
  • Basic knowledge of pandas is recommended, but not required
  • Basic knowledge of statistics is recommended, but not required

Setup

To follow along using your desktop IDE:

  1. Install or update to the latest version of Anaconda
  2. Launch your command line tool and configure your conda environment

For macOS and Linux users: Search and launch Terminal in your system

For Windows users: Locate and launch Anaconda Prompt in your system

3. (Optional but recommended) From the command line, run the following prompts to create and activate a new environment

conda create --name NEW_ENV_NAME

conda activate NEW_ENV_NAME 

4. Install required packages in the command line

conda install matplotlib pandas seaborn 

5. Launch JupyterLab from the command line

jupyter lab 

To open Anaconda Notebooks:

  1. Go to https://anaconda.cloud
  2. Click on 'Notebooks' from the top navigation menu
  3. Create an account or login if you already have one

Facilitator Bio

Sophia Yang is a Senior Data Scientist at Anaconda, Inc., where she uses data science to facilitate decision making for various departments across the company.

She volunteers as a Project Incubator at NumFOCUS to help Open Source Scientific projects grow. She is also the author of multiple Python open-source libraries such as condastats, cranlogs, PyPowerUp, intake-stripe, and intake-salesforce.

She holds an M.S. in Statistics and Ph.D. in Educational Psychology from The University of Texas at Austin.

Company affiliation Anaconda, Inc. 

https://sophiamyang.medium.com/ 

https://twitter.com/sophiamyang

https://www.linkedin.com/in/sophiamyang/

https://github.com/sophiamyang

https://www.youtube.com/SophiaYangDS 

Questions? Issues? Join our Community page to get help. 

Curriculum01:32:32

  • How to use Anaconda notebooks 00:01:02
  • How to read a plot
  • Preview
    How to read a plot + Exercises 00:08:59
  • How to derive basic insights from data
  • Load and explore data in pandas + Exercises 00:14:05
  • Chart types, relationships between variables, and time series + Exercises 00:11:20
  • How to understand data statistically through plotting
  • Basic Seaborn syntax + Exercises 00:11:58
  • Create statistical plots and visualize multiple relationships + Exercises 00:06:03
  • How to customize your plots
  • Combine Matplotlib, pandas, and Seaborn plots + Exercises 00:06:40
  • Add annotations and change styles + Exercises 00:07:03
  • Final project: Visualize UFO sighting data
  • Locate a dataset and develop questions 00:04:24
  • Data cleaning 00:09:26
  • Data exploration and visualization 00:10:32
  • Conclusion 00:01:00
  • End of Course Survey

About this course

Data visualization is an essential part of data science and data analytics. Remember: “Pictures speak louder than words.”

Data visualization is a critical skill for data professionals and anyone who works with data. Our brain processes visual information exponentially faster than words. Data visualization allows us to use graphical representations of data to communicate information effectively. 

This course will help you learn the essential data visualization concepts and tools, with a focus on creating static, non-interactive figures and charts. You will learn data visualization techniques using the most popular and fundamental Python visualization tool, Matplotlib. This course will also focus on the convenient Matplotlib interface provided by pandas, and high-level statistical plotting techniques provided by Seaborn.

Sophia Yang will walk through a visualization project to illustrate the research and preparation work needed for a complete project. Through this process, you will learn how to approach and understand the patterns and relationships of your data through different plotting techniques – ultimately allowing you to apply the visualization skills you have learned to solve real-world problems.

What you'll learn—and how you can apply it

By the end of this course, you’ll understand how to:

  • Describe basic concepts of data visualization using Python
  • Explain your data through visualization
  • Visualize a pandas Dataframe using the pandas .plot() method
  • Use statistical plots and facet plots with Seaborn
  • Use Matplotlib’s object-oriented API to fine-tune and customize plots

This training is for you because:

  • You’re a student wanting to learn about Python data visualization
  • You’re interested in learning how to effectively visualize information
  • You want to become a data analyst or a data scientist

Prerequisites

  • A basic understanding of Python, including lists, variables, and functions. The Introduction to Python Programming Learning Path covers these concepts. 
  • An understanding of Jupyter notebook/JupyterLab
  • Basic knowledge of pandas is recommended, but not required
  • Basic knowledge of statistics is recommended, but not required

Setup

To follow along using your desktop IDE:

  1. Install or update to the latest version of Anaconda
  2. Launch your command line tool and configure your conda environment

For macOS and Linux users: Search and launch Terminal in your system

For Windows users: Locate and launch Anaconda Prompt in your system

3. (Optional but recommended) From the command line, run the following prompts to create and activate a new environment

conda create --name NEW_ENV_NAME

conda activate NEW_ENV_NAME 

4. Install required packages in the command line

conda install matplotlib pandas seaborn 

5. Launch JupyterLab from the command line

jupyter lab 

To open Anaconda Notebooks:

  1. Go to https://anaconda.cloud
  2. Click on 'Notebooks' from the top navigation menu
  3. Create an account or login if you already have one

Facilitator Bio

Sophia Yang is a Senior Data Scientist at Anaconda, Inc., where she uses data science to facilitate decision making for various departments across the company.

She volunteers as a Project Incubator at NumFOCUS to help Open Source Scientific projects grow. She is also the author of multiple Python open-source libraries such as condastats, cranlogs, PyPowerUp, intake-stripe, and intake-salesforce.

She holds an M.S. in Statistics and Ph.D. in Educational Psychology from The University of Texas at Austin.

Company affiliation Anaconda, Inc. 

https://sophiamyang.medium.com/ 

https://twitter.com/sophiamyang

https://www.linkedin.com/in/sophiamyang/

https://github.com/sophiamyang

https://www.youtube.com/SophiaYangDS 

Questions? Issues? Join our Community page to get help. 

Curriculum01:32:32

  • How to use Anaconda notebooks 00:01:02
  • How to read a plot
  • Preview
    How to read a plot + Exercises 00:08:59
  • How to derive basic insights from data
  • Load and explore data in pandas + Exercises 00:14:05
  • Chart types, relationships between variables, and time series + Exercises 00:11:20
  • How to understand data statistically through plotting
  • Basic Seaborn syntax + Exercises 00:11:58
  • Create statistical plots and visualize multiple relationships + Exercises 00:06:03
  • How to customize your plots
  • Combine Matplotlib, pandas, and Seaborn plots + Exercises 00:06:40
  • Add annotations and change styles + Exercises 00:07:03
  • Final project: Visualize UFO sighting data
  • Locate a dataset and develop questions 00:04:24
  • Data cleaning 00:09:26
  • Data exploration and visualization 00:10:32
  • Conclusion 00:01:00
  • End of Course Survey