This video is still being processed. Please check back later and refresh the page.

Uh oh! Something went wrong, please try again.

Cleaning and Preparing Data with Pandas

Manipulating data in the NCAA games data set.

Not currently available

There are no seats available for purchase at this time.

rate limit

Code not recognized.

About this course

Data cleaning is a critical step for any data science, machine learning, statistical, or analytics project. In this two-hour live online course, we'll cover the basics of pruning, cleaning, and formatting data through tasks like dataframe selection, filtering, outlier removal, coalescing blanks, and formatting data types. Afterwards, you will be prepared to handle more advanced areas in Pandas like data transformation, feature selection, and machine learning.

What you'll learn—and how you can apply it

By the end of this hands-on course, you’ll understand:

  • What constitutes data cleaning and why it is necessary.
  • Techniques on dealing with missing values and outliers.
  • When data should be modified versus removed.

And you’ll be able to:

  • Take raw inputs and sanitize them for more sophisticated tasks.
  • Strategize how to handle outliers, missing values, and bad data. 
  • Cast grungy values into proper data types, including freeform text, dates, and times.

This training is for you because...

  • You’re a spreadsheet user looking for a better way to clean data.
  • You work with data science professionals seeking more usable data.
  • You want to become a data professional who can transform raw data into usable formats.

Prerequisites  

  • Basic Python proficiency (variables, loops, collections, operators, etc.)
  • Basic Pandas proficiency is recommended, but not required

Jupyter Notebooks / Setup 

Recommended preparation

Recommended follow-up

About the Instructor

Thomas Nield is the founder of Nield Consulting Group and Yawman Flight, as well as an instructor at University of Southern California. He enjoys making technical content relatable and relevant to those unfamiliar or intimidated by it. Thomas regularly teaches classes on data analysis, machine learning, mathematical optimization, and practical artificial intelligence. At USC he teaches AI System Safety, developing systematic approaches for identifying AI-related hazards in aviation and ground vehicles. He's authored three books, including Essential Math for Data Science (O’Reilly) and Getting Started with SQL (O'Reilly)

He is also the founder and inventor of Yawman Flight, a company developing universal handheld flight controls for flight simulation and unmanned aerial vehicles. You can find him at:

Nield Consulting Group

Yawman Flight

Twitter

LinkedIn

GitHub

YouTube

Cost: $49. Anaconda Learning subscription is not required.

You may cancel your registration at any time before the course airs. Once the course is live, there are no refunds or credits. All registered users will receive a Zoom recording of the live course one day after the course airs.

Important info:

The tutorial will be conducted using Zoom Meetings. It is important that the name you used to register for the event is the same as the name you use when you login to Zoom. If this will not be the case, please email [email protected] to let us know.

All participants will have their microphones muted and cameras off upon entry to help minimize distractions during the live event. Support and Q&A will be conducted via the Chat function within Zoom.

Questions? Issues? Contact [email protected].

Curriculum

  • Cleaning and Preparing Data with Pandas (2 hours)
  • Get Started with Anaconda Notebooks

About this course

Data cleaning is a critical step for any data science, machine learning, statistical, or analytics project. In this two-hour live online course, we'll cover the basics of pruning, cleaning, and formatting data through tasks like dataframe selection, filtering, outlier removal, coalescing blanks, and formatting data types. Afterwards, you will be prepared to handle more advanced areas in Pandas like data transformation, feature selection, and machine learning.

What you'll learn—and how you can apply it

By the end of this hands-on course, you’ll understand:

  • What constitutes data cleaning and why it is necessary.
  • Techniques on dealing with missing values and outliers.
  • When data should be modified versus removed.

And you’ll be able to:

  • Take raw inputs and sanitize them for more sophisticated tasks.
  • Strategize how to handle outliers, missing values, and bad data. 
  • Cast grungy values into proper data types, including freeform text, dates, and times.

This training is for you because...

  • You’re a spreadsheet user looking for a better way to clean data.
  • You work with data science professionals seeking more usable data.
  • You want to become a data professional who can transform raw data into usable formats.

Prerequisites  

  • Basic Python proficiency (variables, loops, collections, operators, etc.)
  • Basic Pandas proficiency is recommended, but not required

Jupyter Notebooks / Setup 

Recommended preparation

Recommended follow-up

About the Instructor

Thomas Nield is the founder of Nield Consulting Group and Yawman Flight, as well as an instructor at University of Southern California. He enjoys making technical content relatable and relevant to those unfamiliar or intimidated by it. Thomas regularly teaches classes on data analysis, machine learning, mathematical optimization, and practical artificial intelligence. At USC he teaches AI System Safety, developing systematic approaches for identifying AI-related hazards in aviation and ground vehicles. He's authored three books, including Essential Math for Data Science (O’Reilly) and Getting Started with SQL (O'Reilly)

He is also the founder and inventor of Yawman Flight, a company developing universal handheld flight controls for flight simulation and unmanned aerial vehicles. You can find him at:

Nield Consulting Group

Yawman Flight

Twitter

LinkedIn

GitHub

YouTube

Cost: $49. Anaconda Learning subscription is not required.

You may cancel your registration at any time before the course airs. Once the course is live, there are no refunds or credits. All registered users will receive a Zoom recording of the live course one day after the course airs.

Important info:

The tutorial will be conducted using Zoom Meetings. It is important that the name you used to register for the event is the same as the name you use when you login to Zoom. If this will not be the case, please email [email protected] to let us know.

All participants will have their microphones muted and cameras off upon entry to help minimize distractions during the live event. Support and Q&A will be conducted via the Chat function within Zoom.

Questions? Issues? Contact [email protected].

Curriculum

  • Cleaning and Preparing Data with Pandas (2 hours)
  • Get Started with Anaconda Notebooks