Machine Learning with Regression in Python With Ordinary Least Squares, Ridge, Decision Trees and Neural Networks

  • Michael Keith

Your browser needs to be JavaScript capable to view this video

Try reloading this page, or reviewing your browser settings

You're watching a preview of subscription content. Log in to check access

In this video, you will learn regression techniques in Python using ordinary least squares, ridge, lasso, decision trees, and neural networks.

We start by exploring a census dataset that captures sales from a business in various counties across the United States. We briefly explore the dataset before moving onto model assumptions and feature engineering. We then implement a linear regression, which is a simple model that is easy to interpret, then move through more complex models to see what best makes predictions on our dataset. To avoid overfitting, we split our dataset and to optimize predictions, we tune hyperparameters with k-folds cross validation. We move through models that are more complex until we arrive at a neural network model. We then use the model with the lowest error metrics on the test dataset and make predictions on a new dataset. Using these predictions, we make a recommendation to the company’s shareholders who want to expand the business about which counties to expand to next.

This modeling process will be done in Python 3 on a Jupyter notebook, so it’s a good idea to have Anaconda installed on your computer so you can follow along. We will structure our notebook to be easy-to-read by others on our team who may want to expand on our analysis.

What You Will Learn

  • Explore a dataset with Pandas

  • Transform variables in a dataset to account for non-linearities and optimize predictions

  • Tune model hyperparameters and score model performance to determine the best model for a given dataset

  • Use statistical modeling to make recommendations to shareholders

Who This Video Is For

Software professionals with knowledge of Python basics and data scientists looking to apply data science to industry.

In this video, you will learn regression techniques in Python using ordinary least squares, ridge, lasso, decision trees, and neural networks. This modeling process will be done in Python 3 on a Jupyter notebook, so it’s a good idea to have Anaconda installed on your computer.

About The Author

Michael Keith

I am Michael Keith live in Orlando, FL, work for Disney Parks and Resorts. I use data science to deliver real results to a successful company, having delivered projects and forecasts that have delivered key insights to senior executives in the company. I also work part time as an instructor for Western Governors University, in the Master of Data Analytics and Bachelor of Computer Science programs. I have delivered many data products written in Python and R. I am originally from Salt Lake City, Utah.

 

Supporting material

View source code at GitHub.

About this video

Author(s)
Michael Keith
DOI
https://doi.org/10.1007/978-1-4842-6583-3
Online ISBN
978-1-4842-6583-3
Total duration
44 min
Publisher
Apress
Copyright information
© Michael Keith 2020

Related content

Video Transcript

[MUSIC PLAYING]

Hello, and welcome to the course entitled Machine Learning with Regression in Python. I’m Michael Keith and I will be taking you through this tutorial.

So the course is going to follow a logical flow. We are going to use Python, run out of a Jupyter Notebook. And we are primarily going to use the scikit-learn library, which is an extensive modeling library, but there is also other libraries that need to be installed. And if you have my code in my GitHub and you pip install everything in the requirements.txt then that should be good enough to run everything.

First thing we are going to look at is preprocessing our data, making it all numerical and doing some transformations on it to optimize our predictions. We’re going to do data visualization throughout the application. We are going to look at different models, ranging from the simple to the complex. We’re going to start at simpler models and go to more complex models, starting with a linear regression and ending with a neural network. We’re going to use these models to make predictions and we’re going to use those predictions to make recommendations.

If you want to follow along, the code that I referenced earlier is in this GitHub link, you can download it and manipulate it and use it and study it and do whatever you want with it.