Introduction to Machine Learning: Supervised Learning

Introduction to Machine Learning: Supervised Learning

This course is part of Machine Learning: Theory and Hands-on Practice with Python Specialization

Taught in English

Some content may not be translated

Instructor: Geena Kim

11,986 already enrolled

Included with Coursera Plus

Learn more

Course

Gain insight into a topic and learn the fundamentals

3.4

(45 reviews)

Intermediate level

Recommended experience

39 hours (approximately)

Flexible schedule

Learn at your own pace

Progress towards a degree

Learn more

View course modules

What you'll learn

Use modern machine learning tools and python libraries.
Compare logistic regression’s strengths and weaknesses.
Explain how to deal with linearly-inseparable data.
Explain what decision tree is & how it splits nodes.

Skills you'll gain

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

9 quizzes

Course

Gain insight into a topic and learn the fundamentals

3.4

(45 reviews)

Intermediate level

Recommended experience

39 hours (approximately)

Flexible schedule

Learn at your own pace

Progress towards a degree

Learn more

View course modules

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

Build your subject-matter expertise

This course is part of the Machine Learning: Theory and Hands-on Practice with Python Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

There are 6 modules in this course

In this course, you’ll be learning various supervised ML algorithms and prediction tasks applied to different data. You’ll learn when to use which model and why, and how to improve the model performances. We will cover models such as linear and logistic regression, KNN, Decision trees and ensembling methods such as Random Forest and Boosting, kernel methods such as SVM.

Prior coding or scripting knowledge is required. We will be utilizing Python extensively throughout the course. In this course, you will need to have a solid foundation in Python or sufficient previous experience coding with other programming languages to pick up Python quickly. We will be learning how to use data science libraries like NumPy, pandas, matplotlib, statsmodels, and sklearn. The course is designed for programmers beginning to work with those libraries. Prior experience with those libraries would be helpful but not necessary. College-level math skills, including Calculus and Linear Algebra, are required. Our hope for this course is that the math will be understandable but not intimidating. This course can be taken for academic credit as part of CU Boulder’s MS in Data Science or MS in Computer Science degrees offered on the Coursera platform. These fully accredited graduate degrees offer targeted courses, short 8-week sessions, and pay-as-you-go tuition. Admission is based on performance in three preliminary courses, not academic history. CU degrees on Coursera are ideal for recent graduates or working professionals. Learn more: MS in Data Science: https://www.coursera.org/degrees/master-of-science-data-science-boulder MS in Computer Science: https://coursera.org/degrees/ms-computer-science-boulder

This week, we will build our supervised machine learning foundation. Data cleaning and EDA might not seem glamorous, but the process is vital for guiding your real-world data projects. The chances are that you have heard of linear regression before. With the buzz around machine learning, perhaps it seems surprising that we are starting with such a standard statistical technique. In "How Not to Be Wrong: The Power of Mathematical Thinking", Jordan Ellenberg refers to linear regression as "the statistical technique that is to social science as the screwdriver is to home repair. It's the one tool you're pretty much going to use, whatever the task" (51). Linear regression is an excellent starting place for solving problems with a continuous outcome. Hopefully, this week will help you appreciate how much you can accomplish with a simple model like this.

What's included

5 videos12 readings4 quizzes1 programming assignment1 peer review2 discussion prompts

5 videosTotal 67 minutes

Introduction 16 minutesPreview module
Simple Linear Regression11 minutes
Least Squared Method11 minutes
Model Fitness and R-squared8 minutes
Coefficient Significance and Test Error18 minutes

12 readingsTotal 80 minutes

Earn Academic Credit for your Work! 10 minutes
Course Support10 minutes
Course Textbooks10 minutes
Things of Note for Programming Assignments5 minutes
Peer Review Guidelines and Expectations10 minutes
A Note About Peer Review Resubmissions0 minutes
Honor Code Expectations5 minutes
ISLR 3.1: Simple Linear Regression 5 minutes
ISLR 3.1.1: Estimating the Coefficients5 minutes
ISLR 3.1.2: Assessing the Accuracy of the Coefficient Estimates 5 minutes
ISLR 3.1.3: Assessing the Accuracy of the Model 5 minutes
Module 1 Slides 10 minutes

4 quizzesTotal 60 minutes

Peer Review Expectations15 minutes
Week 1 Quiz30 minutes
Programming Assignments Quiz5 minutes
Honor Code Expectations10 minutes

1 programming assignmentTotal 180 minutes

Week 1: Data Cleaning and EDA 180 minutes

1 peer reviewTotal 30 minutes

Week 1: Data Cleaning and EDA30 minutes

2 discussion promptsTotal 20 minutes

Introduce Yourself10 minutes
Peer Review Expectations10 minutes

This week we are building on last week's foundation and working with more complex linear regression models. After this week, you will be able to create linear models with several explanatory and categorical variables. Mathematically and syntactically, multiple linear regression models are a natural extension of the simpler linear regression models we learned last week. One of the differences that we must keep in mind this week is that our data space is now 3D instead of 2D. The difference between 3D and 2D has implications when considering how to do things like creating meaningful visualizations. It is essential to understand how to interpret coefficients. Machine learning involves strategically iterating and improving upon a model. In this week's lab and Peer Review, you will identify weaknesses with linear regression models and strategically improve on them. Hopefully, as you progress through this course specialization, you will get better and better at this iterative process.

What's included

4 videos5 readings1 quiz1 programming assignment1 peer review

4 videosTotal 44 minutes

Linear Regression with Higher-Order Terms: Polynomial Regression12 minutesPreview module
Bias-Variance Trade-Off6 minutes
Linear Regression with Multiple Features10 minutes
Feature Selection, Correlation, and Interaction13 minutes

5 readingsTotal 52 minutes

ISLR 3.2: Multiple Linear Regression2 minutes
ISLR 3.3.2: Extensions of the Linear Model10 minutes
ISLR 2.2.2: The Bias-Variance Trade-Off10 minutes
ISLR 3.3.3: Potential Problems20 minutes
Module 2 Slides10 minutes

1 quizTotal 30 minutes

Week 2 Quiz30 minutes

1 programming assignmentTotal 180 minutes

Week 2: Multiple Linear Regression180 minutes

1 peer reviewTotal 60 minutes

Week 2: Multiple Linear Regression60 minutes

Even though the name logistic regression might suggest otherwise, we will be shifting our attention from regression tasks to classification tasks this week. Logistic regression is a particular case of a generalized linear model. Like linear regression, logistic regression is a widely used statistical tool and one of the foundational tools for your data science toolkit. There are many real-world applications for classification tasks, including the financial and biomedical realms. In this week's lab, you will see how this classic algorithm will help you predict whether a biopsy slide from the famous Wisconsin Breast Cancer dataset shows a benign or malignant mass. We also advise starting the final project that you will turn in Week 7 of the course this week. This week, find a project dataset, start performing EDA and define your problem. Use the project rubric as a guide, and don't be afraid to look at a few datasets until you find one well-suited to the project.

What's included

4 videos6 readings1 quiz1 programming assignment1 peer review

4 videosTotal 63 minutes

Logistic Regression Introduction14 minutesPreview module
Logistic Regression Optimization 19 minutes
Performance Metrics in Classification13 minutes
Sklearn Library Usage and Examples15 minutes

6 readingsTotal 75 minutes

ISLR 4.1 - 4.3.1: An Overview of Classification - Logistic Regression10 minutes
ISLR 4.3.2: Estimating the Regression Coefficients5 minutes
Confusion Matrix10 minutes
ISLR 6.2.1- 6.2.3 and 5.1: Ridge Regression and Cross-Validation30 minutes
Logistic Regression 10 minutes
Module 3 Slides10 minutes

1 quizTotal 30 minutes

Week 3 Quiz30 minutes

1 programming assignmentTotal 180 minutes

Week 3: Logistic Regression180 minutes

1 peer reviewTotal 60 minutes

Week 3: Logistic Regression60 minutes

This week we will learn about non-parametric models. k-Nearest Neighbors makes sense on an intuitive level. Decision trees are a supervised learning model that can be used for either regression or classification tasks. In Module 2, we learned about the bias-variance tradeoff, and we've kept that tradeoff in mind as we've moved through the course. Highly flexible tree models have the benefit that they can capture complex, non-linear relationships. However, they are prone to overfitting. This week and next, we will explore strategies like pruning to avoid overfitting with tree-based models. In this week's lab, you will make a KNN classifier for the famous MNIST dataset and then build a spam classifier using a decision tree model. This week we will once again appreciate the power of simple, understandable models. Keep going with your final project. Once you've finalized your dataset and EDA, start on the initial approach for your main supervised learning task. Review the course material, read research papers, look at GitHub repositories and Medium articles to understand your topic and plan your approach.

What's included

5 videos6 readings1 quiz1 programming assignment1 peer review

5 videosTotal 65 minutes

Intro to Non-parametric and K-nearest Neighbors16 minutesPreview module
Decision Tree Intro, Decision Tree Regressor11 minutes
Decision Tree Classifier, Metrics (Gini and Entropy)19 minutes
Sklearn Usage, DT Hyperparameters and Early Stopping9 minutes
Minimal Cost-complexity Pruning8 minutes

6 readingsTotal 60 minutes

ISLR: K-Nearest Neighbors10 minutes
ISLR 8.1.1: The Basics of Decision Trees-Regression Trees10 minutes
ISLR 8.1.2: Classification Trees10 minutes
Decision Tree Classifier10 minutes
ISLR: Tree Pruning10 minutes
Module 4 Slides10 minutes

1 quizTotal 30 minutes

Week 4 Quiz30 minutes

1 programming assignmentTotal 180 minutes

Week 4: Non-parametric Models180 minutes

1 peer reviewTotal 60 minutes

Week 4: Non-parametric Models60 minutes

Last week, we learned about tree models. Despite all of the benefits of tree models, they had some weaknesses that were difficult to overcome. This week we will learn about ensembling methods to overcome tree models' tendency to overfit. The winner utilizes an ensemble approach in many machine learning competitions, aggregating predictions from multiple tree models. This week you will start by learning about random forests and bagging, a technique that involves training the same algorithm with different subset samples of the training data. Then you will learn about boosting, an ensemble method where models train sequentially. You will learn about two essential boosting algorithms: AdaBoost and Gradient Boosting. This week, work on the main analysis of your final project. Iterate and improve on your models. Compare different models. Perform hyperparameter optimization. Sometimes this part of a machine learning project can feel tedious, but hopefully, it will be rewarding to see your performance improve.

What's included

4 videos5 readings1 quiz1 programming assignment1 peer review

4 videosTotal 42 minutes

Ensemble Method Intro: Random Forest8 minutesPreview module
Boosting Introduction9 minutes
AdaBoost Algorithm8 minutes
Gradient Boosting15 minutes

5 readingsTotal 50 minutes

ISLR 8.2.1, 8.2.2: Bagging and Random Forests 10 minutes
ISLR 8.2.3: Boosting10 minutes
ESLII 10.1 - 10.4: Boosting Methods - Exponential Loss and AdaBoost10 minutes
ESLII 10.10, 10.11: Gradient Boosting10 minutes
Module 5 Slides10 minutes

1 quizTotal 30 minutes

Week 5 Quiz30 minutes

1 programming assignmentTotal 180 minutes

Week 5 : Ensembles180 minutes

1 peer reviewTotal 60 minutes

Week 5: Ensembles60 minutes

This week we will be exploring another advanced topic, Support Vector Machines. Don't let the name intimidate you. This week, we will work through understanding this powerful supervised learning model. Hopefully, you will build an intuitive understanding of essential concepts like the difference between hard and soft margins, the kernel trick, and hyperparameter tuning. Next week, you will submit the three deliverables for your final project: the report, video presentation, and a link to your GitHub repository. Suppose you aim to finish iterating on your models, hyperparameter optimization, etc., this week. In that case, next week, you can polish your report, make sure your GitHub repository is ready for Peer Review, and give an excellent presentation of your work.

What's included

4 videos4 readings1 quiz1 programming assignment1 peer review

4 videosTotal 59 minutes

Support Vector Machine Introduction16 minutesPreview module
Support Vector Machine: Soft Margin Classifier15 minutes
Support Vector Machine: Kernel Trick9 minutes
Support Vector Machine: Performance17 minutes

4 readingsTotal 40 minutes

ISLR 9.1: Maximal Margin Classifier10 minutes
ISLR 9.2: Support Vector Classifiers10 minutes
ISLR 9.3: Support Vector Machines10 minutes
Module 6 Slides10 minutes

1 quizTotal 30 minutes

Week 6 Quiz30 minutes

1 programming assignmentTotal 180 minutes

Week 6: SVM Lab180 minutes

1 peer reviewTotal 120 minutes

Week 6: SVM Lab120 minutes

Instructor

Instructor ratings

2.8 (23 ratings)

Geena Kim

University of Colorado Boulder

3 Courses19,600 learners

Offered by

University of Colorado Boulder

Recommended if you're interested in Machine Learning

University of Colorado Boulder
Unsupervised Algorithms in Machine Learning
Make progress toward a degree
Course
University of Colorado Boulder
Machine Learning: Theory and Hands-on Practice with Python
Make progress toward a degree
Specialization
University of Colorado Boulder
Introduction to Deep Learning
Make progress toward a degree
Course
University of Colorado Boulder
ANOVA and Experimental Design
Make progress toward a degree
Course

Get a head start on your degree

This course is part of the following degree programs offered by University of Colorado Boulder. If you are admitted and enroll, your coursework can count toward your degree learning and your progress can transfer with you.

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

Showing 3 of 45

3.4

45 reviews

5 stars
37.77%
4 stars
20%
3 stars
8.88%
2 stars
8.88%
1 star
24.44%

Reviewed on Aug 1, 2023

Reviewed on Apr 4, 2024

Reviewed on May 20, 2022

View more reviews

New to Machine Learning? Start here.

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

A cross-listed course is offered under two or more CU Boulder degree programs on Coursera. For example, Dynamic Programming, Greedy Algorithms is offered as both CSCA 5414 for the MS-CS and DTSA 5503 for the MS-DS.

· You may not earn credit for more than one version of a cross-listed course.

· You can identify cross-listed courses by checking your program’s student handbook.

· Your transcript will be affected. Cross-listed courses are considered equivalent when evaluating graduation requirements. However, we encourage you to take your program's versions of cross-listed courses (when available) to ensure your CU transcript reflects the substantial amount of coursework you are completing directly in your home department. Any courses you complete from another program will appear on your CU transcript with that program’s course prefix (e.g., DTSA vs. CSCA).

· Programs may have different minimum grade requirements for admission and graduation. For example, the MS-DS requires a C or better on all courses for graduation (and a 3.0 pathway GPA for admission), whereas the MS-CS requires a B or better on all breadth courses and a C or better on all elective courses for graduation (and a B or better on each pathway course for admission). All programs require students to maintain a 3.0 cumulative GPA for admission and graduation.

Yes. Cross-listed courses are considered equivalent when evaluating graduation requirements. You can identify cross-listed courses by checking your program’s student handbook.

You may upgrade and pay tuition during any open enrollment period to earn graduate-level CU Boulder credit for << this course/ courses in this specialization>>. Because << this course is / these courses are >> cross listed in both the MS in Computer Science and the MS in Data Science programs, you will need to determine which program you would like to earn the credit from before you upgrade.

MS in Data Science (MS-DS) Credit: To upgrade to the for-credit data science (DTSA) version of << this course / these courses >>, use the MS-DS enrollment form. See How It Works.

MS in Computer Science (MS-CS) Credit: To upgrade to the for-credit computer science (CSCA) version of << this course / these courses >>, use the MS-CS enrollment form. See How It Works.

If you are unsure of which program is the best fit for you, review the MS-CS and MS-DS program websites, and then contact datascience@colorado.edu or mscscoursera-info@colorado.edu if you still have questions.

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

Introduction to Machine Learning: Supervised Learning

Course

What you'll learn

Skills you'll gain

Details to know

Course

See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise

Earn a career certificate

There are 6 modules in this course

Introduction to Machine Learning, Linear Regression

What's included

Multilinear Regression

What's included

Logistic Regression

What's included

Non-parametric Models

What's included

Ensemble Methods

What's included

Kernel Method

What's included

Instructor

Offered by

Recommended if you're interested in Machine Learning

Unsupervised Algorithms in Machine Learning

Machine Learning: Theory and Hands-on Practice with Python

Introduction to Deep Learning

ANOVA and Experimental Design

Get a head start on your degree

Master of Science in Data Science

Master of Science in Computer Science

Master of Engineering in Engineering Management

Master of Science in Electrical Engineering

Why people choose Coursera for their career

Learner reviews

New to Machine Learning? Start here.

Open new doors with Coursera Plus

Advance your career with an online degree

Join over 3,400 global companies that choose Coursera for Business

Frequently asked questions

What is a cross-listed course?

Can I take cross-listed courses to fulfill my degree requirements?

How do I upgrade and earn credit from CU Boulder?

More questions