返回到 Machine Learning: Classification

星

3,685 个评分

Case Studies: Analyzing Sentiment & Loan Default Prediction
In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification.
In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks. You will become familiar with the most successful techniques, which are most widely used in practice, including logistic regression, decision trees and boosting. In addition, you will be able to design and implement the underlying algorithms that can learn these models at scale, using stochastic gradient ascent. You will implement these technique on real-world, large-scale machine learning tasks. You will also address significant tasks you will face in real-world applications of ML, including handling missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and full of visualizations and illustrations of how these techniques will behave on real data. We've also included optional content in every module, covering advanced topics for those who want to go even deeper!
Learning Objectives: By the end of this course, you will be able to:
-Describe the input and output of a classification model.
-Tackle both binary and multiclass classification problems.
-Implement a logistic regression model for large-scale classification.
-Create a non-linear model using decision trees.
-Improve the performance of any model using boosting.
-Scale your methods with stochastic gradient ascent.
-Describe the underlying decision boundaries.
-Build a classification model to predict sentiment in a product review dataset.
-Analyze financial data to predict loan defaults.
-Use techniques for handling missing data.
-Evaluate your models using precision-recall metrics.
-Implement these techniques in Python (or in the language of your choice, though Python is highly recommended)....

SM

Jun 14, 2020

A very deep and comprehensive course for learning some of the core fundamentals of Machine Learning. Can get a bit frustrating at times because of numerous assignments :P but a fun thing overall :)

SS

Oct 15, 2016

Hats off to the team who put the course together! Prof Guestrin is a great teacher. The course gave me in-depth knowledge regarding classification and the math and intuition behind it. It was fun!

筛选依据：

创建者 Alex H

•Feb 7, 2018

Relying on a non-open source library for all of the code examples vitiates the value of this course. It should use Pandas and sklearn.

创建者 Lewis C L

•Jun 13, 2019

First, coursera is a ghost town. There is no activity on the forum. Real responses stopped a year ago. Most of the activity is from 3 years ago. This course is dead.

Two, this course seems to approach the topic as teaching inadequate ways to perform various tasks to show the inadequacies. You can learn from that; we will make mistakes or use approaches that are less than ideal. But, that should be a quick "don't do this," while moving on to better approaches

Three, the professors seem to dismiss batch learning as a "dodgy" technique. If Hinton, Bengio, and other intellectual leaders of the field recommend it as the preferred technique, then it probably is.

Four, the professors emphasize log likelihood. Mathematically, minus the log likelihood is the same as cross-entropy cost. The latter is more robust and applicable to nearly every classification problem (except decision trees), and so is a more versatile formulation. As neither actually plays any roll in the training algorithm except as guidance for the gradient and epsilon formulas and as a diagnostic, the more versatile and robust approach should be preferred.

The professors seem very focused on decision trees. Despite the "apparent" intuitive appeal and computational tractability, the technique seems to be eclipsed by other methods. Worth teaching and occasionally using to be sure, but not for 3/4 of the course.

There are many mechanical problems that remain in the material. At least 6 errors in formulas or instructions remain. Most can be searched for on the forum to find some resolution, through a lot of noise. Since the last corrections were made 3 years ago, the UW or Coursera's lack of interest shows.

It was a bit unnecessary to use a huge dataset that resulted in a training matrix or over 10 billion cells. Sure, if you wanted to focus on methods for scaling--very valuable indeed--go for it. But, this lead to unnecessary long training times and data issues that were, at best, orthogonal to the overall purpose of highlighting classification techniques and encouraging good insights about how classification techniques work.

The best thing about the course was the willingness to allow various technologies to be used. The developers went to some lengths to make this possible. It was far more work to stray outside the velvet ropes of the Jupiter notebooks, but it was very rewarding.

Finally, the quizzes were dependent on numerical point answers that could often be matched only by using the same exact technology and somewhat sloppy approaches (no lowercase for word sentiment analysis, etc.). It does take some cleverness to think of questions that lead to the right answer if the concepts are implemented properly. It doesn't count when the answers rely precisely on anomalies.

I learned a lot, but only because I wrote my own code and was able to think more clearly about it, but that was somewhat of a side effect.

All in all, a disappointing somewhat out of date class.

创建者 Saqib N S

•Oct 16, 2016

Hats off to the team who put the course together! Prof Guestrin is a great teacher. The course gave me in-depth knowledge regarding classification and the math and intuition behind it. It was fun!

创建者 Ian F

•Jul 17, 2017

Good overview of classification. The python was easier in this section than previous sections (although maybe I'm just better at it by this point.) The topics were still as informative though!

创建者 RAJKUMAR R V

•Oct 2, 2019

It will definitely help you in understanding the basics to dept of most of the algorithms. Even though you are already aware of most of the things covered elsewhere related to Classification, this course will add up up a considerable amount of extra inputs which will help to understand and explore more things in Machine learning.

创建者 Christian J

•Jan 25, 2017

Very impressive course, I would recommend taking course 1 and 2 in this specialization first since they skip over some things in this course that they have explained thoroughly in those courses

创建者 Jason M C

•Mar 29, 2016

This continues UWash's outstanding Machine Learning series of classes, and is equally as impressive, if not moreso, then the Regression class it follows. I'm super-excited for the next class!

创建者 Feng G

•Jul 12, 2018

Very helpful. Many ThanksSome suggestions:1.Please add LDA into the module.2.It is really important if you guys can provide more examples for pandas and scikit-learn users in programming assignments like you do in regression module.

创建者 Saransh A

•Oct 31, 2016

Well this series just doesn't seize to amaze me! Another great course after the introductory and regression course. Though I really missed detailed explanations of Random Forest and other Ensemble methods. Also, SVM was not discussed, but there were many other topics which all other courses and books easily skips. The programming assignments were fine, more focused on teaching the algorithms than trapping someone in the coding part. This series is the series for someone who really wants to get a hold of what machine learning really is. One thing which I really like about this course is that there are optional videos from time to time, where they discuss the mathematical aspects of the algorithms that they teach. Which really quenches my thirst for mathematical rigour. Definitely continuing this specialisation forward

创建者 Sauvage F

•Mar 29, 2016

Excellent Course, I'm very found of Carlos jokes mixed with the hard concepts ^^. Lectures are precise, concise and comprehensive. I really enjoyed diving in depths of the algorithms' mechanics (like Emily did in the Regression Course). I also deeply appreciated the real-world examples in the lectures and real world datasets of assignments.

Some may regret the absence of a few "classic" algorithms like SVM but Carlos definitely made his point about it in the forum and did not exclude the addition of an optional module about it.

I found some of the assignments less challenging than during the Regression Course, but maybe I'm just getting better at Machine-Learning and Python ^^.

Thanks again to Emily and Carlos for the brilliant work at this very promising specialization.

创建者 uma m r m

•Aug 4, 2018

I can give a five star for this course, but removed one star cause graphlab api annoyed me a lot of times. The theory covered in this is course is good. The programming assignments are well structured but if api's like pandas, numpy, scikit learn were used it would have made my life easy.

创建者 Dilip K

•Dec 21, 2016

Excellent course that I have already recommended to a couple of people. Only annoying thing is the continued inconsistency between the Graphlab version and other versions (I use sframe with python - no graphlab) - some of the instructions are less than clear and needlessly waste time.

创建者 Daisuke H

•May 18, 2016

I really love this Classification course as well as Regression course!! This course is covering both mathematical background and practical implementation very well. Assignments are moderately challenging and it was a very good exercise for me to have a good intuition about classification algorithms. I only used standard Python libraries such as numpy, scikit-learn, matplotlib and pandas, and there were no problems for me to complete all of the assignments without any use of IPython, SFrames, GraphLab Create at all. I would say thank you so much to Carlos and Emily to give me such a great course!!

P.S. This course would be perfect if it covered bootstrap and Random Forest in details.

创建者 Ridhwanul H

•Oct 16, 2017

As usual this was also a great course, except

⊃゜Д゜）⊃ decision trees ⊂（゜Д゜⊂

I am not saying presently anythings bad or incorrect, but I just dont feel familiar with this. It is one tough topic to understand. I think it would have been great if there were some videos and lectures where some programming example were also given, this would have helped out a lot in programming assignments.

Also there is another thing that I think should have been addressed (at least in one of the courses, unless you did it in course 4 the last one which I havent done yet) : vectorisation - instead of looping through each weight how it could be achieved at once through vectorisation.

创建者 Gerard A

•May 18, 2020

So, there appear to be a lot of smarter people than me out there. Learnt some good python basics and the skeleton approach is quite OK as doing it from scratch for persons who studied maths at uni many years ago is may be a bridge too far. Carlos is great but I feel that 1) the ADABoost could have had an example to relate to - I looked on youtube and it clicked then 2) I miss the discussion on gini coeff. and when to use which type of Decision trees 3) SVM, Baysian missing meaning 2 courses instead of 1 here really required. 4) no tutors so how many are taking the course - few and why? 5) dropping the original 2 last modules seems not a great idea.

创建者 Apurva A

•Jun 14, 2016

This course is very nice and covers some of the very important concepts like decision trees, boosting, and online learning apart form logistic regression. More importantly, everything here has been implemented from scratch and so the understanding of codes becomes very easy.

The lectures and slides were very intuitive. Carlos has explained everything very properly and even some of the very tough concepts have been explained in a proper manner from figures and graphs.

There are lots lots of python assignments to review what have we learned in the lectures.

Overall, its a must take course for all who wants an insight about classification in ML.

创建者 Edward F

•Jun 25, 2017

I took the 4 (formerly 6) courses that comprised this certification, so I'm going to provide the same review for all of them.

This course and the specialization are fantastic. The subject matter is very interesting, at least to me, and the professors are excellent, conveying what could be considered advanced material in a very down-to-Earth way. The tools they provide to examine the material are useful and they stretch you out just far enough.

My only regret/negative is that they were unable to complete the full syllabus promised for this specialization, which included recommender systems and deep learning. I hope they get to do that some day.

创建者 Benoit P

•Dec 29, 2016

This whole specialization is an outstanding program: the instructors are entertaining, and they strike the right balance between theory and practice. Even though I consider myself quite literate in statistics and numerical optimization, I learned several new techniques that I was able to directly apply in various part of my job. We really go in depth: while other classes I've taken limit themselves to an inventory of available techniques, in this specialization I get to implement key techniques from scratch. Highly, highly recommended.

FYI: the Python level required is really minimal, and the total time commitment is around 4 hours per week.

创建者 Liang-Yao W

•Aug 11, 2017

The course walk through (and work through) concepts of linear classifier, logistic regression, decision trees, boosting, etc. For me it is a good introduction to these fundamental ideas with depth but not too deep to be distracted.

I personally become interested in knowing a bit more theoretical basis of the tools or concepts like boosting or maximum likelihood. The course understandably doesn't go that much into math and theory which leaves me a bit unsatisfied :P. But that is probably too much to ask for a short course and I do think the course covers great materials already.

创建者 Paul C

•Aug 13, 2016

This Machine Learning class and the rest of the Machine Learning series from the University of Washington is the best material on the subject matter. What really sets this course and series apart is the case-base methodology as well as in-depth technical subject matter. Specifically, the step through coding of the algorithms provides key insight that is seriously missed in other classes even in traditional academic settings. I highly encourage the authors and other Coursera publishers to continue to publish more educational material in the same framework.

创建者 Sean S

•Mar 9, 2018

I am generally very happy with the style, pace, and content of this entire specialization. This course is no exception and exposed me to a lot of new concepts and helped me to improve my python programming skills. I am left wondering if the programming assignments were made easier over time given all of the hints and "checkpoints" for code that was already supplied. I understand this is not a programming course but I probably would have been okay with toiling away at the algorithms for a few more hours without the hints. But that's just me. Great course.

创建者 Ferenc F P

•Jan 18, 2018

This is a very good course in classification. Starts with logistic regression (w. and wo. regularization) and then makes a very good introduction to decision trees and boosting. Also has a very good explanation about stochastic gradient descent. The only drawback is that for some Quizes the result is different with scikit-learn than with Graphlab while the Quiz is prepared for Graphlab results. Thus, with scikit-learn one may fail some of them.

创建者 Samuel d Z

•Jul 10, 2017

AWESOME!!! Very well structured. Concepts are explained in small and short videos which focus on one thing at the time. Unnecessary clutter is removed and deep dives can now be done with this solid foundation. Also the Python programming part teaches so much and again, only asked to program the essentials and non essentials or "special tricks" are done, so you can see and learn from them without having to search on the web. THANKS.

创建者 Adrian L

•Sep 2, 2020

Really good, excellent approach on demonstrating logistic regression classification, decision trees, boosting, dealing with overfitting, missing data and different tools to improve results adapted to our challenges.

Recommended for those who are interested to get into algorithms and statistics behind scenes on current popular Classification algorithms and apply them either using TuriCreate or scikitlearn (Python).

Thanks.

创建者 Yifei L

•Mar 27, 2016

This is a very good course on classification as previous two.

Good explanation on topics like logistic regression, stochastic gradient descent. The assignments are well designed.

However the decision tree part should introduce entropy and gini which are mainly used for choosing the splitting feature. Also the random forest is worth discussing.

Overall, this is a good course which contains a handful of knowledge.

- Google 数据分析师
- Google 数字营销和电子商务专业证书
- Google IT 自动化与 Python 专业证书
- Google IT 支持
- Google 项目管理
- Google UX 设计
- 备考 Google Cloud 认证：云架构师
- IBM Cybersecurity Analyst
- IBM 数据分析师
- IBM 数据工程
- IBM 数据科学
- IBM 全栈云开发人员
- IBM Machine Learning
- 直觉簿记
- Meta Front-End Developer
- DeepLearning.AI TensorFlow 开发者专业证书
- SAS 程序员专业证书
- 开启您的职业生涯
- 准备证书
- 开拓职业生涯
- 如何识别 Python 语法错误
- 如何捕捉 Python 异常
- 查看所有编程教程