Analyzing product sentiment

In this module ,we focused on classfiiers,applying them to analyzing product sentiment,and understanding the types of errors a classifier makes. We also built an exciting Ipython notebook for analyzing the sentiment of real product reviews.
In this assignment, we are going to explore this application further, training a sentiment analysis model using a set of key polarizing words, verify the weights learned to each of these words, and compare the results of this simpler classifier with those of the one using all of the words. These technniques will be a core component in your capstone project.
Follow the rest of the instructions on this page to complete your program. When you are done, insdead of uploading your code, you will answer a series of quiz quesions (see the quiz after this reading) to document your completion of this assignment. The instructions will indicate what data to collect for answering the quiz.

Learning outcomes

  • Execute sentiment analysis code with the IPython notebook
  • Load and transform real,text data
  • Using the .apply() function to create new columns(features) for our model
  • Compare results of two models,one using all words and the other using a subset of the words
  • Compare learned models with majority class prediction
  • Examine the predicions of a sentiment model
  • Build a sentiment analysis model using a classifier

Resources oyou will need

You will need to install the software tools or use the free Amazon EC2 machine . Instructions for both options are provided in the reading for Module 1.

Download the data and starter code

Before getting started ,you will need to download the dataset and the starter IPython notebook that we used in the module

  • Download the product review dataset here in SFrame format

What you will do

Now you are ready! We are going do four tasks in this assignment.There are several results you need to gather along the way to enter into the quiz afer this reading.
In the Ipython notebook above,we used the word counts for all words in the reviews to train the sentiment classifier model.
Now ,we are going to follow a similar path, but only use this subset of the words:

Often,ML practitioners will throw out words they consider "unimportant" before training their model. This procedure can often be helpful in terms of accuracy. Here ,we are going to throw out all words except for the very few above. Using so few words in our model will hurt our accuracy,but help us interpret what our classifier is doing.

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 昨晚花了两个多小时,反复写着孩子和丈夫,却隐隐觉得不妥,没有像往常一样一大早起床就登陆简书按发表,今天下午回看那些...
    Wendy徐阅读 2,286评论 10 9
  • 时至今日,紧张繁忙的工作总算告一段落。在过去的60天里,感觉自己一直在和时间赛跑,每天都匆匆的来,又匆...
    雲行天下阅读 1,709评论 3 3
  • 1 自己编写一个乘法表,提示使用人输入一个数字,并输出乘法表。 如下 run print "Which multi...
    然2016阅读 2,047评论 0 0
  • 天上飘的是白云 屏幕上掠过弹幕君 你一言我一语 各处各地,各有新意 弹幕君 弹幕君 他们都着了魔的爱上你 是奋世盖...
    一忆光年阅读 3,989评论 4 5
  • 学生阅读准确,正确率高 上节课作业完成情况:学生作业能认真完成,错误较少,老师能及时将学生的错误改正。
    十里总关情阅读 1,098评论 0 0