参与人员:
- 余艾锶、2. 程会林、3. 黄莉婷、4. 梁清源、5. 曾伟、6. 陈南浩
完成检查:博客(读书笔记)、课后习题答案、代码、回答问题
《Text Mining and Analytics》(12.13)
https://www.coursera.org/learn/text-mining
Week1:
Guiding Questions
Develop your answers to the following guiding questions while watching the video lectures throughout the week.
- What does a computer have to do in order to understand a natural language sentence?
- What is ambiguity?
- Why is natural language processing (NLP) difficult for computers?
- What is bag-of-words representation?
- Why is this word-based representation more robust than representations derived from syntactic and semantic analysis of text?
- What is a paradigmatic relation?
- What is a syntagmatic relation?
- What is the general idea for discovering paradigmatic relations from text?
- What is the general idea for discovering syntagmatic relations from text?
- Why do we want to do Term Frequency Transformation when computing similarity of context?
- How does BM25 Term Frequency transformation work?
- Why do we want to do Inverse Document Frequency (IDF) weighting when computing similarity of context?
未完成:
已完成:
黄莉婷
http://blog.csdn.net/weixin_40962955/article/details/78828721
梁清源
http://blog.csdn.net/qq_33414271/article/details/78802272
http://www.jianshu.com/u/337e85e2a284
曾伟
http://www.jianshu.com/p/9e520d5ccdaa
程会林
http://blog.csdn.net/qq_35159009/article/details/78836340
余艾锶
http://blog.csdn.net/xy773545778/article/details/78829053
陈南浩
http://blog.csdn.net/DranGoo/article/details/78850788
Week2:
Guiding Questions
Develop your answers to the following guiding questions while watching the video lectures throughout the week.
- What is entropy? For what kind of random variables does the entropy function reach its minimum and maximum, respectively? 1
- What is conditional entropy? 2
- What is the relation between conditional entropy H(X|Y) and entropy H(X)? Which is larger? 3
- How can conditional entropy be used for discovering syntagmatic relations? 4
- What is mutual information I(X;Y)? How is it related to entropy H(X) and conditional entropy H(X|Y)? 5
- What’s the minimum value of I(X;Y)? Is it symmetric? 6
- For what kind of X and Y, does mutual information I(X;Y) reach its minimum? For a given X, for what Y does I(X;Y) reach its maximum? 1
- Why is mutual information sometimes more useful for discovering syntagmatic relations than conditional entropy?
What is a topic? 2 - How can we define the task of topic mining and analysis computationally? What’s the input? What’s the output? 3
- How can we heuristically solve the problem of topic mining and analysis by treating a term as a topic? What are the main problems of such an approach? 4
- What are the benefits of representing a topic by a word distribution? 5
- What is a statistical language model? What is a unigram language model? How can we compute the probability of a sequence of words given a unigram language model? 6
- What is Maximum Likelihood estimate of a unigram language model given a text article? 1
- What is the basic idea of Bayesian estimation? What is a prior distribution? What is a posterior distribution? How are they related with each other? What is Bayes rule? 2
未完成:陈南浩
已完成:
梁清源
http://blog.csdn.net/qq_33414271/article/details/78871154
程会林
https://www.jianshu.com/p/61614d406b0f
黄莉婷
http://blog.csdn.net/weixin_40962955/article/details/78877103
余艾锶
http://blog.csdn.net/xy773545778/article/details/78848613
曾伟
http://blog.csdn.net/qq_39759159/article/details/78882651
Week3:
Guiding Questions
Develop your answers to the following guiding questions while watching the video lectures throughout the week.
- What is a mixture model? In general, how do you compute the probability of observing a particular word from a mixture model? What is the general form of the expression for this probability? 3
- What does the maximum likelihood estimate of the component word distributions of a mixture model behave like? In what sense do they “collaborate” and/or “compete”? 4
- Why can we use a fixed background word distribution to force a discovered topic word distribution to reduce its probability on the common (often non-content) words? 5
- What is the basic idea of the EM algorithm? What does the E-step typically do? What does the M-step typically do? In which of the two steps do we typically apply the Bayes rule? Does EM converge to a global maximum? 6
- What is PLSA? How many parameters does a PLSA model have? How is this number affected by the size of our data set to be mined? How can we adjust the standard PLSA to incorporate a prior on a topic word distribution? 1
- How is LDA different from PLSA? What is shared by the two models? 2
未完成:余艾锶
已完成:
程会林:公式归一化为什么不同?
https://www.jianshu.com/p/bcef1ad7a530?utm_campaign=haruki&utm_content=note&utm_medium=reader_share&utm_source=qq
曾伟
http://www.cnblogs.com/Negan-ZW/p/8179076.html
梁清源
http://blog.csdn.net/qq_33414271/article/details/78938301
黄莉婷 LDA 的原理
http://blog.csdn.net/weixin_40962955/article/details/78941383#t10
陈南浩
http://blog.csdn.net/DranGoo/article/details/78968749
Week4:
Guiding Questions
Develop your answers to the following guiding questions while watching the video lectures throughout the week.
- What is clustering? What are some applications of clustering in text mining and analysis? 3
- How can we use a mixture model to do document clustering? How many parameters are there in such a model? 4
- How is the mixture model for document clustering related to a topic model such as PLSA? In what way are they similar? Where are they different? 5
- How do we determine the cluster for each document after estimating all the parameters of a mixture model? 6
- How does hierarchical agglomerative clustering work? How do single-link, complete-link, and average-link work for computing group similarity? Which of these three ways of computing group similarity is least sensitive to outliers in the data? 1
- How do we evaluate clustering results? 2
- What is text categorization? What are some applications of text categorization? 3
- What does the training data for categorization look like?
- How does the Naïve Bayes classifier work? 4
- Why do we often use logarithm in the scoring function for Naïve Bayes? 5
未完成:余艾锶、程会林、黄莉婷、梁清源、曾伟、陈南浩
已完成:
Week5:
未完成:余艾锶、程会林、黄莉婷、梁清源、曾伟、陈南浩
已完成:
Week6:
未完成:余艾锶、程会林、黄莉婷、梁清源、曾伟、陈南浩
已完成:
《Text Retrieval and Search Engines》(12.13)
https://www.coursera.org/learn/text-retrieval
Week1:
未完成:余艾锶、程会林、黄莉婷、梁清源、曾伟、陈南浩
已完成:
Week2:
未完成:余艾锶、程会林、黄莉婷、梁清源、曾伟、陈南浩
已完成:
Week3:
未完成:余艾锶、程会林、黄莉婷、梁清源、曾伟、陈南浩
已完成:
Week4:
未完成:余艾锶、程会林、黄莉婷、梁清源、曾伟、陈南浩
已完成:
Week5:
未完成:余艾锶、程会林、黄莉婷、梁清源、曾伟、陈南浩
已完成:
Week6:
未完成:余艾锶、程会林、黄莉婷、梁清源、曾伟、陈南浩
已完成: