如今人工智能、机器学习相关行业十分火爆,感觉很多同学在实践中只是满足于用用三方程序库,调调参数等工作(主要是说我自己额),算法的基本原理懂,但是其中蕴含的数学思维、哲学思想(这么高级。。。)却没有很好的领悟,这就导致了很难再更进一步,向数据科学家的层次上去迈进。
为了进一步打好基础,今天开始阅读《Computer Age Statistical Inference: Algorithms, Evidence and Data Science》(简称CASI)这本大作,并将其中相关的配图和算法用python代码进行实现,并将相关的理解记录在这里,希望能用通俗易懂的语言将里面的思想说清楚,毕竟能够给人讲明白才是真正的学会了啊。
下面附上这本书的简介和主页。
The twenty-first century has seen a breathtaking expansion of statistical methodology, both in scope and in influence. “Big data,” “data science,” and “machine learning” have become familiar terms in the news, as statistical methods are brought to bear upon the enormous data sets of modern science and commerce. How did we get here? And where are we going? This book takes us on a journey through the revolution in data analysis following the introduction of electronic computation in the 1950s. Beginning with classical inferential theories – Bayesian, frequentist, Fisherian – individual chapters take up a series of influential topics: survival analysis, logistic regression, empirical Bayes, the jackknife and bootstrap, random forests, neural networks, Markov chain Monte Carlo, inference after model selection, and dozens more. The book integrates methodology and algorithms with statistical inference, and ends with speculation on the future direction of statistics and data science.
https://web.stanford.edu/~hastie/CASI/