DS Interview Question--Missing Values

Q: During analysis, how do you treat missing values?

A: 

First, we need to know the pattern of missing data:1. Missing completely at random (MCAR): there is no pattern in the missing data on any variables. (The most and the best situation); 2. Missing at random (pattern not affect primary dependent variables);3. Missing not at random (pattern affect primary dependent variables)

And then we can choose different methods to deal with missing values:

Deletion: If we have enough observations and the missing data is random, we can delete the observations with missing values and don't introduce bias.

Imputation: 1. Replace missing values with mean/ median/ mode or set default value; 2. Replace missing data by building models(eg. Regression/ KNN, etc.)

Others: Complex methods like Multiple Imputation (MI), Hot Deck, etc.

Ignorance: Some models, like random forest, can deal with missing values by itself.


Interview questions are from DataAppLab (Wechat: Datalaus)

Jun.27th, 2017  Seattle

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 清晨,初曦中学的学生三三两两的在校园里慢慢的,说说笑笑的走着。初中部教学楼上,初三六班的窗口,有一双眼睛看着对偌...
    某个又名茶凉的落瑾阅读 203评论 0 0
  • 一个广场舞大妈曾告诉我,如果她跳的足够快,她的孤独就追不上她;一位拾荒大叔曾告诉我,如果他翻垃圾翻的足够仔细,便能...
    静静027阅读 390评论 0 0
  • 一生之中需要经历多少才能理解存在? 爱情,亲情,友情 不停的交集
    嘻嘻哈哈喝牛奶阅读 232评论 0 0
  • 1、你要记得那些大雨中为你撑伞的人,帮你挡住外来之物的人,黑暗中默默陪伴你的人,逗你笑的人,陪你彻夜聊天的人,坐车...
    白开水的故事阅读 297评论 0 2
  • 提问:什么是欲望? 知识:对能给以愉快或满足的事物或经验的有意识的愿望。 解码:欲望就是比想要这种状态还要上一个层...
    Xiao铭阅读 299评论 0 0