【090】不要迷信大数据|The era of blind faith in big data must end

Speaker: Cathy O'Neil

Key words:数据 算法 歧视

Abstract:算法对每个的的生活都很重要。数学家和大数据科学家Cathy O'Neil告诉我们:1、警惕不要让算法成为rule maker创造出来剥削他人的数学武器。2、很多时候,建立算法时使用的数据本身可能存在缺陷,这会导致算法的不正确和不公平,对此我们应采取措施。

@TED: Algorithms decide who gets a loan, who gets a job interview, who gets insurance and much more -- but they don't automatically make things fair, and they're often far from scientific. Mathematician and data scientist Cathy O'Neil coined a term for algorithms that are secret, important and harmful: "weapons of math destruction." Learn more about the hidden agendas behind these supposedly objective formulas and why we need to start building better ones.

Content:

Fact:

  • Algorithm is everywhere and is used to sort and separate the winners from the losers
  • Algorithms are opinions embedded in code
  • Algorithms are not always objective and true and scientific.

Question: What if the algorithms are wrong?

Two elements of algorithm:

  • you need data, what happened in the past
  • and a definition of success, the thing you're looking for and often hoping for
  • You train an algorithm by looking, figuring out.

Bias affect algorithms:

Eg:

  1. the algorithm uesd to select persons at the hiring process in Fox News is more likely to succeed usually filter out women because they do not look like people who were successful in the past.

  2. when we send the police only to the minority neighborhoods to look for crime. The arrest data would be very biased and the algorithm to predict the individual criminality would go wrong

The news organization ProPublica recently looked into one of those "recidivism risk" algorithms, as they're called, being used in Florida during sentencing by judges. Bernard, on the left, the black man, was scored a 10 out of 10. Dylan, on the right, 3 out of 10. 10 out of 10, high risk. 3 out of 10, low risk. They were both brought in for drug possession. They both had records, but Dylan had a felony but Bernard didn't. This matters, because the higher score you are, the more likely you're being given a longer sentence.

Solution: algorithmic audit

  • data integrity check
  • think about the definition of success
  • we have to consider accuracy
  • we have to consider the long-term effects of algorithms, the feedback loops

Suggestion:

  1. for the data scientists: we should not be the arbiters of truth. We should be translators of ethical discussions that happen in larger society.
  1. the non-data scientists: this is not a math test. This is a political fight. We need to demand accountability for our algorithmic overlords.

Hope: The era of blind faith in big data must end.


Link:TED

快来加入#1000个TED学习计划#,在“一千个TED视频的探索之旅”中分享你最好最实用的TED学习笔记

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

  • 今天看了更新的蛙哥漫画,“不知我的苦,就别劝我大度“,蛙哥的朋友小时候被老师冤枉偷东西,在家长面前挨打,在同学面前...
    矢车菊2阅读 7,026评论 0 0
  • 2017年1月5日星期四 15点04分 “威尔伯永远忘不了夏洛。它虽然热爱他的子女、孙子女、曾孙子女,可是这些新蜘...
    悦者阅读 4,316评论 3 5

友情链接更多精彩内容