Should internet firms pay for the data users currently give away?
互联网公司应该向用户支付数据使用费吗?
YOU have multiple jobs, whether you know it or not. Most begin first thing in the morning, when you pick up your phone and begin generating the data that make up Silicon Valley’s most important resource. That, at least, is how we ought to think about the role of data-creation in the economy, according to a fascinating new economics paper. We are all digital labourers, helping make possible the fortunes generated by firms like Google and Facebook, the authors argue. If the economy is to function properly in the future—and if a crisis of technological unemployment is to be avoided—we must take account of this, and change the relationship between big internet companies and their users.
无论你是否知道,其实你身兼数职。大部分开始于早晨你做的第一件事——拿起手机,开始产生数据,而这是硅谷最重要的资源。一篇新发表的有趣的经济学论文表示,至少我们应该这样来衡量创造数据对经济所起的作用。文章作者认为,我们都是数字化的劳工,帮助谷歌和脸书这样的大企业创造财富。如果未来经济要正常运转,并且避免技术性失业危机的发生,我们必须要考虑这一点,改变大型互联网企业和用户之间的关系。
Artificial intelligence (AI) is getting better all the time, and stands poised to transform a host of industries, say the authors (Imanol Arrieta Ibarra and Diego Jiménez Hernández, of Stanford University, Leonard Goff, of Columbia University, and Jaron Lanier and Glen Weyl, of Microsoft). But, in order to learn to drive a car or recognisea face, the algorithms that make clever machines tick must usually be trained on massive amounts of data. Internet firms gather these data from users every time they click on a Google search result, say, or issue a command to Alexa. They also hoover up valuable data from users through the use of tools like reCAPTCHA, which ask visitors to solve problems that are easy for humans but hard for AIs, such as deciphering text from books that machines are unable to parse. That does not just screen out malicious bots, but also helps digitise books. People “pay” for useful free services by providing firms with the data they crave.
文章作者们(斯坦福大学的伊马诺尔·阿列塔·伊巴拉和迭戈希·门尼斯·埃尔南德斯、哥伦比亚大学的伦纳德·勒戈夫以及微软公司的杰伦•拉尼尔和格伦•外尔)一致认为,人工智能(AI)技术日趋完善,势必将为多个行业带来变革。但是,为了学习驾驶汽车或是进行面部识别,计算机必须通过算法来进行训练,而训练中就要使用大量的数据。每当用户点击谷歌搜索结果或是在Alexa上下达指令的时候,互联网公司都会收集这些数据。互联网公司同样使用工具,如使用验证码,从用户获取有价值的数据,要求访问者解决对于人类来说轻而易举但对于AI来说却非常困难的问题,例如解析书上的文字,这对于机器来说是不可能的。这不仅能够筛选出恶意的网络机器人,还能帮助将图书转为数字形式。人们向互联网公司提供了他们渴望的数据,以此来换取的免费网络服务。
crave /kreɪv/ CET6+ TEM8 (cravingcravedcraves) V-T If you crave something, you want to have it very much. 渴望得到
These data become part of the firms’ capital, and, as such, a fearsome source of competitive advantage. Would-be startups that might challenge internet giants cannot train their AIs without access to the data only those giants possess. Their best hope is often to be acquired by those very same titans, adding to the problem of uncompetitive markets.
这些数据成为企业资本的一部分,因此也是竞争优势的可怕来源。那些有可能挑战互联网巨头的初创公司,由于无法获得这些被垄断的数据,因而无法训练他们的人工智能。他们最大的希望就是被同业互联网巨头收购,加重了市场的垄断程度。
That, for now, AI’s contributions to productivity growth are small, the authors say, is partly because of the free-data model, which limits the quality of data gathered. Firms trying to develop useful applications for AI must hope that the data they have are sufficient, or come up with ways to coax users into providing them with better information at no cost. For example, they must **pester **random people—like those blur-deciphering visitors to websites—into labelling data, and hope that in their annoyance and haste they do not make mistakes.
文章作者认为,目前人工智能对于生产率增长的贡献还很小,部分原因在于免费数据的模式限制了数据的质量。如果互联网公司想要开发更有用的人工智能程序,他们一定希望能取得足够的数据,或者想方设法哄骗用户,免费获得更好的用户信息。例如,他们会随意骚扰随机用户——比如识别模糊验证码的网站访问者,进行数据标注,希望他们在不胜其扰的情况下也不会犯错误。
pester /ˈpɛstə/ CET6+ TEM8 (pesteringpesteredpesters) V-T If you say that someone is pestering you, you mean that they keep asking you to do something, or keep talking to you, and you find this annoying. 纠缠、烦扰
Even so, as AI improves, the amount of work made vulnerable to displacement by technology grows, and ever more of the value generated in the economy accrues to profitable firms rather than workers. As the authors point out, the share of GDP paid out to workers in wages and salaries—once thought to be relatively stable—has already been declining over the past few decades.
即便如此,随着人工智能技术的进步,越来越多的工作岗位将更容易被取代,而在经济发展中产生的价值则会越来越多地落到公司的口袋,而非惠及普通员工。正如作者所指出,在过去的几十年里,工人的工资和薪金占GDP的份额——一度被认为是相对稳定的,在逐年下降。
To tackle these problems, they have a radical proposal. Rather than being regarded as capital, data should be treated as labour—and, more specifically, regarded as the property of those who generate such information, unless they agree to provide it to firms in exchange for payment. In such a world, user data might be sold multiple times, to multiple firms, reducing the extent to which data sets serve as barriers to entry. Payments to users for their data would help spread the wealth generated by AI. Firms could also potentially generate better data by paying. Rather than guess what a person is up to as they wander around a shopping centre, for example, firms could ask individuals to share information on which shops were visited and which items were viewed, in exchange for payment. Perhaps most ambitiously, the authors muse that data labour could come to be seen as useful work, conferring the same sort of dignity as paid employment: a desirable side-effect in a possible future of mass automation.
为了解决这些问题,他们提出了一个非常激进的建议。 数据不应被视为一种资本,而应当作为一种劳动,更确切地说,应该作为创造这些信息的人的财产,除非他们同意向企业有偿提供这些信息。在这种情况下,用户的数据可能会被多次出售给不同的公司,同时也能够降低作为入门门槛的数据收集难度。对用户的数据付费有助于扩散人工智能创造的财富。企业也将有可能通过付费获得质量更好的数据。 例如,当一个人在商场闲逛时,商家可以不用去猜测这个人在干什么,而可以通过付费的方式要求个人分享诸如他逛了哪些商店,看了哪些商品之类的信息。而作者大胆地设想,在不久的将来,数据劳动也能够被当成是一种有价值的工作,并且能够获得与有薪工作同样的尊严,而这正是未来可能的大规模自动化带来的一种好的副作用。
The authors’ ideas need fleshing out; their paper, thought-provoking though it is, runs to only five pages. Parts of the envisioned scheme seem impractical. Would people really be interested in taking the time to describe their morning routine or office habits without a substantial monetary **inducement **(and would their data be valuable enough for firms to pay a substantial amount)? Might not such systems attract data mercenaries, spamming firms with useless junk data simply to make a quick buck?
从目前来看,作者的想法需要充实,他们的文章尽管发人深省,但是全文仅仅只有5页。部分设想看上去不太实际。在没有足够大金钱利益驱动下,人们真的会有兴趣去花时间记录自己的日常习惯或者是办公习惯吗(或是用户数据的价值是否足以使企业去支付大量的费用)?这样的体系下是不是会吸引大批唯利是图,企图通过垃圾邮件或者无用数据来赚快钱的公司?
Inducement /ɪnˈdju:smənt/ TEM8 GMAT N : There is little inducement for them to work harder. 诱导,劝诱;诱因,动机
Nothing to use but your brains
什么都不用,只用大脑
Still, the paper contains essential insights which should frame discussion of data’s role in the economy. One concerns the imbalance of power in the market for data. That stems partly from concentration among big internet firms. But it is also because, though data may be extremely valuable in aggregate, an individual’s personal data typically are not. For one Facebook user to threaten to deprive Facebook of his data is no threat at all. So effective negotiation with internet firms might require collective action: and the formation, perhaps, of a “data-labour union”.
尽管如此,论文提出了数据在经济中的作用的基本见解。其中一点是目前数据市场权力的不平衡,这源于大部分的资源都集中在大型互联网公司手里。但是这也是因为数据只有在成规模的时候才有价值,单个的数据信息并没有那么有价值。比如说一个脸书的用户威胁他将从脸书上自己的数据,而这根本无法构成威胁。因此,同互联网公司进行有效谈判可能也得采取抱团的方式,也许通过‘数据劳工工会’这种形式比较合适。
deprive / dɪˈpraɪv / TEM8/CET/IELTS/TOEFL V-T: The were imprisoned and deprived of their basic right 剥夺,夺去,使丧失
This might have drawbacks. A union might demand too much in compensation for data, for example, impairing the development of useful AIs. It might make all user data freely available and extract compensation by demanding a share of firms’ profits; that would rule out the pay-for-data labour model the authors see as vital to improving data quality. Still, a data union holds potential as a way of solidifying worker power at a time when conventional unions struggle to remain relevant.
但是工会的模式可能也有问题。例如,它可能会对于数据要价太高从而影响那些非常实用的人工智能技术的发展。它可能让所有用户的数据可以自由获取,但用户通过从企业收益中获取分成从而获得补偿。如果这样,无疑会毁掉数据付费使用这一劳动模式,而这种模式正是作者希望用来提升数据质量至关重要的一环。尽管如此,当传统工会难以保持利益相关性之时,数据工会依然可能成为保护劳动者权利的一种方式。
Most important, the authors’ proposal puts front and centre the collective nature of value in an AI world. Each person becomes something like an oil well, pumping out the fuel that makes the digital economy run. Both fairness and efficiency demand that the distribution of income generated by that fuel should be shared more evenly, according to our contributions. The tricky part is working out how.
最重要的是,作者的建议将人工智能世界中价值的集体性质置于超前和中心的位置。 每个人都变成了一个“油井”,喷发出让数字经济运行的“石油”。公平和效率都要求根据贡献来更加均等地分配由我们生产的“石油”所产生的回报。理想很美好,但难的是如何将一切付诸行动。
References: [取经号](https://qujinghao.com/2018/01/27/5418/)