Research Questions
This page summarizes four projects lead by Lingfei Wu under the topic of "The Science of Attention" since summer 2015 (and we expect to have some primary results for publication at the end of 2015). The common goal of these studies is to understand the following questions:
What are the laws of attention dynamics ?
Can we exploit the power of human attention to develop "attention engine" and make machines as creative as human beings ?
What are the limitations of the intelligence of man-machine systems ?
These are big questions. We do not expect to address all of them in such a short period of time (half-year). But it is possible to find good angles to break these big questions into smaller ones that can be handled by research projects. These projects are designed to combine theoretical insights with practical consequences. Meanwhile, to demonstrate the interdisciplinary nature of attention science we select projects across different knowledge domains.
Previous Findings
My previous studies, including collaborations with Jiang Zhang, Rob Ackland, and Marco Janssen, uncovered the following dynamics of attention:
- The limitation of attention flow on the Web [1];
- The accelerating growth of attention as a power-law function of the number of active users in online communities [2][3][4][6];
- The fast decay of attention over time (stretched exponential function) [8];
- The re-production cycle of attention (attention loop and preferential return) [5][10];
- The decentralized tendency of attention allocation (reversed preferential attachment) [9].
Ongoing Projects
1. Location Based Advertising for Moving Citizens
Cheng-jun Wang and Lingfei Wu
The increasing power of search engines has made advertisement more and more precise - computational advertising systems like Google AdSense collect and filter the attention of the most relevant users and sale it to companies. However, how to reach target customers in the physical world precisely is still a question remained unsolved in the advertising industry. In the current study we analyze the anonymized smartphone data of 10^5 Beijing residents in 30 days and correlate their physical movement with flow of attention in cyberspace. In particular, we use network renormalization technique to divide the city into local areas and use information entropy based measurements to identify the most relevant website visited in these local areas. To demonstrate the applied value of our research we create an interactive map that takes the queries of website names as input and lights up the most relevant areas as output. This tool is particularly useful for dot-com companies who are interested in buying outdoor advertising space.
2. Quantifying the Attention Cost of Collaborative Knowledge Production
Lingfei Wu and Zi-gang Huang
Online communities are becoming increasingly important as platforms for knowledge production. In these communities users seek and share professional skills, spreading knowledge along the hierarchy of expertise levels. To investigate how users collaborate with each other in knowledge production, we analyze StackExchange, one of the largest question and answer systems in the world. Our dataset includes the asking and answering activities of 2.7 million users over 5 years across 110 communities. We construct expertise networks to include all pairs of help-seeking interactions and measure the expertise level Li of the ith user based on their positions in expertise networks. We calculate |Lj-Li| between all pairs of linked users i and j and suggest that, this variable characterizes the cost of attention users are willing to pay in helping others, because it is reasonable to assume that the communication between users across more expertise levels usually takes more efforts. We find that the distribution of |Lj-Li| is symmetrical and unimodal, with a small mean and a small standard deviation. This means that users at all expertise levels tend to help those whose levels are slightly below themselves. In other words, people do not go very far out of their "comfort zones" to help others. This observation explains the forming of hierarchy in large-scale knowledge production and provide an important guidance rule for building expert recommendation systems.
3. Predicting the Birth and Death of Sciences
Yanbo Zhang and Lingfei Wu
Like the fashion industry, sciences also experience the surge and decay of areas. From the perspective of attention dynamics, this is the natural consequence of the flow of scientists' collective attention of across different knowledge domains. Different from previous attempts that map the landscape of sciences directly by the topology of knowledge networks (in which nodes are papers or journals and edges are citations or reader-clickstreams), we used the singular vector decomposition (SVD) method to project the analyzed citation networks into a low-dimension Euclidean space. We find that the diffusion of attention in this pace explains 1) the super-linear, power-law growth of the number of citations against the number of papers; and 2) the exponential decay of the citing probability over time. We also show that this low-dimension Euclidean space projection of knowledge map allows us to predict the growth trend of different knowledge domains from their current status. To construct the citation networks we used 5 x 10^5 physics papers published in the past century and 1.3 x 10^5 computer science papers published in the past forty years.
4. Pattern Repeating and Breaking in Song Structure
Hua Xiao and Lingfei Wu
The emotional arousal and pleasure during music listening is a widely observed phenomenon but the mechanism behind it remained unsolved for many years. We propose that the emotional effect of a song is relevant to the information it carries. In particular, a balance between pattern repeating and breaking should be achieved for a song to bring pleasure feeling, which means that either no information or information overload are un-preferred extremes, although the optimal position between these two extremes may vary between different audience. In the current study we use networks to represent the structure of songs. In these networks nodes are repeated patterns and edges are time intervals between them. We analyze the hierarchy of these networks and propose a novel index N to quantify the novelty (symmetry-breaking) of songs. This metric is a fingerprint for personal taste of audience across music genres and thus is an important feature to be considered in personalized music recommendation.
Open Science is Possible
These projects are interesting by themselves, but their impact goes beyond the scientific problems they are addressing. I use these projects to show that, open-science is not only possible, but really a much better way to do science. I got to know the collaborators of the above-listed projects in Swarm Agents Club (SAC) (see below a brief introduction of SAC). These collaborative studies are zero-cost in terms of data, software, and human resource, and are open to the entire scientific community since the birth of scientific ideas.
Swarm Agents Club
Since 2008, we have been gathering a group of young Chinese scientists from different areas including physics, computer science, math, and biology to practice the idea of "open science", in particular, scientific collaboration outside universities. The effort of SAC in promoting interdisciplinary communication and collaboration in the past seven years established its high reputation in China.
Currently SAC is run by two groups of core members working together towards the same goal of promoting open science, 1) 11 SAC cores who are responsible for dealing with daily routine activities such as hosting study groups and seminars, and 2) 4 global scientific committee members who are responsible for formulating and recommending academic and planning goals and initiatives for the club.
Besides the continuous input in hosting seminars in a Cafe in Beijing and maintaining an email list of 500+ professional members, our products now include 1 book on artificial intelligence, 1 APP using deep learning technique to predict weather at minute-level and meter resolution, which is providing service for 10+ million users, and many high-impact peer-review papers (e.g., Scientific Reports, CVPR, etc).
Reference
[1] Wu L. and R. Ackland (2014), How Web1.0 fails: The mismatch between hyperlinks and clickstreams, Social Network Analysis and Mining, 4: 202 – 206.
[2] Wu L. (2011), The accelerating growth of online tagging systems, European Physical Journal B, 83(2): 283-287.
[3] Wu L. and J. Zhang (2011), Accelerating growth and size-dependent distribution of human online activities, Physical Review E, 84 (2): 026113-026117.
[4] Wu, L., J. Zhang, and M. Zhao (2014), The metabolism and growth of web forums, PLoS ONE, 9(8): e102646.
[5] Zhang J. and L. Wu (2013), Allometry and dissipation of ecological networks, PLoS ONE, 8(9): e72525.
[6] Zhang J., X. Li, X. Wang, W. Wang, L. Wu (2015), Scaling behaviors in the growth of networked systems and their geometric origins, Scientific Reports, 5: 9767.
[7] Wu L. and J. Zhang (2013), The decentralized structure of collective attention on the Web, European Physical Journal B, 86(6): 266-277.
[8] C. Wang, L. Wu, J. Zhang, M. Janssen (2015), The Hidden Geometry of Attention Diffusion, Under Review.
[9] L. Wu, J. Baggio, M. Janssen (2015), The Dynamics of Collaborative Knowledge Production, Under Review.
[10] L. Wu and M. Janssen (2015), Attention as A Limited Resource in Digital Commons, Working Paper.