CE634 Assignment-2Derive Community Structures from Taxi Flow Network1. IntroductionIn assignment 1, you have performed an exploratory data analysis to derive meaningful statistics of the taxiGPS dataset. Some of the questions listed in assignment 1 allow us to generate insights into the spatial ortemporal distribution of travel demand in NYC. However, none of the questions require us to couple triporigins and destinations together. In other words, the interactions among different locations in the studyarea remain unexplored.In this assignment, you will be asked to perform a network analysis, namely community detection, touncover the hidden structures in the flow network derived from the taxi GPS dataset. According toWikipedia:“In the study of complex networks, a network is said to have community structure if the nodes of thenetwork can be easily grouped into (potentially overlapping) sets of nodes such that each set of nodesis densely connected internally. In the particular case of non-overlapping community finding, thisimplies that the network divides naturally into groups of nodes with dense connections internally andsparser connections between groups. But overlapping communities are also allowed. The moregeneral definition is based on the principle that pairs of nodes are more likely to be connected if theyare both members of the same community(ies), and less likely to be connected if they do not sharecommunities. A related but different problem is community search, here the goal is to find acommunity that a certain vertex belongs to.”Thus, the community detection algorithm(s) can be applied to our taxi GPS dataset to derive communitystructures such that the taxi flows (e.g., origin-destination trips) within the communities are denser whileinter-community flows are sparser. The results could generate insights into the spatial interactions amongdifferent locations in the city.2. About Community Detection2.1. Derive taxi flow network from the GPS datasetThe taxi GPS dataset makes it possible for us to derive the origin-destination (OD) trips that contain richinformation of the location interactions in the Manhattan area. Usually, a community detection algorithmis performed over a network, with the nodes representing particular entities, and the links (and weights)denoting the interactions among these nodes.In this assignment, you will first derive a flow network, in which the nodes are represented by variouslocations (or places) in the study area, where the links between the nodes are measured as the total amountof taxi trips between the corresponding locations.Instead of representing the nodes using road intersections, we use taxi zones in this analysis to representdifferent places (e.g., nodes) in the flow network. The reason is that using taxi zones will significantlyreduce the number of nodes and edges in the network, which makes the computation time (of the communiCE634代做、代写Flow Network、Python程tydetection) more reasonable.2To accomplish this task, you are provided with another file:− intersection_to_zoneThis file maps each road intersection onto a particular taxi zone in Manhattan. Two columns, namelyinter_id and zone_id, denote the id of the road intersection and the taxi zone, respectively.What you need to do is to analyze the original taxi GPS dataset, derive the OD trips at the level of roadintersections, and transform the network onto taxi zones. To make it clear, the final network used for thecommunity detection consists of 63 nodes that denote the taxi zones in Manhattan, with the weight of theedges as the amount of taxi trips between the zones.2.2. Perform the community detection algorithmThere are many community detection algorithms (https://en.wikipedia.org/wiki/Community_structure),with their own pros and cons. In this assignment, you are asked to apply the algorithm proposed by Blondelet al., (2008). The implementation is described in details in this article.3However, to facilitate the analysis, you are encouraged to use existing libraries & APIs. In particular, igraph(http://igraph.org/), which is a collection of network analysis tools, allows you to perform this algorithmthrough a couple of programming languages such as Python, R, and C. The python API for this algorithmcan be found at http://igraph.org/python/doc/igraph.Graph-class.html#community_multilevel.A few things for your attention:• You first have to figure out how to install the package (http://igraph.org/python/);• And then follow the tutorial and learn how to establish a graph, i.e., network(http://igraph.org/python/doc/tutorial/tutorial.html);• Then, apply the so called multilevel community detection algorithm proposed by Blondel et al.3. Tasks(1) Derive the flow network at the level of taxi zones.(2) Understand the multilevel community detection algorithm and perform it over the flow network. Theoutput of your analysis would be a collection of clusters or communities, with each community includinga list of taxi zones with frequent interactions.(3) You are asked to form the flow network during the following time periods, and derive the correspondingcommunities:• Using taxi trips occurred during 07:00 – 09:00 throughout the whole year• And those during 16:00 – 18:00 throughout the whole year• Using taxi trips of each of the twelve months.4. What to Submit− A word document or pdf file with 1-2 paragraphs of your understanding of the multilevelcommunity detection algorithm and the key concepts (e.g., modularity).− The results of the communities based on (3), i.e., results for 14 different scenarios.− The submission due date is November 25th, 2019.ReferencesBlondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities inlarge networks. Journal of statistical mechanics: theory and experiment, 2008(10), P10008.转自:http://www.3daixie.com/contents/11/3444.html
讲解:CE634、Flow Network、Python、PythonR| Statisti
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。