Information retrieval领域必读的六篇论文

经典必读论文推荐又来啦，这次推荐的是Information retrieval最重要的6篇论文，统计数据来自于学术范，希望可以帮到大家~

（对于英语阅读有困难的同学，访问后可以使用翻译功能）

一、Rayyan-a web and mobile app for systematic reviews.

作者：Mourad Ouzzani / Hossam M. Hammady / Zbys Fedorowicz / Ahmed K. Elmagarmid

摘要：Synthesis of multiple randomized controlled trials (RCTs) in a systematic review can summarize the effects of individual outcomes and provide numerical answers about the effectiveness of interventions. Filtering of searches is time consuming, and no single method fulfills the principal requirements of speed with accuracy. Automation of systematic reviews is driven by a necessity to expedite the availability of current best evidence for policy and clinical decision-making. We developed Rayyan (http://rayyan.qcri.org), a free web and mobile app, that helps expedite the initial screening of abstracts and titles using a process of semi-automation while incorporating a high level of usability. For the beta testing phase, we used two published Cochrane reviews in which included studies had been selected manually. Their searches, with 1030 records and 273 records, were uploaded to Rayyan. Different features of Rayyan were tested using these two reviews. We also conducted a survey of Rayyan’s users and collected feedback through a built-in feature. Pilot testing of Rayyan focused on usability, accuracy against manual methods, and the added value of the prediction feature. The “taster” review (273 records) allowed a quick overview of Rayyan for early comments on usability. The second review (1030 records) required several iterations to identify the previously identified 11 trials. The “suggestions” and “hints,” based on the “prediction model,” appeared as testing progressed beyond five included studies. Post rollout user experiences and a reflexive response by the developers enabled real-time modifications and improvements. The survey respondents reported 40% average time savings when using Rayyan compared to others tools, with 34% of the respondents reporting more than 50% time savings. In addition, around 75% of the respondents mentioned that screening and labeling studies as well as collaborating on reviews to be the two most important features of Rayyan. As of November 2016, Rayyan users exceed 2000 from over 60 countries conducting hundreds of reviews totaling more than 1.6M citations. Feedback from users, obtained mostly through the app web site and a recent survey, has highlighted the ease in exploration of searches, the time saved, and simplicity in sharing and comparing include-exclude decisions. The strongest features of the app, identified and reported in user feedback, were its ability to help in screening and collaboration as well as the time savings it affords to users. Rayyan is responsive and intuitive in use with significant potential to lighten the load of reviewers.

全文链接：Rayyan-a web and mobile app for systematic reviews.

二、featureCounts: an efficient general-purpose program for assigning sequence reads to genomic features

作者：Yang Liao / Gordon K. Smyth / Wei Shi

摘要：MOTIVATION: Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. RESULTS: We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. AVAILABILITY AND IMPLEMENTATION: featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.

全文链接：featureCounts: an efficient general-purpose program for assigning sequence reads to genomic features

三、NIH Image to ImageJ: 25 years of image analysis

作者：Caroline A Schneider / Wayne Rasband / Kevin W. Eliceiri

摘要：For the past 25 years NIH Image and ImageJ software have been pioneers as open tools for the analysis of scientific images. We discuss the origins, challenges and solutions of these two programs, and how their history can serve to advise and inform other software projects.

全文链接：NIH Image to ImageJ: 25 years of image analysis

四、UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

作者：Khurram Soomro / Amir Roshan Zamir / Mubarak Shah

摘要：We introduce UCF101 which is currently the largest dataset of human actions.It consists of 101 action classes, over 13k clips and 27 hours of video data.The database consists of realistic user uploaded videos containing camera motion and cluttered background. Additionally, we provide baseline action recognition results on this new dataset using standard bag of words approach with overall performance of 44.5%. To the best of our knowledge, UCF101 is currently the most challenging dataset of actions due to its large number of classes, large number of clips and also unconstrained nature of such clips.

全文链接：UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

五、The variant call format and VCFtools

作者：Petr Danecek / Adam Auton / Gonçalo R. Abecasis / Cornelis A. Albers / Eric Banks / Mark A. DePristo / Robert E. Handsaker / Gerton Lunter / Gabor T. Marth / Stephen T. Sherry / Gilean McVean / Richard Durbin

摘要：Convolutional networks are at the core of most state of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we are exploring ways to scale up networks in ways that aim at utilizing the addedcomputationas efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21:2% top-1 and 5:6% top-5 error for single frame evaluation using a network with a computational cost of 5 billion multiply-adds per inference and with using less than 25 million parameters. With an ensemble of 4 models and multi-crop evaluation, we report 3:5% top-5 error and 17:3% top-1 error on the validation set and 3:6% top-5 error on the official test set.

全文链接：The variant call format and VCFtools

六、2011 Compendium of Physical Activities: a second update of codes and MET values.

作者：Barbara E. Ainsworth / William L. Haskell / Stephen D. Herrmann / Nathanael Meckes / David R. Bassett / Catrine Tudor-Locke / Jennifer L. Greer / Jesse W. Vezina / Melicia C. Whitt-Glover / Arthur S. Leon

摘要：ABSTRACTPurpose:The Compendium of Physical Activities was developed to enhance the comparability of results across studies using self-report physical activity (PA) and is used to quantify the energy cost of a wide variety of PA. We provide the second update of the Compendium, called the 2011 Compend

全文链接：2011 Compendium of Physical Activities: a second update of codes and MET values.

希望对大家有帮助~

Information retrieval领域必读的六篇论文

Information retrieval领域必读的六篇论文

相关阅读更多精彩内容

友情链接更多精彩内容