2018-06-03 今日收集

  • 【流行深度学习模型的(Keras)参考实现集】’Keras Applications - Reference implementations of popular deep learning models.' GitHub: 网页链接 ​​​​

  • 【Tensorflow.js摄像头(人体)追踪】《Webcam Tracking with Tensorflow.js - YouTube》by Siraj Raval 网页链接 GitHub:网页链接 ​​​​“Tensorflow.js摄像头(人体)追踪” 搬运:网页链接

  • 【贝叶斯思维可视化指南】《A visual guide to Bayesian thinking - YouTube》by Julia Galef 网页链接 L爱可可-爱生活的秒拍视频 ​​​​

  • 【从头搭建自动问答系统】《Building a Question-Answering System from Scratch》by Alvira Swalin Part 1:网页链接 pdf:网页链接 ​​​​

  • 《Hyperbolic Neural Networks》O Ganea, G Bécigneul, T Hofmann [ETH Zürich] (2018) 网页链接 view:网页链接 GitHub:网页链接 ​​​​

  • 《EcoRNN: Fused LSTM RNN Implementation with Data Layout Optimization》B Zheng, A Nair, Q Wu, N Vijaykumar, G Pekhimenko [University of Toronto & CMU] (2018) 网页链接 view:网页链接 ​​​​

  • 《Virtuously Safe Reinforcement Learning》H Aslund, E M E Mhamdi, R Guerraoui, A Maurer [EPFL] (2018) 网页链接 view:网页链接 ​​​​

  • 《Embedding Syntax and Semantics of Prepositions via Tensor Decomposition》H Gong, S Bhat, P Viswanath [University of Illinois at Urbana-Champaign] (2018) 网页链接 view:网页链接 ​​​​

  • 《Estimating Carotid Pulse and Breathing Rate from Near-infrared Video of the Neck》W Chen, J Hernandez, R W. Picard [MIT] (2018) 网页链接 view:网页链接 ​​​​

  • 《Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data》W Hsu, J Glass [MIT] (2018) 网页链接 view:网页链接 ​​​​

  • 《Phrase Table as Recommendation Memory for Neural Machine Translation》Y Zhao, Y Wang, J Zhang, C Zong [CAS & University of Chinese Academy of Sciences] (2018) 网页链接 view:网页链接 ​​​​

  • 《Polyglot Semantic Role Labeling》P Mulcaire, S Swayamdipta, N Smith [University of Washington & CMU] (2018) 网页链接 view:网页链接 ​​​​

  • 《Rotation Equivariance and Invariance in Convolutional Neural Networks》B Chidester, M N. Do, J Ma [CMU & University of Illinois at Urbana-Champaign] (2018) 网页链接 view:网页链接 GitHub:网页链接 ​​​​

  • 《Supervised Policy Update》Q H Vuong, Y Zhang, K W. Ross [New York University] (2018) 网页链接 view:网页链接 ​​​​

  • 《How Important Is a Neuron?》K Dhamdhere, M Sundararajan, Q Yan [Google AI] (2018) 网页链接 view:网页链接 ​​​​

  • 《Depth and nonlinearity induce implicit exploration for RL》J Dauparas, R Tomioka, K Hofmann [University of Cambridge & Microsoft Research] (2018) 网页链接 view:网页链接 ​​​​

  • 《Pathology Segmentation using Distributional Differences to Images of Healthy Origin》S Andermatt, A Horváth, S Pezold, P Cattin [University of Basel] (2018) 网页链接 view:网页链接 ​​​​

  • 《A Unified Particle-Optimization Framework for Scalable Bayesian Sampling》C Chen, R Zhang, W Wang, B Li, L Chen [University at Buffalo & Duke University] (2018) 网页链接 view:网页链接 ​​​​

  • 《Multi-Resolution 3D Convolutional Neural Networks for Object Recognition》S Ghadai, X Lee, A Balu, S Sarkar, A Krishnamurthy [Iowa State University] (2018) 网页链接 view:网页链接 ​​​​

  • 《Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization》S Liu, B Kailkhura, P Chen, P Ting, S Chang, L Amini [IBM Research & Lawrence Livermore National Laboratory & University of Michigan] (2018) 网页链接 view:网页链接 ​​​​

  • 《Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms》M M. Dunlop, D Slepčev, A M. Stuart, M Thorpe [Caltech & CMU & University of Cambridge] (2018) 网页链接 view:网页链接 ​​​​

  • 《Pushing the bounds of dropout》G Melis, C Blundell, T Kočiský, K M Hermann, C Dyer, P Blunsom [DeepMind] (2018) 网页链接 view:网页链接 ​​​​

  • 《A Generalized Active Learning Approach for Unsupervised Anomaly Detection》T Pimentel, M Monteiro, J Viana, A Veloso, N Ziviani [Kunumi & UFMG] (2018) 网页链接 view:网页链接 ​​​​

  • 《Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces》A Coucke, A Saade, A Ball, T Bluche... [Snips] (2018) 网页链接 view:网页链接 ​​​​

  • 《Deployment of Customized Deep Learning based Video Analytics On Surveillance Cameras》P Dubal, R Mahadev, S Kothawade, K Dargan, R Iyer [AitoeLabs] (2018) 网页链接 view:网页链接 ​​​​

  • 《Deep Functional Dictionaries: Learning Consistent Semantic Structures on 3D Models from Functions》M Sung, H Su, R Yu, L Guibas [Stanford University & University of California San Diego] (2018) 网页链接 view:网页链接 ​​​​

  • 《A Unified Probabilistic Model for Learning Latent Factors and Their Connectivities from High-Dimensional Data》R P Monti, A Hyvärinen [University College London] (2018) 网页链接 view:网页链接 ​​​​

  • 《Meta-Gradient Reinforcement Learning》Z Xu, H v Hasselt, D Silver [DeepMind] (2018) 网页链接 view:网页链接 ​​​​

  • 《Unsupervised Alignment of Embeddings with Wasserstein Procrustes》E Grave, A Joulin, Q Berthet [New York City & Facebook AI Research & University of Cambridge] (2018) 网页链接 view:网页链接 ​​​​

  • 《Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting》Y Chen, M Bansal [UNC Chapel Hill] (2018) 网页链接 view:网页链接 GitHub:网页链接 ​​​​

  • arXiv Papers | Data Analytics & R

  • [1805.11643v1] High Dimensional Robust Sparse Regression
    We provide a novel -- and to the best of our knowledge, the first -- algorithm for high dimensional sparse regression with corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse parameters in the presence of a constant fraction of arbitrary corruptions. Our main contribution is a robust variant of Iterative Hard Thresholding. Using this, we provide accurate estimators with sub-linear sample complexity. Our algorithm consists of a novel randomized outlier removal technique for robust sparse mean estimation that may be of interest in its own right: it is orderwise more efficient computationally than existing algorithms, and succeeds with high probability, thus making it suitable for general use in iterative algorithms. We demonstrate the effectiveness on large-scale sparse regression problems with arbitrary corruptions.

  • [1805.11653v1] LSTMs Exploit Linguistic Attributes of Data
    While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data. We investigate how the properties of natural language data affect an LSTM's ability to learn a nonlinguistic task: recalling elements from its input. We find that models trained on natural language data are able to recall tokens from much longer sequences than models trained on non-language sequential data. Furthermore, we show that the LSTM learns to solve the memorization task by explicitly using a subset of its neurons to count timesteps in the input. We hypothesize that the patterns and structure in natural language data enable LSTMs to learn by providing approximate ways of reducing loss, but understanding the effect of different training data on the learnability of LSTMs remains an open question.

  • [1805.11686v1] Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition
    The design of a reward function often poses a major practical challenge to real-world applications of reinforcement learning. Approaches such as inverse reinforcement learning attempt to overcome this challenge, but require expert demonstrations, which can be difficult or expensive to obtain in practice. We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available. Our method is grounded in an alternative perspective on control and reinforcement learning, where an agent's goal is to maximize the probability that one or more events will happen at some point in the future, rather than maximizing cumulative rewards. We demonstrate the effectiveness of our methods on continuous control tasks, with a focus on high-dimensional observations like images where rewards are hard or even impossible to specify.

  • [1805.11706v1] Supervised Policy Update
    We propose a new sample-efficient methodology, called Supervised Policy Update (SPU), for deep reinforcement learning. Starting with data generated by the current policy, SPU optimizes over the proximal policy space to find a non-parameterized policy. It then solves a supervised regression problem to convert the non-parameterized policy to a parameterized policy, from which it draws new samples. There is significant flexibility in setting the labels in the supervised regression problem, with different settings corresponding to different underlying optimization problems. We develop a methodology for finding an optimal policy in the non-parameterized policy space, and show how Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) can be addressed by this methodology. In terms of sample efficiency, our experiments show SPU can outperform PPO for simulated robotic locomotion tasks.

  • [1805.11724v1] Rethinking Knowledge Graph Propagation for Zero-Shot Learning
    The potential of graph convolutional neural networks for the task of zero-shot learning has been demonstrated recently. These models are highly sample efficient as related concepts in the graph structure share statistical strength allowing generalization to new classes when faced with a lack of data. However, knowledge from distant nodes can get diluted when propagating through intermediate nodes, because current approaches to zero-shot learning use graph propagation schemes that perform Laplacian smoothing at each layer. We show that extensive smoothing does not help the task of regressing classifier weights in zero-shot learning. In order to still incorporate information from distant nodes and utilize the graph structure, we propose an Attentive Dense Graph Propagation Module (ADGPM). ADGPM allows us to exploit the hierarchical graph structure of the knowledge graph through additional connections. These connections are added based on a node's relationship to its ancestors and descendants and an attention scheme is further used to weigh their contribution depending on the distance to the node. Finally, we illustrate that finetuning of the feature representation after training the ADGPM leads to considerable improvements. Our method achieves competitive results, outperforming previous zero-shot learning approaches.

  • [1805.11730v1] Learn to Combine Modalities in Multimodal Deep Learning
    Combining complementary information from multiple modalities is intuitively appealing for improving the performance of learning-based approaches. However, it is challenging to fully leverage different modalities due to practical challenges such as varying levels of noise and conflicts between modalities. Existing methods do not adopt a joint approach to capturing synergies between the modalities while simultaneously filtering noise and resolving conflicts on a per sample basis. In this work we propose a novel deep neural network based technique that multiplicatively combines information from different source modalities. Thus the model training process automatically focuses on information from more reliable modalities while reducing emphasis on the less reliable modalities. Furthermore, we propose an extension that multiplicatively combines not only the single-source modalities, but a set of mixtured source modalities to better capture cross-modal signal correlations. We demonstrate the effectiveness of our proposed technique by presenting empirical results on three multimodal classification tasks from different domains. The results show consistent accuracy improvements on all three tasks.

  • [1805.11761v1] Collaborative Learning for Deep Neural Networks
    We introduce collaborative learning in which multiple classifier heads of the same network are simultaneously trained on the same training data to improve generalization and robustness to label noise with no extra inference cost. It acquires the strengths from auxiliary training, multi-task learning and knowledge distillation. There are two important mechanisms involved in collaborative learning. First, the consensus of multiple views from different classifier heads on the same example provides supplementary information as well as regularization to each classifier, thereby improving generalization. Second, intermediate-level representation (ILR) sharing with backpropagation rescaling aggregates the gradient flows from all heads, which not only reduces training computational complexity, but also facilitates supervision to the shared layers. The empirical results on CIFAR and ImageNet datasets demonstrate that deep neural networks learned as a group in a collaborative way significantly reduce the generalization error and increase the robustness to label noise.

  • [1805.11797v1] Grow and Prune Compact, Fast, and AccurateLSTMs
    Long short-term memory (LSTM) has been widely used for sequential data modeling. Researchers have increased LSTM depth by stacking LSTM cells to improve performance. This incurs model redundancy, increases run-time delay, and makes the LSTMs more prone to overfitting. To address these problems, we propose a hidden-layer LSTM (H-LSTM) that adds hidden layers to LSTM's original one level non-linear control gates. H-LSTM increases accuracy while employing fewer external stacked layers, thus reducing the number of parameters and run-time latency significantly. We employ grow-and-prune (GP) training to iteratively adjust the hidden layers through gradient-based growth and magnitude-based pruning of connections. This learns both the weights and the compact architecture of H-LSTM control gates. We have GP-trained H-LSTMs for image captioning and speech recognition applications. For the NeuralTalk architecture on the MSCOCO dataset, our three models reduce the number of parameters by 38.7x [floating-point operations (FLOPs) by 45.5x], run-time latency by 4.5x, and improve the CIDEr score by 2.6. For the DeepSpeech2 architecture on the AN4 dataset, our two models reduce the number of parameters by 19.4x (FLOPs by 23.5x), run-time latency by 15.7%, and the word error rate from 12.9% to 8.7%. Thus, GP-trained H-LSTMs can be seen to be compact, fast, and accurate.

  • [1805.11917v1] The Dynamics of Learning: A Random Matrix Approach
    Understanding the learning dynamics of neural networks is one of the key issues for the improvement of optimization algorithms as well as for the theoretical comprehension of why deep neural nets work so well today. In this paper, we introduce a random matrix-based framework to analyze the learning dynamics of a single-layer linear network on a binary classification problem, for data of simultaneously large dimension and size, trained by gradient descent. Our results provide rich insights into common questions in neural nets, such as overfitting, early stopping and the initialization of training, thereby opening the door for future studies of more elaborate structures and models appearing in today's neural networks.

  • [1805.11648v1] Teaching Meaningful Explanations
    The adoption of machine learning in high-stakes applications such as healthcare and law has lagged in part because predictions are not accompanied by explanations comprehensible to the domain user, who often holds ultimate responsibility for decisions and outcomes. In this paper, we propose an approach to generate such explanations in which training data is augmented to include, in addition to features and labels, explanations elicited from domain users. A joint model is then learned to produce both labels and explanations from the input features. This simple idea ensures that explanations are tailored to the complexity expectations and domain knowledge of the consumer. Evaluation spans multiple modeling techniques on a simple game dataset, an image dataset, and a chemical odor dataset, showing that our approach is generalizable across domains and algorithms. Results demonstrate that meaningful explanations can be reliably taught to machine learning algorithms, and in some cases, improve modeling accuracy.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 214,717评论 6 496
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,501评论 3 389
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 160,311评论 0 350
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,417评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,500评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,538评论 1 293
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,557评论 3 414
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,310评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,759评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,065评论 2 330
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,233评论 1 343
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,909评论 5 338
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,548评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,172评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,420评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,103评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,098评论 2 352

推荐阅读更多精彩内容