NoteSet | SSL

科普笔记

自监督学习近期进展——naiyan wang

https://zhuanlan.zhihu.com/p/30265894


Yan Lecun 自监督学习:机器能像人一样学习吗? 110页PPT+视频

https://cloud.tencent.com/developer/article/1356966


自主学习(Self Learning)有什么比较新的思路?——xiaolong wang

https://www.zhihu.com/question/267563087/answer/327486390


ICML workshop

https://sites.google.com/view/self-supervised-icml2019


Lecun IJCAI18 ppt / Zisserman ICML19 ppt

my MacBook :)


论文笔记

Multi-task Self-Supervised Visual Learning (ICCV)

https://blog.csdn.net/hibercraft/article/details/80150148

用多自监督任务一起学图像表示

Introduction

Vision is one of the most promising domains for unsupervised learning. Unlabeled images and video are available in practically unlimited quantities, and the most prominent present image models—neural networks—are data starved, easily memorizing even random labels for large image collections. Yet unsupervised algorithms are still not very effective for training neural networks: they fail to adequately capture the visual semantics needed to solve real-world tasks like object detection or geometry estimation the way strongly-supervised methods do. For most vision problems, the current state-of-the-art approach begins by training a neural network on ImageNet  or a similarly large dataset which has been hand-annotated.

How might we better train neural networks without manual labeling? Neural networks are generally trained via backpropagation on some objective function. Without labels, however, what objective function can measure how good the network is? Self-supervised learning answers this question by proposing various tasks for networks to solve, where performance is easy to measure, i.e., performance can be captured with an objective function like those seen in supervised learning. Ideally, these tasks will be difficult to solve without understanding some form of image semantics, yet any labels necessary to formulate the objective function can be obtained automatically. In the last few years, a considerable number of such tasks have been proposed [1, 2, 6, 7, 8, 17, 20, 21, 23, 25, 26, 27, 28, 29, 31, 39, 40, 42, 43, 46, 47], such as asking a neural network to colorize grayscale images, fill in image holes, solve jigsaw puzzles made from image patches, or predict movement in videos. Neural networks pre-trained with these tasks can be re-trained to perform well on standard vision tasks (e.g. image classification, object detection, geometry estimation) with less manually-labeled data than networks which are initialized randomly. However, they still perform worse in this setting than networks pre-trained on ImageNet.

Related Work

自监督方法

两类:use auxiliary information / use raw pixels.

video & image


TextTopicNet: Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces

https://blog.csdn.net/qq_26074263/article/details/81277630

用文本监督图像


Split-brain Auto-encoder

https://richzhang.github.io/splitbrainauto/

不同channel之间预测监督,L-ab, RGB-D


Unsupervised Visual Representation Learning by Context Prediction

relative position

Abstract

This work explores the use of spatial context as a source of free and plentiful supervisory signal for training a rich visual representation. (自监督)

Introduction

1. 小样本

Recently, new computer vision methods have leveraged large datasets of millions of labeled examples to learn rich, high-performance visual representations.

Yet efforts to scale these methods to truly Internet-scale datasets (i.e. hundreds of billions of images) are hampered by the sheer expense of the human annotation required.

A natural way to address this difficulty would be to employ unsupervised learning, which aims to use data without any annotation.

2. 动机:文本领域中的context

This converts an apparently unsupervised problem (finding a good similarity metric between words) into a “self-supervised” one: learning a function from a given word to the words surrounding it. 

Here the context prediction task is just a “pretext” to force the model to learn a good word embedding, which, in turn, has been shown to be useful in a number of real tasks, such as semantic word similarity.

3. Our paper

Our underlying hypothesis is that doing well on this task requires understanding scenes and objects, i.e. a good visual representation for this task will need to extract objects and their parts in order to reason about their relative spatial location. (借口任务的作用)

“Objects,” after all, consist of multiple parts that can be detected independently of one another, and which occur in a specific spatial configuration (if there is no specific configuration of the parts, then it is “stuff” [1]).

We demonstrate that the resulting visual representation is good for both object detection, providing a significant boost on PASCAL VOC 2007 compared to learning from scratch, as well as for unsupervised object discovery / visual data mining. This means, surprisingly, that our representation generalizes across images, despite being trained using an objective function that operates on a single image at a time. That is, instance-level supervision appears to improve performance on category-level tasks.

Related Work

1. 生成模型

存在问题:Generative models have shown promising performance on smaller datasets such as handwritten digits [25, 24, 48, 30, 46], but none have proven effective for high-resolution natural images.(2016年)

2. 无监督

存在问题:We believe that current reconstruction-based algorithms struggle with low-level phenomena, like stochastic textures, making it hard to even measure whether a model is generating well.

文本领域:context prediction

各种pretext task:However, such a task would be trivial, since discriminating low-level color statistics and lighting would be enough. To make the task harder and more high-level, in this paper, we instead classify between multiple possible configurations of patches sampled from the same image, which means they will share lighting and color statistics, as shown on Figure 2.

Another line of work in unsupervised learning from images aims to ...

Video

Our work

Avoiding trivial solution

When designing a pretext task, care must be taken to ensure that the task forces the network to extract the desired information (high-level semantics, in our case), without taking “trivial” shortcuts. In our case, low-level cues like boundary patterns or textures continuing between patches could potentially serve as such a shortcut. Hence, for the relative prediction task, it was important to include a gap between patches.

However, even these precautions are not enough: we were surprised to find that, for some images, another trivial solution exists. We traced the problem to an unexpected culprit: chromatic aberration.



Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles

Abstract

By following the principles of self-supervision, we build a convolutional neural network (CNN) that can be trained to solve Jigsaw puzzles as a pretext task, which requires no manual labeling, and then later repurposed(重新调整用途) to solve object classification and detection.

We show that the CFN includes fewer parameters than AlexNet while preserving the same semantic learning capabilities.

Introduction

1. vision 小样本

However, as manually labeled data can be costly, unsupervised learning methods are gaining momentum.

2. 自监督

... have explored a novel paradigm for unsupervised learning called self-supervised learning. The main idea is to exploit different labelings that are freely available besides or within visual data, and to use them as intrinsic reward signals to learn general-purpose features.

The features obtained with these approaches have been successfully transferred to classification and detections tasks, and their performance is very encouraging when compared to features trained in a supervised manner.

We introduce a novel self-supervised task, the Jigsaw puzzle reassembly problem (see Fig. 1), which builds features that yield high performance when transferred to detection and classification tasks.

3. Our work

We argue that solving Jigsaw puzzles can be used to teach a system that an object is made of parts and what these parts are. The association of each separate puzzle tile to a precise object part might be ambiguous. However, when all the tiles are observed, the ambiguities might be eliminated more easily because the tile placement is mutually exclusive. This argument is supported by our experimental validation. Training a Jigsaw puzzle solver takes about 2.5 days compared to 4 weeks of [10]. Also, there is no need to handle chromatic aberration or to build robustness to pixelation. Moreover, the features are highly transferable to detection and classification and yield the highest performance to date for an unsupervised method.

Related Work

1. 表示学习

transfer learning / pre-traing (这么说好像也可以,后面实验)

2. 无监督学习

三类:probabilistic, direct mapping (autoencoders), and manifold learning ones

3. 自监督学习





pixelCNN

https://blog.csdn.net/Jasminexjf/article/details/82499513



弱监督

弱监督通常分为三种类型:

1 不完全监督:标签样本少 —— 主动学习、半监督学习、迁移学习

2 不确切监督:标签粗粒度 —— 多示例学习

3 不准确监督:标签有噪声

让机器“一叶知秋”:弱监督视觉语义分割

https://blog.csdn.net/xwukefr2tnh4/article/details/80479335

弱监督学习在医学影像中的探索 

http://www.sohu.com/a/240831591_133098

南京大学周志华教授综述论文:弱监督学习

A brief introduction to weakly supervised learning

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 204,921评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 87,635评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 151,393评论 0 338
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,836评论 1 277
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,833评论 5 368
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,685评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,043评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,694评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 42,671评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,670评论 2 321
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,779评论 1 332
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,424评论 4 321
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,027评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,984评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,214评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,108评论 2 351
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,517评论 2 343