COCO数据集：

官网数据下载
面对官网下载界面无法打开问题，此处直接提供下载链接。一组数据包括一个train包，一个val包和一个annotations包。
2014coco数据 train2014.zip val2014.zip annotations_trainval2014.zip
2017coco数据 train2017.zip val2017.zip annotations_trainval2017.zip
测试数据 test2014.zip test2015.zip test2017.zip
将coco数据集以文件夹格式存放：
笔者面向目标检测faster RCNN场景，所以此处先研究instances格式，其他评估集，人体关键点集也都类似。

COCO/DIR/
       annotations/
             instances_train201?.json
             instances_val201?.json
       train201?/
             # image files that are mentioned in the corresponding json
       val201?/
             # image files that are mentioned in corresponding json

coco格式：
数据整体格式

{
    'info': info,
    'images': [image],
    'licenses': [license],
    'annotations': [annotation],
    'categories': [category] 
}

以2014数据为例，读取instances_train2014.json数据信息。

 import json
data = json.load(open('instances_train2014.json'))
data.keys()
# dict_keys(['info', 'images', 'licenses', 'annotations', 'categories'])
data['info']
# {'description': 'COCO 2014 Dataset', 'url': 'http://cocodataset.org', 'version': '1.0', 'year': 2014, 'contributor': 'COCO Consortium', 'date_created': '2017/09/01'}
data['images'][0]
# {'license': 5, 'file_name': 'COCO_train2014_000000057870.jpg', 'coco_url': 'http://images.cocodataset.org/train2014/COCO_train2014_000000057870.jpg', 'height': 480, 'width': 640, 'date_captured': '2013-11-14 16:28:13', 'flickr_url': 'http://farm4.staticflickr.com/3153/2970773875_164f0c0b83_z.jpg', 'id': 57870}
data['licenses'][0]
# {'url': 'http://creativecommons.org/licenses/by-nc-sa/2.0/', 'id': 1, 'name': 'Attribution-NonCommercial-ShareAlike License'}
data['annotations'][0]
# {'segmentation': [[312.29, 562.89, 402.25, 511.49, 400.96, 425.38, 398.39, 372.69, 388.11, 332.85, 318.71, 325.14, 295.58, 305.86, 269.88, 314.86, 258.31, 337.99, 217.19, 321.29, 182.49, 343.13, 141.37, 348.27, 132.37, 358.55, 159.36, 377.83, 116.95, 421.53, 167.07, 499.92, 232.61, 560.32, 300.72, 571.89]], 'area': 54652.9556, 'iscrowd': 0, 'image_id': 480023, 'bbox': [116.95, 305.86, 285.3, 266.03], 'category_id': 58, 'id': 86}
data['categories'][0]
# {'supercategory': 'person', 'id': 1, 'name': 'person'}

labelme标签制作：

安装labelme

pip install labelme

安装好后直接在终端输入labelme即可打开其图形界面。然后点击界面左边open dir加载图片数据集的文件夹。

labelme界面

然后再选择界面左方create polygons标记多边形，并点击save保存在与图片同一路径下

然后选择下一张图片，继续标注，然后保存，标注完成时，一张图片应对应一个json文件。

labelme标签制作COCO数据集：

下载labelme源码
需要使用labelme源码中的一个example转换json。

git clone https://github.com/wkentaro/labelme.git

进入文件夹labelme/examples/中，将文件夹instance_segmentation文件夹拷贝，然后进入其中，将生成好的labelme文件放入data_annotated中，删除data_dataset_coco文件夹，打开labels.txt文件，按照自己的标签更改其中的类别。
执行labelme2coco.py 形成新的data_dataset_coco文件夹。

python labelme2coco.py data_annotated data_dataset_coco --labels labels.txt

读取新生成的annotations文件，与原coco数据集中的annotations文件进行对比，一致。

从labelme标签到COCO数据集制作

从labelme标签到COCO数据集制作

COCO数据集：

labelme标签制作：

labelme标签制作COCO数据集：

推荐阅读更多精彩内容