深度学习常见数据集COCO Python使用 pycocotools.coco

简介

COCO API 提供了 Matlab, Python 和 Lua 的 API 接口. 该 API 接口可以提供完整的图像标签数据的加载, parsing 和可视化。此外,网站还提供了数据相关的文章, 教程等。

在使用 COCO 数据库提供的 API 和 demo 之前, 需要首先下载 COCO 的图像和标签数据（类别标志、类别数量区分、像素级的分割等）：

图像数据下载到 coco/images/ 文件夹中
标签数据下载到 coco/annotations/ 文件夹中

pip安装

1 2	`pip3 install -U Cython pip3 install -U pycocotools`

使用

pycocotools 没有文档说明，或者它的文档说明就在源代码的注释里面。注释如下：部分如下

# The following API functions are defined:
#  COCO       - COCO api class that loads COCO annotation file and prepare data structures. 加载COCO注释文件并准备数据结构的COCO api类。
#  decodeMask - Decode binary mask M encoded via run-length encoding.                       通过运行长度编码解码二进制掩码M。
#  encodeMask - Encode binary mask M using run-length encoding.                             使用运行长度编码对二进制掩码M进行编码。
#  getAnnIds  - Get ann ids that satisfy given filter conditions.                           得到满足给定过滤条件的annotation的id。
#  getCatIds  - Get cat ids that satisfy given filter conditions.                           获得满足给定过滤条件的category的id。
#  getImgIds  - Get img ids that satisfy given filter conditions.                           得到满足给定过滤条件的imgage的id。
#  loadAnns   - Load anns with the specified ids.                                           使用指定的id加载annotation。
#  loadCats   - Load cats with the specified ids.                                           使用指定的id加载category。
#  loadImgs   - Load imgs with the specified ids.                                           使用指定的id加载imgage。
#  annToMask  - Convert segmentation in an annotation to binary mask.                       将注释中的segmentation转换为二进制mask。
#  showAnns   - Display the specified annotations.                                          显示指定的annotation。
#  loadRes    - Load algorithm results and create API for accessing them.                   加载算法结果并创建访问它们的API。
#  download   - Download COCO images from mscoco.org server.                                从mscoco.org服务器下载COCO图像。
# Throughout the API "ann"=annotation, "cat"=category, and "img"=image.
# Help on each functions can be accessed by: "help COCO>function".

导入相关的包

from pycocotools.coco import COCO

import matplotlib.pyplot as plt
import cv2

import os
import numpy as np
import random

加载 COCO

cocoRoot = "/media/gph/D(Data)/COCO/"
dataType = "val2017"

annFile = os.path.join(cocoRoot, f'annotations/instances_{dataType}.json')

# initialize COCO api for instance annotations
coco=COCO(annFile)

loading annotations into memory…
Done (t=0.32s)
creating index…
index created!

根据 image_id 获取 img 信息：loadImgs

作用：根据 image_id 获取 img 信息
参数 ids: image_id 的列表
返回值：返回一个图片信息的列表，列表中的值均为字典
- 字典格式如下：{‘id’: 7, ‘file_name’: ‘28918710,be0001310a1c4.jpg’, ‘width’: 710, ‘height’: 443}

img_list = coco.loadImgs(ids=image_id):
for i in img_list:
    print(img_list)
# {'id': 7, 'file_name': '28918710,be0001310a1c4.jpg', 'width': 710, 'height': 443}

获取id和类别：getCatIds、loadCats

getCatIds

作用：利用 getCatIds 函数获取某个类别对应的ID，这个函数可以实现更复杂的功能，请参考官方文档
参数：’类别名’
返回：返回一个类别信息的列表，列表中的值均为数字

1
2
3

ids = coco.getCatIds('person')[0]
print(f'"person" 对应的序号: {ids}')
# "person" 对应的序号: 1

loadCats

作用：利用 loadCats 获取序号对应的文字类别，这个函数可以实现更复杂的功能，请参考官方文档
参数：’类别序号’
返回：返回一个类别信息的列表，列表中的值均为字典
- {‘supercategory’: ‘person’, ‘id’: 1, ‘name’: ‘person’}

1
2
3

cats = coco.loadCats(1)
print(f'"1" 对应的类别名称: {cats}')
# “1” 对应的类别名称: \[{‘supercategory’: ‘person’, ‘id’: 1, ‘name’: ‘person’}\]

获取满足特定条件的图片（交集）

1
2
3

# 获取包含person的所有图片
imgIds = coco.getImgIds(catIds=[1])
print(f'包含person的图片共有：{len(imgIds)}张')

包含person的图片共有：2693张

获取某一类的所有图片

# 获取包含dog的所有图片
id = coco.getCatIds(['dog'])[0]
imgIds = coco.catToImgs[id]
print(f'包含dog的图片共有：{len(imgIds)}张, 分别是：')
print(imgIds)

包含dog的图片共有：218张, 分别是：
[289343, 61471, 472375, 520301, 579321, 494869, …]

展示图片信息

展示原始图片

imgId = imgIds[10]

imgInfo = coco.loadImgs(imgId)[0]
print(f'图像{imgId}的信息如下：\n{imgInfo}')

imPath = os.path.join(cocoRoot, 'images', dataType, imgInfo['file_name'])                     
im = cv2.imread(imPath)
plt.axis('off')
plt.imshow(im)
plt.show()

图像329219的信息如下：
{‘license’: 1, ‘file_name’: ‘000000329219.jpg’, ‘coco_url’: ‘http://images.cocodataset.org/val2017/000000329219.jpg’,
‘height’: 427, ‘width’: 640, ‘date_captured’: ‘2013-11-14 19:21:56’,
‘flickr_url’: ‘http://farm9.staticflickr.com/8104/8505307842\_465524a6a6\_z.jpg’,
‘id’: 329219}

加载并显示annotations

plt.imshow(im)
plt.axis('off')

# 获取该图像对应的anns的Id
annIds = coco.getAnnIds(imgIds=imgInfo['id'])
print(f'图像{imgInfo["id"]}包含{len(anns)}个ann对象，分别是:\n{annIds}')
anns = coco.loadAnns(annIds)

coco.showAnns(anns)
plt.show()

图像329219包含21个ann对象，分别是:
[8032, 192816, 693180, 1508387, 1510882, 1518236, 1527016, 1529043, 1882305, 1885153, 1885350, 1885410, 1886212, 1886466, 1887489, 1981518, 2106278, 2183575, 2183858, 2213662, 2213709]

在这里插入图片描述

print(f'ann{annIds[3]}对应的mask如下：')
mask = coco.annToMask(anns[3])
plt.imshow(mask)
plt.axis('off')
plt.show()

ann1508387对应的mask如下：
(-0.5, 639.5, 426.5, -0.5)

在这里插入图片描述

深度学习 > Python

#深度学习 #Python

深度学习常见数据集COCO Python使用 pycocotools.coco

https://flepeng.github.io/ml-深度学习常见数据集COCO-Python使用-pycocotools-coco/

作者

Lepeng

发布于

2021年6月29日

许可协议

评价模型的参数 TP、TN、FP、FN 上一篇

评价目标检测(object detection)模型的参数 IOU、AP、mAP 下一篇