如何下载 Coco Dataset 的特定部分?
问题描述
我正在开发一个对象检测模型来使用 YOLO 检测船舶.我想使用 COCO 数据集.有没有办法只下载带有注释的图像?
I am developing an object detection model to detect ships using YOLO. I want to use the COCO dataset. Is there a way to download only the images that have ships with the annotations?
解决方案
要下载特定类别的图片,可以使用 COCO API.这是一个 demo 笔记本,通过这个和其他用法.整体流程如下:
To download images from a specific category, you can use the COCO API. Here's a demo notebook going through this and other usages. The overall process is as follows:
- 安装pycocotools
- 从 COCO 数据集 下载其中一个注释 json
- Install pycocotools
- Download one of the annotations jsons from the COCO dataset
下面是一个示例,说明我们如何下载包含 person
的图像子集并将其保存在本地文件中:
Now here's an example on how we could download a subset of the images containing a person
and saving it in a local file:
from pycocotools.coco import COCO
import requests
# instantiate COCO specifying the annotations json path
coco = COCO('...path_to_annotations/instances_train2014.json')
# Specify a list of category names of interest
catIds = coco.getCatIds(catNms=['person'])
# Get the corresponding image ids and images using loadImgs
imgIds = coco.getImgIds(catIds=catIds)
images = coco.loadImgs(imgIds)
它返回一个字典列表,其中包含有关图像及其 url 的基本信息.我们现在可以使用 requests
来 GET
图像并将它们写入本地文件夹:
Which returns a list of dictionaries with basic information on the images and its url. We can now use requests
to GET
the images and write them into a local folder:
# Save the images into a local folder
for im in images:
img_data = requests.get(im['coco_url']).content
with open('...path_saved_ims/coco_person/' + im['file_name'], 'wb') as handler:
handler.write(img_data)
请注意,这将保存指定类别中的所有张图片.因此,您可能希望将 images
列表切片为第一个 n
.
Note that this will save all images from the specified category. So you might want to slice the images
list to the first n
.
相关文章