使用 Google Cloud Storage python 客户端的批处理请求

2022-01-25 00:00:00 python google-cloud-storage

问题描述

我找不到任何关于如何使用 python 谷歌云存储的批处理功能的示例.我看到它存在 这里.

I can't find any examples on how to use the python google cloud storage's batch functionality. I see it exists here.

我想要一个具体的例子.假设我想删除一堆具有给定前缀的 blob.我将开始获取 blob 列表,如下所示

I'd love a concrete example. Let's say I want to delete a bunch of blobs with a given prefix. I'd start getting the list of blobs as follows

from google.cloud import storage

storage_client = storage.Client()
bucket = storage_client.get_bucket('my_bucket_name')
blobs_to_delete = bucket.list_blobs(prefix="my/prefix/here")

# how do I delete the blobs in blobs_to_delete in a single batch?

# bonus: if I have more than 100 blobs to delete, handle the limitation
#        that a batch can only handle 100 operations


解决方案

TL;DR - 只需在 batch() 上下文管理器(可用在 google-cloud-python 库中)

TL;DR - Just send all the requests within the batch() context manager (available in the google-cloud-python library)

试试这个例子:

from google.cloud import storage

storage_client = storage.Client()
bucket = storage_client.get_bucket('my_bucket_name')
# Accumulate the iterated results in a list prior to issuing
# batch within the context manager
blobs_to_delete = [blob for blob in bucket.list_blobs(prefix="my/prefix/here")]

# Use the batch context manager to delete all the blobs    
with storage_client.batch():
    for blob in blobs_to_delete:
        blob.delete()

如果您直接使用 REST API,则只需担心每批 100 个项目.batch() 上下文管理器 自动处理此限制,并在需要时发出多个批处理请求.

You only need to worry about the 100 items per batch if you're using the REST APIs directly. The batch() context manager automatically takes care of this restriction and will issue multiple batch requests if needed.

相关文章