Google云功能,使用python将源Bucket的所有数据复制到另一个Bucket

问题描述

我想使用Google云功能将数据从一个存储桶复制到另一个存储桶。此时,我只能将单个文件复制到目标,但我希望将所有文件、文件夹和子文件夹复制到我的目标存储桶。


from google.cloud import storage
def copy_blob(bucket_name= "loggingforproject", blob_name= "assestnbfile.json", destination_bucket_name= "test-assest", destination_blob_name= "logs"):
    """Copies a blob from one bucket to another with a new name."""
    bucket_name = "loggingforproject"
    blob_name = "assestnbfile.json"
    destination_bucket_name = "test-assest"
    destination_blob_name = "logs"

    storage_client = storage.Client()

    source_bucket = storage_client.bucket(bucket_name)
    source_blob = source_bucket.blob(blob_name)
    destination_bucket = storage_client.bucket(destination_bucket_name)

    blob_copy = source_bucket.copy_blob(
        source_blob, destination_bucket, destination_blob_name
    )

    print(
        "Blob {} in bucket {} copied to blob {} in bucket {}.".format(
            source_blob.name,
            source_bucket.name,
            blob_copy.name,
            destination_bucket.name,
        )
    )

解决方案

使用gsutil cp是个不错的选择。但是,如果您希望使用云函数复制文件,也可以实现。

目前,您的函数仅复制单个文件。为了复制存储桶的全部内容,您需要循环访问其中的文件。

以下是我为HTTP云函数编写并测试的代码示例,您可以将其作为参考:

MAIN.PY

from google.cloud import storage

def copy_bucket_files(request):
    """
    Copies the files from a specified bucket into the selected one.
    """

    # Check if the bucket's name was specified in the request
    if request.args.get('bucket'):
        bucketName = request.args.get('bucket')
    else:
        return "The bucket name was not provided. Please try again."

    try:
        # Initiate Cloud Storage client
        storage_client = storage.Client()
        # Define the origin bucket
        origin = storage_client.bucket(bucketName)
        # Define the destination bucket
        destination = storage_client.bucket('<my-test-bucket>')

        # Get the list of the blobs located inside the bucket which files you want to copy
        blobs = storage_client.list_blobs(bucketName)

        for blob in blobs:
            origin.copy_blob(blob, destination)

        return "Done!"

    except:
        return "Failed!"

REQUIREMENTS.TXT

google-cloud-storage==1.22.0

如何调用该函数:

可以通过为触发函数提供的URL调用,方法是在该URL后面加上/?bucket=<name-of-the-bucket-to-copy>(名称不带<>):

https://<function-region>-<project-name>.cloudfunctions.net/<function-name>/?bucket=<bucket-name>

相关文章