动态重命名Azure Blob(如果已上载

2022-03-04 00:00:00 python azure azure-blob-storage

问题描述

我有一组文件(不是本地保存的)需要上载到Azure Blob存储并每天更新。

(1)有一定数量的同名文件(内容不同)应保存为单个Blob。
(2)更新后的文件集应该重写相应的前一天BLOB。

是否有办法检查blob是否已存在,并通过追加数字动态重命名它(由于(2)无法追加时间戳)?

我正在使用以下函数上传我的所有文件:

def azure_upload_file(block_blob_service, container, local_file_path, local_file_name):
    logger = logging.getLogger('data')

    isExist = block_blob_service.exists(container, local_file_name)

    blobname = os.path.splitext(local_file_name)[0]
    blobext =  os.path.splitext(local_file_name)[1]


    if isExist is True:
        blob_file_name = '{}_{}{}'.format(blobname, '#', blobext)
    else:
        blob_file_name = local_file_name
    full_path_to_file =os.path.join(local_file_path, local_file_name)

    blob = block_blob_service.create_blob_from_path(container, blob_file_name, full_path_to_file)
    blob_url = block_blob_service.make_blob_url(container, blob_file_name)

    logger.info('Uploaded file {} to azure blob storage'.format(blob_file_name))
    os.unlink(full_path_to_file)

    return blob_url

示例:

日期:2019-11-19-初始上传

filename.ext->blob
1.abcd.zip->abcd.zip
2.abcd.zip->abcd(1).zip
3.abcd.zip->abcd(2).zip
4.defg.csv->defg.csv

以此类推..

我只想以某种方式智能地填充代码中的"#",这样每当我有更新的文件集时,我就已经知道应该将文件覆盖到哪个Blob。

也就是说,如果我在2019年11月20日有一组新文件

示例:

日期:2019年11月20日-第二次上传

新文件名.ext->blob
1.abcd.zip->abcd.zip
2.abcd.zip->abcd(1).zip
3.abcd.zip->abcd(2).zip
4.defg.csv->defg.csv

以此类推..

我已经看过类似的文章了:
1.Azure blob upload rename if blob name exist
2.Faster Azure blob name search with python?

他们两个都不能解决我的问题。想知道是否有一种既高效又简单的方法可以实现这一点?


解决方案

您可以使用exists方法检查Blob是否已存在,然后检查是否需要更改文件名。

以下是我的测试代码,我可以使用它。

    block_blob_service = BlockBlobService(account_name=accountName, account_key=accountKey,
                                              socket_timeout=10000)

    container_name ="test"
    local_path = "./data"
    local_file_name = "quickstart.txt"

    isExist = block_blob_service.exists(container_name, local_file_name)

if isExist:
    local_file_name = local_file_name.replace('.txt', '1.txt')
    upload_file_path = os.path.join(local_path, local_file_name)
    print("
Uploading to Azure Storage as blob:
	" + local_file_name)
    # Upload the created file, use local_file_name for the blob name.
    block_blob_service.create_blob_from_path(
    container_name, local_file_name, upload_file_path)
else:
    upload_file_path = os.path.join(local_path, local_file_name)
    print("
Uploading to Azure Storage as blob:
	" + local_file_name)
    block_blob_service.create_blob_from_path(
container_name, local_file_name, upload_file_path)

更新:

    container_name ="test"
    local_path = "./data"
    local_file_name="quickstart.txt"


    isExist = block_blob_service.exists(container_name, local_file_name)

    if not(isExist):
        upload_file_path = os.path.join(local_path, local_file_name)
        print("
Uploading to Azure Storage as blob:
	" + local_file_name)
        block_blob_service.create_blob_from_path(container_name, local_file_name, upload_file_path)
    else:
        i=1
        while(isExist):
            name = local_file_name.split('.')[0] + '(' + str(i) + ').' + local_file_name.split('.')[1]
            isExist = block_blob_service.exists(container_name, name)
            i=i+1
        upload_file_path = os.path.join(local_path, local_file_name)
        print("
Uploading to Azure Storage as blob:
	" + name)
        block_blob_service.create_blob_from_path(container_name, name, upload_file_path)

相关文章