使用 Python 从包含给定字符串的 FTP 服务器下载文件

2022-01-09 00:00:00 python ftp ftplib

问题描述

我正在尝试从 FTP 服务器下载大量共享公共字符串 (DEM) 的文件.这些文件嵌套在多个目录中.例如,Adair/DEM*Adams/DEM*

I'm trying to download a large number of files that all share a common string (DEM) from an FTP sever. These files are nested inside multiple directories. For example, Adair/DEM* and Adams/DEM*

FTP 服务器位于此处:ftp://ftp.igsb.uiowa.edu/gis_library/counties/,不需要用户名和密码.所以,我想遍历每个县并下载包含字符串 DEM 的文件.

The FTP sever is located here: ftp://ftp.igsb.uiowa.edu/gis_library/counties/ and requires no username and password. So, I'd like to go through each county and download the files containing the string DEM.

我在这里阅读了很多关于 Stack Overflow 的问题和 Python 的文档,但无法弄清楚如何使用 ftplib.FTP() 在没有用户名和密码的情况下进入站点(其中不是必需的),我不知道如何在 ftplib 或 urllib 中 grep 或使用 glob.glob.

I've read many questions here on Stack Overflow and the documentation from Python, but cannot figure out how to use ftplib.FTP() to get into the site without a username and password (which is not required), and I can't figure out how to grep or use glob.glob inside of ftplib or urllib.

提前感谢您的帮助


解决方案

好的,好像可以了.如果尝试下载目录或扫描文件,可能会出现问题.异常处理可以方便地捕获错误的文件类型并跳过.

Ok, seems to work. There may be issues if trying to download a directory, or scan a file. Exception handling may come handy to trap wrong filetypes and skip.

glob.glob 无法工作,因为您在远程文件系统上,但您可以使用 fnmatch 来匹配名称

glob.glob cannot work since you're on a remote filesystem, but you can use fnmatch to match the names

代码如下:它会下载TEMP目录下所有匹配*DEM*的文件,按目录排序.

Here's the code: it download all files matching *DEM* in TEMP directory, sorting by directory.

import ftplib,sys,fnmatch,os

output_root = os.getenv("TEMP")

fc = ftplib.FTP("ftp.igsb.uiowa.edu")
fc.login()
fc.cwd("/gis_library/counties")

root_dirs = fc.nlst()
for l in root_dirs:
    sys.stderr.write(l + " ...
")
    #print(fc.size(l))
    dir_files = fc.nlst(l)
    local_dir = os.path.join(output_root,l)
    if not os.path.exists(local_dir):
        os.mkdir(local_dir)

    for f in dir_files:
        if fnmatch.fnmatch(f,"*DEM*"):   # cannot use glob.glob
            sys.stderr.write("downloading "+l+"/"+f+" ...
")
            local_filename = os.path.join(local_dir,f)
            with open(local_filename, 'wb') as fh:
                fc.retrbinary('RETR '+ l + "/" + f, fh.write)

fc.close()

相关文章