如何使用 pysftp 仅同步远程目录中更改的文件?

问题描述

我正在使用 pysftp 库的 get_r 函数(https://pysftp.readthedocs.io/en/release_0.2.9/pysftp.html#pysftp.Connection.get_r) 从 sftp 服务器获取目录结构的本地副本.

I am using pysftp library's get_r function (https://pysftp.readthedocs.io/en/release_0.2.9/pysftp.html#pysftp.Connection.get_r) to get a local copy of a directory structure from sftp server.

对于远程目录的内容已更改并且我只想获取自上次运行脚本以来更改的文件的情况,这是正确的方法吗?

Is that the correct approach for a situation when the contents of the remote directory have changed and I would like to get only the files that changed since the last time the script was run?

脚本应该能够递归同步远程目录并镜像远程目录的状态 - f.e.带有一个参数,用于控制是否应删除本地过时的文件(远程服务器上不再存在的文件),以及是否应获取对现有文件和新文件的任何更改.

The script should be able to sync the remote directory recursively and mirror the state of the remote directory - f.e. with a parameter controlling if the local outdated files (those that are no longer present on the remote server) should be removed, and any changes to the existing files and new files should be fetched.

我目前的方法就在这里.

示例用法:

from sftp_sync import sync_dir

sync_dir('/remote/path/', '/local/path/')


解决方案

使用 pysftp.Connection.listdir_attr 获取带有属性的文件列表(包括文件时间戳).

Use the pysftp.Connection.listdir_attr to get file listing with attributes (including the file timestamp).

然后,迭代列表并与本地文件进行比较.

Then, iterate the list and compare against local files.

import os
import pysftp
import stat

remote_path = "/remote/path"
local_path = "/local/path"

with pysftp.Connection('example.com', username='user', password='pass') as sftp:
    sftp.cwd(remote_path)
    for f in sftp.listdir_attr():
        if not stat.S_ISDIR(f.st_mode):
            print("Checking %s..." % f.filename)
            local_file_path = os.path.join(local_path, f.filename)
            if ((not os.path.isfile(local_file_path)) or
                (f.st_mtime > os.path.getmtime(local_file_path))):
                print("Downloading %s..." % f.filename)
                sftp.get(f.filename, local_file_path)

相关文章