自动化 Python 包发布过程

问题描述

我刚刚开始了一个开源 Python 项目,我希望它有一天会流行起来.目前要发布新版本,我必须做一些事情.

I've just started an open source Python project that I hope might be popular one day. At the moment to release a new version I have to do a few things.

  1. 测试所有东西.
  2. 编辑 mypackage.VERSION 变量,该变量 setup.py__init__
  3. 导入
  4. 使用 python setup.py sdist bdist_wheel
  5. 构建包和轮子
  6. 将变更日志条目写入 CHANGELOG 文件
  7. 提交我的更改,回显其中的一些更改日志
  8. 将该提交标记为发布,再次复制该更改日志条目.
  9. 拖入我构建的文件,以便人们可以从版本中下载它们
  10. 使用 Twine 将包推送到 PyPI 上
  11. 通过 PyPI 在我的暂存服务器上再次测试.
  1. Test all the things.
  2. Edit mypackage.VERSION variable, which setup.py imports from __init__
  3. Build packages and wheels with python setup.py sdist bdist_wheel
  4. Write a changelog entry to CHANGELOG file
  5. Commit my changes, echo some of that changelog
  6. Tag that commit as a release, copy that changelog entry over again.
  7. Drag in my built files so people can download them from the release
  8. Use Twine to push the packages up onto PyPI
  9. Test again on my staging server via PyPI.

如果我必须用九个要点来总结我讨厌我的项目的所有内容,我想我们会看到一个非常相似的列表.最重要的是,过去我编了一个新版本号并写了提交/更改日志消息,这非常无聊.

If I had to sum up everything I hate about my project in nine bullet points, I think we'd be looking at a very similar list. The thing that cuts is that past me making up a new version number and writing the commit/changelog message, this is painfully dull.

我能否以某种方式自动化这些任务,例如,让 GitHub CI 仅通过我的提交完成所有事情?

Can I automate any of these tasks in such a way that I might be able to, for example, let GitHub CI do everything just from my commits?

我已经有 10 年的 Python 经验和一点 CI,但我对打包 Python 和积极与 PyPI 交互还是很陌生.我怀疑我不是唯一一个被这里的手动重复逼疯的人,我只是在寻找可以使这个过程更容易的工具(或服务).

I already have a decade of Python experience, and a bit of CI, but I'm very new to packaging Python and actively interacting with PyPI. I suspect I'm not the only person driven crazy by the manual repetition here, I'm just looking for tools (or services) that can make this process easier.


解决方案

以下是我对您的列表的看法.您可以实现一定范围的自动化,我将尝试提供一个合理的起点,然后提供一些提示,告诉您如何从那里走得更远.

The following is my own opinionated take on your list. There is a certain range of automation you can achieve, and I'll try to provide a reasonable starting point, and then some hints on how you can go further from there.

采用这部分应该已经摆脱了大部分烦人的手动工作,并且您可以根据需要越来越多地自动化.如果你不习惯维护大量的 CI 代码,你应该从这里开始.

Adopting this part should already get rid of most of the annoying manual work, and you can automate away more and more as the need arises. If you're not comfortable maintaining a good amount of CI code, you should start here.

您需要的是一个 CI(正如您已经指出的)和一个包管理器.您无法解决的问题是使用 git 推送您的更改和一个新标签,因此第 5 步和第 6 步的部分内容仍然是手动的.

Things you'll need are a CI (as you already noted) and a package manager. Something you won't get around is pushing your changes and a new tag with git, so parts of step 5 and 6 remain manual.

我将使用 诗歌 来保持简洁,因为我喜欢它[1],但也有 其他选项.这将处理第 2、3、7、8 步和未列出的第 10 步,更新我的依赖项并测试它们的兼容性",一旦出现问题,就会非常烦人.

I'll use poetry to keep things concise and because I like it[1], but there are also other options. This will take care of steps 2, 3, 7, 8, and the unlisted step 10, "update my dependencies and test them for compatibility", which is incredibly annoying as soon as it turns out to be a problem.

使用诗歌时的坏消息是您需要将所有打包配置移动到一个新文件 pyproject.toml 中.好消息是,您不需要单独的 setup.pysetup.cfgMANIFEST.inrequirements.txt 更多,因为 pyproject.toml 是包装和其他工具的临时标准,诗歌也有一个演练关于如何移植所有相关信息.

The bad news when using poetry is that you'll need to move all packaging configuration into a new file, pyproject.toml. The good news is, that you don't need a separate setup.py, setup.cfg, MANIFEST.in, or requirements.txt any more, since pyproject.toml is a provisional standard for packaging and other tools, and poetry also has a walkthrough on how to port over all the relevant info.

设置完成后,新的部署工作流程将是:

Once the setup is ready, the new deployment workflow would be:

$ poetry update           # update dependencies, may be skipped 
$ poetry version          # bump version
Bumping version from 1.1.2 to 1.1.3
# finalize git stuff, e.g. add -u, commit -m 'v1.1.3', tag v1.1.3, push
$ poetry publish --build  # build and publish to PyPI
Building my_django_lib (1.1.3)
 - Building sdist
 - Built my_django_lib-1.1.3.tar.gz

 - Building wheel
 - Built my_django_lib-1.1.3-py3-none-any.whl

Publishing my_django_lib (1.1.3) to PyPI
 - Uploading my_django_lib-1.1.3-py3-none-any.whl 100%
 - Uploading my_django_lib-1.1.3.tar.gz 100%

这应该已经比您当前所做的要短很多.如果你总是执行完全相同的 git 命令,不怕自动推送,并妥善保管你的 .gitignore 文件,请随意添加类似这个函数的东西到你的 ~/.bashrc 并改为调用它:

This should already be a lot shorter than what you're currently doing. If you always execute the exact same git commands, are not afraid to automate a push, and take good care of your .gitignore file, feel free to add something like this function to your ~/.bashrc and call it instead:

git_cord () {
  version=$(grep pyproject.toml -e '(?<=^version = ")(.*)(?=")' -Po)
  git add -u
  git commit -m "${version}"
  git tag "${version}"
  git push -u origin "${version}"
}

gitlab-CI 入门

CI 原则上可以处理与部署过程相关的所有事情,包括版本更新和发布.但是第一个要求您的 CI 可以推送到您的 repo(这有烦人的副作用),而后者可以发布到您的 PyPI(这是有风险的,并且使调试 CI 变得很痛苦).我认为更喜欢手动完成这两个步骤并不少见,因此这种最小的方法将只处理第 1 步和第 9 步.之后可以包括更广泛的测试和构建作业.

Getting started with gitlab-CI

The CI can in principle handle everything surrounding the deployment process, including version bumping and publishing. But the first requires that your CI can push to your repo (which has annoying side effects) and the latter that it can publish to your PyPI (which is risky, and makes debugging the CI a pain). I think it's not unusual to prefer to do those two steps by hand, so this minimal approach will only handle step 1 and 9. More extensive testing and build jobs can be included afterwards.

CI 的正确设置取决于您打算使用哪一个.github 的列表 很长,所以我将重点介绍 gitlab 的内置 CI.它是免费的,几乎没有什么魔力(这使得它具有相当的便携性),并且 CI 运行器的二进制文件是开放、免费的,并且实际记录在案,因此您可以在本地调试您的 CI 或启动并连接新的运行器(如果免费的运行器不适合您).

The correct setup of a CI depends on which one you plan to use. The list for github is long, so I'll instead focus on gitlab's builtin CI. It's free, has very little magic (which makes it comparably portable), and the binaries for the CI runners are open, free, and actually documented, so you can debug your CI locally or start and connect new runners if the free ones don't cut it for you.

这是一个小的 .gitlab-ci.yml,您可以将其放入项目根目录以运行测试.管道中的每个作业(跳过设置和安装命令)也应该可以在您的开发环境中执行,保持这种状态可以为维护者提供更好的体验.

Here is a small .gitlab-ci.yml that you can put into you project root in order to run the tests. Every single job in the pipeline (skipping setup and install commands) should also be executable in your dev environment, keeping it that way makes for a better maintainer-experience.

image: python:3.7-alpine

stages:
  - build
  - test

packaging:
  stage: build
  script:
    - pip install poetry
    - poetry build
  artifacts:
    paths: 
      - dist

pytest:
  stage: test
  script:
    - pip install dist/*.whl
    - pip install pytest
    - pytest

像这样设置 buildtest 阶段可以一口气处理步骤 1 和 9,同时还针对已安装的包而不是源文件运行测试套件.虽然只有在项目中有 src-layout 时它才能正常工作,这使得本地源无法从项目根目录导入.关于为什么这是一个好主意的一些信息这里和这里.

Setting up the build and test stage like this handles steps 1 and 9 in one swoop, while also running the test suite against the installed package instead of your source files. Though it will only work properly if you have have a src-layout in your project, which makes local sources unimportable from the project root. Some info on why that would be a good idea here and here.

Poetry 可以创建一个 src-layout 模板,您可以使用 poetry new my_django_lib --src 将代码移动到其中.

Poetry can create a src-layout template you can move your code into with poetry new my_django_lib --src.

虽然有一些工具可以根据提交消息自动创建变更日志,但保持一个好的变更日志是那些从手工护理中受益匪浅的事情之一.所以,我的建议是第 4 步不要自动化.

While there are tools out there that automatically create a changelog from commit messages, keeping a good changelog is one of those things that benefit greatly from being cared for by hand. So, my advice is no automation for step 4.

一种思考方式是手动 CHANGELOG 文件包含与您的用户相关的信息,并且应该只包含新功能、重要错误修复和弃用等信息.

One way to think about it is that the manual CHANGELOG file contains information that is relevant to your users, and should only feature information like new features, important bugfixes, and deprecations.

对于贡献者或插件作者可能很重要的更细粒度的信息将位于 MR、提交消息或问题讨论中,不应将其放入 CHANGELOG.您可以尝试以某种方式收集它,但导航这样的 AUTOLOG 可能与筛选我刚才提到的主要来源一样麻烦.

More fine grained information that might be important for contributors or plugin writers would be located in MRs, commit messages, or issue discussions, and should not make it into the CHANGELOG. You can try to collect it somehow, but navigating such an AUTOLOG is probably about as cumbersome as sifting through the primary sources I just mentioned.

所以简而言之,第 5 步和第 6 步的变更日志相关部分可以跳过.

So in short, the changelog-related parts of step 5 and 6 can be skipped.

添加CD并没有太大变化,只是不再需要手动释放.如果 CI 出现故障、出现错误或您不想等待管道发布修补程序,您仍然可以随诗发布.

Adding CD doesn't change too much, except that you don't have to release by hand any more. You can still release with poetry in case the CI is down, buggy, or you don't want to wait for the pipeline to release a hotfix.

这将通过以下方式改变工作流程:

This would alter the workflow in the following way:

  • 日常工作
    • 编写代码(还不能避免)
    • 记录提交消息和/或 MR 的进度(我更喜欢 MR,即使是我自己的更改,并在合并时压缩所有提交)
    • 推送到 gitlab/合并 MRs
    • 创建一个标签,运行 poetry version 可能还有 poetry update
    • CHANGELOG
    • 中编写发行说明
    • 推送到 gitlab

    如果您 提供秘密 PYPI_USERPYPI_PASSWORD:

    stages:
      - build
      - test
      - release
    
    [...]  # packaging and pytest unchanged
    
    upload:
      stage: release
      only:
        - tags
        # Or alternatively "- /^vd+.d+.d+/" if you also use non-release
        # tags, the regex only matches tags that look like this: "v1.12.0"
      script:
        - pip install poetry
        - poetry publish -u ${PYPI_USER} -p ${PYPI_PASSWORD} dist/*
    

    <小时>

    一些有用的链接:


    Some useful links:

    • .gitlab-ci.yml 文档
    • 预定义变量列表,这是大多数 gitlab CI 的地方默默无闻的谎言
    • 我的 的长版本.gitlab-ci.yml 模板,带有可能对您有用或可能没有用的附加阶段.它需要您的代码的 src 布局.
      • lint:类型检查,coverage 和 代码风格
      • security:检查您自己的代码和你的依赖项 获取价值
      • release.docs:提供自动创建的文档的公共 gitlab 页面部分 基于您的文档字符串
      • build 阶段从 poetry.lock 文件创建一个操舵室,可用于稍后安装依赖项以支持 PyPI.这会稍微快一点,节省网络带宽,并且如果您想调试,则断言使用特定版本,但可能有点矫枉过正,需要使用诗歌预发行版.
      • .gitlab-ci.yml documentation
      • list of predefined variables, this is where most of gitlab CI's obscurities lie
      • the long version of my .gitlab-ci.yml template, with additional stages that may or may not be useful to you. It expects a src layout of your code.
        • lint: type checking, coverage, and code style
        • security: checking your own code and your dependencies for valnuarabilities
        • release.docs: public gitlab pages section where docs are served that are created automatically based on your docstrings
        • The build stage creates a wheelhouse from the poetry.lock file that can be used for installing dependencies later in favor of PyPI. This is a little faster, saves network bandwidth, and asserts the use of specific versions if you want to debug, but might be overkill and requires the use of a poetry pre-release.

        [1] 除此之外,诗歌还 1)为您处理 virtualenv,2)创建一个散列锁文件以防您需要可重现的构建,以及 3)使贡献更容易,因为您只有克隆存储库后运行诗歌安装"并准备就绪.

相关文章