诗歌如何处理二进制依赖关系?(尤其是 numpy)

2022-01-10 00:00:00 python numpy conda python-poetry anaconda

问题描述

到目前为止,我一直在使用 conda 作为虚拟环境和依赖管理.但是,将 environment.yml 文件从我的开发机器传输到生产服务器时,有些东西无法按预期工作.现在,我想研究替代方案.诗歌看起来不错,尤其是因为

Until now I have been using conda as virtual environment and dependency management. However, some stuff does not work as expected when transfering my environment.yml file from my development machine to the production server. Now, I would like to look into alternatives. Poetry seems nice, especially because

poetry 还维护一个锁文件,它比 pipenv 有一个优势,因为它跟踪哪些包是子依赖项.(https://realpython.com/effective-python-environment/#poetry)

poetry also maintains a lock file, and it has a benefit over pipenv because it keeps track of which packages are subdependencies. (https://realpython.com/effective-python-environment/#poetry)

这可能会大大提高稳定性.但是,我正在从事科学类项目(矩阵、数据科学、机器学习),所以在实践中,我需要 scipy 堆栈(例如 numpy、pandas、scitkit-learn).

which might improve stability quite a bit. However, I am working on science-heavy projects (matrices, data science, machine learning), so in practise I need the scipy stack (e.g. numpy, pandas, scitkit-learn).

对于一些纯粹的计算工作负载,Python 变得太慢,因此 numpy 和 scipy 诞生了.[...] 它们是用 C 语言编写的,只是包装成一个 python 库.

Python became too slow for some pure computational workloads so numpy and scipy were born. [...] They are written in C and just wrapped as a python library.

编译此类库带来了一系列挑战,因为它们(或多或少)必须在您的计算机上编译以获得最佳性能并与 glibc 等库正确链接.

Compiling such libraries brings a set of challenges since they (more or less) have to be compiled on your machine for maximum performance and proper linking with libraries like glibc.

Conda 是作为科学界管理 Python 环境的一体化解决方案推出的.

Conda was introduced as an all-in-one solution to manage python environments for the scientific community.

[...] 与在您的机器上编译库的脆弱过程不同,库被预编译并在您请求时才下载.不幸的是,该解决方案附带一个警告 - conda 不使用 PyPI,这是最流行的 python 包索引.

[...] Instead of using a fragile process of compiling libraries on your machine, libraries are precompiled and just downloaded when you request them. Unfortunately, the solution comes with a caveat - conda does not use PyPI, the most popular index of python packages.

(https://modelpredict.com/python-dependency-management-tools#fnref:conda-compile-challenges)

据我所知,这甚至对 Conda 不公平,因为它进行了大量优化以充分利用我的 CPU/GPU/numpy 架构.(https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/#Myth-#6:-Now-that-pip-uses-wheels,-conda-is-no-long-necessary)

As far as I know, this doesn't even do Conda justice, because it does quite a bit of optimization to get the most out of my CPU/GPU/architecture for numpy. (https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/#Myth-#6:-Now-that-pip-uses-wheels,-conda-is-no-longer-necessary)

https://numpy.org/install/ 本身建议使用 conda,但也说一个可以通过 pip 安装(而诗歌使用 pypi)

https://numpy.org/install/ itself advises to use conda, but also says that one can install via pip (and poetry uses pypi)

对于从个人喜好或阅读下文了解 conda 和 pip 的主要区别的用户,他们更喜欢基于 pip/PyPI 的解决方案,我们建议:

For users who know, from personal preference or reading about the main differences between conda and pip below, they prefer a pip/PyPI-based solution, we recommend:

[...] 使用 Poetry 作为维护最完善的工具,它以与 conda 类似的方式提供依赖关系解析器和环境管理功能.

[...] Use Poetry as the most well-maintained tool that provides a dependency resolver and environment management capabilities in a similar fashion as conda does.

我想要获得诗歌设置的稳定性和 conda 设置的速度.

I would like to get the stability of the poetry setup and the speed of the conda setup.

诗歌如何处理二进制依赖?它是否也像 conda 一样考虑我的硬件?

如果诗歌没有这方面的表现,我可以将它与 conda 结合起来吗?

If poetry not deliver in this regard, can I combine it with conda?


解决方案

numpy 为不同的 os、cpu 架构和 python 版本提供了几个 wheel 文件.wheel 包是预编译的,因此目标系统不必编译包.

numpy provides several wheel files for different os, cpu architecture and python versions. wheel packages are precompiled, so the target system doesn't have to compile the package.

poetry 能够根据您的系统为您选择合适的轮子.

poetry is able to choose the right wheel for you, depending on your system.

说到这里,我推荐使用poetry,只要你只需要python包,pypi也有.一旦您需要其他非 python 工具,请坚持使用 conda.(免责声明:我是poetry的维护者之一).

Saying this, I would recommend using poetry, as long as you just need python packages, which are also available at pypi. As soon as you need other, non-python tools, stick to conda. (Disclaimer: I'm one of the maintainer of poetry).

还相关:https://github.com/python-poetry/poetry/issues/1904

相关文章