埃拉托色尼筛网速度比较:Python vs Julia

2022-01-23 00:00:00 python performance julia

问题描述

所以我有一个用 Python 和 Julia 编写的 Eratosthenes 函数的小筛子,我正在比较运行时间.

So I have a little sieve of Eratosthenes function written in both Python and Julia, and I'm comparing run times.

这是 Python 代码:

Here is the Python code:

import time

def get_primes(n):
    numbers = set(range(n, 1, -1))
    primes = []
    while numbers:
        p = numbers.pop()
        primes.append(p)
        numbers.difference_update(set(range(p*2, n+1, p)))
    return primes

start = time.time()
get_primes(10000)
print time.time() - start

这里是 Julia 代码:

And here is the Julia code:

function get_primes(n)
        numbers = [2:n]
        primes = Int[]
        while numbers != []
                p = numbers[1]
                push!(primes, p)
                numbers = setdiff(numbers, [p*i for i=1:int(n/p)])
        end
        return primes
end

@time get_primes(10000);

Python 代码的运行时间约为 0.005 秒,而 Julia 代码的运行时间约为 0.5 秒,这意味着 Python 的运行速度大约快了 100 倍.这可能有一个完全合乎逻辑的原因,但我真的不知道我在做什么.

The Python code runs in about .005 seconds, while the Julia code takes about .5 seconds, so that means Python runs about 100x times faster. There's probably a perfectly logical reason for this, but I really have no idea what I'm doing.


解决方案

主要区别在于,在 Python 中,您分配一个集合,number,然后就地修改它,而在Julia,您在循环的每次迭代中分配一个新数组.另一个区别是您在 Python 中使用了一个集合,而在 Julia 中使用了一个数组(Python 称之为列表").让我们更改 Julia 代码以消除这两个差异:

The main difference is that in Python you're allocating a single set, number, and then modifying it in place, while in Julia, you're allocating a new array on every iteration of the loop. Another difference is that you're using a set in Python and an array in Julia (what Python calls a "list"). Let's change the Julia code to eliminate these two differences:

function get_primes(n)
    numbers = IntSet(2:n)
    primes = Int[]
    while !isempty(numbers)
        p = shift!(numbers)
        push!(primes, p)
        setdiff!(numbers, p:p:n)
    end
    return primes
end

通过此更改,在我的系统上,Julia 代码比 Python 代码快 10 倍(0.0005 对 0.0048 秒).请注意,我还使用了一个范围作为 setdiff! 的第二个参数——它比使用推导式构造数组更简洁和更快(1.5 倍).

With this change, on my system, the Julia code is 10x faster than the Python code (0.0005 vs. 0.0048 seconds). Note that I also used a range as the second argument of the setdiff! – it's both terser and faster (1.5x) than constructing an array with a comprehension.

在 Julia 中写这个更惯用的风格是使用一个布尔数组,打开和关闭它们:

A somewhat more idiomatic style of writing this in Julia would be to use an array of booleans, turning them on and off:

function eratosthenes(n)
    primes = fill(true,n)
    primes[1] = false
    for p = 2:n
        primes[p] || continue
        for i = 2:div(n,p)
            primes[p*i] = false
        end
    end
    find(primes)
end

最后的 find 返回向量非零(即真)的索引.在我的机器上,这比其他 Julia 版本快 5 倍(0.0001 秒),比 Python 版本快 45 倍.

The find at the end returns the indices where a vector is non-zero (i.e. true). On my machine, this is 5x faster (0.0001 seconds) than the other Julia version and 45x faster than the Python version.

相关文章