初始化多个 Numpy 数组(多重赋值) - 像 MATLAB deal()

问题描述

我找不到任何描述如何执行此操作的内容,这导致我相信我没有以正确的惯用 Python 方式执行此操作.也将不胜感激有关正确" Python 方法的建议.

I was unable to find anything describing how to do this, which leads to be believe I'm not doing this in the proper idiomatic Python way. Advice on the 'proper' Python way to do this would also be appreciated.

我正在编写的数据记录器有一堆变量(任意记录长度,最大长度已知).在 MATLAB 中,我会将它们全部初始化为长度为 n 的一维零数组,n 大于我见过的条目数,在记录循环,并在测量结束时修剪掉多余的零.初始化如下所示:

I have a bunch of variables for a datalogger I'm writing (arbitrary logging length, with a known maximum length). In MATLAB, I would initialize them all as 1-D arrays of zeros of length n, n bigger than the number of entries I would ever see, assign each individual element variable(measurement_no) = data_point in the logging loop, and trim off the extraneous zeros when the measurement was over. The initialization would look like this:

[dData gData cTotalEnergy cResFinal etc] = deal(zeros(n,1));

有没有办法在 Python/NumPy 中做到这一点,所以我不必将每个变量放在自己的行上:

Is there a way to do this in Python/NumPy so I don't either have to put each variable on its own line:

dData = np.zeros(n)
gData = np.zeros(n)
etc.

我也不想只制作一个大矩阵,因为跟踪哪一列是哪个变量是不愉快的.也许解决方案是制作 (length x numvars) 矩阵,并将列切片分配给各个变量?

I would also prefer not just make one big matrix, because keeping track of which column is which variable is unpleasant. Perhaps the solution is to make the (length x numvars) matrix, and assign the column slices out to individual variables?

假设到这结束时我将有很多相同长度的向量;例如,我的后处理获取每个日志文件,计算一堆单独的指标(>50),存储它们,然后重复,直到所有日志都被处理完.然后我生成直方图,means/maxes/sigmas/etc.对于我计算的所有各种指标.由于在 Python 中初始化 50 多个向量显然并不容易,那么最好的(最干净的代码和良好的性能)方法是什么?

Assume I'm going to have a lot of vectors of the same length by the time this is over; e.g., my post-processing takes each log file, calculates a bunch of separate metrics (>50), stores them, and repeats until the logs are all processed. Then I generate histograms, means/maxes/sigmas/etc. for all the various metrics I computed. Since initializing 50+ vectors is clearly not easy in Python, what's the best (cleanest code and decent performance) way of doing this?


解决方案

如果你真的有动力在单行中做到这一点,你可以创建一个 (n_vars, ...) 数组零,然后沿第一维解包:

If you're really motivated to do this in a one-liner you could create an (n_vars, ...) array of zeros, then unpack it along the first dimension:

a, b, c = np.zeros((3, 5))
print(a is b)
# False

另一种选择是使用列表推导式或生成器表达式:

Another option is to use a list comprehension or a generator expression:

a, b, c = [np.zeros(5) for _ in range(3)]   # list comprehension
d, e, f = (np.zeros(5) for _ in range(3))   # generator expression
print(a is b, d is e)
# False False

不过要小心!您可能认为在包含您对 np.zeros() 的调用的列表或元组上使用 * 运算符可以达到同样的效果,但事实并非如此:

Be careful, though! You might think that using the * operator on a list or tuple containing your call to np.zeros() would achieve the same thing, but it doesn't:

h, i, j = (np.zeros(5),) * 3
print(h is i)
# True

这是因为元组内的表达式首先被求值.np.zeros(5) 因此只被调用一次,重复元组中的每个元素最终都是对同一个数组的引用.这就是你不能只使用 a = b = c = np.zeros(5) 的原因.

This is because the expression inside the tuple gets evaluated first. np.zeros(5) therefore only gets called once, and each element in the repeated tuple ends up being a reference to the same array. This is the same reason why you can't just use a = b = c = np.zeros(5).

除非您确实需要分配大量空数组变量并且您非常关心使代码紧凑(!),否则我建议您在单独的行中初始化它们以提高可读性.

Unless you really need to assign a large number of empty array variables and you really care deeply about making your code compact (!), I would recommend initialising them on separate lines for readability.

相关文章