高级索引分配是否复制数组数据?

2022-01-20 00:00:00 python numpy 复制

问题描述

我正在慢慢尝试理解 numpy 中 views 和 copys 以及可变类型与不可变类型之间的区别.

I am slowly trying to understand the difference between views and copys in numpy, as well as mutable vs. immutable types.

如果我使用 '高级索引' 它应该返回一个副本.这似乎是真的:

If I access part of an array with 'advanced indexing' it is supposed to return a copy. This seems to be true:

In [1]: import numpy as np In [2]: a = np.zeros((3,3)) In [3]: b = np.array(np.identity(3), dtype=bool) In [4]: c = a[b] In [5]: c[:] = 9 In [6]: a Out[6]: array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])

由于c 只是一个副本，它不共享数据并且更改它不会改变a.然而，这让我感到困惑:

Since c is just a copy, it does not share data and changing it does not mutate a. However, this is what confuses me:

In [7]: a[b] = 1 In [8]: a Out[8]: array([[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]])

看来，即使我使用高级索引，赋值仍然会将左侧的内容视为视图.显然，第 2 行中的 a 与第 6 行中的 a 是相同的对象/数据，因为改变 c 对其没有影响.

So, it seems, even if I use advanced indexing, assignment still treats the thing on the left as a view. Clearly the a in line 2 is the same object/data as the a in line 6, since mutating c has no effect on it.

所以我的问题是:第 8 行中的 a 是和以前一样的对象/数据(当然不包括对角线)还是副本?换句话说，是a的数据被复制到了新的a，还是它的数据在原地发生了变异?

So my question: is the a in line 8 the same object/data as before (not counting the diagonal of course) or is it a copy? In other words, was a's data copied to the new a, or was its data mutated in place?

例如，是不是这样的:

x = [1,2,3] x += [4]

或喜欢:

y = (1,2,3) y += (4,)

我不知道如何检查这一点，因为在任何一种情况下，a.flags.owndata 都是 True.如果我以一种令人困惑的方式思考这个问题，请随时详细说明或回答不同的问题.

I don't know how to check for this because in either case, a.flags.owndata is True. Please feel free to elaborate or answer a different question if I'm thinking about this in a confusing way.

解决方案

当你执行 c = a[b] 时，a.__get_item__ 被 调用b 作为其唯一参数，返回的任何内容都分配给 c.

When you do c = a[b], a.__get_item__ is called with b as its only argument, and whatever gets returned is assigned to c.

当您执行a[b] = c 时，a.__setitem__ 会与 b 和 c 一起调用作为参数，返回的任何内容都会被默默地丢弃.

When you doa[b] = c, a.__setitem__ is called with b and c as arguments and whatever gets returned is silently discarded.

因此，尽管具有相同的 a[b] 语法，但两个表达式执行不同的操作.您可以继承 ndarray，重载这两个函数，并让它们表现不同.在 numpy 中默认情况下，前者返回一个副本(如果 b 是一个数组)，但后者修改 a 就地.

So despite having the same a[b] syntax, both expressions are doing different things. You could subclass ndarray, overload this two functions, and have them behave differently. As is by default in numpy, the former returns a copy (if b is an array) but the latter modifies a in place.

相关文章