高级索引分配是否复制数组数据?
问题描述
我正在慢慢尝试理解 numpy 中 view
s 和 copy
s 以及可变类型与不可变类型之间的区别.
I am slowly trying to understand the difference between view
s and copy
s in numpy, as well as mutable vs. immutable types.
如果我使用 '高级索引' 它应该返回一个副本.这似乎是真的:
If I access part of an array with 'advanced indexing' it is supposed to return a copy. This seems to be true:
In [1]: import numpy as np
In [2]: a = np.zeros((3,3))
In [3]: b = np.array(np.identity(3), dtype=bool)
In [4]: c = a[b]
In [5]: c[:] = 9
In [6]: a
Out[6]:
array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
由于c
只是一个副本,它不共享数据并且更改它不会改变a
.然而,这让我感到困惑:
Since c
is just a copy, it does not share data and changing it does not mutate a
. However, this is what confuses me:
In [7]: a[b] = 1
In [8]: a
Out[8]:
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
看来,即使我使用高级索引,赋值仍然会将左侧的内容视为视图.显然,第 2 行中的 a
与第 6 行中的 a
是相同的对象/数据,因为改变 c
对其没有影响.
So, it seems, even if I use advanced indexing, assignment still treats the thing on the left as a view. Clearly the a
in line 2 is the same object/data as the a
in line 6, since mutating c
has no effect on it.
所以我的问题是:第 8 行中的 a
是和以前一样的对象/数据(当然不包括对角线)还是副本?换句话说,是a
的数据被复制到了新的a
,还是它的数据在原地发生了变异?
So my question: is the a
in line 8 the same object/data as before (not counting the diagonal of course) or is it a copy? In other words, was a
's data copied to the new a
, or was its data mutated in place?
例如,是不是这样的:
x = [1,2,3]
x += [4]
或喜欢:
y = (1,2,3)
y += (4,)
我不知道如何检查这一点,因为在任何一种情况下,a.flags.owndata
都是 True
.如果我以一种令人困惑的方式思考这个问题,请随时详细说明或回答不同的问题.
I don't know how to check for this because in either case, a.flags.owndata
is True
. Please feel free to elaborate or answer a different question if I'm thinking about this in a confusing way.
解决方案
当你执行 c = a[b]
时,a.__get_item__
被 调用b
作为其唯一参数,返回的任何内容都分配给 c
.
When you do c = a[b]
, a.__get_item__
is called with b
as its only argument, and whatever gets returned is assigned to c
.
当您执行a[b] = c
时,a.__setitem__
会与 b
和 c
一起调用作为参数,返回的任何内容都会被默默地丢弃.
When you doa[b] = c
, a.__setitem__
is called with b
and c
as arguments and whatever gets returned is silently discarded.
因此,尽管具有相同的 a[b]
语法,但两个表达式执行不同的操作.您可以继承 ndarray
,重载这两个函数,并让它们表现不同.在 numpy 中默认情况下,前者返回一个副本(如果 b
是一个数组),但后者修改 a
就地.
So despite having the same a[b]
syntax, both expressions are doing different things. You could subclass ndarray
, overload this two functions, and have them behave differently. As is by default in numpy, the former returns a copy (if b
is an array) but the latter modifies a
in place.
相关文章