基本矩阵分解:验证 R 和 T 的四种可能解决方案
我想使用 OpenCV 做一些 Structure-from-Motion.到目前为止,我有 basicmatix 和 essentialmatrix.有了基本矩阵,我正在做 SVD 以获得 R 和 T.
I want to do some Structure-from-Motion using OpenCV. So far I have the fundamentalmatix and the essentialmatrix. Having the essentialmatrix I am doing SVD for getting R and T.
我的问题是我有 2 个可能的 R 解决方案和 2 个可能的 T 解决方案,这导致整体姿势有 4 个解决方案,其中 4 个解决方案中只有一个是正确的.如何找到正确的解决方案?
My problem is that I have 2 possible solutions for R and 2 possible solutions for T which leads to 4 solutions for the overall pose, where only one of the 4 solutions is the right one. How can I find the correct solution?
这是我的代码:
private void calculateRT(Mat E, Mat R, Mat T){
Mat w = new Mat();
Mat u = new Mat();
Mat vt = new Mat();
Mat diag = new Mat(3,3,CvType.CV_64FC1);
double[] diagVal = {1,0,0,0,1,0,0,0,1};
diag.put(0, 0, diagVal);
Mat newE = new Mat(3,3,CvType.CV_64FC1);
Core.SVDecomp(E, w, u, vt, Core.DECOMP_SVD);
Core.gemm(u, diag, 1, vt, 1, newE);
Core.SVDecomp(newE, w, u, vt, Core.DECOMP_SVD);
publishProgress("U: " + u.dump());
publishProgress("W: " + w.dump());
publishProgress("vt:" + vt.dump());
double[] W_Values = {0,-1,0,1,0,0,0,0,1};
Mat W = new Mat(new Size(3,3), CvType.CV_64FC1);
W.put(0, 0, W_Values);
double[] Wt_values = {0,1,0-1,0,0,0,0,1};
Mat Wt = new Mat(new Size(3,3), CvType.CV_64FC1);
Wt.put(0,0,Wt_values);
Mat R1 = new Mat();
Mat R2 = new Mat();
// u * W * vt = R
Core.gemm(u, Wt, 1, vt, 1, R2);
Core.gemm(u, W, 1, vt, 1, R1);
publishProgress("R: " + R.dump());
// +- T (2 possible solutions for T)
Mat T1 = new Mat();
Mat T2 = new Mat();
// T = u.t
u.col(2).copyTo(T1);
publishProgress("T : " + T.dump());
Core.multiply(T, new Scalar(-1.0, -1.0, -1.0), T2);
// TODO Here I have to find the correct combination for R1 R2 and T1 T2
}
推荐答案
从两个相机的基本矩阵重建相对欧几里得姿势时存在理论歧义.这种模糊性与以下事实有关:给定图像中的 2D 点,经典针孔相机模型无法分辨相应的 3D 点是在相机前面还是在相机后面.为了消除这种歧义,您需要知道图像中的一个点对应关系:因为这两个 2D 点被假定为位于两个相机前面的单个 3D 点的投影(因为它在两个图像中都可见),这将能够选择正确的 R 和 T.
There is a theoretical ambiguity when reconstructing the relative euclidian poses of two cameras from their fundamental matrix. This ambiguity is linked to the fact that, given a 2D point in an image, the classic pinhole camera model cannot tell whether the corresponding 3D point is in front of the camera or behind the camera. In order to remove this ambiguity, you need to know one point correspondence in the images: as these two 2D points are assumed to be the projections of a single 3D point lying in front of both cameras (since it is visible in both images), this will enable choosing the right R and T.
为此,C.Ressl (PDF).下面给出该方法的概要.我将用 x1 和 x2 表示两个对应的 2D 点,用 K1 和 K2 表示两个相机矩阵,用 E12 表示基本矩阵.
For that purpose, one method is explained in § 6.1.4 (p47) of the following PhD thesis: "Geometry, constraints and computation of the trifocal tensor", by C.Ressl (PDF). The following gives the outline of this method. I'll denote the two corresponding 2D points by x1 and x2, the two camera matrices by K1 and K2 and the essential matrix by E12.
我.计算基本矩阵 E12 = U * S * V'
的 SVD.如果 det(U) <0
设置 U = -U
.如果 det(V) <0
设置 V = -V
.
i. Compute the SVD of the essential matrix E12 = U * S * V'
. If det(U) < 0
set U = -U
. If det(V) < 0
set V = -V
.
二.定义 W = [0,-1,0;1,0,0;0,0,1]
,R2 = U * W * V'
和 T2 = U 的第三列
三.定义 M = [ R2'*T2 ]x
、X1 = M * inv(K1) * x1
和 X2 = M * R2' * inv(K2)* x2
四.如果 <代码>X1(3) * X2(3) <0,设置R2 = U * W' * V'
并重新计算M
和X1
iv. If X1(3) * X2(3) < 0
, set R2 = U * W' * V'
and recompute M
and X1
v.如果 <代码>X1(3) <代码0 设置 T2 = -T2
六.定义 P1_E = K1 * [ I |0 ]
和 P2_E = K2 * [ R2 |T2]
符号 '
表示转置,符号 [.]x
在步骤 iii 中使用.对应于斜对称算子.在 3x1 向量上应用斜对称算子 e = [e_1;e_2;e_3]
结果如下(参见 维基百科关于跨产品的文章):
The notation '
denotes the transpose and the notation [.]x
used in step iii. corresponds to the skew-symetric operator. Applying the skew-symmetric operator on a 3x1 vector e = [e_1; e_2; e_3]
results in the following (see the Wikipedia article on cross-product):
[e]x = [0,-e_3,e_2; e_3,0,-e_1; -e_2,e_1,0]
最后,请注意 T2
的范数将始终为 1,因为它是正交矩阵的列之一.这意味着您将无法恢复两个摄像头之间的真实距离.为此,您需要知道场景中两点之间的真实距离,并将其考虑在内以计算相机之间的真实距离.
Finally, note that the norm of T2
will always be 1, since it is one of the column of an orthogonal matrix. This means that you won't be able to recover the true distance between the two cameras. For that purpose, you need to know the true distance between two points in the scene and take that into account to calculate the true distance between the cameras.
相关文章