将 2D 图像坐标转换为 z = 0 的 3D 世界坐标
- OpenCV => 3.2
- 操作系统/平台 => Windows 64 位
- 编译器 => Visual Studio 2015
我目前正在从事我的项目,该项目涉及车辆检测和跟踪以及估计和优化车辆周围的长方体.为此,我已经完成了车辆的检测和跟踪,我需要找到车辆边界框边缘的图像点的 3-D 世界坐标,然后估计长方体和项目边缘的世界坐标它回到图像来显示它.
I am currently working on my project which involves vehicle detection and tracking and estimating and optimizing a cuboid around the vehicle. For that I have accomplished detection and tracking of vehicles and I need to find the 3-D world coordinates of the image points of the edges of the bounding boxes of the vehicles and then estimate the world coordinates of the edges of the cuboid and the project it back to the image to display it.
所以,我是计算机视觉和 OpenCV 的新手,但据我所知,我只需要图像上的 4 个点并且需要知道这 4 个点的世界坐标,并在 OpenCV 中使用 solvePNP 来获取旋转和平移向量(我已经有了相机矩阵和失真系数).然后,我需要使用 Rodrigues 将旋转向量转换为旋转矩阵,然后将其与平移向量连接以获得我的外在矩阵,然后将外在矩阵与相机矩阵相乘以得到我的投影矩阵.由于我的 z 坐标为零,所以我需要从投影矩阵中取出第三列,它给出了将 2D 图像点转换为 3D 世界点的单应矩阵.现在,我找到了单应矩阵的逆矩阵,它给出了 3D 世界点与 2D 图像点之间的单应矩阵.之后,我将图像点 [x, y, 1]t 与逆单应矩阵相乘得到 [wX, wY, w]t 并将整个向量除以标量 w 得到 [X, Y, 1]给我世界坐标的 X 和 Y 值.
So, I am new to computer vision and OpenCV, but in my knowledge, I just need 4 points on the image and need to know the world coordinates of those 4 points and use solvePNP in OpenCV to get the rotation and translation vectors (I already have the camera matrix and distortion coefficients). Then, I need to use Rodrigues to transform the rotation vector into a rotation matrix and then concatenate it with the translation vector to get my extrinsic matrix and then multiply the extrinsic matrix with the camera matrix to get my projection matrix. Since my z coordinate is zero, so I need to take off the third column from the projection matrix which gives the homography matrix for converting the 2D image points to 3D world points. Now, I find the inverse of the homography matrix which gives me the homography between the 3D world points to 2D image points. After that I multiply the image points [x, y, 1]t with the inverse homography matrix to get [wX, wY, w]t and the divide the entire vector by the scalar w to get [X, Y, 1] which gives me the X and Y values of the world coordinates.
我的代码如下所示:
#include "opencv2/opencv.hpp"
#include <stdio.h>
#include <iostream>
#include <sstream>
#include <math.h>
#include <conio.h>
using namespace cv;
using namespace std;
Mat cameraMatrix, distCoeffs, rotationVector, rotationMatrix,
translationVector,extrinsicMatrix, projectionMatrix, homographyMatrix,
inverseHomographyMatrix;
Point point;
vector<Point2d> image_points;
vector<Point3d> world_points;
int main()
{
FileStorage fs1("intrinsics.yml", FileStorage::READ);
fs1["camera_matrix"] >> cameraMatrix;
cout << "Camera Matrix: " << cameraMatrix << endl << endl;
fs1["distortion_coefficients"] >> distCoeffs;
cout << "Distortion Coefficients: " << distCoeffs << endl << endl;
image_points.push_back(Point2d(275, 204));
image_points.push_back(Point2d(331, 204));
image_points.push_back(Point2d(331, 308));
image_points.push_back(Point2d(275, 308));
cout << "Image Points: " << image_points << endl << endl;
world_points.push_back(Point3d(0.0, 0.0, 0.0));
world_points.push_back(Point3d(1.775, 0.0, 0.0));
world_points.push_back(Point3d(1.775, 4.620, 0.0));
world_points.push_back(Point3d(0.0, 4.620, 0.0));
cout << "World Points: " << world_points << endl << endl;
solvePnP(world_points, image_points, cameraMatrix, distCoeffs, rotationVector, translationVector);
cout << "Rotation Vector: " << endl << rotationVector << endl << endl;
cout << "Translation Vector: " << endl << translationVector << endl << endl;
Rodrigues(rotationVector, rotationMatrix);
cout << "Rotation Matrix: " << endl << rotationMatrix << endl << endl;
hconcat(rotationMatrix, translationVector, extrinsicMatrix);
cout << "Extrinsic Matrix: " << endl << extrinsicMatrix << endl << endl;
projectionMatrix = cameraMatrix * extrinsicMatrix;
cout << "Projection Matrix: " << endl << projectionMatrix << endl << endl;
double p11 = projectionMatrix.at<double>(0, 0),
p12 = projectionMatrix.at<double>(0, 1),
p14 = projectionMatrix.at<double>(0, 3),
p21 = projectionMatrix.at<double>(1, 0),
p22 = projectionMatrix.at<double>(1, 1),
p24 = projectionMatrix.at<double>(1, 3),
p31 = projectionMatrix.at<double>(2, 0),
p32 = projectionMatrix.at<double>(2, 1),
p34 = projectionMatrix.at<double>(2, 3);
homographyMatrix = (Mat_<double>(3, 3) << p11, p12, p14, p21, p22, p24, p31, p32, p34);
cout << "Homography Matrix: " << endl << homographyMatrix << endl << endl;
inverseHomographyMatrix = homographyMatrix.inv();
cout << "Inverse Homography Matrix: " << endl << inverseHomographyMatrix << endl << endl;
Mat point2D = (Mat_<double>(3, 1) << image_points[0].x, image_points[0].y, 1);
cout << "First Image Point" << point2D << endl << endl;
Mat point3Dw = inverseHomographyMatrix*point2D;
cout << "Point 3D-W : " << point3Dw << endl << endl;
double w = point3Dw.at<double>(2, 0);
cout << "W: " << w << endl << endl;
Mat matPoint3D;
divide(w, point3Dw, matPoint3D);
cout << "Point 3D: " << matPoint3D << endl << endl;
_getch();
return 0;
我已经获得了四个已知世界点的图像坐标,并为简化起见对其进行了硬编码.image_points
包含四个点的图像坐标,world_points
包含四个点的世界坐标.我正在考虑将第一个世界点作为世界轴上的原点 (0, 0, 0),并使用已知距离计算其他四个点的坐标.现在在计算逆单应矩阵之后,我将它与与世界坐标 (0, 0, 0) 相关的 [image_points[0].x, image_points[0].y, 1]t 相乘.然后我将结果除以第三个分量 w 得到 [X, Y, 1].但是打印出 X 和 Y 的值后,发现它们分别不是 0 和 0.做错了什么?
I have got the image coordinates of the four known world points and hard-coded it for simplification. image_points
contain the image coordinates of the four points and world_points
contain the world coordinates of the four points. I am considering the the first world point as the origin (0, 0, 0) in the world axis and using known distance calculating the coordinates of the other four points. Now after calculating the inverse homography matrix, I multiplied it with [image_points[0].x, image_points[0].y, 1]t which is related to the world coordinate (0, 0, 0). Then I divide the result by the third component w to get [X, Y, 1]. But after printing out the values of X and Y, it turns out they are not 0, 0 respectively. What am doing wrong?
我的代码输出是这样的:
The output of my code is like this:
Camera Matrix: [517.0036881709533, 0, 320;
0, 517.0036881709533, 212;
0, 0, 1]
Distortion Coefficients: [0.1128663679798094;
-1.487790079922432;
0;
0;
2.300571896761067]
Image Points: [275, 204;
331, 204;
331, 308;
275, 308]
World Points: [0, 0, 0;
1.775, 0, 0;
1.775, 4.62, 0;
0, 4.62, 0]
Rotation Vector:
[0.661476468596541;
-0.02794460022559267;
0.01206996342819649]
Translation Vector:
[-1.394495345140898;
-0.2454153722672731;
15.47126945512652]
Rotation Matrix:
[0.9995533907649279, -0.02011656447351923, -0.02209848058392758;
0.002297501163799448, 0.7890323093017149, -0.6143474069013439;
0.02979497438726573, 0.6140222623910194, 0.7887261380159]
Extrinsic Matrix:
[0.9995533907649279, -0.02011656447351923, -0.02209848058392758,
-1.394495345140898;
0.002297501163799448, 0.7890323093017149, -0.6143474069013439,
-0.2454153722672731;
0.02979497438726573, 0.6140222623910194, 0.7887261380159,
15.47126945512652]
Projection Matrix:
[526.3071813531748, 186.086785938988, 240.9673682002232, 4229.846989065414;
7.504351145361707, 538.1053336219271, -150.4099339268854, 3153.028471890794;
0.02979497438726573, 0.6140222623910194, 0.7887261380159, 15.47126945512652]
Homography Matrix:
[526.3071813531748, 186.086785938988, 4229.846989065414;
7.504351145361707, 538.1053336219271, 3153.028471890794;
0.02979497438726573, 0.6140222623910194, 15.47126945512652]
Inverse Homography Matrix:
[0.001930136511648154, -8.512427241879318e-05, -0.5103513244724983;
-6.693679705844383e-06, 0.00242178892313387, -0.4917279870709287
-3.451449134581896e-06, -9.595179260534558e-05, 0.08513443835773901]
First Image Point[275;
204;
1]
Point 3D-W : [0.003070864657310213;
0.0004761913292736786;
0.06461112415423849]
W: 0.0646111
Point 3D: [21.04004290792539;
135.683117651025;
1]
推荐答案
你的推理是正确的,但是你在最后一个分区中犯了一些错误..还是我错过了什么?
Your reasoning is sound, but you are making some mistake in the last division.. or am I missing something?
你在W除之前的结果是:
Your result before W division is:
Point 3D-W :
[0.003070864657310213;
0.0004761913292736786;
0.06461112415423849]
现在我们需要通过将所有坐标除以 W(数组的第三个元素)来标准化它,正如您在问题中描述的那样.所以:
Now we need to normalize this by dividing all the coordinates by W (the 3rd element of the array), as you described in your question. so:
Point 3D-W Normalized =
[0.003070864657310213 / 0.06461112415423849;
0.0004761913292736786 / 0.06461112415423849;
0.06461112415423849 / 0.06461112415423849]
结果:
Point 3D-W Normalized =
[0.047528420183179314;
0.007370113668614144;
1.0]
这该死的接近 [0,0].
Which is damn close to [0,0].
相关文章