OpenGL - 鼠标坐标到空间坐标

2021-12-19 00:00:00 opengl graphics c++ glfw glut

我的目标是在鼠标指向的地方放置一个球体(Z 坐标为 0).

我看到了这个

下面的代码定义了一个矩阵,它准确地封装了计算场景外观所需的步骤:

  • 将模型坐标转换为视口坐标.
  • 旋转,以查看视图的方向.
  • 移动到眼睛位置

以下代码与 gluLookAtglm::lookAt 的作用相同:

使用 TVec3 = std::array<浮动,3>;使用 TVec4 = std::array<浮动,4>;使用 TMat44 = std::array;TVec3 Cross( TVec3 a, TVec3 b ) { return { a[1] * b[2] - a[2] * b[1], a[2] * b[0] - a[0] * b[2], a[0] * b[1] - a[1] * b[0] };}浮动点(TVec3 a,TVec3 b){ return a[0]*b[0] + a[1]*b[1] + a[2]*b[2];}void Normalize(TVec3 & v){float len = sqrt( v[0] * v[0] + v[1] * v[1] + v[2] * v[2] );v[0]/= len;v[1]/= len;v[2]/= len;}TMat44 Camera::LookAt( const TVec3 &pos, const TVec3 &target, const TVec3 &up ){TVec3 mz = { pos[0] - 目标[0], pos[1] - 目标[1], pos[2] - 目标[2] };归一化( mz );TVec3 my = { up[0], up[1], up[2] };TVec3 mx = Cross(my, mz);归一化(mx);我的 = 交叉(mz,mx);TMat44 v{TVec4{ mx[0], my[0], mz[0], 0.0f },TVec4{ mx[1], my[1], mz[1], 0.0f },TVec4{ mx[2], my[2], mz[2], 0.0f },TVec4{ 点(mx, pos), 点(my, pos), -Dot(mz, pos), 1.0f }};返回 v;}


投影

投影矩阵描述了从场景的 3D 点到视口的 2D 点的映射.它从眼睛空间转换到剪辑空间,剪辑空间中的坐标通过与剪辑坐标的w分量相除而转换为归一化设备坐标(NDC).NDC 在 (-1,-1,-1) 到 (1,1,1) 的范围内.
每个超出 NDC 的几何体都会被剪裁.

相机视锥的近平面和远平面之间的物体映射到 NDC 的范围 (-1, 1).


正交投影

在正交投影中,眼睛空间中的坐标线性映射到标准化设备坐标.

正交投影矩阵:

r = 右,l = 左,b = 底部,t = 顶部,n = 近,f = 远2/(r-l) 0 0 00 2/(t-b) 0 00 0 -2/(f-n) 0-(r+l)/(r-l) -(t+b)/(t-b) -(f+n)/(f-n) 1


透视投影

在透视投影中,投影矩阵描述了从针孔相机看到的世界中的 3D 点到视口的 2D 点的映射.
相机视锥体(一个截棱锥)中的眼睛空间坐标被映射到一个立方体(标准化的设备坐标).

透视投影矩阵:

r = 右,l = 左,b = 底部,t = 顶部,n = 近,f = 远2*n/(r-l) 0 0 00 2*n/(t-b) 0 0(r+l)/(r-l) (t+b)/(t-b) -(f+n)/(f-n) -10 0 -2*f*n/(f-n) 0

哪里:

a = w/hta = tan( fov_y/2 );2 * n/(r-l) = 1/(ta * a)2 * n/(t-b) = 1/ta

如果投影是对称的,即视线在视口的中心并且视野没有位移,那么矩阵可以简化:

1/(ta*a) 0 0 00 1/ta 0 00 0 -(f+n)/(f-n) -10 0 -2*f*n/(f-n) 0


以下函数将计算与 gluPerspective 相同的投影矩阵:

#include const float cPI = 3.14159265f;浮动ToRad(浮动度){返回度* cPI/180.0f;}使用 TVec4 = std::array<浮动,4>;使用 TMat44 = std::array;TMat44 透视图(浮动 fov_y,浮动方面){浮动 fn = 远 + 近浮动 f_n = 远 - 近;浮动 r = 方面;浮动 t = 1.0f/tan( ToRad( fov_y )/2.0f );返回 TMat44{TVec4{ t/r, 0.0f, 0.0f, 0.0f },TVec4{ 0.0f, t, 0.0f, 0.0f },TVec4{ 0.0f, 0.0f, -fn/f_n, -1.0f },TVec4{ 0.0f, 0.0f, -2.0f*far*near/f_n, 0.0f }};}


透视投影中恢复视图空间位置的3个解决方案

  1. 具有视野和纵横比

由于投影矩阵是由视野和纵横比定义的,因此可以使用视野和纵横比恢复视口位置.假设它是对称透视投影,并且归一化设备坐标、深度和远近平面已知.

恢复视图空间中的 Z 距离:

z_ndc = 2.0 * 深度 - 1.0;z_eye = 2.0 * n * f/(f + n - z_ndc * (f - n));

通过XY归一化设备坐标恢复视图空间位置:

ndc_x, ndc_y = xy 归一化设备坐标,范围从 (-1, -1) 到 (1, 1):viewPos.x = z_eye * ndc_x * 方面 * tanFov;viewPos.y = z_eye * ndc_y * tanFov;viewPos.z = -z_eye;


2. 用投影矩阵

由视场和纵横比定义的投影参数存储在投影矩阵中.因此,视口位置可以通过投影矩阵中的值从对称透视投影中恢复.

注意投影矩阵、视野和纵横比之间的关系:

prjMat[0][0] = 2*n/(r-l) = 1.0/(tanFov * aspect);prjMat[1][1] = 2*n/(t-b) = 1.0/tanFov;prjMat[2][2] = -(f+n)/(f-n)prjMat[2][2] = -2*f*n/(f-n)

恢复视图空间中的 Z 距离:

A = prj_mat[2][2];B = prj_mat[3][2];z_ndc = 2.0 * 深度 - 1.0;z_eye = B/(A + z_ndc);

通过XY归一化设备坐标恢复视图空间位置:

viewPos.x = z_eye * ndc_x/prjMat[0][0];viewPos.y = z_eye * ndc_y/prjMat[1][1];viewPos.z = -z_eye;


3. 用逆投影矩阵

当然可以通过逆投影矩阵恢复视口位置.

mat4 inversePrjMat = inverse( prjMat );vec4 viewPosH = inversePrjMat * vec4(ndc_x, ndc_y, 2.0*depth - 1.0, 1.0)vec3 viewPos = viewPos.xyz/viewPos.w;


进一步查看:

  • 如何使用片段着色器中的 gl_FragCoord.z 在现代 OpenGL 中线性渲染深度?
  • 变换模型矩阵
  • 透视投影和视图矩阵:两者深度缓冲区和三角形面方向在 OpenGL 中反转
  • 如何计算给定坐标处相机可见的矩形的大小?
  • 如何在给定视图空间深度值和 ndc xy 的情况下恢复视图空间位置
  • 可以吗获取将在 OpenGL 中点击立方体的哪个表面?

My goal is to place a sphere right at where the mouse is pointing (with Z-coord as 0).

I saw this question but I didn't yet understand the MVP matrices concept, so I researched a bit, and now I have two questions:

How to create a view matrix from the camera settings such as the lookup, eye and up vector?

I also read this tutorial about several camera types and this one for webgl.

I still can put it all together I don't know how to get the projection matrix also...

What steps should I do to implement all of this?

解决方案

In a rendering, each mesh of the scene usually is transformed by the model matrix, the view matrix and the projection matrix.

  • Projection matrix:
    The projection matrix describes the mapping from 3D points of a scene, to 2D points of the viewport. The projection matrix transforms from view space to the clip space, and the coordinates in the clip space are transformed to the normalized device coordinates (NDC) in the range (-1, -1, -1) to (1, 1, 1) by dividing with the w component of the clip coordinates.

  • View matrix:
    The view matrix describes the direction and position from which the scene is looked at. The view matrix transforms from the world space to the view (eye) space. In the coordinate system on the viewport, the X-axis points to the left, the Y-axis up and the Z-axis out of the view (Note in a right hand system the Z-Axis is the cross product of the X-Axis and the Y-Axis).

  • Model matrix:
    The model matrix defines the location, orientation and the relative size of an mesh in the scene. The model matrix transforms the vertex positions from of the mesh to the world space.

The model matrix looks like this:

( X-axis.x, X-axis.y, X-axis.z, 0 )
( Y-axis.x, Y-axis.y, Y-axis.z, 0 )
( Z-axis.x, Z-axis.y, Z-axis.z, 0 )
( trans.x,  trans.y,  trans.z,  1 ) 


View

On the viewport the X-axis points to the left, the Y-axis up and the Z-axis out of the view (Note in a right hand system the Z-Axis is the cross product of the X-Axis and the Y-Axis).

The code below defines a matrix that exactly encapsulates the steps necessary to calculate a look at the scene:

  • Converting model coordinates into viewport coordinates.
  • Rotation, to look in the direction of the view.
  • Movement to the eye position

The following code does the same as gluLookAt or glm::lookAt does:

using TVec3  = std::array< float, 3 >;
using TVec4  = std::array< float, 4 >;
using TMat44 = std::array< TVec4, 4 >;

TVec3 Cross( TVec3 a, TVec3 b ) { return { a[1] * b[2] - a[2] * b[1], a[2] * b[0] - a[0] * b[2], a[0] * b[1] - a[1] * b[0] }; }
float Dot( TVec3 a, TVec3 b ) { return a[0]*b[0] + a[1]*b[1] + a[2]*b[2]; }
void Normalize( TVec3 & v )
{
    float len = sqrt( v[0] * v[0] + v[1] * v[1] + v[2] * v[2] );
    v[0] /= len; v[1] /= len; v[2] /= len;
}

TMat44 Camera::LookAt( const TVec3 &pos, const TVec3 &target, const TVec3 &up )
{ 
    TVec3 mz = { pos[0] - target[0], pos[1] - target[1], pos[2] - target[2] };
    Normalize( mz );
    TVec3 my = { up[0], up[1], up[2] };
    TVec3 mx = Cross( my, mz );
    Normalize( mx );
    my = Cross( mz, mx );

    TMat44 v{
        TVec4{ mx[0], my[0], mz[0], 0.0f },
        TVec4{ mx[1], my[1], mz[1], 0.0f },
        TVec4{ mx[2], my[2], mz[2], 0.0f },
        TVec4{ Dot(mx, pos), Dot(my, pos), -Dot(mz, pos), 1.0f }
    };

    return v;
}


Projection

The projection matrix describes the mapping from 3D points of a scene, to 2D points of the viewport. It transforms from eye space to the clip space, and the coordinates in the clip space are transformed to the normalized device coordinates (NDC) by dividing with the w component of the clip coordinates. The NDC are in range (-1,-1,-1) to (1,1,1).
Every geometry which is out of the NDC is clipped.

The objects between the near plane and the far plane of the camera frustum are mapped to the range (-1, 1) of the NDC.


Orthographic Projection

At Orthographic Projection the coordinates in the eye space are linearly mapped to normalized device coordinates.

Orthographic Projection Matrix:

r = right, l = left, b = bottom, t = top, n = near, f = far 

2/(r-l)         0               0               0
0               2/(t-b)         0               0
0               0               -2/(f-n)        0
-(r+l)/(r-l)    -(t+b)/(t-b)    -(f+n)/(f-n)    1


Perspective Projection

At Perspective Projection the projection matrix describes the mapping from 3D points in the world as they are seen from of a pinhole camera, to 2D points of the viewport.
The eye space coordinates in the camera frustum (a truncated pyramid) are mapped to a cube (the normalized device coordinates).

Perspective Projection Matrix:

r = right, l = left, b = bottom, t = top, n = near, f = far

2*n/(r-l)      0              0                0
0              2*n/(t-b)      0                0
(r+l)/(r-l)    (t+b)/(t-b)    -(f+n)/(f-n)    -1    
0              0              -2*f*n/(f-n)     0

where :

a = w / h
ta = tan( fov_y / 2 );

2 * n / (r-l) = 1 / (ta * a)
2 * n / (t-b) = 1 / ta

If the projection is symmetric, where the line of sight is in the center of the view port and the field of view is not displaced, then the matrix can be simplified:

1/(ta*a)  0     0              0
0         1/ta  0              0
0         0    -(f+n)/(f-n)   -1    
0         0    -2*f*n/(f-n)    0


The following function will calculate the same projection matrix as gluPerspective does:

#include <array>

const float cPI = 3.14159265f;
float ToRad( float deg ) { return deg * cPI / 180.0f; }

using TVec4  = std::array< float, 4 >;
using TMat44 = std::array< TVec4, 4 >;

TMat44 Perspective( float fov_y, float aspect )
{
    float fn = far + near
    float f_n = far - near;
    float r = aspect;
    float t = 1.0f / tan( ToRad( fov_y ) / 2.0f );

    return TMat44{ 
        TVec4{ t / r, 0.0f,  0.0f,                 0.0f },
        TVec4{ 0.0f,  t,     0.0f,                 0.0f },
        TVec4{ 0.0f,  0.0f, -fn / f_n,            -1.0f },
        TVec4{ 0.0f,  0.0f, -2.0f*far*near / f_n,  0.0f }
    };
}


3 Solutions to recover view space position in perspective projection

  1. With field of view and aspect

Since the projection matrix is defined by the field of view and the aspect ratio it is possible to recover the viewport position with the field of view and the aspect ratio. Provided that it is a symmetrical perspective projection and the normalized device coordinates, the depth and the near and far plane are known.

Recover the Z distance in view space:

z_ndc = 2.0 * depth - 1.0;
z_eye = 2.0 * n * f / (f + n - z_ndc * (f - n));

Recover the view space position by the XY normalized device coordinates:

ndc_x, ndc_y = xy normalized device coordinates in range from (-1, -1) to (1, 1):

viewPos.x = z_eye * ndc_x * aspect * tanFov;
viewPos.y = z_eye * ndc_y * tanFov;
viewPos.z = -z_eye; 


2. With the projection matrix

The projection parameters, defined by the field of view and the aspect ratio are stored in the projection matrix. Therefore the viewport position can be recovered by the values from the projection matrix, from a symmetrical perspective projection.

Note the relation between projection matrix, field of view and aspect ratio:

prjMat[0][0] = 2*n/(r-l) = 1.0 / (tanFov * aspect);
prjMat[1][1] = 2*n/(t-b) = 1.0 / tanFov;

prjMat[2][2] = -(f+n)/(f-n)
prjMat[2][2] = -2*f*n/(f-n)

Recover the Z distance in view space:

A     = prj_mat[2][2];
B     = prj_mat[3][2];
z_ndc = 2.0 * depth - 1.0;
z_eye = B / (A + z_ndc);

Recover the view space position by the XY normalized device coordinates:

viewPos.x = z_eye * ndc_x / prjMat[0][0];
viewPos.y = z_eye * ndc_y / prjMat[1][1];
viewPos.z = -z_eye; 


3. With the inverse projection matrix

Of course the viewport position can be recovered by the inverse projection matrix.

mat4 inversePrjMat = inverse( prjMat );
vec4 viewPosH      = inversePrjMat * vec4(ndc_x, ndc_y, 2.0*depth - 1.0, 1.0)
vec3 viewPos       = viewPos.xyz / viewPos.w;


See further:

  • How to render depth linearly in modern OpenGL with gl_FragCoord.z in fragment shader?
  • Transform the modelMatrix
  • Perspective projection and view matrix: Both depth buffer and triangle face orientation are reversed in OpenGL
  • How to compute the size of the rectangle that is visible to the camera at a given coordinate?
  • How to recover view space position given view space depth value and ndc xy
  • Is it possble get which surface of cube will be click in OpenGL?

相关文章