我的观点数学正确吗？

我有一个作业，必须使用透视变换来计算和绘制一些点，但是我不确定我的结果是正确的，因为使用相机坐标的3d图看起来与使用图像坐标的2d图非常不同。你能帮我了解怎么了吗？

给出的结果是：相机位于，以世界坐标（以米为单位）指定。相机坐标系通过绕世界参考的Y轴旋转，因此其旋转矩阵为 $_WT^C = [−1, 1, 5]^T$ $\theta = 160^o$ $^wR_c = \begin{bmatrix}cos(\theta) & 0 & sin(\theta)\\ 0 & 1 & 0 \\ -sin(\theta) & 0 & cos(\theta)\end{bmatrix}$

相机参数为：，，， $f = 16mm$ $s_x = s_y = 0.01 mm/px$ $o_x = 320 px$ $o_y = 240px$

采样点（在世界坐标中）：

$^WP_1 = [1, 1, 0.5]^T$

$^WP_2 = [1, 1.5, 0.5]^T$

$^WP_3 = [1.5, 1.5, 0.5]^T$

$^WP_4 = [1.5, 1, 0.5]^T$

我必须计算并绘制相机坐标系和图像坐标系中的点，因此我在Octave中编写了以下代码：

%camera intrinsic parameters
f = 16
Sx = 0.01
Sy = 0.01
Ox = 320
Oy = 240

%given points, in world coordinate
wP1 = transpose([1, 1, 0.5])
wP2 = transpose([1, 1.5, 0.5])
wP3 = transpose([1.5, 1.5, 0.5])
wP4 = transpose([1.5, 1, 0.5])

% camera translation matrix
wTc = transpose([-1, 1, 5])

% rotation angle converted to rad
theta = 160 / 180 * pi

%camera rotation matrix
wRc = transpose([cos(theta), 0, sin(theta); 0, 1, 0; -sin(theta), 0, cos(theta)])

%transform the points to homogeneous coordinates
wP1h = [wP1; 1]
wP2h = [wP2; 1]
wP3h = [wP3; 1]
wP4h = [wP4; 1]

%separate each line of the rotation matrix
R1 = transpose(wRc(1 , :))
R2 = transpose(wRc(2 , :))
R3 = transpose(wRc(3 , :))

%generate the extrinsic parameters matrix
Mext = [wRc, [-transpose(R1) * wTc; -transpose(R2) * wTc; -transpose(R3) * wTc]]

%intrinsic parameters matrix
Mint = [-f/Sx, 0, Ox; 0, -f/Sy, Oy; 0, 0, 1]

% calculate coordinates in camera coordinates
cP1 = wRc * (wP1 - wTc)
cP2 = wRc * (wP2 - wTc)
cP3 = wRc * (wP3 - wTc)
cP4 = wRc * (wP4 - wTc)

% put coordinates in a list for plotting

x = [cP1(1), cP2(1), cP3(1), cP4(1), cP1(1)]
y = [cP1(2), cP2(2), cP3(2), cP4(2), cP1(2)]
z = [cP1(3), cP2(3), cP3(3), cP4(3), cP1(3)]

%plot the points in 3D using camera coordinates
plot3(x, y, z, "o-r")

pause()

% calculate the points in image coordinates
iP1 = Mint * (Mext * wP1h)
iP2 = Mint * (Mext * wP2h)
iP3 = Mint * (Mext * wP3h)
iP4 = Mint * (Mext * wP4h)

%generate a list of points for plotting
x = [iP1(1) / iP1(3), iP2(1) / iP2(3), iP3(1) / iP3(3), iP4(1) / iP4(3), iP1(1) / iP1(3)]
y = [iP1(2) / iP1(3), iP2(2) / iP2(3), iP3(2) / iP3(3), iP4(2) / iP4(3), iP1(2) / iP1(3)]

plot(x, y, "o-r")

pause()

这些是我从脚本中获得的图：我期望它们有些相似，但看起来并不相似。

绘制相机坐标

绘制图像坐标

3d mathematics perspective

— 维特尔
source

+1表示作业可能是高质量的问题。:)

— 马丁·恩德

正如在meta上指出的，这个问题值得一个很好的答案。我自己一个人，但我很乐意将自己的名声带给一个人。

— trichoplax

@trichoplax的问题是它在matlab中完成。

— joojaa

@joojaa啊好点。如果在赏金期间没有任何matlab专家介入，我将考虑学习Octave，以了解是否足够接近找到解决方案。

— trichoplax

我不清楚第一张图片的含义。第二个是从相机的角度来看的，经过包络估计后，我认为它看起来是正确的。

— Julien Guertault

识别两个图中的轴并将相机位置添加到第一个图中将有助于您了解发生了什么。

$x$ $y$ $z$

$[0, 0, 1]$ $[0, 1, 0]$

$0.016$ $S_x = S_y = 0.0001$ $0.00001$

$[-1,1,x]$ $z=0.5$ $x$ $tan(160°) \cdot (5 - 0.5) = 1.64...$ $x=-1$ $\approx 0.64$ $y$ 坐标作为两个点，并且由于坐标不会因旋转而改变，因此它们在变换后仍应以相同的坐标结尾，即在图像的中心行。 $y$

检查答案的一种好方法是使用现有的3D建模器（例如Blender）：谨慎使用Blender的坐标系，例如默认相机矢量为[0, 0, -1]。这是渲染器：将 Focal设置为另一个值以使球体更加可见。因此，我们看到底部的两个点位于图像的中间行，并且这些点略微位于图像的右侧。

我用Python实现了您的作业：

import numpy as np

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import axes3d, Axes3D


# Parameters
f_mm = 0.016
f_px = f_mm / 0.00001
t_cam = np.array([[-1., 1., 5.]]).T
t_cam_homogeneous = np.vstack((t_cam, np.array([[0]])))
theta = 160. * np.pi / 180.
ox = 320
oy = 240
# Rotation and points are in homogeneous coordinates
rot_cam = np.array([[np.cos(theta), 0, np.sin(theta)],
                    [0, 1, 0],
                    [-np.sin(theta), 0, np.cos(theta)]])
points = np.array([[1, 1, 0.5, 1],
                   [1, 1.5, 0.5, 1],
                   [1.5, 1.5, 0.5, 1],
                   [1.5, 1, 0.5, 1]]).T

# Compute projection matrix using intrinsics and extrinsics
intrinsics = np.array([[f_px, 0, ox],
                       [0, f_px, oy],
                       [0, 0, 1]])
extrinsics = np.hstack((rot_cam, rot_cam.dot(-t_cam)))

rot_cam2 = np.identity(4); rot_cam2[:3,:3] = rot_cam
camera_coordinates = rot_cam2.dot(points - t_cam_homogeneous)
camera_coordinates = camera_coordinates[:3,:] / camera_coordinates[3,:]

# Perform the projection
projected_points = intrinsics.dot(camera_coordinates)
projected_points = projected_points[:2,:] / projected_points[2,:]
projected_points[0,:] = -projected_points[0,:] # Inverted x-axis because camera is pointing toward [0, 0, 1]

fig = plt.figure()
ax = Axes3D(fig)
ax.scatter(points[0,:], points[1,:], points[2,:], label="Points")
ax.scatter(t_cam[0], t_cam[1], t_cam[2], c="red", label="Camera")
ax.set_xlabel("X axis"); ax.set_ylabel("Y axis"); ax.set_zlabel("Z axis")
plt.title("World coordinates")
plt.legend()
plt.savefig('world_coordinates.png', dpi=300, bbox_inches="tight")

fig = plt.figure()
ax = Axes3D(fig)
ax.scatter(camera_coordinates[0,:], camera_coordinates[1,:], camera_coordinates[2,:], label="Points")
ax.scatter(0, 0, 0, c="red", label="Camera")
ax.set_xlabel("X axis"); ax.set_ylabel("Y axis"); ax.set_zlabel("Z axis")
plt.title("Camera coordinates")
plt.legend()
plt.savefig('camera_coordinates.png', dpi=300, bbox_inches="tight")

plt.figure()
plt.scatter(projected_points[0,:], projected_points[1,:])
plt.xlabel("X axis"); plt.ylabel("Y axis")
plt.title("Image coordinates")
plt.savefig('image_coordinates.png', dpi=300, bbox_inches="tight")

plt.show()

这为我提供了这些图形：分别：世界坐标，摄影机坐标，旋转的摄影机坐标以略微适合摄影机方向（请注意，这里的摄影机矢量朝向图形视点，它没有“输入”图形）和图像坐标。

因此，我们可以看到底部点的垂直坐标正确地位于中间行（240）上，并且这些点位于图像的右侧（水平值> 320）。

我相信您遇到的一个错误是，您发现X值为负，因此您-f/Sxy对内在函数矩阵中的focuss 进行了求反以进行补偿。这里的问题是我们假设相机最初指向（否则160°旋转将不会指向这些点）。如果以这种方式查看，则轴向左移动时会增加，应将此轴取反。 $[0, 0, 1]$ $x$

我们的两个结果似乎都与我相似，只是您假设摄影机的向上矢量（实际上两个轴都被镜像，因为您取消了两个焦点），并且以mm而不是米为单位进行了计算。 $[0, -1, 0]$

— 索拉武
source