使用光流进行特征跟踪
我在论坛中发现了一个类似问题.但是里面的答案并没有回答我的问题.
I found a similar question in the forum. But the answer in there does not answer my question.
如果我在第一张图片上只做一次特征检测(goodFeaturesToTrack)一次,然后使用光流(calcOpticalFlowPyrLK)来跟踪这些特征,问题是:只有在第一张图片上检测到的特征可以跟踪第一张图像.当这些特征超出图像时,将没有要跟踪的特征.
If I do feature detection (goodFeaturesToTrack) only once on the first image, and then use optical flow (calcOpticalFlowPyrLK) to track these features, the problem is: only the features detected on the first image can be tracked. When these features go beyond of the image, there would be no features to track.
如果我对每张新图像进行特征检测,特征跟踪不稳定,因为上次检测到的特征这次可能检测不到.
If I do feature detection for every new image, the feature tracking is not stable, because the feature detected last time may not be detected this time.
我正在使用光流进行 3D 重建.所以我对跟踪什么特征不感兴趣,相反,我只关心视野中的特征是否可以稳定地跟踪.总而言之,我的问题是:如何使用光流来跟踪旧特征,同时添加进入视野的新图像特征并删除超出视野范围的旧特征查看?
I am using optical flow for 3D reconstruction. So I'm not interested in tracking what features, instead, I only care whether features in the field of view can be tracked stably. To summarize, my question is: how can I use optical flow to track old features, and in the meantime add new image features that come into the field of view and remove old features that go beyond the field of view?
推荐答案
有几种方法是可能的.一个好的方法是这样的:
Several approaches are possible. A good method goes like this:
- 在第 1 帧中检测 N 个特征,这是 关键帧 m=1
- 在第 k 帧中通过光流跟踪特征
- 如果成功跟踪的特征数量下降到 N/2 以下,则在第 k 帧中:
- 这一帧是关键帧 m+1
- 计算描述关键帧 m和m+1之间运动的单应性或基本矩阵
- 检测 N 个特征并丢弃旧特征
- k := k+1 转到 2
- in Frame 1 detect N features, this is the Keyframe m=1
- in Frame k track the features by optical flow
- in Frame k if the number of successfully tracked features drops under N/2:
- this frame is the keyframe m+1
- compute the homography or the fundamental matrix describing the motion between the keyframes m and m+1
- detect N features and discard the old ones
- k := k+1 go to 2
在这种方法中,您基本上可以估计最后两个关键帧之间的相机运动.
In this method basically you estimate the camera motion between the last two keyframes.
由于您没有提到用于 3D 重建的方法,我假设首先计算 H 或 F 以估计运动.为了准确地估计它们,关键帧之间的基线应该尽可能宽.一般来说,最好的策略是考虑相机的粗略运动模型.如果相机是用手握住的,则与将相机固定在汽车或机器人顶部时相比,应该使用不同的策略.如果有帮助,我可以提供一个 Python 中的最小工作示例,请告诉我.
Since you didn't mention what approach is used for 3D reconstruction I assumed either H or F are computed to estimated motion first. To estimate them accurately the baseline between the keyframes should be as wide as possible. In general, the best strategy is take into account the rough motion model of the camera. If the camera is held by hand a different strategy should be used compared to when the camera is fixed on the top of a car or a robot. I can provide a minimal working example in Python if that helps, let me know.
相关文章