HMOO 讀書筆記: 電腦視覺中特徵點的光流與追蹤 Optical Flow & Tracking

本文為讀了 An Invitation to 3-D Vision 書中 4.3 Matching point features 的筆記。

Brightness Consistency Constraint

考慮一個場景在時間 $t$ 與 $t + \Delta t$ 的關係，由於物體是相同的，因此他們投影在圖片上的像素值應該也要是相同的，也就是 $I(x(t), t) = I(x(t+\Delta t), t+\Delta t )$ 而如果 $\Delta t$ 很小的話我們可以直接近似 $x(t + \Delta t) = x(t) + \mathbf{u} \Delta t$ ，將此式代入上式再求泰勒展開式以後可得： $I(x(t), t) = I(x(t), t) + \triangledown I(x(t),t)^T\mathbf{u} + I_t(x(t),t)$ 其中 $\triangledown I(x(t),t)$ 為 I 的 x, y 兩個方向的梯度，I_t(x(t), t) 為對 I 時間的微分（也就是 $t + \Delta t$ 時刻的 I 減去 $t$ 時刻的 I）。而 Brightness Consistency Constraint 的式子即為： $\triangledown I(x(t),t)^T\mathbf{u} + I_t(x(t),t) = 0$

Optical flow 與 Feature tracking 的差別

書中的原文是這樣寫的：「The only difference is where the vector $\mathbf{u}(x,t)$ is computed: in optical flow it is computed at a fixed location in the image, whereas in feature tracking it is computed at the point x(t).」

計算 $\mathbf{u}$

從上式中可看出一個式子裡面要求兩個未知數 $(\mathbf{u}_x, (\mathbf{u}_y$ 。因此我們得靠一個 local window $W(x)$ 把附近的點聚集起來，並且假設 $(\mathbf{u}$ 是常數才能寫成以下 loss function： $L(\mathbf{u}) = \sum_{W(x)}[\triangledown I(x(t),t)^T\mathbf{u} + I_t(x(t),t)]^2$ 取導數為零來解： $\triangledown L(\mathbf{u}) = 2 \sum \triangledown I(\triangledown I^T\mathbf{u} + I_t) \\ = 2 \sum ( \begin{bmatrix} I_x^2 & I_x I_y\\ I_x I_y & I_y^2 \end{bmatrix} \mathbf{u} + \begin{bmatrix} I_xI_t\\ I_yI_t \end{bmatrix} ) = 0$ 也可以寫成矩陣式子 $G \mathbf{u} + \mathbf{b} = 0$ ，其中： $G = \begin{bmatrix} I_x^2 & I_x I_y\\ I_x I_y & I_y^2 \end{bmatrix}\\ b = \begin{bmatrix} I_xI_t\\ I_yI_t \end{bmatrix}$ 此方程式的解為： $\mathbf{u} = -G^{-1}\mathbf{b}$ 這個解得考慮一件事情：矩陣 G 是否存在反矩陣。當此 local window 的像素值都很接近時（ $I_x = I_y = 0$ ），或是只存在一個方向的梯度時（ $I_x=0$ or $I_y = 0$ ），G 的反矩陣不存在（也就是 aperture 及 blank wall 的問題），因此要求 $\mathbf{u}$ 的另一個前提是 local window 內要有足夠的 texture。

SSD & NCC criterion

SSD (sum of squared differences) 的精神是找到 $\Delta \mathbf{x}=(dx, dy)$ 來讓此 loss function 最小： $E_t(dx, dy) = \sum_{W(x,y) [I(x+dx, y+dy, t+dt) = I(x,y,t)]^2}$ 用這種演算法跟前面計算 $\mathbf{u}$ 的方法等價，但是不需要計算 $I(x,y,t)$ 的微分（及梯度）值。SSD 的缺點是無法處理像素值的 scaling 及 shift，因此 NCC (normalized cross-correlation) 可以用來解決此問題： $NCC(h) = \frac{\sum_{W(x)}(I_1(x)-\overline{I}_1)(I_2(h(x))-\overline{I}_2)}{\sqrt{\sum_{W(x)}(I_1(x)-\overline{I}_1)^2 \sum_{W(x)}(I_2(h(x))-\overline{I}_2)^2}}$ 其中 $h$ 為 transform， $\overline{I}$ 為 $I$ 的 local window $W(x)$ 中的平均像素值。

Corner & Edge detector

基本的 corner detector：計算上面式子矩陣 G 的 eigenvalue，當最小的 eigenvalue 大於一個 threshold 時此點即為一個 feature point。
Harris corner detector：計算 $C(G) = det(G) + k \times trace^2(G) = (1+2k) \sigma_1 \sigma_2 + k(\sigma_1^2 + \sigma_2^2)$
Canny edge detector：先對圖片作高斯模糊去除雜訊，再計算梯度 $\triangledown I = [I_x, I_y]^T$ ，再求此梯度向量的 norm，訂一個 threshold 決定是否為 edge pixel。

HMOO 讀書筆記

2021年9月1日星期三

電腦視覺中特徵點的光流與追蹤 Optical Flow & Tracking

Brightness Consistency Constraint

Optical flow 與 Feature tracking 的差別

計算 $\mathbf{u}$

SSD & NCC criterion

Corner & Edge detector

沒有留言:

張貼留言

2021年9月1日 星期三

電腦視覺中特徵點的光流與追蹤 Optical Flow & Tracking

Brightness Consistency Constraint

Optical flow 與 Feature tracking 的差別

計算 u\mathbf{u}

SSD & NCC criterion

Corner & Edge detector

沒有留言:

張貼留言

2021年9月1日星期三

計算 $\mathbf{u}$