TOC
前提提要
最近讀了 Pose Estimation 相關的論文,發現一些 Bottom Up 的方法 1 2 會直接生成各個 Keypoints 會哪,中間不經過 RPN 等方法。而生成的 Confidence Map 的 Ground Truth 是使用高斯分佈來指示位置。但我翻了一下文檔,numpy
似乎沒有提供生成二維高斯分佈的函式,只提供從高斯分佈取樣的函式,於是我模彷了 skimage.draw
的 API,寫了一個函式。
實作原理
二維高斯分佈就是兩個一維高斯分佈取外積。於是我分別對 row 與 col 各生成一個高斯分佈,函數 domain 為 $$ [-3sigma, +3sigma] $$,因為是整數域,共 $$ 6sigma + 1 $$ 個值。然後將這兩個分佈使用 np.outer
即為所求。
程式碼
Hint!
此函式不支援 double 型態的 mu 與 sigma!
1def gaussian2d(mu, sigma, shape=None):
2 """Generate 2d gaussian distribution coordinates and values.
3
4 Parameters
5 --------------
6 mu: tuple of int
7 Coordinates of center, (mu_r, mu_c)
8 sigma: tuple of int
9 Intensity of the distribution, (sigma_r, sigma_c)
10 shape: tuple of int, optional
11 Image shape which is used to determine the maximum extent
12 pixel coordinates, (r, c)
13
14 Returns
15 --------------
16 rr, cc: (N,) ndarray of int
17 Indices of pixels that belong to the distribution
18 gaussian: (N, ) ndarray of float
19 Values of corresponding position. Ranges from 0.0 to 1.0.
20
21 .. warning::
22
23 This function does NOT support mu, sigma as double.
24 """
25 mu_r, mu_c = mu
26 sigma_r, sigma_c = sigma
27
28 R, C = 6 * sigma_r + 1, 6 * sigma_c + 1
29 r = np.arange(-3 * sigma_r, +3 * sigma_r + 1) + mu_r
30 c = np.arange(-3 * sigma_c, +3 * sigma_c + 1) + mu_c
31 if shape:
32 r = np.clip(r, 0, shape[0])
33 c = np.clip(c, 0, shape[1])
34
35 coef_r = 1 / (sigma_r * np.sqrt(2 * np.pi))
36 coef_c = 1 / (sigma_c * np.sqrt(2 * np.pi))
37 exp_r = -1 / (2 * (sigma_r**2)) * ((r - mu_r)**2)
38 exp_c = -1 / (2 * (sigma_c**2)) * ((c - mu_c)**2)
39
40 gr = coef_r * np.exp(exp_r)
41 gc = coef_c * np.exp(exp_c)
42 g = np.outer(gr, gc)
43
44 r = r.reshape(R, 1)
45 c = c.reshape(1, C)
46 rr = np.broadcast_to(r, (R, C))
47 cc = np.broadcast_to(c, (R, C))
48 return rr.ravel(), cc.ravel(), g.ravel()
範例
import numpy as np
from PIL import Image
img = np.zeros((100, 100), dtype=np.float32)
rr, cc, g = gaussian2d([50, 50], [4, 4], shape=img.shape)
img[rr, cc] = g / g.max()
rr, cc, g = gaussian2d([20, 40], [5, 2], shape=img.shape)
img[rr, cc] = g / g.max()
rr, cc, g = gaussian2d([0, 0], [1, 3], shape=img.shape)
img[rr, cc] = g / g.max()
# Save Image
img = np.uint8(img * 255)
Image.fromarray(img).save('./out.jpg')

其中要注意的是函式的值可能太小(例如 sigma=1
時,函式值最大為 0.5),可以考慮將之 normalize,如程式碼所示。