HarmonyHu 多思不如养志,多言不如守静,多才不如蓄德

SSD网络

2020-06-17
AI

PriorBox

SSD中定义PriorBox,预选框。

layer {
  name: "conv13_mbox_priorbox"
  type: "PriorBox"
  bottom: "conv13"
  bottom: "data"
  top: "conv13_mbox_priorbox"
  prior_box_param {
    min_size: 105.0
    max_size: 150.0
    aspect_ratio: 2.0
    aspect_ratio: 3.0
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset: 0.5
  }
  • 默认正方形预选框: \(最小边长:min\_size \\ 最大边长:\sqrt{min\_size \times max\_size}\)

  • 长方形预选框,每个aspect_ratio会生成1个长方形框;如果flip为true,则再生成一个转置的长方形框: \(\begin{bmatrix} \sqrt{aspect\_ratio} \times min\_size,\frac{1}{\sqrt{aspect\_ratio}}\times min\_size \end{bmatrix} \\ \begin{bmatrix} \frac{1}{\sqrt{aspect\_ratio}}\times min\_size,\sqrt{aspect\_ratio} \times min\_size \end{bmatrix}, if \quad flip = ture\)

  • 根据上面的计算,预选框数量为: \(num\_priors = num(min\_size) * num(aspect\_ratio) * (1 + flip) + num(max\_size) + num(min\_size)\)

  • 根据预选框从每个像素点记录框的坐标

  • min_sizemax_size由以下公式确定: \(S_k= S_{min}+ \frac{S_{max} - S_{min}}{m - 1} \times (k - 1), \quad k \in [1,m]\) SSD网络中m为6

DetectionOutput

SSD网络最后一层,定义如下:

layer {
  name: "detection_out"
  type: "DetectionOutput"
  bottom: "mbox_loc"
  bottom: "mbox_conf_flatten"
  bottom: "mbox_priorbox"
  top: "detection_out"
  include {
    phase: TEST
  }
  detection_output_param {
    num_classes: 21
    share_location: true
    background_label_id: 0
    nms_param {
      nms_threshold: 0.45
      top_k: 100
    }
    code_type: CENTER_SIZE
    keep_top_k: 100
    confidence_threshold: 0.25
  }
}

输入

  • mbox_priorbox,是各个priorbox层输出的预选框,进行concat生成
  • mbox_loc,预选框的偏移量
  • mbox_conf_flatten,各个框在类别上的得分。

输出

shape为[N, 1, x, 7],x是保留框的个数,7对应[id, label, core, xmin, ymin, xmax, ymax]


Similar Posts

上一篇 反卷积(DeConv)

下一篇 Macbook操作

Content