SPPNet to Fast-RCNN

2 min readApr 9, 2021

Fast RCNN are modified version of RCNN where a lot of differ comes with SPPNet.

in SPPNet there were 3 to4 layers (0,1,2,3) of pooling where Fast-RCNN the only use one level of 7 by 7 (7x7) grid pooling (max pooling) and named it ROI pooling.
they experimentally discover that if they uses softmax or SVM for classification ,the both result are almost same. that is why they get ride of SVM and uses SoftMax for both classification and fine-tune.

the SPP/ROI layer are fed to Fully connected layers before feeding to the Softmax FC and and BBox regression FC layers. means these are
for Bounding Boxes regression they used smooth L1 loss rather than L2 loss. the smooth l2 loss is the combination of L1 and l2 loss
click here for more info about regression loss
experimentally they add up the loss of classification and BBbox regression, and based on that train the network, and they found out the accuracy is improved as compared to train on separate loss.
the get ride of image pyramid (5 different scales of an image for input) and used only single scale for an image. the reason is that by using image pyramid we can only achieve 0.5 percent accuracy improvement as compared to single but makes the network 4X slower.
In term of accuracy the Fast-RCNN is not much improved . (i.e. accuracy of RCNN is 66.0% where Fast-RCNN is 66.9%) but the primary purpose of introducing Fast-RCNN was speed. the SPPNet are 22 to 24 times faster than RCNN where fast-RCNN are 146 times faster than RCNN.

Written by Hidayat35