• toc {:toc}

Introduction

object recognition ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜๋ฉด์„œ ๋” ํฐ ๋ฐ์ดํ„ฐ์…‹, ๋” ๊ฐ•๋ ฅํ•œ ๋ชจ๋ธ, Overfitting์„ ๋ฐฉ์ง€ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ์ˆ ์ด ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ๋‹ค. LabelMe, ImageNet๊ณผ ๊ฐ™์€ ๋” ํฐ ๋ฐ์ดํ„ฐ์…‹์„ ๋‹ค๋ฃจ๊ธฐ ์œ„ํ•ด ํฐ learning capacity๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๋ชจ๋ธ์„ ํ•„์š”๋กœ ํ–ˆ๊ณ  ILSVRC(ImageNet Large Scale Visual recognition Challenge)-2012 ๋Œ€ํšŒ์— ์‚ฌ์šฉ๋˜๋Š” ImageNet ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด ๊ฐ€์žฅ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๊ฐ–๋Š” ๋ชจ๋ธ์„ ์ œ์ž‘ํ–ˆ๋‹ค.


# AlexNet

๋…ผ๋ฌธ์˜ ์ฒซ ๋ฒˆ์งธ ์ €์ž๊ฐ€ Alex Khrizevsky์ด๊ธฐ ๋•Œ๋ฌธ์— ์ €์ž์˜ ์ด๋ฆ„์„ ๋”ฐ์„œ AlexNet์ด๋ผ ๋ถ€๋ฅธ๋‹ค.

The Architecture

1. ReLUs

๋‰ด๋Ÿฐ์˜ Activation ํ•จ์ˆ˜์˜ ๊ธฐ๋ณธ์ ์ธ ๋ฐฉ๋ฒ•์€ tanh(x)์ด๋‹ค. ํ•˜์ง€๋งŒ AlexNet์€ tanh ๋Œ€์‹  ์†๋„๊ฐ€ 5~6๋ฐฐ ์ •๋„ ๋” ๋น ๋ฅธ ReLUs(Rectified Linear Units)๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค. ReLU๋ฅผ ์‚ฌ์šฉํ–ˆ์„ ๋•Œ ํ•™์Šต, ์˜ˆ์ธก์˜ ์†๋„๊ฐ€ ์ฆ๊ฐ€ํ•˜๊ณ  ์ •ํ™•๋„๋Š” ์œ ์ง€ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.

2. Local Response Normalization

LRN(Local Response Normalization)์—๋Š” ์‹ ๊ฒฝ์ƒ๋ฆฌํ•™์— ์‚ฌ์šฉ๋˜๋Š” ์ธก๋ฉด ์–ต์ œ(lateral inhibition)์ด๋ผ๋Š” ๊ฐœ๋…์„ ํ™œ์šฉํ•ด ์‚ฌ์šฉ๋๋‹ค.

โ€˜์ธก๋ฉด ์–ต์ œ๋Š” ํ•œ ์˜์—ญ์— ์žˆ๋Š” ์‹ ๊ฒฝ ์„ธํฌ๊ฐ€ ์ƒํ˜ธ ๊ฐ„ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ์„ ๋•Œ ํ•œ ๊ทธ ์ž์‹ ์˜ ์ถ•์ƒ‰์ด๋‚˜ ์ž์‹ ๊ณผ ์ด์›ƒ ์‹ ๊ฒฝ์„ธํฌ๋ฅผ ๋งค๊ฐœํ•˜๋Š” ์ค‘๊ฐ„์‹ ๊ฒฝ์„ธํฌ(interneuron)๋ฅผ ํ†ตํ•ด ์ด์›ƒ์— ์žˆ๋Š” ์‹ ๊ฒฝ ์„ธํฌ๋ฅผ ์–ต์ œํ•˜๋ ค๋Š” ๊ฒฝํ–ฅ์ด๋‹ค.โ€™
์ •์˜๋งŒ์œผ๋กœ๋Š” ์ดํ•ดํ•˜๊ธฐ ์–ด๋ ต๋‹ค.

์œ„ ๊ทธ๋ฆผ์„ ๋ณด๋ฉด 4๊ฐœ์˜ ๊ฒ€์ •์ƒ‰ ์ •์‚ฌ๊ฐํ˜• ์ค‘๊ฐ„์— ํšŒ์ƒ‰ ์ •์‚ฌ๊ฐํ˜•์ด ๋ณด์ด๋Š” ๊ฒƒ ๊ฐ™๋‹ค. ํ•˜์ง€๋งŒ ์‹ค์ œ๋กœ ์ด ๊ทธ๋ฆผ์€ ๊ท ์ผํ•œ ๊ฒ€์€์ƒ‰ ์ •์‚ฌ๊ฐํ˜•๋“ค์ด ํฐ์ƒ‰ ์„ ์œผ๋กœ ๋‘˜๋Ÿฌ์‹ธ์—ฌ ์žˆ๊ธฐ๋งŒ ํ•˜๊ณ  ํšŒ์ƒ‰์˜ ์ •์‚ฌ๊ฐํ˜•์€ ์กด์žฌํ•˜์ง€ ์•Š๋Š”๋‹ค. ๊ฒ€์ •์ƒ‰์˜ ๊ฐ•ํ•œ ์ž๊ทน์ด ๋น„๊ต์  ์•ฝํ•œ ํฐ์ƒ‰์˜ ์ž๊ทน์„ ์ „๋‹ฌํ•˜๋Š” ๊ฒƒ์„ ์–ต์ œํ•˜๋ ค๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.
LRN์€ ์ด์™€ ๊ฐ™์€ ์ธก๋ฉด ์–ต์ œ์˜ ์›๋ฆฌ๋ฅผ ๊ตฌํ˜„ํ•œ ๊ฒƒ์ด๋‹ค. ๊ฐ•ํ•˜๊ฒŒ ํ™œ์„ฑํ™”๋œ ๋‰ด๋Ÿฐ ์ฃผ๋ณ€ ์ด์›ƒ๋“ค์— ๋Œ€ํ•ด์„œ normalization์„ ์‹คํ–‰ํ•œ๋‹ค. ์ฃผ๋ณ€์— ๋น„ํ•ด ์–ด๋–ค ๋‰ด๋Ÿฐ์ด ๊ฐ•ํ•˜๊ฒŒ ํ™œ์„ฑํ™” ๋˜์–ด ์žˆ๋‹ค๋ฉด, ์ฃผ๋ณ€์„ normalizationํ•จ์œผ๋กœ์จ ๋”์šฑ ํ™œ์„ฑํ™”๋˜์–ด ๋ณด์ผ ๊ฒƒ์ด๋‹ค. ํ•˜์ง€๋งŒ ๊ฐ•ํ•˜๊ฒŒ ํ™œ์„ฑํ™”๋œ ๋‰ด๋Ÿฐ๋“ค์ด ์ฃผ๋ณ€์— ๋งŽ๋‹ค๋ฉด normalization์„ ์ง„ํ–‰ํ•œ ํ›„ ๊ฐ’์ด ์ž‘์•„์ง„๋‹ค.
๋…ผ๋ฌธ์—์„œ๋Š” LRN์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์œผ๋กœ $k=2, n=5, \alpha=0.0001, \beta=0.75$์œผ๋กœ ์„ค์ •ํ–ˆ๋‹ค.

3. Overlapping Pooling

CNN์—์„œ pooling layer๋Š” convolution์„ ํ†ตํ•ด ์–ป์€ ํŠน์„ฑ๋งต์„ ์••์ถ•, ์š”์•ฝํ•˜๋Š” ์—ญํ• ์„ ํ•œ๋‹ค. ์ „ํ†ต์ ์œผ๋กœ ์‚ฌ์šฉ๋˜์—ˆ๋˜ pooling์€ ๊ฒน์ณ์„œ ์ง„ํ–‰๋˜์ง€ ์•Š์•˜๋‹ค. ์ฆ‰, size์™€ stride๊ฐ€ ๊ฐ™๋‹ค. ํ•˜์ง€๋งŒ, AlexNet์—์„œ๋Š” size๋ณด๋‹ค stride๋ฅผ ๋” ํฌ๊ฒŒ ์ ์šฉํ•ด overlapping polling์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ๊ฐ’์„ ์‚ฌ์šฉํ–ˆ๋‹ค.

LeNet-5๋Š” pooling layer๋กœ average pooling์„ ์‚ฌ์šฉํ–ˆ์ง€๋งŒ AlexNet์—์„œ๋Š” maxpooling์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ๋…ผ๋ฌธ์— ๋”ฐ๋ฅด๋ฉด non-overlapping ์— ๋น„ํ•ด์„œ top-1, top-5 ์˜ค์ฐจ์œจ์ด ๊ฐ๊ฐ 0.4%, 0.3% ๊ฐ์†Œํ–ˆ๋‹ค๊ณ  ํ•œ๋‹ค.


Overall Architecture

Alexnet{: .center}

AlexNet์€ 5๊ฐœ์˜ convolution layer, 3๊ฐœ์˜ fully-connected layer ์ด 8๊ฐœ์˜ layer๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค. ๋งˆ์ง€๋ง‰ fully-connected layer์˜ output์€ ImageNet์˜ class๊ฐ€ 1000๊ฐœ์ด๋ฏ€๋กœ 1000๊ฐœ์˜ softmax๋กœ ๋‚˜ํƒ€๋‚ธ๋‹ค.

AlexNet์€ 2๊ฐœ์˜ GPU๋ฅผ ์‚ฌ์šฉํ•ด ๋ณ‘๋ ฌ์ฒ˜๋ฆฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋‘ ๊ฐˆ๋ž˜๋กœ ์—ฐ๊ฒฐ์ด ๋œ๋‹ค. ๋‘ ๋ฒˆ์งธ, ๋„ค ๋ฒˆ์งธ, ๋‹ค์„ฏ ๋ฒˆ์งธ convolution layer๋“ค์€ ๊ฐ™์€ GPU์— ์žˆ๋Š” ์ด์ „ layer๋งŒ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ๊ณ  ์„ธ ๋ฒˆ์งธ convolution layer๋Š” 2๊ฐœ์˜ GPU์— ์žˆ๋Š” ์ด์ „ layer ๋ชจ๋‘ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ๋‹ค. ์ถ”๊ฐ€๋กœ, ์‚ฌ์ง„์—๋Š” input size๊ฐ€ 224๋กœ ๋‚˜ํƒ€๋‚˜ ์žˆ์ง€๋งŒ ์‹ค์ œ ๊ณ„์‚ฐ์ด ์„ฑ๋ฆฝ๋˜๋ ค๋ฉด 227x227์˜ ํ˜•ํƒœ๊ฐ€ ๋œ๋‹ค.

LayerSortKernelOutputKernel SizeStridePaddingActivation Function
InputInput3(RGB)227x227----
C1Conv9655x5511X114-ReLU + LRU
P1MaxPooling9627x273x32--
C2Conv25613x135x522ReLU + LRU
P2MaxPooling25613x133x32--
C3Conv38413x133x311ReLU
C4Conv38413x133x311ReLU
C5Conv25613x133x311ReLU
P3MaxPooling2566x63x32--
FC1Fully Connected-4096---ReLU
FC2Fully Connected-4096---ReLU
FC3Fully Connected-4096---Softmax

์ฐธ๊ณ 

[1] AlexNet ๋…ผ๋ฌธ ๋ณธ๋ฌธ