- toc {:toc}
Sequence μ΄μ©ν΄ Custom Dataset λ§λ€κΈ°
Pytorch μ Dataset μ μμλ°μ Custom Dataset μ λ§λλ λ°©μκ³Ό μ μ¬νλ€.
Tensorflow 2.x λ²μ λΆν° custom dataset loader λ₯Ό λ§λλ λ°©λ²μ΄ μκ²Όλ€.
tensorflow.keras.utils.Sequence λ₯Ό μ¬μ©νλ€.
init ν¨μ μ μ
class CustomDataset(Sequence):
def __init__(self, img, labels, batch_size=BATCH_SIZE, augmentor=None, shuffle=False):
self.img = img
self.labels = labels
self.batch_size = BATCH_SIZE
self.augmentor = augmentor
self.shuffle = shuffle
if self.shuffle:
self.on_epoch_end()- img : μ΄λ―Έμ§ νμΌμ΄ μλ directory κ²½λ‘, μ΄μΈμ κ° ν½μ κ°μ λ΄λ numpy array μ κ²½μ°μλ NonImplmentedError κ° λ°μνλ€.
- labels : μ΄λ―Έμ§ label μ λ΄λλ€.
len ν¨μ μ μ
def __len__(self):
return int(np.ceil(len(self.labels)/self.batch_size))- step μ΄ λͺ λ² λ°μνλμ§λ₯Ό μλ―Ένλ€.
- μ¦, μ 체λ°μ΄ν°κ° 60000 μ΄κ³ batch_size κ° 600 μ΄λΌλ©΄ 100 λ² λμ step μ μ§νν¨μ μλ―Ένλ€.
- np.ceil μ λ§μ½ batch_size κ° 599 λΌλ©΄ 100.xxxxx λ² νλ κ²μ΄ μλ 101 λ²μ μ§νν΄μΌ νκΈ° λλ¬Έμ μ¬λ¦Ό μ²λ¦¬λ‘ μ¬μ©νλ€.
getitem ν¨μ μ μ
def __getitem__(self, index):
img_batch = self.img[index*self.batch_size:(index+1)*self.batch_size]
if self.labels is not None:
label_batch = self.labels[index*self.batch_size:(index+1)*self.batch_size]
image_batch = np.zeros((img_batch.shape[0], IMAGE_SIZE, IMAGE_SIZE, 3))
for image_index in range(img_batch.shape[0]):
image = cv2.cvtColor(cv2.imread(img_batch[image_index]), cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (IMAGE_SIZE, IMAGE_SIZE))
if self.augmentor is not None:
image = self.augmentor(image=image)['image']
img_batch[image_index] = image
return img_batch, label_batch- index μ λ°λΌμ λ°μ΄ν°μμ batch_size λ§νΌ λ°μ΄ν°λ₯Ό κ°μ Έμ€λ ν¨μμ΄λ€.
- ν μ€νΈ μΈνΈμ κ²½μ° label μ΄ μκΈ° λλ¬Έμ λ°λ‘ label μ μ²λ¦¬ν΄μ€λ€.
- img_batch κ° κ°μ§κ³ μλ κ°μ΄ directory path μ κ°μ΄κΈ° λλ¬Έμ cv2 λ₯Ό ν΅ν΄ numpy array λ‘ λ³κ²½ν΄ resize νλ€.
- augmentor κ° μ‘΄μ¬νλ κ²½μ° μ΄λ―Έμ§ κ°κ°μ μ μ©νκ³ img_batch μ μ μ₯νλ€.
- img_batch μ label_batch λ₯Ό λ°νν΄ iteration λ§λ€ batch λ₯Ό κ°μ Έμ€κ² νλ€.
On_epoch_end ν¨μ μ μ
def on_epoch_end(self):
if(self.shuffle):
self.image_filenames, self.labels = sklearn.utils.shuffle(self.image_filenames, self.labels)
else:
pass- On_epoch_end ν¨μλ μ νμ¬νμ΄λ€.
- shuffle μ μν΄ μ¬μ©νλ€. sklearn.utils.shuffle μ μ¬μ©ν΄μ μμμ λ°λΌ shuffle νλ€.
- sklearn.utils.shuffle() : λ°μ΄ν°λ₯Ό λμΌν μμλ‘ μμ΄μ€λ€.