• toc {:toc}

Pipeline Batch training pattern

λ¨Έμ‹ λŸ¬λ‹, λ”₯λŸ¬λ‹ ν•™μŠ΅μ„ μ§„ν–‰ν•˜λŠ”λ° κ°€μž₯ 기본이 λ˜λŠ” νŒ¨ν„΄μ΄λ‹€. λ¨Έμ‹ λŸ¬λ‹, λ”₯λŸ¬λ‹μ—μ„œμ˜ ν•™μŠ΅μ€ λ‹€μŒκ³Ό 같이 λΆ„λ₯˜ν•  수 μžˆλ‹€.

  1. 데이터 μˆ˜μ§‘
  2. 데이터 μ „μ²˜λ¦¬
  3. λͺ¨λΈ ν•™μŠ΅
  4. λͺ¨λΈ 평가
  5. 예츑 μ„œλ²„μ— λͺ¨λΈ ꡬ좕
  6. λͺ¨λΈ, μ„œλ²„, 평가 기둝

λ‹€μŒκ³Ό 같이 λ¨Έμ‹ λŸ¬λ‹μ—μ„œμ˜ ν•™μŠ΅μ€ μ—¬λŸ¬ κ³Όμ •μœΌλ‘œ λΆ„ν• λœλ‹€. 이 각 ν”„λ‘œμ„ΈμŠ€λ“€μ„ 순차적으둜 μ‹€ν–‰ν•˜μ—¬ ν•™μŠ΅μ΄ 이루어지도둝 λ§Œλ“œλŠ” νŒ¨ν„΄μ΄ pipeline batch training pattern 이닀. ν”„λ‘œμ„ΈμŠ€λ“€μ΄ λΆ„ν• λ˜μ–΄ μžˆμ–΄, ν•™μŠ΅ λ„μ€‘μ˜ κ²½κ³Όλ₯Ό κΈ°λ‘ν•˜κ³  μž¬μ‚¬μš©μ΄λ‚˜ 뢀뢄적인 μˆ˜μ •μ„ κ°„νŽΈν•˜κ²Œ ν•˜κ±°λ‚˜, 병렬 처리λ₯Ό 톡해 μ„±λŠ₯을 ν–₯μƒμ‹œν‚¬ 수 μžˆλ‹€. λ˜ν•œ, 각각의 ν”„λ‘œμ„ΈμŠ€μ—μ„œ μ‚¬μš©λ˜λŠ” 라이브러리λ₯Ό 선택해 μ‚¬μš©ν•  수 있고 νŒŒμ΄ν”„λΌμΈμ„ λ§Œλ“€λ©΄μ„œ ν•™μŠ΅ 및 좔둠에 λŒ€ν•œ μžλ™ν™”λ₯Ό μ§„ν–‰ν•  수 μžˆλ‹€.

Pipeline batch training pattern 을 μ‚¬μš©ν•˜λŠ” 경우 (μž₯점)

  • νŒŒμ΄ν”„λΌμΈμ˜ μžμ›μ„ λΆ„ν• ν•΄ ν”„λ‘œμ„ΈμŠ€λ§ˆλ‹€ 라이브러리λ₯Ό μ„ μ •ν•˜λŠ” 경우
  • ν”„λ‘œμ„ΈμŠ€λ₯Ό λ‹€λ₯Έ μš©λ„λ‘œ ν•¨κ»˜ μ‚¬μš©ν•˜λ €λŠ” 경우
  • ν”„λ‘œμ„ΈμŠ€λ§ˆλ‹€ λ°μ΄ν„°μ˜ μƒνƒœ, μ§„ν–‰ λ‘œκ·Έλ“€μ„ κΈ°λ‘ν•˜κ³  싢은 경우
  • ν”„λ‘œμ„ΈμŠ€ μ‹€ν—˜μ„ κ°œλ³„μ μœΌλ‘œ 닀루고 싢은 경우

단점

  • κ°œλ³„ μž‘μ—…μ„ μ§„ν–‰ν•˜λ©΄μ„œ 독립성을 κ°–μΆ”λ‚˜ 확인해야 ν•  쑰건이 λŠ˜μ–΄λ‚˜ μ½”λ“œ 관리가 λ³΅μž‘ν•΄μ§„λ‹€.
  • μžμ› 선택에 λŒ€ν•΄ κ³ λ € 사항이 λŠ˜μ–΄λ‚œλ‹€.

μ„œλΉ„μŠ€μ˜ 전체적인 그림은 μ•„λž˜μ™€ κ°™κ³ , κ·Έ 쀑 ν•™μŠ΅ νŒŒμ΄ν”„λΌμΈμ—μ„œ Pipeline Batch training pattern 이 μ‚¬μš©λ˜μ—ˆλ‹€.

image

좜처 : https://medium.com/kubwa/ml-design-pattern-%EB%AA%A8%EB%8D%B8-%EC%83%9D%EC%84%B1-2-%ED%8C%8C%EC%9D%B4%ED%94%84%EB%9D%BC%EC%9D%B8-%EB%B0%B0%EC%B9%98-%ED%95%99%EC%8A%B5-%ED%8C%A8%ED%84%B4-a0163f9a5698

  • μ½”λ“œ 전체적인 μ½”λ“œ κ΅¬μ‘°λŠ” pytorch tutorial 을 톡해 이해할 수 μžˆλ‹€. λ°μ΄ν„°μ…‹μœΌλ‘œ Fashion-MNIST dataset 을 μ‚¬μš©ν–ˆλ‹€.
  1. Fashion-MNIST Dataset 을 load ν•΄ ν•™μŠ΅ 데이터λ₯Ό μˆ˜μ§‘ν•œλ‹€.
import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt
 
 
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)
 
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

μœ„ Fashion-MNIST 의 κ²½μš°μ™€ 달리 각 task 에 따라 데이터셋을 μ œμž‘ν•˜μ—¬ ν›ˆλ ¨μ„ μ§„ν–‰ν•˜λŠ” 경우 ν•™μŠ΅μ— μ‚¬μš©ν• ν•  Custom Dataset 을 λ§Œλ“ λ‹€.

import os
import pandas as pd
from torchvision.io import read_image
 
class CustomImageDataset(Dataset):
    def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):
        self.img_labels = pd.read_csv(annotations_file)
        self.img_dir = img_dir
        self.transform = transform
        self.target_transform = target_transform
 
    def __len__(self):
        return len(self.img_labels)
 
    def __getitem__(self, idx):
        img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
        image = read_image(img_path)
        label = self.img_labels.iloc[idx, 1]
        if self.transform:
            image = self.transform(image)
        if self.target_transform:
            label = self.target_transform(label)
        return image, label
  1. μœ„ Fashion-MNIST Dataset load κ³Όμ •μ—μ„œλŠ” 데이터λ₯Ό λΆˆλŸ¬μ˜€λ©΄μ„œ transform 을 μ μš©ν•΄ λΆˆλŸ¬μ˜€λ„λ‘ λ§Œλ“€μ–΄μ Έ μžˆμ§€λ§Œ, 일반적으둜 ν•™μŠ΅ 데이터λ₯Ό λΆˆλŸ¬μ˜€λŠ” 경우 데이터λ₯Ό 뢈러였고, ν•™μŠ΅ 데이터λ₯Ό ν›ˆλ ¨μ— μ ν•©ν•˜λ„λ‘ 데이터 증강, μ •κ·œν™” λ“± μ „μ²˜λ¦¬λ₯Ό μ§„ν–‰ν•œλ‹€.
from torchvision import transforms
 
train_df, val_df = train_test_split(training_data, 
									test_size=0.15, 
									random_state=42, 
									stratify=imageNet_df['label'])
 
transform = transforms.Compose([transforms.ToTensor(), 
								transforms.Normalize(
									mean=[0.485, 0.456, 0.406],
									std=[0.229, 0.224, 0.225])
								])
 
train_ds = Dataset(train_df,
						root_dir,
						transform=transform)
 
val_ds = ImageNetDataset(val_df,
						root_dir,
						transform=transform)
dataset = {'train' : train_ds, 'val' : val_ds}
 
train_loader = DataLoader(train_ds, batch_size = config['batch_size'], shuffle=True)
val_loader = DataLoader(val_ds, batch_size = config['batch_size'], shuffle=True)
  1. ν•™μŠ΅μ„ μ§„ν–‰ν•  λͺ¨λΈμ„ λ§Œλ“€κ³  ν•™μŠ΅κ³Ό 평가λ₯Ό μ§„ν–‰ν•œλ‹€.
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        )
 
    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits
 
 
def train_loop(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    # Set the model to training mode - important for batch normalization and dropout layers
    # Unnecessary in this situation but added for best practices
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        # Compute prediction and loss
        pred = model(X)
        loss = loss_fn(pred, y)
 
        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
 
        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")
 
def test_loop(dataloader, model, loss_fn):
    # Set the model to evaluation mode - important for batch normalization and dropout layers
    # Unnecessary in this situation but added for best practices
    model.eval()
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    test_loss, correct = 0, 0
 
    # Evaluating the model with torch.no_grad() ensures that no gradients are computed during test mode
    # also serves to reduce unnecessary gradient computations and memory usage for tensors with requires_grad=True
    with torch.no_grad():
        for X, y in dataloader:
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
 
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
 
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
 
epochs = 10
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train_loop(train_dataloader, model, loss_fn, optimizer)
    test_loop(test_dataloader, model, loss_fn)
print("Done!")

이후 예츑 μ§„ν–‰, λͺ¨λΈ, μ„œλ²„, 평가 기둝의 경우 ν•΄λ‹Ή ν™˜κ²½μ— 따라 λ‹¬λΌμ§€κ²Œ λœλ‹€. Pipeline batch training pattern 의 과정은 λ‹€μŒκ³Ό κ°™κ³  각 μ„œλΉ„μŠ€μ— 따라 ꡬ쑰가 달라진닀.

μ°Έκ³ λ¬Έν—Œ

  1. pytorch νŠœν† λ¦¬μ–Ό : https://pytorch.org/tutorials/