[17-3] CNN을 사용한 Kaggle 데이터 분류

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

또르르's 개발 Story

[17-3] CNN을 사용한 Kaggle 데이터 분류 본문

부스트캠프 AI 테크 U stage/개인프로젝트

[17-3] CNN을 사용한 Kaggle 데이터 분류

또르르21 2021. 2. 17. 09:49

부스트캠프 AI Tech를 하면서 "Deep Learning 기초"에 대해서 배우는 시간이 있었습니다.

"Deep Learning 기초"에서 정리할게 너무 많았고, 운 좋게도 그 담주에 설 연휴동안 1주일의 시간이 생겼습니다.

1주일동안 뭐할까 고민을 하다가 피어세션의 팀원들과 Kaggle에 있는 데이터를 훈련해서 제출하자고 했고, 이미 끝난 대회 중에 First Steps With Julia라는 종료된 대회에 데이터를 제출했습니다.

https://www.kaggle.com/c/street-view-getting-started-with-julia

데이터들은 간판 이미지로 존재하며, 간판 이미지는 알파벳(대소문자)과 숫자로 구성되어 있습니다.

DataSet과 CNN 구조는 아래 내용을 가지고 만들었습니다.

DataSet 만들기

dororo21.tistory.com/27

[13-2] Colab에서 DataSet 다루기 (강아지 DataSet)

1️⃣ 설정 아래 모듈들을 import 합니다. import tarfile import os import shutil import glob import numpy as np import matplotlib.pyplot as plt from torch import nn, optim from torch.autograd import..

dororo21.tistory.com

CNN 모델

dororo21.tistory.com/26

[13-1] CNN using PyTorch

1️⃣ 설정 1) device 설정 : GPU 사용 or CPU 사용 device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') 2) Dataset 가지고 오기 PyTorch에서는 기본적으로 datasets를 지원합니다. from..

dororo21.tistory.com

1️⃣ 설정

필요한 모듈을 import합니다.

import numpy as np

import matplotlib.pyplot as plt

import os

import shutil

import glob

import torch

import torch.nn as nn

import torch.optim as optim

import torch.nn.functional as F

%matplotlib inline

%config InlineBackend.figure_format='retina'

print ("PyTorch version:[%s]."%(torch.__version__))

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

print ("device:[%s]."%(device))

2️⃣ Kaggle 데이터 가져오기

kaggle 데이터를 Colab으로 가지고 오기 위해서는 많은 방법이 있지만, Kaggle module을 install해서 직접 colab disk로 가지고 올 수 있습니다.

!pip install kaggle

Colab에서 Kaggle API를 사용하기 위해서는 kaggle > Account에서 Create New API Token으로 json 파일을 upload 해야합니다.

from google.colab import files

files.upload()

이후 데이터를 ./kaggle 폴더에 넣어주어야 동작합니다.

!mkdir -p ~/.kaggle

!cp kaggle.json ~/.kaggle/


!chmod 600 ~/.kaggle/kaggle.json

kaggle 데이터를 불러옵니다.

!kaggle competitions download -c street-view-getting-started-with-julia

3️⃣ 데이터 전처리

1) 데이터 폴더 나누기

데이터 폴더를 나누는 이유는 datasets.ImageFolder를 사용하기 위함입니다.

datasets.ImageFolder는 계층적인 구조를 가지고 있기 때문에 하위 폴더의 이름을 가지고 labeling을 수행합니다.

따라서 하나로 뭉쳐있는 train data의 폴더를 나눠서 labeling을 수행합니다.

# unzip

import string


TEST_PATH = "images/test"

TRAIN_PATH = "images/train"

UPLOAD_PATH = "images/upload"		# upload 폴더는 kaggle에 올릴 test data


if os.path.exists(TEST_PATH):		# folder path가 이미 존재한다면 folder를 삭제

  shutil.rmtree(TEST_PATH)
  
if os.path.exists(TRAIN_PATH):

  shutil.rmtree(TRAIN_PATH)
  
if os.path.exists(UPLOAD_PATH):

  shutil.rmtree(UPLOAD_PATH)
  

for name in list(string.ascii_letters):		# 영어 대문자, 소문자 folder를 만듭니다.

  os.makedirs(os.path.join(TRAIN_PATH, name))
  
  os.makedirs(os.path.join(TEST_PATH, name))
  

for name in range(0,10):					# 0 ~ 10 folder를 만듭니다.

  name = str(name)
  
  os.makedirs(os.path.join(TRAIN_PATH, name))
  
  os.makedirs(os.path.join(TEST_PATH, name))
  

os.makedirs(UPLOAD_PATH)


!unzip -uq "/content/train.zip" -d "/content/images"		# 압축을 품

!unzip -uq "/content/test.zip" -d "/content/images/upload"


print("Done")

pandas를 사용해 csv의 해당 번호의 label들을 가지고 와서 폴더 단위로 나눠주기 전 과정을 진행합니다.

이 때, dataset에 들어간 folder와 label을 가지고 옵니다.

# train data 폴더 전처리(나누기)


import pandas as pd

trainLabels = pd.read_csv("./trainLabels.csv")  # csv 가지고 오기


dataset = []

for filepath in glob.iglob(f'images/train/*.Bmp', recursive=True):  #image/train 폴더에 있는 모든 Bmp파일

    filepath_list = filepath.split("/")
    
    number, exten = filepath_list[2].split(".")
    
    trgt_dir = trainLabels[trainLabels["ID"]==int(number)]["Class"].to_string(index=False).strip()
    
    exten = exten.replace("B","b")
    
    full_name = number + "." + exten
    
    shutil.move(filepath, os.path.join(filepath_list[0], filepath_list[1], full_name))
    
    filepath_new = os.path.join(filepath_list[0], filepath_list[1], trgt_dir, full_name)
    
    dataset.append([filepath_new, trgt_dir])
    
dataset = np.array(dataset)



>>> print(dataset)

[['images/train/I/3790.bmp' 'I']
 ['images/train/g/2848.bmp' 'g']
 ['images/train/E/1352.bmp' 'E']
 ...
 ['images/train/N/1320.bmp' 'N']
 ['images/train/S/1898.bmp' 'S']
 ['images/train/N/33.bmp' 'N']]

uploadset = []

for filepath in glob.iglob(f'images/upload/test/*.Bmp', recursive=True):

    filepath_new = filepath.replace("B","b")
    
    shutil.move(filepath, filepath_new)
    
    uploadset.append([filepath_new, filepath.split("/")[3].split(".")[0]])
    
uploadset = np.array(uploadset)


>>> print(uploadset)

[['images/upload/test/10329.bmp' '10329']
 ['images/upload/test/9002.bmp' '9002']
 ['images/upload/test/11757.bmp' '11757']
 ...
 ['images/upload/test/9155.bmp' '9155']
 ['images/upload/test/10950.bmp' '10950']
 ['images/upload/test/6324.bmp' '6324']]

test가 잘 되는지 확인하기 위해 train data에서 일부분을 train data / test data로 분리합니다.

sklearn.model_selection을 사용합니다.

from sklearn.model_selection import train_test_split    # 학습 데이터와 test data로 나눔


train_image, test_image, train_target, test_target = train_test_split(dataset[:,0], dataset[:,1], stratify=dataset[:,1])


for filepath, taregt_dir in zip(train_image.tolist(), train_target.tolist()):   

    filepath_lst = filepath.split("/")
    
    source_path = os.path.join(filepath_lst[0], filepath_lst[1], filepath_lst[3])
    
    shutil.copy(source_path, filepath) 
    
    
for filepath, taregt_dir in zip(test_image.tolist(), test_target.tolist()):    

    filepath_lst = filepath.split("/")
    
    source_path = os.path.join(filepath_lst[0], filepath_lst[1], filepath_lst[3])
    
    target_path = os.path.join(filepath_lst[0], "test", filepath_lst[2], filepath_lst[3])
    
    shutil.copy(source_path, target_path)

test_image는 다음과 같이 있습니다.

>>> test_image

array(['images/train/c/4427.bmp', 'images/train/C/226.bmp',
       'images/train/N/3245.bmp', ..., 'images/train/Z/4516.bmp',
       'images/train/C/6247.bmp', 'images/train/C/35.bmp'], dtype='<U23')

2) 데이터 Transform

torchvision의 transforms를 사용하면 data의 size를 조정, crop, 정규화, grayscale 변환, 수평으로 만들기 등 여러가지 수행이 가능합니다. 사용하려면 Compose 안에 원하는 transform들을 넣어주어야 합니다.

transforms.Compose([ ... ])

transforms.ToTensor() : Tensor로 변환

transforms.ToTensor()

transforms.Resize() : data의 크기 resize

 transforms.Resize([28,28]), # resize

transforms.Normalize() : 정규화

transforms.Normalize([0.5, 0.5, 0.5],[0.5, 0.5, 0.5])

# [R_mean, G_mean, B_mean], [R_std, G_std, B_std] 의미

transforms.Grayscale() : data를 gray로 변경

transforms.Grayscale(num_output_channels=1)

transforms.CenterCrop(n) : Center를 중심으로 n 크기만큼 crop

transforms.CenterCrop(28)

transforms.RandomHorizontalFlip() : 수평으로 만듦

transforms.RandomHorizontalFlip()

따라서 train, test, upload의 transform을 정의합니다.

from torchvision import datasets, transforms, models


train_transforms = transforms.Compose([transforms.Resize([28,28]), # resize

                                       transforms.ToTensor(),
                                       
                                       transforms.Normalize([0.5, 0.5, 0.5],
                                       
                                                            [0.5, 0.5, 0.5])
                                                            
                                       ])
                                       
test_transforms = transforms.Compose([transforms.Resize([28,28]),

                                      transforms.CenterCrop(28),
                                      
                                      transforms.ToTensor(),
                                      
                                      transforms.Normalize([0.5, 0.5, 0.5],
                                      
                                                           [0.5, 0.5, 0.5])])
                                                           

upload_transforms = transforms.Compose([transforms.Resize([28,28]), # resize

                                        transforms.ToTensor(),
                                        
                                        transforms.Normalize([0.5, 0.5, 0.5],
                                        
                                                             [0.5, 0.5, 0.5])
                                                             
                                       ])

3) ImageFolder 적용

datasets.ImageFolder는 계층적인 구조를 가지고 있기 때문에 하위 폴더의 이름을 가지고 labeling을 수행합니다.

DATASET_PATH = "images"

# pytorch에서 datasets.imageFolder을 하면 자동으로 폴더 안에 있는 image와 label을 잡아줌(위의 transforms를 폴더 안 bmp에 적용)

train_data = datasets.ImageFolder(DATASET_PATH + '/train', transform=train_transforms)  

test_data = datasets.ImageFolder(DATASET_PATH + '/test', transform=test_transforms)

upload_data = datasets.ImageFolder(DATASET_PATH + '/upload', transform=upload_transforms)

train_data는 아래와 같은 data와 transform을 가집니다.

>>> test_data



Dataset ImageFolder
    Number of datapoints: 4712
    Root location: images/train
    StandardTransform
Transform: Compose(
               Resize(size=[28, 28], interpolation=PIL.Image.BILINEAR)
               ToTensor()
               Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
           )

test_data는 아래와 같은 data와 transform을 가집니다.

>>> test_data



Dataset ImageFolder
    Number of datapoints: 1571
    Root location: images/test
    StandardTransform
Transform: Compose(
               Resize(size=[28, 28], interpolation=PIL.Image.BILINEAR)
               CenterCrop(size=(28, 28))
               ToTensor()
               Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
           )

train_data 또는 test_data의 class_to_idx 함수를 사용하면 label이 어떤 index로 정해졌는지 알 수 있습니다.

>>> train_data.class_to_idx


{'0': 0,
 '1': 1,
 '2': 2,
 '3': 3,
 '4': 4,
 '5': 5,
 '6': 6,
 '7': 7,
 '8': 8,
 '9': 9,
 'A': 10,
 'B': 11,
 'C': 12,
 ...
 't': 55,
 'u': 56,
 'v': 57,
 'w': 58,
 'x': 59,
 'y': 60,
 'z': 61}

4) DataLoader 사용

DataLoader는 Pytorch에서 제공하는 dataset입니다. DataLoader를 사용하면 batch size만큼의 데이터를 shffule해서 훈련에 사용하는 것이 가능해집니다. Generator로 가지고 오기 때문에 메모리도 적게 사용합니다.

BATCH_SIZE = 512

train_iter = torch.utils.data.DataLoader(train_data, batch_size=BATCH_SIZE, shuffle=True)

test_iter = torch.utils.data.DataLoader(test_data, batch_size=BATCH_SIZE, shuffle=True)

# upload iter은 csv에 순서대로 분류해주어야하기 때문에 shuffle하지 않음 

upload_iter = torch.utils.data.DataLoader(upload_data, batch_size=BATCH_SIZE, shuffle=False)

train_iter의 shape과 label을 가지고 오면 다음과 같습니다.

data_iter = iter(train_iter)


images, labels = next(data_iter)


>>> print(images.shape)

torch.Size([512, 3, 28, 28])


>>> print(labels)

tensor([25, 23, 47, 21, 27, 22, 55, 45, 31, 52, 24, 28, 43, 39, 36, 10,  4, 35,
        19, 21, 16, 23, 18, 17, 35, 31, 60, 10, 57, 28, 10, 17,  3, 18, 38, 49,
        58, 54, 14, 14, 37, 28, 16, 12, 18,  4, 14, 17, 49, 29, 10, 17, 36, 19,
        16, 56, 24, 25, 12, 56, 28, 12, 10, 55, 39,  0, 10, 41, 53, 10, 38, 23,
        10, 28, 21, 50, 21, 28, 29, 52,  4, 14, 16, 19, 24, 24, 14, 16, 17, 10,
        34, 61, 32,  7, 40, 53, 18, 13, 28, 18, 59, 10, 13, 17, 40, 40, 18, 40,
        14, 12, 23, 18, 16, 24, 10, 54,  1, 18, 24, 18, 10, 25,  7, 39, 15, 18,
        24, 24, 36, 18, 54, 10, 36, 14, 21, 22, 27, 50, 16,  3, 53, 53, 44, 10,
        27, 23, 30, 29, 55, 13, 14, 17, 23, 28, 10, 10, 23, 29, 24, 24, 17, 23,
        28, 23, 55, 13, 60, 21, 18, 44, 25, 59, 11, 42, 44,  4, 28, 28, 18, 30,
        11, 14, 23, 24, 40,  9, 25,  6, 20, 56, 21, 14, 23, 22, 33, 22, 49, 31,
        14, 11, 24, 47, 18, 30, 47, 27, 18, 47, 45, 22, 48, 27,  0, 28, 18, 29,
        12, 39,  5, 36, 50, 49, 24, 10, 43, 25, 26, 14, 13, 18, 54, 18, 41, 32,
        15, 31, 49, 14,  2, 25, 14, 14, 18, 27, 18, 40, 27, 18, 36, 10, 17, 18,
        18, 49, 10, 55, 49, 53, 41, 44, 52, 28, 14, 54, 10, 13, 23, 24, 23, 29,
        12, 21, 27, 28, 53, 23, 13, 10, 40, 26, 48, 12, 12, 17, 40, 23, 38, 29,
        49, 38, 20, 41, 49, 18, 10, 36, 29, 30, 50, 23, 20, 23, 32,  0, 11, 16,
        38, 44, 21, 27, 40, 25,  4, 32, 27, 10,  7, 24, 21, 21, 10, 34, 17, 26,
        23, 27, 27, 13, 10, 16, 17, 10, 29, 55, 44, 50, 14, 18, 14,  3, 24, 40,
        34, 13, 14,  1, 14, 24, 23, 27, 28, 49, 23, 10, 18, 36, 25, 29, 19, 21,
         6, 53, 20, 29, 21, 55, 36, 17, 43,  5, 25, 23, 10, 32, 10, 14, 21, 14,
        27, 34, 23, 14, 14, 16, 40, 13, 28, 28, 19, 10, 28, 10, 40, 56, 29, 22,
        40, 20, 27, 10, 29, 14, 52, 44, 24, 47,  1, 27, 26, 24, 23, 29, 16, 18,
        11, 43,  3, 10, 14, 10, 22, 12, 23,  3, 10, 22, 22, 20, 53,  8, 28, 29,
        48, 55, 23, 40, 42, 28, 18, 18, 40,  5,  3, 37,  6, 58, 34, 11, 51, 22,
        29, 15, 34, 23, 31, 16, 44, 14, 34, 14, 23, 12, 14, 10, 28, 25, 29, 23,
        18, 58, 33, 24, 11, 27, 18, 39, 14, 50, 47, 18, 28, 30, 18, 14, 39,  2,
         1, 44, 14, 16, 25, 14, 13, 53, 10, 29, 23, 23, 18, 14, 53, 12, 27, 47,
        25, 44, 27, 10, 28,  3, 46,  0])

5) Image 확인하기

matplotlib를 사용하면 image를 확인할 수 있습니다.

import matplotlib.pyplot as plt

import matplotlib.image as mpimg


def process(filename: str=None) -> None:

    image = mpimg.imread(filename)

    plt.figure()
    
    plt.imshow(image)



idx = np.random.choice(len(train_image), 10)


images = train_image[:][idx]

target = train_target[:][idx]


for file, trgt in zip(images, target):
    
    process(file)

4️⃣ Model

1) Define Model

모델은 CNN을 사용합니다.

class ConvolutionalNeuralNetworkClass(nn.Module):

    """

        Convolutional Neural Network (CNN) Class
        
    """
    
    def __init__(self,name='cnn',xdim=[1,28,28],
    
                 ksize=3,cdims=[32,64],hdims=[1024,128],ydim=10,
                 
                 USE_BATCHNORM=False):
                 
        super(ConvolutionalNeuralNetworkClass,self).__init__()
        
        self.name = name
        
        self.xdim = xdim
        
        self.ksize = ksize
        
        self.cdims = cdims
        
        self.hdims = hdims
        
        self.ydim = ydim
        
        self.USE_BATCHNORM = USE_BATCHNORM
        

        # Convolutional layers
        
        self.layers = []
        
        prev_cdim = self.xdim[0]
        
        for cdim in self.cdims: # for each hidden layer
        
            self.layers.append(
            
                nn.Conv2d(
                
                    in_channels = prev_cdim,    # previous channel dimensions
                    
                    out_channels= cdim,         # 현재 내가 가고자하는 channel dimensions
                    
                    kernel_size = self.ksize,   # kernel size
                    
                    stride=(1,1),
                    
                    padding=self.ksize//2
                    
                )) # convlution 
                
            if self.USE_BATCHNORM:              # batch norm을 사용할지 말지 결정
            
                self.layers.append(nn.BatchNorm2d(cdim)) # batch-norm
                
            self.layers.append(nn.ReLU(True))  # activation
            
            self.layers.append(nn.MaxPool2d(kernel_size=(2,2), stride=(2,2))) # max-pooling
            
             # kernel_size=(2,2), stride=(2,2) => 절반씩 줄어감
             
            self.layers.append(nn.Dropout2d(p=0.5))  # dropout
            
            prev_cdim = cdim
            

        # Dense layers
        
        self.layers.append(nn.Flatten())    # 한줄로 핌
        
        prev_hdim = prev_cdim*(self.xdim[1]//(2**len(self.cdims)))*(self.xdim[2]//(2**len(self.cdims)))
        
        for hdim in self.hdims:
        
            self.layers.append(nn.Linear(prev_hdim, hdim, bias=True))
            
            self.layers.append(nn.ReLU(True))  # activation 통과
            
            prev_hdim = hdim
            
        # Final layer (without activation)
        
        # 마지막 나온 output을 layer에 집어넣어줌
        
        self.layers.append(nn.Linear(prev_hdim,self.ydim,bias=True))
        

        # Concatenate all layers 
        
        # layer는 list이므로 list에 있는 layer들을 하나씩 쌓게됨
        
        self.net = nn.Sequential()    # net은 sequencial로 정의
        
        for l_idx,layer in enumerate(self.layers):
        
            layer_name = "%s_%02d"%(type(layer).__name__.lower(),l_idx)
            
            self.net.add_module(layer_name,layer)   # net의 module에 layer을 쌓음, module의 장점은 layer_name을 정할 수 있음
            
        self.init_param() # initialize parameters
        
        
    def init_param(self):
      
        for m in self.modules():
        
            if isinstance(m,nn.Conv2d): # init conv
            
                nn.init.kaiming_normal_(m.weight)
                
                nn.init.zeros_(m.bias)
                
            elif isinstance(m,nn.BatchNorm2d): # init BN
            
                nn.init.constant_(m.weight,1)
                
                nn.init.constant_(m.bias,0)
                
            elif isinstance(m,nn.Linear): # lnit dense
            
                nn.init.kaiming_normal_(m.weight)
                
                nn.init.zeros_(m.bias)
                
            
    def forward(self,x):
    
        return self.net(x)

CNN 모델의 xdim은 [3, 28, 28]이며 (RGB 3channel에 28 x 28 size), cdims는 [32, 64, 128] (convolution의 3번 진행, 32 channel => 64 channel => 128 channel), hdims를 [3028, 256]로 주어서 최종 ydim=62 (알파벳 대/소문자 52 + 숫자 10)에 도달합니다.

loss는 CrossEntropyLoss, Optim을 Adam으로 설정했습니다.

C = ConvolutionalNeuralNetworkClass(

    name='cnn',xdim=[3,28,28],ksize=3,cdims=[32,64, 128],
    
    hdims=[3028, 256],ydim=62).to(device) 
    
loss = nn.CrossEntropyLoss()

optm = optim.Adam(C.parameters(),lr=1e-3)

2) Parameter 확인

총 parameter 개수는 4,375,890개 입니다.

np.set_printoptions(precision=3)

n_param = 0

for p_idx,(param_name,param) in enumerate(C.named_parameters()):

    if param.requires_grad:
    
        param_numpy = param.detach().cpu().numpy() # to numpy array 
        
        n_param += len(param_numpy.reshape(-1))
        
        print ("[%d] name:[%s] shape:[%s]."%(p_idx,param_name,param_numpy.shape))
        
print ("Total number of parameters:[%s]."%(format(n_param,',d')))

[0] name:[net.conv2d_00.weight] shape:[(32, 3, 3, 3)].
[1] name:[net.conv2d_00.bias] shape:[(32,)].
[2] name:[net.conv2d_04.weight] shape:[(64, 32, 3, 3)].
[3] name:[net.conv2d_04.bias] shape:[(64,)].
[4] name:[net.conv2d_08.weight] shape:[(128, 64, 3, 3)].
[5] name:[net.conv2d_08.bias] shape:[(128,)].
[6] name:[net.linear_13.weight] shape:[(3028, 1152)].
[7] name:[net.linear_13.bias] shape:[(3028,)].
[8] name:[net.linear_15.weight] shape:[(256, 3028)].
[9] name:[net.linear_15.bias] shape:[(256,)].
[10] name:[net.linear_17.weight] shape:[(62, 256)].
[11] name:[net.linear_17.bias] shape:[(62,)].
Total number of parameters:[4,375,890].

5️⃣ Evaluation function

Evaluation function의 가장 중요한 점은 model mode를 eval()로 수행해야 합니다. (no_grad mode)

def func_eval(model,data_iter,device):

    with torch.no_grad():
    
        n_total,n_correct = 0,0
        
        model.eval() # evaluate (affects DropOut을 안하고, and BN은 학습되어있는 것을 사용)
        
        for batch_in,batch_out in data_iter:
        
            y_trgt = batch_out.to(device)
            
            model_pred = model.forward(batch_in.view(-1,3,28,28).to(device))
            
            _,y_pred = torch.max(model_pred.data,1)
            
            n_correct += (y_pred==y_trgt).sum().item()
            
            n_total += batch_in.size(0)
            
        val_accr = (n_correct/n_total)
        
        model.train() # back to train mode 
        
    return val_accr
    
print ("Done")

6️⃣ Making CSV function

Kaggle에 upload하기 위해서는 파일 형식을 CSV로 변경해야합니다.

Model forward한 값을 Pandas DataFrame을 사용해서 CSV형태로 변경합니다.

from pandas import Series, DataFrame


def make_csv(model,data_iter,device):

    with torch.no_grad():
    
        
        model.eval() # evaluate (affects DropOut을 안하고, and BN은 학습되어있는 것을 사용)
        

        prev = 0
        
        total_bmp_dict = {}
        
        idx2class = dict([(value, key) for key, value in train_iter.dataset.class_to_idx.items()])
        
        
        for batch_in, batch_out in data_iter:
        
            model_pred = model.forward(batch_in.view(-1,3,28,28).to(device))
            
            _,y_pred = torch.max(model_pred.data,1)
            
            bmp_name = [ x[0].split("/")[3].split(".")[0] for x in data_iter.dataset.samples[prev:prev+batch_in.shape[0]]]
            
            bmp_idx = [idx2class[pred] for pred in y_pred.tolist()]
            
            bmp_dict = dict([(key, value) for key,value in zip(bmp_name, bmp_idx)])
            
            total_bmp_dict.update(bmp_dict)
            
            prev += batch_in.shape[0]
            

        bmp_series = Series(total_bmp_dict)
        
        bmp_series.to_csv('submission.csv')
        
        print("submission.csv is generated")
        

        model.train() # back to train mode 
        
print ("Done")

7️⃣ Train

1) Initial Evaluation

Model의 Param을 init한 후, train data와 test data의 정확도를 측정합니다.

현재는 training되지 않아 train_accr = 0.014, test_accr = 0.010정도로 측정됩니다.

C.init_param() # initialize parameters

train_accr = func_eval(C,train_iter,device)

test_accr = func_eval(C,test_iter,device)


>>> print ("train_accr:[%.3f] test_accr:[%.3f]."%(train_accr,test_accr))

train_accr:[0.014] test_accr:[0.010].

2) Train

Model을 200 Epoch으로 훈련합니다.

print ("Start training.")

C.init_param() # initialize parameters

C.train() # to train mode 

EPOCHS,print_every = 200,10

for epoch in range(EPOCHS):

    loss_val_sum = 0
    
    for batch_in,batch_out in train_iter:
    
        # Forward path
        
        y_pred = C.forward(batch_in.view(-1,3,28,28).to(device))
        
        loss_out = loss(y_pred,batch_out.to(device))
        
        # Update
        
        optm.zero_grad()      # reset gradient 
        
        loss_out.backward()      # 여기서는 gradient 값만 다 계산
        
        optm.step()      # optimizer update를 통해 각layer에 적용
        
        loss_val_sum += loss_out
        
    loss_val_avg = loss_val_sum/len(train_iter)
    

    if epoch==(EPOCHS-1) :
    
        make_csv(C,upload_iter,device)
        
    
    # Print     
    
    if ((epoch%print_every)==0 or epoch==(EPOCHS-1)) :
    
        train_accr = func_eval(C,train_iter,device)
        
        test_accr = func_eval(C,test_iter,device)
        
        print ("epoch:[%d] loss:[%.3f] train_accr:[%.3f] test_accr:[%.3f]."%
        
               (epoch,loss_val_avg,train_accr,test_accr))
               
print ("Done")

결과는 epoch :199에서 loss:[0.139] train_accr:[0.994] test_accr:[0.963] 결과를 보이는 것을 알 수 있습니다.

Start training.
epoch:[0] loss:[5.770] train_accr:[0.050] test_accr:[0.052].
epoch:[10] loss:[3.790] train_accr:[0.097] test_accr:[0.094].
epoch:[20] loss:[3.003] train_accr:[0.383] test_accr:[0.381].
epoch:[30] loss:[1.850] train_accr:[0.664] test_accr:[0.657].
epoch:[40] loss:[1.317] train_accr:[0.769] test_accr:[0.767].
epoch:[50] loss:[1.008] train_accr:[0.840] test_accr:[0.830].
epoch:[60] loss:[0.770] train_accr:[0.890] test_accr:[0.874].
epoch:[70] loss:[0.620] train_accr:[0.925] test_accr:[0.906].
epoch:[80] loss:[0.475] train_accr:[0.950] test_accr:[0.926].
epoch:[90] loss:[0.378] train_accr:[0.958] test_accr:[0.928].
epoch:[100] loss:[0.309] train_accr:[0.971] test_accr:[0.941].
epoch:[110] loss:[0.257] train_accr:[0.983] test_accr:[0.951].
epoch:[120] loss:[0.233] train_accr:[0.989] test_accr:[0.954].
epoch:[130] loss:[0.208] train_accr:[0.992] test_accr:[0.957].
epoch:[140] loss:[0.171] train_accr:[0.992] test_accr:[0.959].
epoch:[150] loss:[0.173] train_accr:[0.994] test_accr:[0.960].
epoch:[160] loss:[0.150] train_accr:[0.995] test_accr:[0.959].
epoch:[170] loss:[0.117] train_accr:[0.995] test_accr:[0.961].
epoch:[180] loss:[0.125] train_accr:[0.996] test_accr:[0.961].
epoch:[190] loss:[0.123] train_accr:[0.998] test_accr:[0.960].
epoch:[199] loss:[0.115] train_accr:[0.998] test_accr:[0.963].
submission.csv is generated.
Done

8️⃣ 결과와 고찰

1) Kaggle 결과

완성된 CSV를 Kaggle에 upload했는데 다음과 같은 결과를 얻을 수 있었습니다.

2) 고찰과 향후 방향

MNIST에서는 epoch 10만에 loss:[0.037] train_accr:[0.997] test_accr:[0.992]를 도달하는 것을 보고 epoch을 10정도만 줘서 테스트 했습니다. 아래 데이터만 보면 알듯이 epoch 10에서는 9%나옵니다.

처음에 계속 0.07 ~ 0.09만 나오는 것을 보고 모델과 param 문제인가 계속 돌려봤지만, 문제는 해결되지 않았습니다. 피어세션에서 팀원들과 상의결과 epoch은 overfitting날 때까지 돌려야한다고 해서 epoch 200까지 돌렸더니 train_accur가 0.998까지 올라갔습니다.

epoch:[10] loss:[3.790] train_accr:[0.097] test_accr:[0.094].

epoch:[199] loss:[0.115] train_accr:[0.998] test_accr:[0.963].

Param은 xdim=[3,28,28], ksize=3, cdims=[32, 64, 128], hdims=[3028, 256], ydim=62인데 여기서 xdim과 ydim은 고정이고, 다른 값들은 저렇게 준 이유는 다음과 같습니다.

ksize=3은 kernel size는 3이 기본이며, 데이터 값 손실을 최소화하면서 좋은 성능을 낸다고 알고 있습니다. 이부분은 다시 한 번 알아봐야 할 것 같습니다.

cdims=[32, 64, 128]은 convolution layer을 3-layer로 했으며, 32 channel => 64 channel => 128 channel로 점점 커지게 설계했습니다.

피어세션에서 2-layer와 3-layer로 성능 차이를 가지고 이야기를 나눈 적이 있는데 제가 CNN을 돌렸을 때는 3-layer를 사용했을 때 epoch 20-30에서 정확도가 확 오른 것을 체감했습니다. 2-layer는 비교적 천천히 올랐고요.

따라서 결과적으로 3-layer를 사용했을 때 2-layer보다 Parameter 수가 확 줄어들었고, 정답에 빠르게 도달한다는 것을 알 수 있었습니다.

epoch:[20] loss:[3.418] train_accr:[0.284] test_accr:[0.263].

epoch:[30] loss:[2.419] train_accr:[0.543] test_accr:[0.481].

Private Score 74%면 조금 아쉬운 결과라고 생각합니다. 아마 Transform을 사용해서 Data Augmentation을 사용하면 성능이 약간 더 늘어날 것 같습니다. 아래처럼 Test_iter에다가 crop, horizontal을 수행했을 때 Test결과는 0.725정도 나왔고 이 값은 kaggle에 업로드한 accr와 비슷합니다. 차후에 더 Transform에 대해서 정리해봐야겠다고 생각했습니다.

transforms.CenterCrop(28)

transforms.RandomHorizontalFlip()

Start training.
epoch:[0] loss:[6.458] train_accr:[0.069] test_accr:[0.062].
epoch:[10] loss:[3.795] train_accr:[0.073] test_accr:[0.074].
epoch:[20] loss:[3.418] train_accr:[0.284] test_accr:[0.263].
epoch:[30] loss:[2.419] train_accr:[0.543] test_accr:[0.481].
epoch:[40] loss:[1.726] train_accr:[0.684] test_accr:[0.603].
epoch:[50] loss:[1.336] train_accr:[0.772] test_accr:[0.659].
epoch:[60] loss:[1.083] train_accr:[0.830] test_accr:[0.682].
epoch:[70] loss:[0.877] train_accr:[0.878] test_accr:[0.703].
epoch:[80] loss:[0.726] train_accr:[0.911] test_accr:[0.707].
epoch:[90] loss:[0.546] train_accr:[0.930] test_accr:[0.710].
epoch:[100] loss:[0.488] train_accr:[0.953] test_accr:[0.722].
epoch:[110] loss:[0.382] train_accr:[0.959] test_accr:[0.728].
epoch:[120] loss:[0.309] train_accr:[0.982] test_accr:[0.726].
epoch:[130] loss:[0.284] train_accr:[0.977] test_accr:[0.731].
epoch:[140] loss:[0.227] train_accr:[0.987] test_accr:[0.725].
epoch:[150] loss:[0.241] train_accr:[0.986] test_accr:[0.731].
epoch:[160] loss:[0.175] train_accr:[0.989] test_accr:[0.719].
epoch:[170] loss:[0.165] train_accr:[0.990] test_accr:[0.726].
epoch:[180] loss:[0.165] train_accr:[0.991] test_accr:[0.722].
epoch:[190] loss:[0.139] train_accr:[0.994] test_accr:[0.725].
submission.csv is generated
epoch:[199] loss:[0.127] train_accr:[0.997] test_accr:[0.725].
Done

Test data와 Train data를 나눴는데 모두 학습시켜버리면 성능도 더 올라갈 것 같습니다.

Comments

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

또르르's 개발 Story

또르르's 개발 Story

[17-3] CNN을 사용한 Kaggle 데이터 분류 본문

[17-3] CNN을 사용한 Kaggle 데이터 분류

1️⃣ 설정

2️⃣ Kaggle 데이터 가져오기

3️⃣ 데이터 전처리

1) 데이터 폴더 나누기

2) 데이터 Transform

3) ImageFolder 적용

4) DataLoader 사용

5) Image 확인하기

4️⃣ Model

1) Define Model

2) Parameter 확인

5️⃣ Evaluation function

6️⃣ Making CSV function

7️⃣ Train

1) Initial Evaluation

2) Train

8️⃣ 결과와 고찰

1) Kaggle 결과

2) 고찰과 향후 방향

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역