인공지능 공부 (MLP)

퍼셉트론이라는 것이 나오고 xor문제를 해결하지 못하는 단점이 있었는데,,

https://humphryscomputing.com/Notes/Neural/single.neural.html

MLP(Multi-Layer Perceptron)으로 해결하였다..

오늘은 이 MLP를 numpy로 구현하고 학습시켜서 xor문제를 풀어보고,
그 다음으로 pytorch로 더 쉽게 구현해서 똑같이 xor문제를 풀어보겠다.

나의 계획은 perceptron을 2개 붙여서 학습시킬 계획이라서 perceptron을 구현해야 한다.

퍼셉트론이 뭐지? 구글 검색..

이런 이미지를 찾을 수 있다 이게 퍼셉트론이다

input을 weight와 곱해서 bias와 sum한 뒤 그거를 activation function에 넣어서 출력을 구하는것..

먼저 필요한 가중치 들은 bias 1개 , weights들 input 개수만큼.

필요한 함수로는 활성함수(activation function), forward(예측), backward(학습)

이제 구현해보자

import numpy as np

#XOR data

inputs=np.array([[0,0],[0,1],[1,0],[1,1]])

outputs = np.array([[0],[1],[1],[0]])

먼저 numpy를 임포트 하고, 학습에 쓸 xor 데이터들을 적는다.

퍼셉트론의 input의 사이즈가 input_size 이고 output의 사이즈가 output_size일때,,

여기서 헷갈릴 수 있는 부분이 input 사이즈가 위 그림 기준 n이고 x1,x2,x3,...들을 하나의 벡터에 넣어서 그 벡터의 사이즈가 인풋 벡터의 사이즈를 말하는 것이다.

아웃풋은 마찬가지로 저기서 나오는 화살표가 1개 즉, 한 층에서의 노드의 수를 말하는 것이다.. (나만 헷갈리나?)

#퍼셉트론

class Perceptron:

def __init__(self, input_size, output_size, learning_rate=0.05):

self.input_size= input_size

self.output_size= output_size

self.weights = np.random.randn(input_size, output_size)

self.bias = np.random.randn(1, output_size)

self.learning_rate = learning_rate

def sigmoid(self, x):

return 1 / (1 + np.exp(-x))

def sigmoid_derivative(self, x):

return x * (1 - x)

def forward(self, inputs):

self.inputs = inputs

self.linear_output = inputs @ self.weights + self.bias

self.y_pred = self.sigmoid(self.linear_output)

return self.y_pred

#out layer backward

def out_layer_backward(self, y_true):

dLdy = -2 *(y_true - self.y_pred) / self.output_size

dLds = dLdy * self.sigmoid_derivative(self.y_pred)

dLdw= self.inputs.T @ dLds

r=dLds @ self.weights.T

self.weights -= dLdw * self.learning_rate

self.bias -= np.mean(dLds,axis=0) * self.learning_rate

return r

#hidden layer backward

def hidden_layer_backward(self, grad):

db = grad * self.sigmoid_derivative(self.y_pred)

dw = self.inputs.T @ db

r = db

self.weights -= dw * self.learning_rate

self.bias -= np.mean(db,axis=0) * self.learning_rate

return r

인풋 벡터의 크기를 1 x input_size, output vector의 크기를 1 x output_size 이라 할때 weights 는 input x weight 가 output vector 사이즈여야 하니까 1x input_size @ input_size x output_size = 1x output_size 니까.
weight의 크기는 input_size x output_size
bias는 weight와 곱해주지 않고 sum만 하니까 아웃풋 벡터와 크기가 같은 1xoutput_size
그리고 학습에 쓸 learning rate를 받는다

sigmoid와 그 미분은 구글에서 검색해서 복붙..

forward는 퍼셉트론의 작동을 그대로 코드로 구현한다
input을 받고 그 input과 weight를 곱한 뒤 bias를 더해주고.
sigmoid에 넣으면 output이 나온다

backward는 이제 가중치들을 조정하기 위해 gradient를 체인룰로 구해준다
구한 뒤 Leaning rate를 곱해서 조정해준다

back_backward는 gradient를 구하는데 이전에 나온 gradient를 이용하기 때문에 그걸 받아 이용하여 체인룰로 gradient를 마저 구하게 한다.
구한 뒤 Leaning rate를 곱해서 조정해준다

bias를 업데이트 할때는 bias의 크기가 1xoutput_size이기때문에, 학습시 배치를 사용할 경우 가중치가 n(배치 크기) x output_size이기에 np.mean으로 1xoutput_Size로 만들어주어서 가중치를 최적화한다.

#hiddenlayer

hidden_layer=Perceptron(2,2)

#outlayer

out_layer=Perceptron(2,1)

epochs=20000

#train

for epoch in range(epochs):

out_layer.forward(hidden_layer.forward(inputs))

hidden_layer.hidden_layer_backward(out_layer.out_layer_backward(outputs))

#XOR test

for i in range(4):

print(out_layer.forward(hidden_layer.forward(inputs[i])))

내가 만든 hidden_layer와 out_layer라는 객체는 이런식으로 생긴 것이다.

들어가는게 2개고 나오는게 2개인 hidden_layer 1개
들어가는게 2개고 나오는게 1개인 out_layer 1개
원 하나당 객체 하나가 아니라 층(layer)당 객체 하나(그래서 객체 이름 _layer라고 지음)!

이제 테스트를 해준다

[[0.02998189]] [[0.97216824]] [[0.96583395]] [[0.02649352]]

어느정도 잘 나왔다.

MLP에서 내 생각에 핵심
1. Input, output size의 의미 이해
-input은 다 scalar인데 이걸 벡터 하나로 만들어서 그 길이가 인풋 사이즈 아웃풋 사이즈는 노드의 개수이자 그림에서 원의 개수(한 층에서)
2. 체인룰 이해 및 적용

import torch

import torch.nn as nn

import torch.optim as optim

import numpy as np

# XOR 데이터셋

X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)

y = np.array([[0], [1], [1], [0]], dtype=np.float32)

# NumPy 배열을 PyTorch 텐서로 변환

X = torch.tensor(X)

y = torch.tensor(y)

# 신경망 모델 정의

class XORModel(nn.Module):

def __init__(self):

super(XORModel, self).__init__()

self.hidden = nn.Linear(2, 2)

self.output = nn.Linear(2, 1)

self.sigmoid = nn.Sigmoid()

def forward(self, x):

x = self.sigmoid(self.hidden(x))

x = self.sigmoid(self.output(x))

return x

# 모델 초기화

model = XORModel()

# 손실 함수와 옵티마이저 정의

criterion = nn.MSELoss()

optimizer = optim.SGD(model.parameters(), lr=0.1)

# 학습

num_epochs = 20000

for epoch in range(num_epochs):

# 순전파

outputs = model(X)

loss = criterion(outputs, y)

# 역전파 및 옵티마이저 단계

optimizer.zero_grad()

loss.backward()

optimizer.step()

# 결과 확인

with torch.no_grad():

predicted = model(X)

print(predicted)

tensor([[0.0545], [0.9498], [0.9381], [0.0479]])
파이토치로 구현한 MLP

모델을 만들고 linear은 weighted sum을 해준다 활성함수도 만들어서

forward에 weighted sum 해준걸 활성함수에 넣는 작업을 해준다

nn.module에 __call__이 정의 되어있는데 여기에서 forward를 호출해 주기때문에

순전파를 할때 model객체에 input을 넣어주는 방식으로 한다.

total_Code

https://github.com/rdt3784/STUDY_AI/blob/main/MLP.ipynb

저작자표시

Eody

인공지능 공부 (MLP)

티스토리툴바