본문 바로가기

딥러닝 학습

9장 텍스트 분류 - 순환신경망 (5)

반응형

9-4 LSTM 순환 신경망으로 텍스트 분류

- LSTM

타임 스텝이 멀리 떨어진 영단어 사이의 관계를 파악하기 위함

1997년 호크라이터와 슈미트후버가 고안한 것으로, 그레이디언트 소실(vanishing gradient) 문제를 극복하여 긴 시퀀스를 성공적으로 모델링하게 됨

셀의 구조

- 텐서플로로 LSTM 순환 신경망 만들기

1. LSTM 순환 신경망 만들기

from tensorflow.keras.layers import LSTM

model_lstm = Sequential()

model_lstm.add(Embedding(1000, 32))
model_lstm.add(LSTM(8))
model_lstm.add(Dense(1, activation='sigmoid'))

model_lstm.summary()


"""
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_1 (Embedding)     (None, None, 32)          32000     
                                                                 
 lstm (LSTM)                 (None, 8)                 1312      
                                                                 
 dense_2 (Dense)             (None, 1)                 9         
                                                                 
=================================================================
Total params: 33,321
Trainable params: 33,321
Non-trainable params: 0
_________________________________________________________________
"""

 

2. 모델 훈련

model_lstm.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

history = model_lstm.fit(x_train_seq, y_train, epochs=10, batch_size=32, 
                         validation_data=(x_val_seq, y_val))
                         
                         
"""
Epoch 1/10
625/625 [==============================] - 27s 39ms/step - loss: 0.4462 - accuracy: 0.8015 - val_loss: 0.3765 - val_accuracy: 0.8376
Epoch 2/10
625/625 [==============================] - 24s 38ms/step - loss: 0.3355 - accuracy: 0.8598 - val_loss: 0.3745 - val_accuracy: 0.8422
Epoch 3/10
625/625 [==============================] - 23s 37ms/step - loss: 0.3124 - accuracy: 0.8705 - val_loss: 0.3617 - val_accuracy: 0.8422
Epoch 4/10
625/625 [==============================] - 22s 34ms/step - loss: 0.2948 - accuracy: 0.8770 - val_loss: 0.3690 - val_accuracy: 0.8352
Epoch 5/10
625/625 [==============================] - 23s 36ms/step - loss: 0.2787 - accuracy: 0.8859 - val_loss: 0.3740 - val_accuracy: 0.8394
Epoch 6/10
625/625 [==============================] - 23s 36ms/step - loss: 0.2720 - accuracy: 0.8860 - val_loss: 0.4034 - val_accuracy: 0.8324
Epoch 7/10
625/625 [==============================] - 23s 37ms/step - loss: 0.2581 - accuracy: 0.8921 - val_loss: 0.3881 - val_accuracy: 0.8324
Epoch 8/10
625/625 [==============================] - 24s 38ms/step - loss: 0.2456 - accuracy: 0.8982 - val_loss: 0.4137 - val_accuracy: 0.8256
Epoch 9/10
625/625 [==============================] - 22s 35ms/step - loss: 0.2371 - accuracy: 0.9015 - val_loss: 0.4231 - val_accuracy: 0.8342
Epoch 10/10
625/625 [==============================] - 23s 37ms/step - loss: 0.2259 - accuracy: 0.9083 - val_loss: 0.4448 - val_accuracy: 0.8306
"""

 

3. 손실 그래프와 정확도 그래프 그리기

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.show()

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.show()

4. 검증 세트 정확도 평가

loss, accuracy = model_lstm.evaluate(x_val_seq, y_val, verbose=0)
print(accuracy)


##출력: 0.8306000232696533

 

 

 

 

※ 해당 내용은 <Do it! 딥러닝 입문>의 내용을 토대로 학습하며 정리한 내용입니다.

 

반응형