본문 바로가기

텍스트 마이닝

RNN-딥러닝을 이용한 문서 분류 (6)

반응형

10.4 LSTM, Bi-LSTM과 GRU를 이용한 성능 개선

  • LSTM은 장기 기억정보를 추가함으로써 장기의존성을 학습
  • GRU(Gated Recurrent Unit): LSTM을 간소화한 모형으로 계산량이 적고 속도가 빠르면서도 좋은 성능을 냄
  • Bi-LSTM: 역방향의 영향을 함께 구현하기 위해 사용
from tensorflow.keras.optimizers import legacy as legacy_optimizers
from tensorflow.keras import optimizers

model = Sequential([
    Embedding(max_words, 64),
    Bidirectional(LSTM(64)),
    Dense(64, activation='relu'),
    Dense(1, activation='sigmoid')
])
model.summary()

model.compile(optimizer=optimizers.Adam(learning_rate=0.001),
              loss='binary_crossentropy',
              metrics=['acc'])

history = model.fit(X_train, y_train,
                    epochs=8,
                    verbose=0,
                    validation_split=0.2)

plot_results(history, 'acc')

# Evaluate the performance of the trained model on the test set
score = model.evaluate(X_test, y_test)
print(f'Test accuracy: {score[1]:.3f}')


"""
Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_4 (Embedding)     (None, None, 64)          640000    
                                                                 
 bidirectional_2 (Bidirectio  (None, 128)              66048     
 nal)                                                            
                                                                 
 dense_7 (Dense)             (None, 64)                8256      
                                                                 
 dense_8 (Dense)             (None, 1)                 65        
                                                                 
=================================================================
Total params: 714,369
Trainable params: 714,369
Non-trainable params: 0
_________________________________________________________________
"""

y_pred = np.round(model.predict(X_test[:10]))
for pred, y_t in zip(y_pred, y_test[:10]):
    print(f'predicted value: {pred[0]}, true value: {y_t}, so the prediction is {pred[0] == y_t}')
    
"""
1/1 [==============================] - 3s 3s/step
predicted value: 0.0, true value: 0, so the prediction is True
predicted value: 1.0, true value: 1, so the prediction is True
predicted value: 1.0, true value: 1, so the prediction is True
predicted value: 0.0, true value: 0, so the prediction is True
predicted value: 0.0, true value: 1, so the prediction is False
predicted value: 0.0, true value: 1, so the prediction is False
predicted value: 0.0, true value: 0, so the prediction is True
predicted value: 1.0, true value: 0, so the prediction is False
predicted value: 1.0, true value: 0, so the prediction is False
predicted value: 0.0, true value: 0, so the prediction is True
"""

 

 

 

 

 

※ 해당 내용은 <파이썬 텍스트 마이닝 완벽 가이드>의 내용을 토대로 학습하며 정리한 내용입니다.

반응형