This project focuses on building a sentiment classification model using the IMDB movie review dataset.
- π Processed and cleaned 75,000 movie reviews to ensure high-quality input for training.
- π Fine-tuned a Small BERT model with custom encoder layers and an optimized classification head.
- βοΈ Enhanced performance using AdamW optimizer, learning rate tuning, and weight decay to improve generalization.
- π Achieved 72% training accuracy and 75% validation accuracy, demonstrating effective model fine-tuning and deep learning optimization.
The project showcases the power of transfer learning in NLP using BERT and TensorFlow.
This project demonstrates fine-tuning a BERT model on the Stanford IMDB movie review dataset for binary sentiment classification (positive/negative). The model is built using TensorFlow, TensorFlow Hub, and TensorFlow Text.
- β Downloads and preprocesses the IMDB dataset
- β
Fine-tunes a Small BERT model (
bert_en_uncased_L-4_H-512_A-8
) from TensorFlow Hub - β Splits data into training, validation, and test sets
- β
Uses
EarlyStopping
andModelCheckpoint
callbacks - β
Saves the model in
.h5
,.keras
, and TensorFlowSavedModel
formats
- TensorFlow
- TensorFlow Hub
- TensorFlow Text
- Keras
- Training:
pos
,neg
, andunsup
- Test:
pos
,neg
- Total files: ~100,000
- BERT encoder (trainable)
- Dense(64) + Dropout(0.3)
- Final sigmoid output for binary classification
pip install tf-models-official tensorflow tensorflow_hub tensorflow_text