SimpleRNN - Basic RNN implementation with the simplest formulation:
LSTM (Long Short-Term Memory) - Addresses the vanishing gradient problem in standard RNNs:
GRU (Gated Recurrent Unit) - Simplified version of LSTM with fewer parameters
Bidirectional - Wrapper that can be applied to any RNN layer to process sequences in both directions:
The tanh (hyperbolic tangent) activation function is a commonly used activation function in recurrent neural networks like LSTMs and GRUs.
The sigmoid activation function is a mathematical function that transforms input values into an output range between 0 and 1. It’s commonly used in binary classification problems.
The IMDB dataset contains 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative).
Reviews have been preprocessed, and each review is encoded as a list of word indexes (integers).
For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer “3” encodes the 3rd most frequent word in the data.
This allows for quick filtering operations such as: “only consider the top 10,000 most common words, but eliminate the top 20 most common words”.
import tensorflow as tf
def main():
# load the IMDB dataset (movie reviews with sentiment labels)
# this dataset contains 50,000 movie reviews split for training and testing
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.imdb.load_data(num_words = 10000)
# Convert sequences to same length through padding
max_length = 256
x_train = tf.keras.preprocessing.sequence.pad_sequences(x_train, maxlen = max_length)
x_test = tf.keras.preprocessing.sequence.pad_sequences(x_test, maxlen = max_length)
# Build a simple model for text classification
model = tf.keras.Sequential([
# embedding layer - converts integer indices to dense vectors
tf.keras.layers.Embedding(input_dim=10000, output_dim=128),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation="relu"),
tf.keras.layers.Dropout(0.3),
# output layer for binary classification (positive/negative sentiment)
tf.keras.layers.Dense(1, activation="sigmoid")
])
# compile the model
model.compile(
optimizer="adam",
loss="binary_crossentropy",
metrics=["accuracy", "precision", "recall"]
)
# train the model
model.fit(
x_train, y_train,
epochs=5,
batch_size=128,
validation_split=0.2
)
# evaluate the model on test data
print("********** model evaluation *******")
model.evaluate(x_test, y_test)
main()