Skip to content Skip to sidebar Skip to footer

Keras Model For Siamese Network Not Learning And Always Predicting The Same Ouput

I am trying to train a Siamese neural network using Keras, with the goal of identifying if 2 images belong to same class or not. My data is shuffled and has equal number of positiv

Solution 1:

Mentioning the resolution to this issue in this section (even though it is present in Comments Section), for the benefit of the community.

Since the Model is working fine with other Standard Datasets, the solution is to use more Data. Model is not learning because it has less data for Training.

Solution 2:

The Model is working fine with more data as mentioned in comments and in the answer by Tensorflow Support. Tweaking the model a little is also working. Changing the number of filters in 2nd and 3rd convolutional layers from 256 to 64 is decreasing the number of trainable parameters by a large number and therefore model started learning.

Solution 3:

I want to mention few things here which may be useful to others:

1) Data stratification / random sampling

When you use validation_split Keras uses the last x percent of data as validation data. This means that if the data is ordered by class, e.g. because "pairs" or "tripletts" are made in a sequence, validation data will only come from classes (or the class) contained in the last x percent of data. In this case, the validation set will be of no use. Thus it is essential to suffle input data to make sure that the validation set contains random samples from each class.

The docs for validation_split say:

Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling

2) Choice of optimizer

In model.compile() choosing optimizer='sgd' may not be the best approach since sgd can get stuck in local minima etc. Adam (see docs) seems to be a good choice to start with since it...

[...] combines the advantages of [...] AdaGrad to deal with sparse gradients, and the ability of RMSProp to deal with non-stationary objectives.

according to Kingma and Ba (2014, page 10).

from keras.optimizers import Adam
...
model.compile(loss=contrastive_loss, optimizer=keras.optimizers.Adam(lr=0.0001))

3) Early stopping / learning rate

Using early stopping and adjusting the learning rate during training may also be highly useful to achieve good results. So the model can train until there is no more success (stop automatically in this case).

from keras.callbacks import EarlyStopping
from keras.callbacks import ReduceLROnPlateau
...
early_stopping = EarlyStopping(monitor='val_loss', patience=50, mode='auto', restore_best_weights=True)
reduce_on_plateau = ReduceLROnPlateau(monitor="val_loss", factor=0.8, patience=15, cooldown=5, verbose=0)
...
hist = model.fit([img_1, img_2], y, 
            validation_split=.2, 
            batch_size=128, 
            verbose=1, 
            epochs=9999,
            callbacks=[early_stopping])

4) Kernel initialization

Kernel initialization (with a small SD) may be helpful as well.

# Layer 1
    seq.add(Conv2D(8, (5,5), input_shape=input_shape, 
        kernel_initializer=keras.initializers.TruncatedNormal(mean=0.0, stddev=0.01, seed=None), 
        data_format="channels_first"))
    seq.add(Activation('relu'))
    seq.add(MaxPooling2D(pool_size=(2, 2))) 
    seq.add(Dropout(0.1))

5) Overfitting

I noticed that instead of using dropout to fight overfitting, adding some noise can be rather helpful. In this case simply add some GaussianNoise at the top of the network.

Post a Comment for "Keras Model For Siamese Network Not Learning And Always Predicting The Same Ouput"