Neural Network Becomes Unruly With Large Layers
This is a higher-level question about the performance of a neural network. The issue I'm having is that with larger numbers of neurons per layer, the network has frequent rounds of
Solution 1:
My guess is, that your network is oscillating heavily across a jagged error surface. Trying a lower error rate might help. But first of all, there are a few things you can do to better understand what your network is doing:
- plot the output error over training epochs. This will show you when in the training process things go wrong.
- have a graphical representation (an image) of your weight matrices and of your outputs. Makes it much easier to spot irregularities.
A major problem with ANN training is saturation of the sigmoid function. Towards the asymptotes of both the logistic function and tanh, the derivative is close to 0, numerically it probably even is zero. As a result, the network will only learn very slowly or not at all. This problem occurs when the input for the sigmoid are too big, here's what you can do about it:
- initialize your weights proportional to number of inputs a neuron receives. Standard literature suggests to draw them from a distribution with mean = 0 and standard deviation 1/sqrt(m), where m is the number of input connections.
- scale your teachers so that they lie where the network can learn the most; that is, where the activation function is the steepest: the maximum of the first derivative. For tanh you can alternatively scale the function to f(x) = 1.7159 * tanh(2/3 * x) and keep the teachers at [-1, 1]. However, don't forget to adjust the derivative to f'(x) = 2/3 * 1.7159 * (1 - tanh^2 (2/3 * x)
Let me know if you need additional clarification.
Post a Comment for "Neural Network Becomes Unruly With Large Layers"