basic neural network ,problem in training
Started by simshon, Sep 25 2011 10:42 PM
12 replies to this topic
#1
Posted 25 September 2011 - 10:42 PM
after reading some articles about neural network(back-propagation) i try to write a simple neural network by myself.
ive decided XOR neural-network, my problem is when i am trying to train the network, if i use only one example to train the network,lets say 1,1,0(as input1,input2,targetOutput). after 500 trains +- the network answer 0.05. but if im trying more then one example (lets say 2 different or all the 4 possibilities) the network aims to 0.5 as output :( i searched in google for my mistakes with no results :S ill try to give as much details as i can to help find what wrong:
-ive tried networks with 2,2,1 and 2,4,1 (inputlayer,hiddenlayer,outputlayer).
-the output for every neural defined by:
double input = 0.0;
for (int n = 0; n < layers[i].Count; n++)
input += layers[i][n].Output * weights[n];
while 'i' is the current layer and weight are all the weights from the previous layer.
-the last layer(output layer) error is defined by:
value*(1-value)*(targetvalue-value);
while 'value' is the neural output and 'targetvalue' is the target output for the current neural.
-the error for the others neurals define by:
foreach neural in the nextlayer
sum+=neural.value*currentneural.weights[neural];
myerror=myoutput*(1-myoutput)*sum;
-all the weights in the network are adapt by this formula(the weight from neural -> neural 2)
weight+=LearnRate*neural.myvalue*neural2.error;
while LearnRate is the nework learning rate(defined 0.25 at my network). -the biasweight for each neural is defined by:
bias+=LearnRate*neural.myerror*neural.Bias;
bias is const value=1.
that pretty much all i can detail, as i said the output aim to be 0.5 with different training examples :(
ive upload my project here:
http://www.multiupload.com/G68E57N4BM .
im realy stuck here,dunno where my mistake after checking the code again and again:(
thank you very very much for your help:)
ive decided XOR neural-network, my problem is when i am trying to train the network, if i use only one example to train the network,lets say 1,1,0(as input1,input2,targetOutput). after 500 trains +- the network answer 0.05. but if im trying more then one example (lets say 2 different or all the 4 possibilities) the network aims to 0.5 as output :( i searched in google for my mistakes with no results :S ill try to give as much details as i can to help find what wrong:
-ive tried networks with 2,2,1 and 2,4,1 (inputlayer,hiddenlayer,outputlayer).
-the output for every neural defined by:
double input = 0.0;
for (int n = 0; n < layers[i].Count; n++)
input += layers[i][n].Output * weights[n];
while 'i' is the current layer and weight are all the weights from the previous layer.
-the last layer(output layer) error is defined by:
value*(1-value)*(targetvalue-value);
while 'value' is the neural output and 'targetvalue' is the target output for the current neural.
-the error for the others neurals define by:
foreach neural in the nextlayer
sum+=neural.value*currentneural.weights[neural];
myerror=myoutput*(1-myoutput)*sum;
-all the weights in the network are adapt by this formula(the weight from neural -> neural 2)
weight+=LearnRate*neural.myvalue*neural2.error;
while LearnRate is the nework learning rate(defined 0.25 at my network). -the biasweight for each neural is defined by:
bias+=LearnRate*neural.myerror*neural.Bias;
bias is const value=1.
that pretty much all i can detail, as i said the output aim to be 0.5 with different training examples :(
ive upload my project here:
http://www.multiupload.com/G68E57N4BM .
im realy stuck here,dunno where my mistake after checking the code again and again:(
thank you very very much for your help:)
#2
Posted 26 September 2011 - 02:46 PM
This is definitely bad:
As for learning algorithm, backpropagation is not the best method for setting neuron weights. There are more powerful algorithms, but there are no best approach for all cases. Try to play with learning constants, start from different set of weights and then choose the best.
//Calc OutputFor InputLayer for (int i = 0; i < layers[0].Count; i++) layers[0][i].Output = sigmoid(layers[0][i].Input);Never transform input values before feeding it into first neuron. While it may works for small inputs around 0, your whole network lose great deal of adaptivity. Also there are unnecessary redundancy here:
//Add Bias layers[i][j].Input += layers[i][j].Bias * weights[i][j, 0]; //layers[i][j].Input += layers[i][j].Bias;Remove layers[i][j].Bias altogether and use trainable weights[i][j, 0] only. Also "+=" instead of "=" scares me. Do you have nonzero values from previous run and use it? More clear code will be
layers[i][j].Input = weights[i][j, 0];
As for learning algorithm, backpropagation is not the best method for setting neuron weights. There are more powerful algorithms, but there are no best approach for all cases. Try to play with learning constants, start from different set of weights and then choose the best.
Sorry my broken english!
#3
Posted 26 September 2011 - 04:52 PM
1)
for every neuron there is input,output and bias(which is 1).
input is the sum of all the previous layers neurons(their output*weight with this neuron).
then the output of the neuron is the value of activation(input),
where is the problem here?
2)about the bias, as i understand the network need bias in case all the inputs are 0,then the network wont train without non-zero value(which is the bias).
3)about the '+=',before each train im set all the input&output of all the neurons to 0.0.
tyvm for your help:)
for every neuron there is input,output and bias(which is 1).
input is the sum of all the previous layers neurons(their output*weight with this neuron).
then the output of the neuron is the value of activation(input),
where is the problem here?
2)about the bias, as i understand the network need bias in case all the inputs are 0,then the network wont train without non-zero value(which is the bias).
3)about the '+=',before each train im set all the input&output of all the neurons to 0.0.
tyvm for your help:)
#4
Posted 26 September 2011 - 05:29 PM
For every neuron must hold
O = F(sum(I[i]*w[i])+b),
but for first layer you have
O = F(I+b).
Effectively you preprocess input signal with sigmoid and lessen dynamic range as a result. You cannot compensate lack of sensitivity with learning.
O = F(sum(I[i]*w[i])+b),
but for first layer you have
O = F(I+b).
Effectively you preprocess input signal with sigmoid and lessen dynamic range as a result. You cannot compensate lack of sensitivity with learning.
Sorry my broken english!
#5
Posted 26 September 2011 - 08:02 PM
'}:+()___ [Smile said:
']For every neuron must hold
O = F(sum(I[i]*w[i])+b),
but for first layer you have
O = F(I+b).
Effectively you preprocess input signal with sigmoid and lessen dynamic range as a result. You cannot compensate lack of sensitivity with learning.
O = F(sum(I[i]*w[i])+b),
but for first layer you have
O = F(I+b).
Effectively you preprocess input signal with sigmoid and lessen dynamic range as a result. You cannot compensate lack of sensitivity with learning.
but i output dont need to be like that :
o=sigmoid(sum(I[i]*w[i])+b*bw)
while bw is the bias weight?
#6
Posted 26 September 2011 - 09:14 PM
simshon said:
i dont understand your last sentence:S
Well, simply replace
SetInputs(input); //Calc OutputFor InputLayer for (int i = 0; i < layers[0].Count; i++) layers[0][i].Output = sigmoid(layers[0][i].Input);by
for (int i = 0; i < tempinput.Count; i++) layers[0][i].Output = layers[0][i].Input = tempinput[i];That can improve learning ability in certain cases.
Sorry my broken english!
#7
Posted 26 September 2011 - 09:25 PM
ive already try to not use the activation function on the input class,but the training proccess is still wrong somehow:S
#8
Posted 26 September 2011 - 09:42 PM
Hmm...why do you use O*(1-O)*E for error correction? This may be wrong, try E/(O*(1-O)) or simply E.
Sorry my broken english!
#9
Posted 26 September 2011 - 11:55 PM
i just followed few tutorials formulas,such as:
http://www.codeproje...recipes/BP.aspx.
what is E in your:E/(o*(1-o)?
http://www.codeproje...recipes/BP.aspx.
what is E in your:E/(o*(1-o)?
#10
Posted 27 September 2011 - 01:12 PM
By E I mean (target - output) or sum(weight[i]*error[i]) or delta in tutorial.
O*(1-O)*E doesn't make sense to me: it slows learning on tails of sigmoid where sensitivity already low. Mathematically justified will be E/(O*(1-O)) cause [sigmoid(x)]' = -sigmoid(x)*(1-sigmoid(x)). I think tutorial author simply made mistake.
O*(1-O)*E doesn't make sense to me: it slows learning on tails of sigmoid where sensitivity already low. Mathematically justified will be E/(O*(1-O)) cause [sigmoid(x)]' = -sigmoid(x)*(1-sigmoid(x)). I think tutorial author simply made mistake.
Sorry my broken english!
#11
Posted 27 September 2011 - 09:33 PM
stil the output aims to be 0.5:S
#12
Posted 27 September 2011 - 10:38 PM
Maybe you don't have enough neurons?
Sorry my broken english!
#13
Posted 29 September 2011 - 10:14 PM
ive try with more layers and more then 2 and even 4 in each layer(hiddens)
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users











