9 replies to this topic

### #1karligula

Valued Member

• Members
• 180 posts

Posted 26 January 2007 - 04:57 PM

Hi everyone...

Suppose you've got a network trained to recognise faces. It's got an input layer of neurons, one for each pixel in the input image. A few hidden layers. And an output layer of 100 neurons. So that if you give the input layer a random image and 95 of the output layer neurons fire, it means the network is 95% sure that image contains a face. Right?

Well... is it possible to run the network backwards? So if you reverse the direction of processing, set all 100 of the output (now input) neurons to fire, you would get the input (now output) layer neurons producing an image of a face?

Is that how our visual imaginations work? All the networks that normally recognise images, are put into reverse and instead start producing images instead?

Could this be how deja vu works... there's a momentary glitch in your brain and the network runs in both directions at once, so even as you see something you remember it?

Sorry if this is a naive description, or if it's something utterly stupid and I've completely misunderstood how these things work... but it feels like a good idea and I wanted to share!

### #2GroundKeeper

Valued Member

• Members
• 110 posts

Posted 26 January 2007 - 06:05 PM

First of the transitivity does work! This is not theoretically true since you could use transitive activitvation functions and somehow make the network transitive but using the summation of input as out will have infinite number of solutions backwards (to get the output there are inifinte number of ways from which set of values of input it could be generated by).

To then discuss how imagination work I will have too reason with knowing exactly (I would guess no one do). The first thing that hits me is the complexity of constructing a system that solves an equation in the maner of which we are talking. Assuming evolution have played it's role the mechanics of which the brain rules will be the product of randomness (in the mathematical sense of an stochastic process) which somewhat motivates occams razor that the simplest of solution is more probable. And infact there are a number of ann:s not working with the backpropagation networks but rather matrices of nodes. One example is the Hopefield network but there are many more that rather deals with association. In those models you somewhat understand how memory could work and from that assoication. How it generally works is that you start in some input state (retinal image, image) and then permutate the states by using most commonly some approche of hebbs rule to finally end up in a end state. The end state then represents the association of one state to another.

If you go to the biological side of the matter you will find that the models used in computer science / mathematics are very much an isolated system and infact the complexity of actually biological systems is the problem that you can't isolate such a problem. There are parallell actions and several other mechanics that makes the distinction of a single system very difficult.

Sorry for the rant but wanted to share my perspective of the matter.

(Disclaimer)
Sorry for my english!
(End disclaimer)

### #3Reedbeta

DevMaster Staff

• 5311 posts
• LocationSanta Clara, CA

Posted 26 January 2007 - 06:16 PM

karligula said:

Suppose you've got a network trained to recognise faces. It's got an input layer of neurons, one for each pixel in the input image. A few hidden layers. And an output layer of 100 neurons. So that if you give the input layer a random image and 95 of the output layer neurons fire, it means the network is 95% sure that image contains a face. Right?

That's not normally how a face-recognizing neural network would be set up. Neurons aren't binary "fire" or "not fire"; their output values are arbitrary real numbers. You would typically have only a single output value for your network, and train it with data containing a "1" output for faces and "0" for non-faces. Then when using the network you'd take that single output as a measure of face-ness, or confidence that the image is a face. If the output value was 0.85, for instance, you'd say the network was 85% confident the image was a face, or that the image was 85% facelike and 15% non-facelike.

So you see that setting the output to 1 and using the network 'backwards' would be unlikely to produce anything reasonable, due to (as GroundKeeper mentioned), the summations involved in the NN equations which have infinitely many solutions.
reedbeta.com - developer blog, OpenGL demos, and other projects

### #4TheNut

Senior Member

• Moderators
• 1701 posts
• LocationCyberspace

Posted 26 January 2007 - 06:35 PM

http://www.nutty.ca - Being a nut has its advantages.

### #5P4P4B34R

New Member

• Members
• 1 posts

Posted 28 January 2007 - 12:05 PM

Hi Folks,

There are three kinds of problems. I am not sure of which one you are talking about.

1) Face Detection : Given an image detect and extract the faces in the image. In other words we divide the image into few regions based on intensity and classify if a region is a face region or a non-face region.

2) Face verification problem: Given a face image and name of the person the face image belongs to, this problem verifies if the face image under consideration belongs to the particular person. This is where the single output ANN could be used where it gives a confidence of how close the face image is to the person.

3) Face recognition problem: Given a set of face images, find the names of the persons whom the faces belong to. This is where multiple outputs are available at the output layer.

Neural networks do not work backwards because neural networks try to approximate using non linear functions say y = f(x) where y is the output and x is the input. If f is non-linear function, for a given y, finding a unique x is not always possible. Imagine a parabola y = ax^2. For a particular y, you can always find 2 values of x. And the neural network uses a combination of such functions so there could be a large number of input images possible for a particular output.

### #6karligula

Valued Member

• Members
• 180 posts

Posted 29 January 2007 - 09:18 AM

Hmmmmmm I was thinking about it over the weekend and I guess that's right... if the network is reducing information from a high density input to a relatively low density output, then running it backwards probably won't work...

Although I have made a realistion about how ingenious neural networks actually are... I've always wondered what they're actually doing, it's quite mystifying how you put a face (or some other data) in at one end and a recognition or not comes out the other. And I was wondering where memory is stored in neurons... our electronic computers use transistors organised one way to do processing, and transistors organised another way to store data, there's a clear distinction. Yet a neural net manages to both store data AND do processing?

So then I thought that what you're really doing when training a neural network is tuning the network to represent the average of all the inputs... that's what memory really is. Then when you give it another input, it returns a value representing how close that input is to the average. So that way the network is both 'remembering' the average and also comparing the input with that average.

Many places I've read that networks are pattern recognition machines... really I've never understood how that works. It doesn't tell me how the network actually recognises the pattern. But if you think about it in terms of comparing to an average, that makes a lot more sense to me. I remember reading about some research where they averaged a load of male and female faces and came up with the 'perfect' male and female faces. Then our perception of how attractive a person is is determined by how closely their face matches the perfect face. So I reckon that's what a network is doing.

I'm ranting I know, just got all these thoughts in my head, need to get them out!

### #7Reedbeta

DevMaster Staff

• 5311 posts
• LocationSanta Clara, CA

Posted 29 January 2007 - 06:31 PM

The thing is, karligula, networks can be trained on multiple patterns. For instance, neural networks can be used for recognizing digits in handwritten addresses on envelopes. The network recognizes each of the 10 possible digits as a pattern.

The network doesn't really learn the average of a set of inputs, rather it learns a function from inputs to outputs. It's basically a big nonlinear optimization problem. The network is a big mathematical function with a lot of parameters (the weights/biases of the neurons), and the training algorithms try to find the best combination of parameters to approximate the function you want. The thing is that the "function you want" isn't actually known, rather you have a set of training data, which consist of known outputs of the function for a certain set of inputs. The cool thing about NNs is that they are able to generalize from the training data and actually pin down a good approximation of the function you want even if you yourself don't know how to express that function mathematically (like the face-recognizing function).
reedbeta.com - developer blog, OpenGL demos, and other projects

### #8karligula

Valued Member

• Members
• 180 posts

Posted 30 January 2007 - 09:12 AM

Thanks Reedbeta, that's the first decent explanation I've yet read of what these networks are actually doing!

So, to recognise multiple digits, does the network have multiple outputs? And if you had a big enough network, could you train it to recognise anything? And once you HAVE trained the network, you could theoretically extract the function that it represents?

### #9GroundKeeper

Valued Member

• Members
• 110 posts

Posted 30 January 2007 - 10:58 AM

No, the meaning of the network is not that easy to interpete as anything other that the network configuration. You could of course analyse the network output and from that get what function it might represent. You can of course manually deduce it from the network weights but that is nothing I would recommend since in most cases the function trained is not complete and will be a numerical approximation.

To understand what a NN does you must somehow understand the fundamental mathematical principal it bases upon. I would recommend the book from Simon Haykin (Neural Networks: A Comprehensive Foundation (2nd Edition) which covers what you are asking. But the mathematical background required for that book is somewhat demanding.

### #10meico

New Member

• Members
• 7 posts

Posted 30 January 2007 - 11:43 AM

karligula said:

So, to recognise multiple digits, does the network have multiple outputs??

Generally, and that would be easiest, but you don't _have_ to do it that way (for example you could do thresholding on a single output).

karligula said:

And if you had a big enough network, could you train it to recognise anything?

Thoeretically, but totally impractical. Humans have REALLY big intricately architected neural networks that have been fed for a lifetime with a nearly continuous stream of data using an incredibly advanced training algorithm and even we can't always recognise faces (if you just turn photos upside down it becomes difficult for us to even recognize photos of friends). :)

karligula said:

And once you HAVE trained the network, you could theoretically extract the function that it represents?

GroundKeepers reply is spot on, but I would also add a few comments. While our abilities to extract symbolic functions from even the simplest of neural networks is pretty bad, but can sometimes be done, there are a also a couple of problems with the very idea of converting networks to symbolic functions...

When we can represent a system with a function instead of network we generally go straight to using the function instead of the network. So, for simple things where a network or a fucntion would both do we usually already try to use the function.

For the types of problems where we do use neural networks for, the weights and architecture of the network is itself often the most compact notation for the function we can come up with because our symbolic notation just isn't well suited to represent the kinds of functions that neural networks are well suited for.

#### 1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users