Thursday, May 13, 2010

Visual Object Recognition

White object: shadowsImage by kevindooley via Flickr
MIT researcher, James DiCarlo is cracking the crux of systems neuroscience--object recognition.  "Once we know how the brain transforms the pixels on our retina into pictures in our mind, we can start to understand how those representations form the basis for higher cognitive tasks such as memory, decisions, and long-term planning."  This is an interesting problem in other domains such as philosophy, that is also interested in perception and forming mental models.  How you form these mental models starts to determine how you relate to the universe not only in a physical sense but also in creating abstract concepts like history, truth and beauty.... but anyway, back to neuroscience. 

Although I am displaying a map of the brain showing localized areas of visual function, please do not think of the brain as modular.  Visual processing occurs across a gradient and not in a particular spot.

"Previous studies showed  that signals from the eyes travel to a succession of specialized brain regions that progressively and rapidly assemble the dots of light from the retina into lines, corners, shapes, and ultimately into complex objects. Object recognition is thought to happen in a region the size of a postage stamp, called the inferior temporal cortex.  "

The hard problem in object recognition is how to detect an object's  in the real world -- i.e. an object in shadows, or rotated, or partially hidden.

The brain's ability to see beyond this variability and to recognize the constancy of the underlying object. How does the brain do it? One idea suggests that the brain has a built-in computational system that automatically generalizes knowledge about objects under a variety of viewing circumstances. An alternate theory, which DiCarlo finds more plausible, holds that the brain learns to solve this problem through its vast experience in the natural world.

DiCarlo is also interested in how the brain handles objects in a crowded visual field.  In other words, why it's sometimes hard to find the canopener in the kitchen drawer.  I have this problem.  It can really make you look like a nitwit.  In office work, it can make it hard to work with spreadsheets--crowded visual fields (really crowded).  I used to do economic forecasting with rows and rows of numbers.  No wonder I'd check and recheck my figures.  I would come home wiped out. In vision therapy, I am doing the find the hidden words in a field of letters type of puzzle. 

Reblog this post [with Zemanta]