{"pk":31764,"title":"A Model of Visual Perception and Recognition Based on Separated Representation of \"What\" and \"Where\" Object Features","subtitle":null,"abstract":"In the processes of visual perception and\nrecognition h u m a n eyes actively select\nessential information by way of successive\nfixations at the most informative points of the\nimage. So, perception and recognition are not\nonly results or neural computations, but are\nalso behavioral processes. A behavioral\nprogram defining a scanpath of the image is\nformed at the stage of learning (object\nmemorizing) and consists of sequential motor\nactions, which are shifts of attention from\none to another point of fixation, and sensory\nsignals expected to arrive in response to each\nshift of attention.\nIn the m o d e m view of the problem,\ninvariant object recognition is provided by the\nfoUowing: (i) separated processing of \"what\"\n(object features) and \"where\" (spatial\nfeatures) information at high levels of the\nvisual system; (ii) mechanisms of visual\nattention using \"where\" information;\n(iii)representation of \"what\" information in an\nobject-based frame of reference (OFR).\nHowever, most recent models of\nvision based on O F R have demonstrated the\nability of invariant recognition of only simple\nobjects like letters or binary objects without\nbackground, i.e. objects to which a frame of\nreference is easily attached. In contrast, w e\nuse not O F R , but a feature-based frame of\nreference (FFR), connected with the basic\nfeature (edge) at the fixation point. This has\nprovided for our model, the ability for\ninvariant representation of complex objects in\ngray-level images, but demands realization of\nbehavioral aspects of vision described above.\nThe developed model contains a\nneural network subsystem of low-level vision\nwhich extracts a set of primary features\n(edges) in each fixation, and high-level\nsubsystem consisting of \"what\" (Sensory\nM e m o r y ) and \"where\" (Motor M e m o r y )\nmodules. The resolution of primary features\nextraction decreases with distances from the\npoint of fixation. F F R provides both the\ninvariant representation of object features in\nSensory M e m o r y and shifts of attention in\nMotor Memory. Object recognition consists\nin successive recall (from Motor Memory)\nand execution of shifts of attention and\nsuccessive verification of the expected sets of\nfeatures (stored in Sensory Memory). The\nmodel shows the ability of recognition of\ncomplex objects (such as faces) in gray-level\nimages invariant with respect to shift,\nrotation, and scale","language":"eng","license":{"name":"","short_name":"","text":null,"url":""},"keywords":[],"section":"Submitted Presentations","is_remote":true,"remote_url":"https://escholarship.org/uc/item/0nb5s6tt","frozenauthors":[{"first_name":"Ilya","middle_name":"A.","last_name":"Rybak","name_suffix":"","institution":"University of Pennsylvania","department":""}],"date_submitted":null,"date_accepted":null,"date_published":"1993-01-01T18:00:00Z","render_galley":null,"galleys":[{"label":"PDF","type":"pdf","path":"https://journalpub.escholarship.org/cognitivesciencesociety/article/31764/galley/22832/download/"}]}