The “Sensing: from Minds to Machines” is a 4 day event organized as two back-to-back workshops. Below is the detailed program for both workshops. Click titles for the talk abstract. Note that all academic lectures and poster sessions will be held in the W.A. Minkoff Senate Hall and Court.
The Future of Sensing for Robotics workshop, May 29-May 30
|Date||Time||Session and Talk|
|May 29||Workshop opening|
|May 29||09:00-10:00||Registration and refreshments|
|May 29||10:00-10:10||Greetings, Ohad Ben-Shahar, Computer Science Department, BGU|
|May 29||10:10-10:20||Greetings, Dan Blumberg, VP and Dean for R&D, BGU|
|May 29||Hearing Session, Chair:Boaz Rafaely|
|May 29||10:20-11:05||Steven van de Par, Oldenburg University
The role of spatial hearing in speech intelligibility
Spatial hearing leads to a benefit in speech intelligibility in noisy environments. There are various factors that have been proposed to contribute to this benefit. When the speaker of interest is closer to one ear than the noise sources, this ear will have a better “signal-to-noise ratio”. This is termed the better ear effect and does not strictly depend on spatial processing. It is known that signal components are better detectable when they contain interaural cues that differ from those of the masking noise, this is termed binaural unmasking. In addition, spatial cues may help to resolve what components of the total acoustical signal belong to the speech source such that these are selectively processed, this is termed auditory stream segregation. In this presentation, experiments will be discussed that aim to clarify to what degree binaural unmasking and auditory stream segregation contribute to the spatial speech intelligibility improvement. In addition, a possible method to incorporate the principle of auditory stream segregation into a machine listening algorithm will be discussed
|May 29||11:05-11:50||Symeon Delikaris-Manias, Aalto University
Perceptually motivated spatial sound processing
Spatial audio processing have gained considerable attention during the last decades and especially the recent years due to emerging applications such as virtual reality. The main approaches in the field are linear and parametric. Linear techniques commonly utilize either directly the microphone signals or a mathematical framework that aims to analyze and reconstruct the sound field or parts of the sound field from a recording. Versatile solutions, such as ambisonics, have been suggested where the microphone array signals are encoded in an intermediate set of signals which can be then decoded to an arbitrary loudspeaker setup or headphones. Such solutions depend on a mathematical framework. The aim is to reconstruct the sound field but this is not always applicable in practice due to hardware limitations. In contrary to such techniques, parametric techniques are adaptive, perceptually motivated spatial sound processing techniques which employ a parametric model describing the properties of the sound field and can process spatial sound with high perceptual accuracy when compared to linear techniques. This talk consists of an overview of the research and facilities of the Spatial Sound group of Aalto University, Finland and examples of perceptually motivated spatial sound techniques in sound reproduction and beamforming applications.
|May 29||11:50-12:35||Israel Nelken, Hebrew University
Mechanisms subserving auditory scene analysis in the auditory system
Auditory scene analysis is arguably the most important role of the core auditory system. As a computational task, it requires integration of information across frequency and over relatively long time periods, as well as information gathered by other sensory systems and probably feedback from higher brain areas. In my talk, I will highlight two such mechanisms. First, I will discuss the integrated vestibular-auditory representation that makes it possible to discriminate between self-motion and motion of the sound source, and is available as early as the cochlear nucleus, the first central station of the auditory pathway. Second, I will discuss the representation of rare (and presumably surprising) events, which is shaped over a number of auditory stations, starting at least as early as the inferior colliculus, and that is sensitive to the fine statistical structure of the stimulation sequence observed over tens of seconds. Thus, mechanisms that subserve auditory scene analysis occur early in the auditory system and shape its activity at all levels.
|May 29||12:35-13:20||Dorothea Kolossa, Ruhr-Universität Bochum
Listening strategies: Bayesian and active approaches to environmental robustness
Human beings still clearly outperform machines in recognizing speech or environmental sounds in difficult acoustic environments. In this talk, two potential reasons for this remaining skills gap are addressed: On the one hand, humans are highly effective at integrating multiple sources of uncertain information, and mounting evidence points to this integration being practically optimal in a Bayesian sense. Yet, the two central tasks of signal enhancement and of speech or sound recognition are performed almost in isolation in many systems, with only estimates of mean values being exchanged between them. The first part of this talk describes concepts for enhancing the interface of these two systems, considering a range of appropriate probabilistic representations. Examples will illustrate how such broader, probabilistic interfaces between signal processing and speech or pattern recognition can help to achieve better performance in real-world conditions, to more closely approximate the Bayesian ideal of using all sources of information in accordance with their respective degree of reliability. On the other hand, in contrast to machine listening, humans usually listen actively, i.e. they move their head or body for best recognition results as necessary: In difficult acoustic conditions, they not only turn their head for better-ear listening, but they also move to the spot that affords best recognition quality, e.g. moving up to a speaker they are interested in attending. Therefore, the second part of the talk will focus on current work in active machine listening, presenting strategies to endow machines with such capabilities and showing performance of passive versus active approaches in binaural machine listening.
|May 29||13:20-14:30||Lunch break (Senate Gallery)|
|May 29||Touch Session, Chair:David Golomb|
|May 29||14:30-15:15||Hossam Haick , Technion
Self-healable and unpixelated electronic skin for multifunctional sensing applications
Electronic skin (e-skin) is a pixelated flexible sensing array, which senses external and environmental stimuli, in a manner similar to human skin. E-skin has been produced from diverse technologies, such as semiconducting organics, nanowires, carbon nanotubes, and nanofibres. Although promising results have been achieved with these technologies, the multi-pixel integration, complicated wiring, applied voltage, and analysis remain challenges to overcome. For example, 10×10 pixelated e-skin requires 200-300 wiring devices and 100 electrical measurement devices, thus increasing the energy consumption and the e-skin cost. Here, we report for the first time, on a flexible substrate with two parallel gold nanoparticle (GNP) strips with anti-parallel sensitivity gradients for an unpixelated skin strip that diminishes the readout data for two resistance measurements only, acquired through three terminals. The e-skin strip exhibits highly sensitive prediction of both the load applied and position along the sensing strip and is sensitive to various environmental stimuli, such as temperature, humidity and volatile organic compounds. Additionally, the e-skin strip exhibits self-healing properties under normal (room temperature and atmospheric pressure) or harsh (low and high temperature) conditions, and maintains its excellent performance after several cycles of cutting and healing. These properties raise expectations that these e-skin strips might one day become self-administered, thus increasing their reliability in a number of applications, such as durable-transparency touch-screens, self-healing e-skin, and cost and effort efficiency in large-scale or hard-to-reach robotics and instruments.
|May 29||15:15-16:00||Tony Prescott, The University of Sheffield
The integration of sensing, attention and memory in mammals and robots
We are guided in where we explore next by what we already know and by what further information we wish to obtain. Exploratory sensing is therefore deeply constrained by memory. In Sheffield, we are investigating the neural underpinnings of attention, decision making, and spatial memory in relation to mammalian active touch. We employ a range of approaches including (i) ethological studies of rodent active vibrissal (whisker) sensing, (ii) systems-level computational neuroscience modelling, and (iii) biomimetic robotics. The goal of this research is to develop a model of the control architecture of the brain—integrating sensing, attention and memory—that generates exploratory behaviour in mammal-like robots similar to that seen in behaving animals. This talk will report on the current status of this project, outlining our current hypotheses concerning brain architecture, and demonstrating life-like behaviour in a number of robot platforms.
|May 29||16:00-16:15||Coffee break, (Senate Gallery)|
|May 29||16:15-17:00||Ehud Ahissar, Weizmann institute
Active sensing in closed loops
Empirical data collected from a variety of mammalian perceiving systems suggest that (i) perception is an active process and (ii) a perceiving system interacts with its environment via motor-sensory-motor (MSM) closed loops. These data suggest that biological perception is not based on an open-loop computational tour de-force but rather on a closed-loop convergence process. We hypothesize that convergence occurs within motor-object-sensory-motor loops, in which objects are temporarily ‘grasped’. I will present the hypothesis of closed-loop perception and discuss its validity for artificial perception.
|May 29||17:00-17:45||Martin Pearson, University of West England
Whisker based tactile sensing for robotics
The mammalian whisker sensory system is both an intriguing sensorimotor control loop to study as a model of active touch in biology, and as the inspiration for a useful tactile sensory modality in robotics. Whisker-like robotic tactile sensors have been used over the past 30 years in a diverse range of applications such as proximity sensors for legged robots, biomimetic arrays for tactile surface reconstruction, and as flow sensors for underwater robots. The Bristol Robotics Laboratory has been involved in biomimetic robotic whisker research for over 10 years working closely with both animal behavioural scientists and neuroscientists to further our understanding of the biology of whisker based touch and to exploit principles for robotic application. In this time numerous whiskered robotic platforms have been built and used in experiments to test hypotheses regarding the role of individual whisker motion, array morphology and head placement to improve perceptual acuity. More recently these principles have been applied to the robotic SLAM problem with a subsequent improvement in robot pose estimate and spatial mapping being reported. This talk will briefly cover the history of whisker based tactile sensing for robots before focusing on the importance of a close coupling between whisker sensing and whisker movement to maximise both the quality and quantity of sensory information derived. The state-of-the-art in whisker based sensing research will be reported and directions for future research discussed.
|Date||Time||Session and Talk|
|May 30||09:00-10:00||Registration and refreshments|
|May 30||Vision Session, Chair: Tammy Riklin Raviv|
|May 30||10:00-10:45||Adrian Stern, Ben-Gurion University of the Negev
Optical compressing sensing: opportunities and challenges
The theory of compressive sensing (CS) has opened up new opportunities in the field of imaging and optical sensing. It offers a framework to reduce the sensing effort which can be useful in various robot and vision tasks. However, its implementation in these fields is often not straight-forward since the principles of compressive imaging design may differ drastically from the principles used for conventional imaging. Analytical tools developed for conventional imaging may not be optimal for compressive imaging. Nor are the conventional imaging components. In this talk we overview the opportunities opened by compressive sensing and discus the main challenges that might arise in the design of compressive imaging systems. Our discussion will be demonstrated with success stories of compressive 2D and 3D imaging, motion detection and spectral imaging.
|May 30||10:45-11:30||Ilan Shimshoni, Haifa University
Vision-based Robot Homing and Localization
Vision sensors can be very useful for helping robots perform their tasks. They are very cheap and accurate. They can be used for navigation tasks such as homing, localization, visual odometry, and obstacle avoidance. In addition, they can be used for building maps of the environment (SLAM). Cameras however are not simple sensors. Various algorithms have to be developed and applied to the images in order to glean from them the required information. In my talk I will describe the general theory that is the basis of the vision-based navigation algorithms and the basic problem of image matching. I will then describe several algorithms that we developed for robot navigation and localization. I will describe our method for robot homing and localization algorithms for autonomous cars and satellites.
|May 30||11:30-12:15||Joseph Francos, Ben-Gurion University of the Negev
Geometry and radiometry invariant matched manifold detection and tracking
We present a novel framework for detection, tracking and recognition of deformable objects undergoing geometric and radiometric transformations. Assuming the geometric deformations an object undergoes, belong to some finite dimensional family, it has been shown that the universal manifold embedding (UME) provides a set of nonlinear operators that universally maps each of the different manifolds, where each manifold is generated by the set all of possible appearances of a single object, into a distinct linear subspace. We generalize this framework to the case where the observed object undergoes both an affine geometric transformation, and a monotonic radiometric transformation. Applying to each of the observations an operator that makes it invariant to monotonic amplitude transformations, but is geometry-covariant with the affine transformation, the set of all possible observations on that object is mapped by the UME into a distinct linear subspace – invariant with respect to both the geometric and radiometric transformations. This invariant representation of the object is the basis of a matched manifold detection and tracking framework of objects that undergo complex geometric and radiometric deformations: The observed surface is tessellated into a set of tiles such that the deformation of each one is well approximated by an affine geometric transformation and a monotonic transformation of the measured intensities. Since each tile is mapped by the radiometry invariant UME to a distinct linear subspace, the detection and tracking problems are solved by evaluating distances between linear subspaces.
|May 30||12:15-13:00||Brian Scassellati, Yale University
Sensing for human-robot interaction
Human-Robot Interaction places a unique set of context-sensitive constraints on perception. This talk will cover some of the unique conditions and methodologies that have been employed to identify interaction parameters, including attention and motivational state, that are essential for enabling intuitive human-machine interaction. We will draw examples from both collaborative manufacturing and robot tutoring.
|May 30||13:00-14:30||Lunch break and poster session (Senate Gallery)|
|May 30||6th Sense (BCI) Session, Chair:Ilana Nisky|
|May 30||14:30-15:15||Miriam Zacksenhouse, Technion
Brain-computer interfaces - Overview, challenges and errors
Brain computer interfaces (BCIs) provide direct communication between the brain and the external world, with promising applications for restoring movement and communication capabilities for disabled people and enhancing and extending performance of healthy people. Non-invasive BCIs, which are based on Electroencephalography (EEG), are especially promising since they are relatively easy to use (compared with invasive BCIs). Indeed commercial products are already available including hands-free BCI-speller and BCIs for gamming. However, given the low signal-to noise ratio and poor spatial resolution of EEG, non-invasive BCIs are prone to errors, and their extension to movement control presents significant challenges in signal processing, feature extraction, pattern analysis and machine learning. The first part of the talk will review different methods employed in EEG-based BCIs, and then focus on two of the main challenges in this area: single trial analysis and errors. The second part of the talk will focus on error-related potentials – EEG activity evoked in response to errors, including their interpretation within theories of error processing in the brain, and their potential detection for automatic error-correction.
|May 30||15:15-16:00||Eilon Vaadia, Hebrew University
Volitional Control of Neuronal activity by Brain Machine Interface
Since Hippocrates (460 BC) mankind knows that the brain is the origin of the self, the source of joys and sorrows. The intriguing features of the brain are not in its ‘perfect’ performance in controlling arm movements or “sensing” a scene. Like many, I admire more the miraculous creativity of the brain, that learns and memorize events in the external world, and creates internal model of reality. Brain Machine interface (BMI) provides a unique setting for exploration of the fundamental question – how does the brain learn to generate these internal models? Our laboratory uses BMI to test the hypothesis that the brain uses continuously two aspects of learning; (1) mapping instructions to actions and (2) predicting the results of the planned actions. The lecture will discuss this “reductionist” way to study how neuronal population in the brain generate internal model of the world. We have found consistent changes of neuronal activity in motor cortex that may explain how brains of monkeys learn to interact with a machine. Further we have succeeded to induce new patterns of electrical activity and associate it with BMI tasks. The results teach us central features in system principles of learning in the brain, and on the role of populations of cells in generation of internal models of reality. Methodologically, we provide an adaptive device that may become extremely useful in a wide range of clinical applications, far beyond the typical use of BMI for motor prosthesis. For example, the same principle can be used to regulate synchronization and patterns of activity when they malfunction in psychiatric conditions (Schizophrenia, depression and others) and in neurological disorders like epilepsies, Parkinson’s and Alzheimer conditions.
|May 30||16:00-16:45||Jan Peters, TU Darmstadt
Machine learning of motor skills for robots: From simple skills to table tennis and manipulation
Autonomous robots that can assist humans in situations of daily life have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. A first step towards this goal is to create robots that can learn tasks triggered by environmental context or higher level instruction. However, learning techniques have yet to live up to this promise as only few methods manage to scale to high-dimensional manipulator or humanoid robots. In this talk, we investigate a general framework suitable for learning motor skills in robotics which is based on the principles behind many analytical robotics approaches. It involves generating a representation of motor skills by parameterized motor primitive policies acting as building blocks of movement generation, and a learned task execution module that transforms these movements into motor commands. We discuss learning on three different levels of abstraction, i.e., learning for accurate control is needed to execute, learning of motor primitives is needed to acquire simple movements, and learning of the task-dependent „hyperparameters" of these motor primitives allows learning complex tasks. We discuss task-appropriate learning approaches for imitation learning, model learning and reinforcement learning for robots with many degrees of freedom. Empirical evaluations on a several robot systems illustrate the effectiveness and applicability to learning control on an anthropomorphic robot arm. These robot motor skills range from toy examples (e.g., paddling a ball, ball-in-a-cup) to playing robot table tennis against a human being and manipulation of various objects.
Vision: from Minds to Machines workshop, May 31-June 01
|Date||Time||Session and Talk|
|May 31||Workshop opening|
|May 31||09:00-10:00||Registration and refreshments|
|May 31||10:00-10:10||Greetings, Ohad Ben-Shahar, Computer Science Department, BGU|
|May 31||10:10-10:20||Greetings, Dr. Andrey Broisman,Ministry of Science, Technology and Space|
|May 31||Early Vision Session, Chair: Galia Avidan|
|May 31||10:20-11:05||Hamutal Slovin, Bar Ilan University
Encoding local stimulus attributes and higher visual functions in the visual cortex of behaving monkeys
One of the main tasks of the visual system is to combine edges and surfaces of individual objects into a perceptual group, and thus create a representation of visual scenes in which multiple objects are segregated from background. In this talk we will review our recent findings on cortical processing of edges and surfaces as well as cortical mechanisms underlying the segregation of single or few objects from background in the primary visual cortex (V1) of behaving monkeys. We used voltage-sensitive dye imaging to measure neuronal responses in V1 and found that cortical responses of figure-ground segregation are divergent, comprised from figure enhancement and background suppression. Further investigation of a more realistic natural scene with multiple objects suggests that separate objects are labeled by different response amplitude. Finally we studied the effects of fixational saccades and microsaccades on figure-ground modulation, and found that perceptual processing in V1 shows spatial discontinuity but higher order stability along time.
|May 31||11:05-11:50||Uri Polat, Tel-AvivUniversity
Using neuroscience technology to improves visual functions by enhancing brain performance
In our dynamic contemporary world, the processing of the visual information should be completed within a limited time, either before the saccadic eyes movements or the objects change their location. Often, the visual input to the brain is degraded by the optics of the eye with age, creating a bottleneck for the processing of visual information and, consequently, producing multiple negative effects on reading, visual acuity (VA), contrast sensitivity, reaction time, stereo acuity, and processing speed. We used the GlassesOff’s mobile application on iDevices to improve the vision of young people, Air-force combat pilots, and participants with aging eye. Training (15 minutes/session, 3 times/week) covered a wide range of stimuli and tasks. All tested functions were significantly improved, enabling glasses-free reading, without changes in the optics of the eye. Moreover, by improving the processing speed, the processing time of 6/6 targets (VA acuity) was decreased by 80 msec (39%). Therefore, training results in a more efficient retrieval of information, compensating for the blurred input and enhancing the vision to above-normal levels.
|May 31||11:50-12:35||Hedva Spitzer, Tel-Aviv University
Neuronal mechanism for compensation of Longitudinal Chromatic Aberration-derived algorithm
The human visual system has many challenges among them is the need to overcome imperfections of its optics, which degrades the retinal image. One of the most dominant limitations is the longitudinal chromatic aberration (LCA) which causes the short-wavelengths (blue light) to be focused in front the retina, and thus blurring the retinal chromatic image. In this study we ask the intriguing question, of how it is that despite the imperfections of the ocular optics, the perceived visual appearance is still of a sharp and clear chromatic image. We propose a plausible neural mechanism and computational model which are supported by known physiological and psychophysical evidences. The model suggests that the visual system overcomes the LCA through two main known properties of the S-channel: a) Omitting the contribution of the S-channel from the luminance pathway (utilizing only the L and M channels) b) Having large and coextensive receptive fields. The model is based on the small bistratified cells, which have large and coextensive receptive fields regions. We show how integrating these basic principles provide significant compensation to LCA. This has been demonstrated by computational simulations of our model on real images. We further suggest that the neuronal compensation mechanism can shed a light on a prominent visual phenomenon of large color shifts, Monnier & Shevell (2003) and is still regarded as enigmatic effect.
|May 31||12:35-13:20||Shimon Edelman, Cornell University
To understand vision, we must study real behavior, evolution, and the brain
The standard answer to the central questions of vision — "What does it mean to see?" — has been, as Marr (following Aristotle) phrased it, "To know what is where by looking." However, within the proper evolutionary and behavioral context of biological vision, none of the traditional "high-level" visual tasks, such as object recognition (let alone the posited "early vision" capabilities supposedly supporting those, such as "edge detection") make much sense. In particular, the ability to form stimulus-response associations (which is what "object recognition" amounts to and also the only thing that popular vision architectures such as deep networks do well) is only a small part of what it takes to support realistic animal (or robot) behavior in the wild. In my talk, I shall outline a program for vision research that avoids the stimulus-response fallacy and looks for inspiration in computational ethology, evolutionary theory, and neuroscience.
|May 31||13:20-14:30||Lunch break and Poster Session (Senate Gallery)|
|May 31||Mid-Level Vision Session, Chair: Tzi Ganel|
|May 31||14:30-15:15||Pieter Roelfsema , Netherlands Institute for Neuroscience
Decision making and working memory in early visual cortex
Most theories hold that early visual cortex is responsible for the local analysis of simple features while cognitive processes take place in higher areas of the parietal and frontal cortex. However, these theories are not undisputed because there are findings that implicate early visual cortex in visual cognition in tasks where subjects reason about what they see. I will discuss the contribution of early visual cortex to hierarchical decision-making and working memory. We used a hierarchical decision-making task to examine how monkeys solve a decision tree with stochastic sensory evidence at multiple branching points. We found a first parallel phase of decision making in areas V1 and V4 in which multiple decision were considered at the same time. This was followed by an integration phase where the optimal overall strategy crystallized as the result of interactions between the local decisions. In the working memory task, we examined how visual information is maintained in the different layers of V1. When the monkeys memorized a stimulus, we found a profile of top-down inputs in the superficial layers and layer 5 causing an increase in the firing rates in feedback recipient layers. A visual mask erased the V1 memory activity, but it then reappeared at a later point in time. These results provide new insights in the role of early visual cortex in the implementation of complex mental programs.
|May 31||15:15-16:00||Yafa Yeshurun, Haifa University
The reciprocal relations between attention and perceptual organization
Prior research suggests that the relations between perceptual organization and attention are crucial for successful comprehension of the visual display, yet the nature of these relations is not well understood. Here, I will focus on the reciprocal influences of attention and perceptual organization, demonstrating effects of attention on perceptual organization, and vice versa. In the first part of the talk I will describe effects of spatial attention on perceptual organization. Specifically, I will suggest that the allocation of attention to a specific location results in integration of information over longer periods of time but smaller spatial regions. The former involves effects of attention on feature-fusion, and the latter involves effects of attention on spatial crowding. In the second part of the talk I will describe effects of perceptual organization on spatial attention. In this part I will suggest that the mere organization of some elements into a perceptual group attracts attention automatically to the group's location, even when the perceptual group is irrelevant to the task and unpredictive of the task-relevant target. Moreover, I will show that the effectiveness of this spontaneous attentional capturing depends on the goodness of the emerging object and on the number and types of grouping principles (e.g., collinearity, closure, symmetry) mediating the organization.
|May 31||16:00-16:15||Coffee break, (Senate Gallery)|
|May 31||16:15-17:00||Zygmunt Pizlo, Purdue university
The role of Symmetry in 3D vision: Psychophysics and computational modeling
Almost all animals are mirror-symmetrical. This includes birds, fish, cats, elephants, humans, horses, pigs and insects. Animal bodies are symmetrical because of the way the animals move. Plants are symmetrical because of the way they grow. Man-made objects are symmetrical because of the function they serve. Completely asymmetrical objects are dysfunctional. The ubiquity of symmetry in our natural environment makes symmetry a perfect candidate for serving as a particularly effective prior in visual perception. Is symmetry such a prior? How much does the computational recovery of 3D shape benefit from symmetry? A lot. Symmetry actually provides the basis for the accurate recovery 3D shapes from 2D images, arguably the most difficult problem in vision. Does our visual system use symmetry as its most important prior? Is the recovered 3D percept veridical? Does the 3D percept deteriorate when the object’s symmetry is removed? Is there another way to produce veridical 3D percepts? These questions and answers will be discussed within a context provided by the results obtained in psychophysical experiments and by computational modeling. We will also ask whether human subjects update their symmetry prior by using repeated experiences with visual stimuli. Should they? Can a symmetry prior, active with a single eye, be rendered ineffective when the observer views 3D shapes with both eyes? Psychophysical results on monocular and binocular 3D shape recovery will be presented and then compared with a Bayesian model in which the visual data is combined with a number of priors. In this model, the priors supplement, rather than conflict with the visual information contained in the visual stimulus. I will conclude this talk by showing how the symmetry inherent in 3D objects can be used to solve the "Figure-Ground Organization" problem, namely detecting 3D objects in a 3D scene and detecting them in a 2D retinal or camera image.
|May 31||17:00-17:45||John Tsotsos, York University
Vision is necessarily a dynamically tunable process
We claim that a full theory of vision cannot be both biologically predictive and attention-free, cannot be both hierarchical and strictly feedforward. This is in stark contrast to the dominant view of machine processing of visual data. We question the manner in which biological inspiration/realism is used in that view and argue that human vision is a more complex, active and dynamic process. As a result, an important characteristic of mid-level visual representations and processes is that they can be modulated by task needs. We replace the almost ubiquitous hierarchical pipeline by a lattice of pyramid representations, tunable via attentional processes, in order to address the generality of human vision, including eye movements.
|Date||Time||Session and Talk|
|June 1||08:30-09:00||Registration and refreshments (Note the earlier time today)|
|June 1||High-Level Vision Session, Chair:Ohad Ben-Shahar|
|June 1||09:00-09:45||Rafi Malach, Weizmann Institute
Neuronal dynamics underlying human visual perception
A fundamental question in human visual research concerns the neuronal mechanisms that underlie the changes from one percept to another. Such changes are often highly non-linear showing an abrupt transition around a perceptual threshold. In my talk I will review recent work from our group that argues that an ignition-like process of neuronal activity, likely driven by local recurrent loops underlies such content changes. We find evidence for such ignitions under different conditions such as brief stimulations, spontaneous blinks and natural viewing conditions. Together these results point to local recurrent activity as a critical factor underlying human perceptual awareness.
|June 1||09:45-10:30||Nurit Gronau, Open University
The necessity of visual attention to object and scene categorization: the role of category-type, contextual environment and task-relevance
The extent to which real-world scenes and objects are categorized and understood when appearing outside the main focus of attention has been the focus of a continuous debate over the last decade. Most studies investigating this issue have used experimental paradigms in which processing a scene’s “gist” was an explicit part of task requirements, or alternatively, scenes were irrelevant to task demands yet their identity was explicitly reported by participants. While the first type of studies may have overestimated unattended scene processing, as attention was explicitly allocated to scenes’ location, the latter type of studies may have underestimated scene processing due to its reliance on subjective response criteria and on working memory capacity limits. The present research examined scene and object processing by using an indirect, online behavioral measure (i.e., a ‘categorical identity effect’) which allows the assessment of the influence of irrelevant distractor images on behavior. I will present data from several series of studies in which we compared the effects of scene and object categorization when images were fully attended versus when they were unattended and served as task-irrelevant distractors. Overall, our research demonstrates that categorical information is not registered when stimuli are presented at an unattended location, unless it is relevant to task demands and to response-selection processes. In my talk I will present findings concerning the categorization and identification of a variety of visual categories, and I will ask whether certain stimuli (e.g., faces) benefit from enhanced processing even when presented outside the main focus of attention.
|June 1||10:30-11:15||Galit Yovel, Tel-Aviv University
Are faces important for face recognition?
Faces convey rich information that is critical for intact social functioning. Most research on face recognition to date has primarily used images of unfamiliar static faces. However, the face recognition system was evolved to recognize familiar people that we often see in motion, rather then to discriminate among static images of unfamiliar faces. How do we recognize familiar people? What is the role of body motion in recognition of the whole person? In my talk, I will present studies that address these questions. Overall, I will make the claim that the field of face recognition should shift toward the study of the whole, dynamic, familiar person.
|June 1||11:15-12:00||Shimon Ullman, Weizmann Institute
Atoms of Recognition
The human visual system makes highly effective use of limited information: it can recognize not only objects, but severely reduced sub-configurations in terms of size or resolution. The ability to recognize such minimal images is crucial for the interpretation of complex scenes, but is also challenging because recognition in this case depends on the effective use of all the available information. Our human and computer vision studies show that humans and existing models are very different in their ability to interpret minimal images. I will describe the studies and discuss implications to the representations used for recognition, brain mechanisms involved, and algorithms for the interpretation of complex scenes.
|June 1||12:00:12:10||Closing, Yael Edan, ABC Robotic initiative, BGU|
|June 1||12:10-14:00||Lunch (Senate Gallery)|
|June 1||14:00-16:00||Campus and labs tour|