A Novel "Sound Visualization" Process in Virtual 3D Space: The Human Auditory Perception Analysis by Ecological Psychology Approach

Volume 11
A novel sound visualization process in virtual 3D space: the human auditory perception analysis by ecological psychology approach
S. Nomura1, T. Shiose1, H. Kawakami1, O. Katai1, and K. Yamanaka2
1
Department of Systems Science, Graduate School of Informatics, Kyoto University, 6068501 Kyoto, Japan
Email: shigueo@sys.i.kyoto-u.ac.jp Email: {shiose, kawakami, katai}@i.kyoto-u.ac.jp
2
Faculty of Electrical Engineering, Federal University of Uberlndia, 38400-902 Uberlndia, Brazil

Email: keiji@ufu.br
Abstract In this paper we propose sound visualization which provides a learning process for listeners considered as perceptual systems to perceive character-related sounds in a virtual 3D acoustic space. Character-related sound represents the most appropriate sound event to be perceived (read) even by the visually impaired. The test data (characters) used in our experiments were segmented from degraded images concerning real license plate photos. Multilayer Perceptron nets and sound space processor were used to transform these characters into perceptible auditory textures (non-speech audio) called character related sounds. Each character related sound perception accuracy was assessed by ten listeners in several experiments subjected to perceptual training and testing sessions. The term sound visualization describes the perception process of character-related sounds through our auditory system and such a perception analysis by ecological psychology approach. The proposed sound visualization was useful for people who have difficulties acquiring perceptual skills to read character-related sounds. The results show that relatively simple and short periods of perceptual training are enough to enhance listening abilities to classify sounds. Also the results indicate that perception depends on psychoacoustic learning mechanisms rather than simple spatial localization of acoustic signals. Furthermore, the work encourages us to conclude that sound visualization is a viable and great alternative for the visually impaired that use computers as communication media.
1. Introduction
Such human sensations as taste, smell, heat, and touch are not suitable channels for data presentation because our perception of them is not quantitative. Sound, however, is a useful medium for data presentation Yeung (1980).
- 137 Copyright 2005
Complexity International
Volume 11
The combination of visual and auditory imagery Kendall (1991) offers a way of presenting and communicating complex events that emulates the richness of daily experience. These events include what happens to objects in general, and what animate objects cause to happen. Sound events arise from action, that is, from the transfer of energy to a sound object in everyday life. We learn to recognize the occurrence of sound events and to relate them to physical events, even in childhood. Through a lifetime of experience we learn to classify and identify heterogeneous sound events. Some classication of sound events tends to be categorical. Simple categorical distinctions can be exploited in auditory presentations to communicate important distinctions in the data. An example of such auditory presentation for communicating data is scientic visualization Kendall (1991). Traditionally, scientic visualization has relied on computer graphics and visual data representation. Recent advances in multimedia and virtual reality have opened up other ways to interpret data. The primary goal of this scientic visualization is to provide scientists with improved representations of complex phenomena. Many researchers Buxton (1990); Kendall (1991) have suggested that sounds play a more important role in the study of these complex phenomena through the use of auditory data representation and visualization by ear. As examples of scientic visualization, Yeung (1980) explored the use of sound as an alternative to the graphic presentation of data because there are several problems associated with visual presentations. First, adequate standards are generally not available for visual displays. Second, resolution of the representation is generally poor. Third, there are problems with the actual orthogonality of the axes in visual representations. Finally, the problem remains of scaling the measurements in each dimension. In this paper, we propose sound visualization as a novel process to generate a character related sound and train our ability to classify sound events by auditory perception in a virtual 3D acoustic space. First, sound visualization means the perception process of characters using our auditory system, analysed by ecological psychology approach where listeners are considered as perceptual systems. After reviewing related papers, we present some advantages of using our ears, reasons in adopting ecological psychology approach to auditory perception, and benets of using non-speech audio. Then, we propose a road to achieve sound visualization process. Second, character-related sound represents the event to be perceived and classied by the visually impaired (listeners) experiencing the processed sounds in a virtual 3D space. We describe the virtual 3D acoustic space and the proposed process for generating experimental data of character-related sounds. Finally, we present the experimental results for discussing the auditory event perception by ecological psychology approach.
2.
Evidence: advantages of using our ears
Green (1976) has suggested that the ear needs inertia to achieve maximum possible sensitivity. Friction must be overcome because the ear is a mechanical system. According to Handel (1989) the absolute sensitivity of the ear may not compare to the eye, but it is worthwhile to compare the relative speeds with which the two senses attain their maximum sensitivity. The eye takes more than 30 minutes in darkness to attain its maximum sensitivity while the ear reaches its maximum in 0.1 second. Corey and Hudspeth (1979) have investigated the sensitivity of the ear from a neuroscience view, concluding that the bundle of stereocilia works as a light switch. When the bundle prods in one direction (from the shortest cilia to the tallest) it turns the cell on; when the bundle
Volume 11
moves in the opposite direction, it turns the cell off. Based on data from thousands of experiments in which they wiggled the bundle back and forth, they calculated that hair cells are so sensitive that deecting the tip of a bundle by the width of an atom is enough to make the cell respond. This innitesimal movement, which might be caused by a very low, quiet sound at the threshold of hearing, is equivalent to displacing the top of the Eiffel Tower by only half an inch. Other investigators have also concluded that the hair cells response is amazingly rapid. So another component of sensitivity is our ability to distinguish changes in sound. Extensive experimentation using pure sounds has measured changes in intensity or in frequency required before subjects can judge two sounds as being different. Our hearing is almost unbelievably complex. For instance, we can discriminate accurately between highly similar sounds even from birth: the slight difference between the sounds pah and bah is noticed by newborn infants Eimas (1975). We have highly developed skills in what might be called everyday listening Buxton (1990). These skills are heavily relied upon in everyday tasks such as driving and crossing the street. Everyday listening is the experience of listening to events rather than sounds Gaver (1993). We are concerned with listening to the things going on around us, hearing which things are important to avoid and which might offer possibilities for action. This kind of experience seems qualitatively different from listening to music (perceptual dimensions of the sound itself) and is not well understood by traditional approaches to audition. Unfortunately, our everyday listening skills have been virtually ignored when in interaction with computers or in traditional methods of analysing complex data. This waste of the audio channel should be avoided. Several listening based studies (Vanderveer 1979; Warren and Verbrugge 1984; Heine and Guski 1991; Shaw et al. 1991; Gaver 1993) from an ecological perspective suggest that an ecological approach to audition could be fruitful. However, a comprehensive account of everyday listening has yet to emerge. It is necessary to develop an account of ecologically relevant perceptual entities, and the dimensions and features of events that we actually obtain through listening: that is, What do we hear? by Gaver (1993). In this work, we studied adequate dimensions and features of events (character-related sounds) to improve perceptual information based on ecological accounts. We concentrated on such everyday listening in our previous works Shiose et al. (2004); Nomura et al. (2004b). This work analyses the perceptual dimensions and attributes concerning the sound-producing event (character-related sound) and its environment to human auditory perception by taking an ecological psychology approach.
3.
Ecological approach to auditory event perception
Basically, ecological psychology studies human-environment interrelationships and human perception in rich environments Gibson (1979). According to an ecological approach, perception is usually of complex events and entities in the everyday world. It is direct, unmediated by inference or memory. Elemental stimuli for perception do not necessarily correspond to primitive physical dimensions but may instead be specied by complex invariants of supposedly primitive features. Thus, complex perceptions rely on seemingly complex stimuli (or perceptual information), not on the integration of sensations. Since there is rich and varied information in the world, our descriptions are no longer limited to primitive physical dimensions because exploration of the world over time becomes an important component of perception. Studies of perception should uncover the ecologically
Volume 11
relevant dimensions of this perception and their invariant perceptual information from an ecological account. Sound provides information about the interaction of materials at a location in an environment. For instance, we can hear an approaching automobile, its size, and its speed. These are the phenomena of concern of an ecological approach to perception. One novelty is that we have found a great advantage in adopting such as ecological approach. Our previous work Shiose et al. (2004), which concerned hearing an approaching automobile, is useful in understanding what we hear and thus the scope of an ecological approach to audition. In the experiment, a source event (automobile) caused sound waves, some of which radiated directly to an observation point and others were modied by the environment before being reecting to the listener. Information of the experience of everyday listening is captured by the auditory system. In other words, a given sound provides information about an interaction of materials at a location in an environment. Changes in loudness caused by changes in distance from a source may also provide information about time-to-contact in an analogous fashion to changes in visual texture by Shaw et al. (1991). A system for training the ability to pick up information about this time-to-contact has been a goal of that work. However, traditional psychologists have little to say about such information because they only study part of the continuum from source to experience. Typical research focuses on the sound itself, analysing it in terms of such properties as amplitude and perceived loudness, or frequency and perceived pitch. Such research misses the higher level structures that are information about events. On the other hand, taking an ecological approach implies analyses of the mechanical physics of source events, the acoustics describing the propagation of sound through an environment, and the properties of the auditory system that enable us to pick up such information. The results of such analyses will be a characterization of acoustic information about sources, environments, and location which can be empirically veried. In what follows, we focus on analysing auditory perception when the individuals have been requested to classify character-related sounds. Obviously, this is only part of the story that an ecological approach to audition should tell. It is an intriguing part that is usually neglected. Actually, the knowledge about what we hear of the world, that is, about the attributes of events (characters) specied by sounds, is still very small.
4.
Benets of non-speech audio
Our ears provide us a means to extract information from non-speech audio Buxton (1990) which cannot be, or is not, displayed visually. For example, knocking on objects tells us a great deal about the materials from which they were made, leading to important observations about the quality of sound compared to sight. But while objects may be visible, their associated sounds only emerge as the result of specic actions. The latent information embedded in an objects sounds is a potent resource that can be exploited when transferred to computational objects. For example, in SonicFinder by Gaver (1989), the size of les, disks, etc. is encoded in how high (small) or low (large) the resulting sound is. The amount of reverberation associated with a sound provides cues about how full (dry) or empty (reverberant) the object (such as a disk) is. The use of non-speech audio at the user interface has become increasingly popular due to its potential benets Brewster et al. (1993): Increases the information communicated to the user;
- 140 -
Copyright 2005
Volume 11
Reduces the amount of information received through the visual channel; Improves task performance by sharing information across different sensory modalities; Offers more chances to identify the data; Can be heard from 360 degrees without needing to concentrate on an output device, providing greater exibility; Is good at capturing a users attention while he is performing another task.
The following dimensions of sound affect how we perceive the information presented to our ears: pitch, rhythm, loudness, duration, spatial location, timbre, and attack. Humans are able to perceive changes in these dimensions in various degrees of effectiveness Wise (1993). Brewster et al. (1993) emphasized that musical instrument timbres should be used instead of the synthetic timbres created by sine, triangular, and square waveforms suggested by Blattner et al. (1989). So in accordance with Brewster et al. (1993) we used a conventional violin to produce musical timbres as original sounds for the experiments. In this work, we have taken advantage of non-speech audio in the experiments.
5.
Road to sound visualization process
The ability to localize a sound in natural space is present in nearly all animals that possess a hearing mechanism Erulkar (1972). Accurate sound localization Fujiki et al. (2002) is ecologically important for most animal species since it is fundamental for survival, communication, and learning about sight-sound correspondences. Despite the practical signicance of this ability, however, our knowledge about the development of sound localization skills in humans is fairly limited. Sound localization ability has been exploited in our work as a road to reach a sound visualization process by listeners. The term localization refers to judgments of the direction and distance of a sound source. If headphones are worn, the sound image can be located inside the head. Lateralization is used to describe the apparent location of the sound source within the head. Headphones allow the precise control of interaural differences to eliminate effects related to room echoes. Thus lateralization may be regarded as a laboratory version of localization which provides an efficient means of studying sound direction perception. In terms of locomotion, Handel (1989) has investigated the various perceptual consequences of the environment that all contribute to auditory perception: room size, reverberation time, reection of sound waves, and source or listener movement. All of these cues modify the sound at the listeners ears from that sound originally produced by the violin. The sound wave pattern changes in irregular ways even walking around a simple room. Sighted and nonsighted people can develop remarkable abilities to use reected or reverberated sound patterns to move around and locate their position in rooms, corridors, and even outdoors in spite of such irregularities. Individuals can also locate moving and nonmoving sound-producing objects as well as non-sound-producing obstacles and can often describe the size, shape, and texture of these objects from the pattern of the reected and reverberated sounds. It is a sophisticated, learned skill and it demonstrates the potential of this information for understanding the external world. On this potential information we have based a method to provide listeners with sound visualization capacity for distinguishing character-related sounds in a virtual 3D space.
Volume 11
Experimentally, the road to sound visualization must occur in two sessions. First, a listener is trained until he can capture the necessary cues to perceive character related sounds. In the next session, the subject should learn to capture perceptual information from the heard character-related sound to classify it during the test.
6.
Virtual 3D space and character-related sounds
In our daily interaction with the world when we hear such different qualities of sound as pitch and timbre, we can determine the direction of the sound. Nowadays, the world can be simulated by sophisticated equipment that generates sounds we can perceive as coming from different points in a virtual 3D space. We considered that the localization of sound depends on the way the sound waves from the same source differ from each other as they reach the left and right ear. The head, torso, shoulders, and the outer ears modify the sound arriving at the ears. This modication can be described by a complex response function -the Head-Related Transfer Function (HRTF) Mller et al. (1995). Theoretically, HRTFs contain all the information about the sound sources location (its direction and distance from the listener) and can be used to generate binaural cues (interaural time differences -ITDs and interaural intensity differences -IIDs) and monaural cues (caused by the observers own head and pinnae that spectrally colour the sounds) Blauert (1997). If properly measured and implemented, HRTFs can generate a virtual acoustic 3D space. A virtual acoustic 3D space is based on a three-dimensional sound space processor (by Roland RSS-10) that can add appropriate cues on original sounds produced on a violin for generating character-related sounds. Figure 1(a) shows a schematic overview of a sound generation system that produces character related sounds using virtual acoustic space similar to previous research Shiose et al. (2004). Each character-related sound reaches the ears by headphone (Fig. 1(b)), and corresponding cues depend on the quality (shape) of each character from an original degraded image.
Figure 1. Sound generation system: (a) a schematic overview; (b) front view photo.
- 142 -
Copyright 2005
Volume 11
7.
Proposed Process -Experimental Procedure
The experimental process in this work is quite similar to previous work Nomura et al. (2004b) and consists of the following steps: Step 1 -Extraction of feature vectors from original images. The experimental data were obtained through an image pre-processing Nomura et al. (2002, 2004a) system followed by character segmentation and feature vector extraction Nomura and Yamanaka (2002). The extracted feature vectors (20 _ 15) are 300-dimensional ones. Step 2 -Dimensionality reduction by MLP trained with the back propagation algorithm Haykin (1999). In this step, the feature vectors (300-dimensional) are reduced to (x, y, z) data points represented in a 3D Cartesian coordinate system. Figure 2 presents the sample of feature vectors and their corresponding 3D vectors (x, y, z coordinates) after applying dimensionality reduction. Step 3 -Character-related sound generation. In the experiments, each note (produced on a conventional violin) was played by a AR-3000 audio recorder, and the corresponding sound effect (including cues) with movement, reverberation, or reection was created by a RSS-10 sound space processor, according to the sound generation process (previous section) presented in Fig. 1.
Figure 2. Sample of feature vectors and corresponding 3D vectors used in the experiments. The two sessions (to achieve the road to sound visualization) of each experiment proceeded as follows by using two data sets. The Cartesian coordinates of these data sets were multiplied by 10 to represent the distance in meters between the subject head and the sound location. Spatial geometry determined the movement direction of each character-related sound corresponding to the character to be perceived. Figure 3(a) presents the 2D movement directions for the training vectors with the source data in Table 1. The geometric characteristics related to the sound movement direction are: The listeners head position is (0, 0, 0) and faces the 0y direction. The forward position of each sound movement is calculated as 2 m off set ahead of the subject. The angle of direction is calculated as a vector that links the starting point position with the head position. The arrows in Fig. 3 indicate the movement direction of each character-related sound.
- 143 -
Copyright 2005
Volume 11
Table 1. Target vectors for training the ANN model

Vector V1 V2 V3 x-Coordinate 1.5773 -0.4226 -1.1547 y-Coordinate -0.4226 1.5773 -1.1547 z-Coordinate 0.5773 0.5773 0.5773
7.1 Training session

The data presented in Table 1 are used as input data to train the subjects to acquire the necessary skill to classify character-related sounds. The subject can also select the most appropriate musical note from the scale presented in Table 2. Sound effects such as reverberation and reection were not included during the training session.
Figure 3. Sound movement direction for training characters: (a) in 2D coordinate system (xed z-coordinate); (b) in 3D coordinate system (variable z-coordinate). Table 2. Scale of musical notes.
7.2 Testing session

Character-related sounds corresponding to the 30 vectors obtained by dimensionality reduction in Step 2 were used as input data. The subjects task was to hear a characterrelated sound and associate it to one of three characters (0, 1, or 2). The subject decides how to classify each character-related sound based on the perceived cues for capturing semantic information.
8.
Experimental Results
Three experiments were carried out to obtain results for analysing the auditory event perception (the road to sound visualization) by listeners requested to hear sounds in virtual 3D acoustic space. The data set (30 test vectors) obtained in Step 2 of the experimental procedure was used in the following experiments. A group of 10 male and 3 female subjects ranging from 20 to 60 participated in the experiments. Here, we present the results for 10 listeners.
Volume 11
8.1 Experiment 1: without varying the z-coordinate

In this experiment, each character-related sound moved with a xed z-coordinate, that is, at a constant height like an aircraft cruising. Seven listeners from the group of subjects participated in this experiment. The graph in Fig. 4 shows the perception rate results for each subject after classifying 30 character-related sounds.
Figure 4. Results of experiment 1.
8.2 Experiment 2: varying z-coordinate

This experiment aimed to verify the inuence of sound height variation (variable zcoordinate) on the perception rate of listeners. We selected subjects 3, 4, and 5 from experiment 1 who classied 30 character-related sounds that move under variable zcoordinates. In other words, the sound movement condition was quite similar to an aircraft taking off from an airport (Fig. 3(b)). Figure 5 presents the detailed results corresponding to each perceived character by subject.
Figure 5. Detailed results to each perceived character by subject in experiment 2.
8.3 Experiment 3: including reverberation and reection

8.3.1 With reverberation effect
Reverberation was attributed at a maximum level (0 dB attenuation), and reverberation time was dened as 0.2 s. Virtual room size was set to 4m length for each wall. Figure 6 presents the results after including reverberation in the character-related sound movement perceived by subjects 8, 9, and 11.
- 145 -
Copyright 2005
Volume 11
Figure 6. Results after including reverberation effect. 8.3.2 With reection effect
Basically, reection is caused by the oor and was set to a maximum level (0 dB attenuation). The height between the oor and the listeners head was dened as 1.2 m. Sound reection inuence on the perception rate by subjects 8, 9, and 11 can be veried in Fig. 7.
Figure 7. Results after including reection effect. 8.3.3 With reverberation and reection effects
In this case, we included both the acoustic effects (reverberation and reection) at maximum levels. The perception results for subjects 8, 9, and 11 are presented in Fig. 8.
Figure 8. Results after including reverberation and reection effects.
9.
Discussion
In experiment 2, we can verify an increase in the perception rate because of the z-coordinate variation according to the sound movement. This increase was signicant (from 20% to 83%) for the subject 5 in Fig. 5. In the case of subject 8, we verify in Fig. 8 that the maximum perception rate (100% for all the characters) was reached when both effects (reverberation and reection) were included as additional cues in the experimental environment for experiment 3. Subject 8 told us in an interview that the necessary cues to accurately perceive character-related sounds were easily captured at the start of the sound movement with variable z-coordinates. When the zcoordinate was xed, subject 8 said that needed more time to perceive and classify the test data. The results for perceiving 1 and 2 in Figs. 6, 7, and 8 clearly show that if sound elements come from the same location (side) in space, they will tend to be grouped together. This grouping process is based on our perceptual ability to localize sounds in our 3D world. This cue can be used reliably but it can also be easily violated. On the following considerations about perceptions and senses we have based our sound visualization developed to provide a learning process of listeners considered as perceptual systems.
Volume 11
First, Gibson (1966) showed that senses considered as special senses cannot be reconciled with senses considered as perceptual systems. A system has organs, whereas a sense has receptors. A system can orient, explore, investigate, adjust, optimize, resonate, extract, and achieve an equilibrium. Receptors can only passively receive stimuli; in a perceptual system the input-output loop can be assumed to actively obtain information. In other words, when senses are considered as perceptual systems the term sense means to detect something which is more accurate than to have a sensation. Special sense inputs constitute a repertory of innate sensations, but the achievements of a perceptual system are susceptible to maturation and learning. Sensations can be organized, fused, supplemented, or selected, but no new sensations can be learned. The information that is picked up, on the other hand, becomes more and more subtle, elaborate, and precise with practice. One can keep on learning to perceive as long as life goes on. Second, Gibson (1966) also conjectured that perceptual systems develop perceptual skills analogous to the way in which behavioural systems develop performative skills. The channels of sense are not subject to modication by learning, but perceptual systems are amenable to learning. The alternative is to assume that sensations triggered by sound are merely incidental, that information is available to a perceptual system, and that the qualities of the world in relation to the needs of the listener considered as perceptual system are experienced directly. In future work, we expect to obtain other results illustrating that a subject listened more carefully and perceptively after practicing sound visualization.
10. Conclusion
We have veried in these experiments that some people have naturally accurate senses for quickly capturing the necessary cues to perceive and classify character-related sounds in movement. The proposed sound visualization process in virtual 3D acoustic space has worked well to support people with difficulties for acquiring the perceptual skills to discriminate character related sounds. The perception results in our experiments encourage us to conclude that the proposed process can provide perception learning with the senses considered as perceptual systems. This is a crucial consideration because it allows analysis of human auditory perception by ecological psychology approach. Also we have concluded that reection and reverberation included as additional cues during sound movement in 3D acoustic space can improve perception rates by listeners. In other words, the perception of movement in auditory space by humans depends on a number of cues. Results were better when the subject only perceived instead of trying to recognize character related sounds. The perception rate was high when the time for perceiving characters was short, that is, when the subject did not apply the recognition process. This conclusion has made available a new road for analysing perception results under ecological psychology approach. On the other hand, since this research topic is still in its infancy, it is too soon to conclude that sound visualization process-based system is easier than conventional reading machines. More research is necessary to quantify how sensitive a listener (considered as perceptual system) is to accurately perceive character-related sound with cues such as movement, distance, direction, reection, or reverberation associated to a character to be read. However, experiments have shown that the acquisition of sound visualization ability using
Volume 11
virtual 3D acoustic space is possible and can become a great alternative for the visually impaired who use computers as communication media. The visually impaired would have a chance to recover the freedom to read texts without depending on the support of others (experts) because of a self skill transfer process. Also, sound visualization process-based systems might avoid such problems concerning conventional reading machines as nonaccurate character recognition, necessity of conversion from written language into spoken language, and the necessity of a synthesis process of connected speech from symbolic description. In the future, we aim to develop a system to provide the visually impaired with a sound visualization facility so that they can read texts by themselves.
References
Blattner, M. M., Sumikawa, D. A., and Greenberg, R. M. (1989). Earcons and icons: their structure and common design principles Human-Computer Interaction, 4(1), 1144. Blauert, J. (1997). Spatial hearing: the psychophysics of human sound localization. Revised Edition, The MIT Press, Cambridge, Massachusetts. Bregman, A. S. (1990). Auditory scene analysis: the perceptual organization of sound. The MIT Press, Cambridge, Massachusetts. Brewster, S. A., Wright, P. C., and Edwards, A. D. N. (1993). An evaluation of earcons for use in auditory human-computer interfaces. Proceedings of ACM/IFIP INTERCHI93 (Amsterdam) ACM Press, AddisonWesley, 222227. Buxton, W. (1990). Using our ears: an introduction to the use of nonspeech audio cues. In E. Farrell(Ed.). Extracting meaning from complex data: processing, display, interaction. Proceedings of the SPIE, Vol 1259, 124127. Corey, D. P., and Hudspeth, A. J. (1979). Ionic basis of the receptor potential in a vertebrate hair cell. Nature 281, 675677. Eimas, P. (1975). Auditory and phonetic coding of the cues for speech discrimination of the r-l distinction by young infants. Perception and Psychophysics, 18, 341347. Erulkar, S. D. (1972). Comparative aspects of spatial localization of sound. Physiological Review, 1952, 237260. Fujiki, N., Riederer, K. A. J, Jousmaki, V., Makela, J. P., and Hari, R. (2002). Human cortical representation of virtual auditory space: differences between sound azimuth and elevation. European Journal of Neuroscience, 16, 22072213. Gaver, W. W. (1989). The SonicFinder: an interface that uses auditory icons. Human-Computer Interaction, 4(1), 6794 Gaver, W. W. (1993). What in the world do we hear? An ecological approach to auditory event perception. Ecological Psychology, 5, 129. Gibson, J. J. (1966). The senses considered as perceptual systems. Houghton Mifin, Boston. Gibson, J. J. (1979). The ecological approach to visual perception. Houghton Mifin, Boston. Green, D. M. (1976). An introduction to hearing. Hillsdale: Erlbaum, New York. Handel, S. (1989). Listening : an introduction to the perception of auditory events. The MIT Press, Cambridge, Massachusetts. Haykin, S. (1999). Neural networks: a comprehensive foundation. 2 ed., Prentice-Hall, New Jersey. Heine, W. D., and Guski, R. (1991). Listening: the perception of auditory events? An essay review of listening: an introduction to the perception of auditory events by Stephen Handel. Ecological Psychology, 3, 263275. Irvine, D. R. F. (1986). The auditory brainstem. A review of the structure and function of the auditory brainstem processing mechanisms. Vol. 7 (ed. Ottoson, D.) (Springer, Berlin).
- 148 -
Copyright 2005
Volume 11
Kendall, G. (1991). Visualization by ear: auditory imagery for scientic visualization and virtual reality. Computer Music Journal, 15(4), 7073. Middlebrooks, J. C., and Green, D. M. (1991). Sound localization by human listeners. Annu. Rev. Psychol., 42, 135159. Mller, H., Srensen, M. F., Hammershi, D., and Jensen, C. B. (1995). Head-related transfer functions of human subjects. Journal of the Audio Engineering, 43(5), 300321. Nomura, S., and Yamanaka, K. (2002). New adaptive approach based on mathematical morphology applied to character segmentation and code extraction from number plate images. Proc. of 6th World Multi-Conference on Systemics, Cybernetics and Informatics, Florida, USA, IX. Nomura, S., Yamanaka, K., and Katai, O. (2002b). New adaptive methods applied to printed word image binarization. Proc. of the 4th IASTED International Conf. Signal and Image Processing, Hawaii, USA, 288293. Nomura, S., Yamanaka, K., Katai, O., and Kawakami, H. (2004a). A new method for degraded color image binarization based on adaptive lightning on grayscale versions. IEICE Trans. on Information and Systems, E87-D(4), 10121020. Nomura, S., Yamanaka, K., Katai, O., Kawakami, H., and Shiose, T. (2004b). Towards a novel sound visualization via virtual 3D acoustic environmental media. Proc. of International Workshop on Intelligent Media Technology for Communicative Intelligence, Warsaw, Poland, 121124. Shaw, B. K., McGowan, R. S., and Turvey, M. T. (1991). An acoustic variable specifying time-to-contact. Ecological Psychology, 3, 253261. Shiose, T., Ito, K., and Mamada, K. (2004). The development of virtual 3D acoustic environment for training perception of crossability. Proc. of the 9th International Conference on Computers Helping People with Special Needs, Paris, France. Strybel, T. Z., Manligas, C., and Perrott, D. R. (1989). Auditory apparent motion under binaural and monaural listening conditions. Perception and Psychophysics, 45, 371377. Vanderveer, N. J. (1979). Ecological acoustics: human perception of environmental sounds. Doctoral Thesis, Dissertation Abstracts International, 40/09B, 4543 (University Microlms No. 8004002). Warren, W. H., and Verbrugge, R. R. (1984). Auditory perception of breaking and bouncing events: a case study in ecological acoustics. Journal of Experimental Psychology: Human Perception and Performance, 10(5), 704712. Wise, G. B. (1993). Beyond multimedia: an architecture for true multimodal interfaces.A Research Proposal, Rensselaer Polytechnic Institute, New York. Yeung, E. S. (1980). Pattern recognition by audio representation of multivariate analytical data. Anal. Chem., 52, 11201123.
- 149 -
Copyright 2005

A Novel "Sound Visualization" Process in Virtual 3D Space: The Human Auditory Perception Analysis by Ecological Psychology Approach

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

A Novel "Sound Visualization" Process in Virtual 3D Space: The Human Auditory Perception Analysis by Ecological Psychology Approach

Hochgeladen von

Copyright:

Verfügbare Formate

Volume 11

Faculty of Electrical Engineering, Federal University of Uberlndia, 38400-902 Uberlndia, Brazil

Evidence: advantages of using our ears

Ecological approach to auditory event perception

Benets of non-speech audio

Road to sound visualization process

Virtual 3D space and character-related sounds

Proposed Process -Experimental Procedure

Table 1. Target vectors for training the ANN model

7.1 Training session

7.2 Testing session

8.1 Experiment 1: without varying the z-coordinate

Figure 4. Results of experiment 1.

8.2 Experiment 2: varying z-coordinate

Figure 5. Detailed results to each perceived character by subject in experiment 2.

8.3 Experiment 3: including reverberation and reection

Figure 8. Results after including reverberation and reection effects.

Das könnte Ihnen auch gefallen