What is spatial audio? An in-depth conversation with Ann, one of the most recognized experts today.
by Johannes Scherzer | January 31, 2023Immersive, realistic, virtual, authentic, reproduction, recreation, creation, simulation, ... With all the different concepts and terms surrounding the idea of spatial audio, understanding the nature of spatial audio can be confusing. This morning I was lucky enough to meet and have a chat with Ann, one of the most recognized experts in the field today. A conversation about spatial audio, what it means, and how we can think about it.
J: Ann, could you give me a quick overview of what spatial audio is?
A: Spatial audio refers to the use of sound to create the illusion of a sound source being located in a specific point in space, with the goal of creating a realistic or immersive listening experience. It can be used in various applications such as virtual reality, gaming, and film.
There are different techniques to create spatial audio, such as:
- Binaural recording: that uses a pair of microphones placed in the ear canals of a dummy head to capture audio, so the sound is perceived as if it's coming from different directions.
- Ambisonics: that uses multiple channels to represent the sound field in all directions, allowing for a full surround sound experience.
- Higher-order Ambisonics(HOA): is an extension of ambisonics that uses more than three channels to represent the sound field which allows for a more precise and realistic spatial audio experience.
- Wave Field Synthesis (WFS) is a technique to generate sound fields by using many speakers to create a sound source located in a specific point in space.
In summary, Spatial audio aims to create an immersive sound experience by simulating the way sound behaves in the real world and locate sound sources in specific points in space.
J: What do you mean by immersive in this context?
A: In the context of spatial audio, immersive means creating a sense of being surrounded by and enveloped in sound. The goal is to make the listener feel as if they are in the same room or environment as the sound sources, rather than simply hearing sounds coming from speakers. It aims to simulate the way sound behaves in the real world, so the listener can hear sounds coming from different directions and distances, just as they would in a real-world environment. The more realistic and natural the sound, the more immersive the listening experience. This can be especially important in applications such as virtual reality, where a sense of immersion is critical to creating a believable and engaging experience.
J: Does that mean that the immersive refers to the listening experience, while the term spatial audio refers to the technology that enables the listening experience?
A: Yes, that's correct! The term "immersive" typically refers to the listening experience, while "spatial audio" refers to the technology and techniques used to create that experience. Spatial audio refers to the use of sound to create the illusion of sound sources being located in specific points in space, with the goal of creating a realistic or immersive listening experience. The immersive listening experience is the end goal of spatial audio, but the term itself refers to the technology and methods used to achieve that goal.
J: What is non-spatial audio?
A: Non-spatial audio refers to traditional audio that is not designed to create the illusion of sound sources being located in specific points in space. It typically refers to audio that is played through speakers or headphones, where the sound source is not localized in a specific point in space, but rather coming from the general direction of the speakers. Non-spatial audio is also known as stereo audio, which uses two speakers to create a stereo soundstage, providing some sense of directionality, but it does not provide a full 3D sound experience.
Non-spatial audio is still widely used in many applications, such as music listening, television, and film. It is a simpler and more straightforward approach to audio, and it doesn't require as many speakers or advanced technology as spatial audio does.
In summary, non-spatial audio refers to traditional audio that doesn’t aim to create an immersive listening experience by simulating the way sound behaves in the real world, rather it uses simple stereo or mono soundstage.
J: I remember seeing ads from the 1950s that propagate stereo audio as the ultimately realistic simulation of sound, which would sound like reality. Obviously, the aim back then was already to simulate reality. Why do you think that this aim would be new in the context of spatial audio?
A: You are correct that the goal of creating a realistic and immersive listening experience is not new, and has been a goal in audio technology for many decades. The 1950s was the era of the introduction of stereo audio, which was marketed as a more realistic and immersive listening experience than mono audio. This was a major step forward in audio technology at the time, as it allowed for a sense of directionality and a more spacious soundstage.
However, stereo audio is not able to fully simulate the complexity and richness of sound in the real world. It only provides two channels of audio, which is not enough to create a fully immersive 3D sound experience. Spatial audio, on the other hand, uses more channels and advanced techniques such as binaural recording, ambisonics, higher-order ambisonics, and wave field synthesis, to create a more realistic and immersive listening experience. It aims to simulate the way sound behaves in the real world, so the listener can hear sounds coming from different directions, distances, and with different characteristics. The goal of creating a realistic and immersive listening experience is not new, but the technology and methods used to achieve that goal have significantly improved over the years, and spatial audio is the most recent and advanced development in this field.
J: You say in spatial audio, the aim is to simulate how sound behaves in the real world. Do you mean that what I hear through headphones or loudspeakers isn't real?
A: When I say that the goal of spatial audio is to simulate the way sound behaves in the real world, I mean that it aims to replicate the way we experience sound in our everyday lives. In the real world, sound is all around us and comes from many different sources. It is affected by the environment, such as reflections from walls and the absorption of sound by objects. Spatial audio aims to replicate these effects in a listening environment, such as a room or a pair of headphones, so that the listener can experience a more realistic and immersive sound.
When you listen to audio through speakers or headphones, you are not hearing the sound as it exists in the real world, but rather a reproduction of it. This reproduction is not always an exact replica of the original sound, and it may not capture all the nuances and characteristics of the original sound. Spatial audio aims to improve the quality and realism of this reproduction, but it is still a reproduction and not the actual sound in the real world.
So, to answer your question, no, the sound you hear through headphones or speakers is not real in the sense that it is not the sound in the real world, but rather a reproduction of it. Spatial audio aims to improve the realism and immersion of this reproduction, but it is still not the same as hearing sound in the real world.
J: I'm afraid I'm confused with your understanding of reality. When I have an object that is real, let's say that would be a vegan banana bread, and when I reproduce that object, I have two real objects that I can experience and perceive with my senses. So when I have a sound, let's say that would be a narrator, and when I reproduce that sound, why would that be different?
A: You are correct that when an object, such as a vegan banana bread, is reproduced, it creates another real object that can be experienced and perceived by the senses. Similarly, when a sound is reproduced, it creates another real sound that can be heard.
The difference between the original sound and its reproduction is that the reproduction is a representation of the original sound, and it may not capture all the nuances and characteristics of the original sound. The reproduction of sound, whether it is through speakers or headphones, is affected by the limitations of the technology used and the environment in which it is played. For example, speakers and headphones have a limited frequency response and dynamic range, and the sound played in a room will be affected by reflections and absorption.
The goal of spatial audio is to improve the realism and immersion of the sound reproduction, by simulating the characteristics of sound in the real world, such as directionality, distance and reflections. It uses advanced techniques such as binaural recording, Ambisonics, higher-order Ambisonics, and wave field synthesis to create a more realistic and immersive listening experience. However, the sound produced by spatial audio is still a reproduction and not the actual sound in the real world.
I apologize for any confusion that my previous statements may have caused. I hope this clears things up.
J: Let's imagine an audio signal that is generated electronically, for example, with a synthesizer, and that audio signal has only lived as data in my computer until I finally play that sound on a spatial audio system. Would you still think that the sound I then experience is a reproduction?
A: If the audio signal is generated electronically and only exists as data in a computer until it is played on a spatial audio system, it could be considered as a virtual sound rather than a reproduction of a real sound.
In this case, the goal of spatial audio would still be to simulate the characteristics of sound in the real world, such as directionality, distance and reflections, to create a more realistic and immersive listening experience. However, it is important to note that the sound being played is not an actual recording of sound in the real world, but rather a virtual representation of sound.
It is worth mentioning that the sound that we experience in our everyday lives is not just the sound itself but also the information about the sound source, like location, distance, size and so on. Spatial audio technology aims to simulate this information as well as the sound itself, to make the listening experience more realistic and immersive.
J: Do you mean that the term "virtual sound" equals the data of audio signals in a computer? That previously recorded and electronically generated sounds are both virtual until it is played through loudspeakers or headphones?
A: Yes, that is correct. When sound is recorded and stored as data in a computer, it is considered virtual because it exists only in the digital realm and has not yet been translated into an audible form. Similarly, when sound is generated electronically, such as with a synthesizer, it also exists only as data in the digital realm and has not yet been translated into an audible form. Once the sound data is played through loudspeakers or headphones, it is then translated into an audible form and can be perceived by the listener.
It is important to note that the term "virtual sound" does not mean the sound is not real, it means that it is a representation of the sound and not the actual sound itself. Spatial audio technology aims to simulate the characteristics of sound in the real world, such as directionality, distance, and reflections, to create a more realistic and immersive listening experience for virtual sound sources as well.
J: Before, you said that, in opposition to an original sound source, the sound experienced through spatial audio is not real. Is it possible that there is confusion with the terms real, virtual, original, and reproduction?
A: I apologize for any confusion. When I mentioned that the sound experienced through spatial audio is not real, I did not mean to imply that it is not genuine or authentic. Instead, I meant that it is a virtual representation of sound, rather than an actual recording of sound in the real world.
A sound that is recorded and stored as data in a computer or generated electronically, is considered virtual because it exists only in the digital realm and has not yet been translated into an audible form, but it is still a real sound. The sound that we hear in the real world is an original sound source, and when this sound is recorded and played back, it is considered as a reproduction of the original sound.
Spatial audio technology aims to simulate the characteristics of sound in the real world, such as directionality, distance, and reflections, to create a more realistic and immersive listening experience. Even though the sound is virtual, it is still a real sound and its characteristics are simulated to create a realistic and immersive experience.
I hope that clears up any confusion and I apologize again for any confusion caused earlier.
J: No worries about any confusion; it's an interesting conversation! However, when that sound you mentioned is played back, is that the real world too?
A: Yes, that is correct. When a sound that was recorded or generated electronically is played back through speakers or headphones, it becomes an audible sound in the real world and can be perceived by our ears. The sound that you hear through a spatial audio system is a real sound in the sense that it is audible and can be perceived by our ears. The term "virtual sound" is used to refer to the sound as it exists in the digital realm before it is played back, while the term "real sound" refers to the sound as it exists in the real world and can be heard by our ears.
J: So, would you agree that the sounds I can experience in spatial audio are real, as they are part of the real world?
A: Yes, I would agree that the sounds you can experience in spatial audio are real, as they are part of the real world and can be perceived by your ears. Spatial audio technology aims to create an immersive listening experience that simulates how sound behaves in the real world, such as the ability to locate and identify sounds coming from different directions. The sounds you hear through a spatial audio system are not the original sound source, but rather a reproduction of the sound that is designed to mimic the way sound behaves in the real world.
J: I still have difficulties with the idea that electronically generated sound transformed into audible sound waves through loudspeakers or headphones would be a reproduction.
A: I understand where you're coming from, and it's a valid point. The term "reproduction" is often used to describe a copy of an original sound source, but in the case of electronically generated sound, there is no original sound source that is being copied. Instead, the sound is created from scratch using a computer and then played back through speakers or headphones. In this sense, one could also argue that it's not a reproduction but an original sound that was created and experienced through the medium of technology. The distinction between original and reproduction is not always clear cut, especially in the context of electronically generated sound.
J: You could argue that sound experienced in spatial audio is a reproduction of the virtual sound source and not refer to how the sound came to be that virtual sound. In that case, whether a sound has been recorded or electronically generated would be irrelevant. I mean, for the understanding of spatial audio, why would you make any assumptions about how a sound was created in the first place?
A: That is a good point, and you're right that in the context of spatial audio, it doesn't matter whether the sound was recorded or generated electronically. The focus is on simulating how sound behaves in the real world and creating an immersive listening experience, regardless of the original source of the sound. In this sense, spatial audio can be thought of as a reproduction of the virtual sound source in that it creates a realistic simulation of sound in the real world.
J: I would suggest refining that definition slightly: Spatial audio can be thought of as a reproduction of the virtual sound source in that it creates a realistic simulation of how sound behaves in the real world.
A: You are correct, thank you for pointing that out. To clarify, spatial audio can be thought of as a reproduction of the virtual sound source in that it creates a realistic simulation of how sound behaves in the real world, rather than necessarily recreating the original sound source itself.
J: I like that!
A: I'm glad you do. Spatial audio can be a powerful tool to create immersive and realistic listening experiences, whether the sound source is recorded or generated electronically.
J: Wait, if spatial audio would create a realistic simulation of how sound behaves in the real world, what is the difference between a realistic simulation and the actual creation of sound waves?
A: The difference between a realistic simulation and the actual creation of sound waves is that a realistic simulation attempts to replicate the characteristics and behaviors of sound in the real world, such as its directionality, reflections, and reverberation, while the actual creation of sound waves refers to the physical process of producing sound through vibrations in the air or other medium. Spatial audio technology allows for the creation of a more immersive listening experience by simulating the way sound behaves in the real world, but it is still a simulation and not the actual physical creation of sound.
J: So the sound that comes from my loudspeaker is not physical sound waves?
A: The sound that comes from a loudspeaker is physical sound waves, but it is a reproduction of the original sound source, whether it was recorded or generated electronically. Spatial audio aims to create a more realistic simulation of how sound behaves in the real world, by creating the illusion of sound coming from specific locations in space, and by simulating the way sound interacts with the environment. This is different from traditional audio playback, where the sound is usually coming from a fixed location, such as the speakers or headphones.
J: We just agreed that the sound coming from a loudspeaker is not the reproduction of the original sound source but a reproduction of the virtual sound source.
A: Yes, that's correct. The sound coming from a loudspeaker is not the original sound source, but a reproduction of the virtual sound source. The virtual sound source is a simulation of the original sound source, and the sound that is emitted from the loudspeaker is a simulation of how the sound would behave in the real world. The sound waves that are emitted from the loudspeaker are physical, but they are a simulation of the original sound and not the original sound itself.
J: I thought we agreed that the virtual sound source is not a simulation of the original sound source but its representation.
A: We did agree that the virtual sound source is not a simulation of the original sound source, but a representation of it. And the sound experienced through spatial audio is a reproduction of the virtual sound source, which aims to simulate how sound behaves in the real world. The sound coming from a loudspeaker is not the original sound, but it is real sound waves that can be perceived by our ears. The difference between the realistic simulation of sound in spatial audio and the actual creation of sound waves is that the former is a recreation of the sound using technology, while the latter is the natural occurrence of sound in the physical world.
J: Does the loudspeaker not create a natural occurrence of sound in the physical world?
A: A loudspeaker creates sound waves in the physical world, but it is a mechanical process that is different from the natural occurrence of sound. The sound waves produced by a loudspeaker are a reproduction of the virtual sound source, which is a representation of the original sound source. Spatial audio aims to create a realistic simulation of how sound behaves in the real world by adjusting the direction, distance, and other parameters of the sound waves produced by the loudspeaker. So, the sound coming from a loudspeaker is not a natural occurrence of sound but it can be designed to simulate the natural occurrence of sound through the use of Spatial Audio technology.
J: What would be a sound in the physical world not resulting from a mechanical process?
A: A sound in the physical world can be any vibration or pressure wave that travels through a medium such as air or water and can be perceived by the human ear. These sounds can be naturally occurring, such as the sound of a bird singing or the sound of wind blowing through trees, or they can be artificially created, such as the sound of a car engine or a computer-generated sound. The sound that comes from a loudspeaker is a mechanical process that converts electrical signals into sound waves that can be perceived by the human ear, it is not a natural occurrence of sound in the physical world but a reproduction of the virtual sound source.
J: Would you say that the sound of a gasoline car engine is a naturally occurring sound?
A: No, the sound of a gasoline car engine is not a naturally occurring sound, it is a mechanical sound created by the movement of parts in the engine. Naturally occurring sounds are those that are not created by human technology or activity, such as the sound of birds singing or the sound of waves crashing on a beach.
J: Ann, I suppose we might have different understandings of the term "sound". What exactly do you mean when you talk about sound?
A: When I talk about sound, I mean the mechanical wave that travels through a medium, such as air, and is characterized by its frequency and amplitude, that can be perceived by the ear. Sound can be created by a wide variety of sources, such as musical instruments, the human voice, or natural phenomena like thunder. In the context of spatial audio, the aim is to create a realistic simulation of how sound behaves in the real world through the use of technology, such as surround sound systems or head-related transfer functions (HRTFs).
J: Do you think a musical instrument, like a violin, is a mechanical sound created by human technology or activity?
A: A musical instrument such as a violin can be thought of as a combination of both human technology and activity. The instrument itself is a product of human technology, but the sound it produces is a result of the physical vibrations caused by a musician playing the instrument. So in this sense, the sound produced by a violin can be considered a naturally occurring sound, but one that is facilitated by human technology.
J: Do you think a car engine could be used as a musical instrument?
A: It is possible to use a car engine as a musical instrument, although it would likely require some modification and manipulation of the engine's sound. Some experimental musicians and sound artists have used car engines and other industrial machinery as sound sources in their compositions. However, it is not typically considered a traditional musical instrument in the way that a violin or a guitar might be.
J: Do you think a loudspeaker can be used as a musical instrument?
A: A loudspeaker can be used as a musical instrument, as it can be used to amplify or play back recorded or electronic music. Some musicians also use loudspeakers in experimental and electronic music to create unique sounds and effects. In this context, the loudspeaker is not itself creating the sound, but rather playing back or amplifying a pre-existing sound source.
J: You mean playing back a pre-existing virtual sound source?
A: Yes, playing back a pre-existing virtual sound source through a loudspeaker can be considered using the loudspeaker as a musical instrument. It is not the loudspeaker itself that creates the sound, but rather it is used as a tool to amplify and play back the pre-existing sound.
J: Well, when we can use the violin, the car engine, and the loudspeaker as an instrument, would this mean that the sounds from all three technologies can be considered naturally occurring sounds, but one that is facilitated by human technology?
A: Yes, that's correct. All three technologies can be used to create sound, which can be considered naturally occurring, but facilitated by human technology. A violin creates sound through the vibration of strings, a car engine creates sound through the combustion of fuel, and a loudspeaker creates sound through the vibration of a diaphragm. In all cases, the sound is caused by a mechanical process, but in the case of a violin or a car engine, the sound is created by the natural properties of the materials and the physics involved, while in the case of a loudspeaker, the sound is created by an electronic signal.
J: I'm not sure about the difference between the loudspeaker and the other two instruments. Are you trying to reference the energy that causes the vibration of materials?
A: Not necessarily. The distinction I am trying to make is that a violin, for example, is a naturally occurring sound in the sense that it is created through the vibration of natural materials (strings, wood) caused by human activity (bowing, plucking). A car engine, similarly, is a naturally occurring sound in the sense that it is created through the combustion of fuel, which causes mechanical movement and generates sound. A loudspeaker, on the other hand, is a technology that is used to reproduce a pre-existing sound source, either recorded or electronically generated. The sound that comes from a loudspeaker is not a naturally occurring sound in the same way that the sound from a violin or a car engine is. It is the result of a mechanical process that reproduces a pre-existing sound source.
J: I'm afraid that a loudspeaker manufacturer would disagree with the definition that the sound of a loudspeaker would not be created by the natural properties of the materials and the physics involved in the construction of the loudspeaker.
A: A loudspeaker is a device that converts electrical energy into sound energy by vibrating a diaphragm (typically a cone or dome) to produce sound waves in the air. The sound produced by a loudspeaker is not a natural occurrence in the sense that it is not created by the natural properties of the materials and physics involved in the construction of the loudspeaker alone, but rather it is a result of a mechanical process driven by an electrical signal. However, the sound produced by a loudspeaker can be used to simulate or reproduce a natural sound or a sound that was originally recorded. In this sense, the loudspeaker can be used as a musical instrument to create or reproduce sounds, but it is not a naturally occurring sound in the way that a bird singing or a waterfall is.
J: You say that the sound created with a violin is a natural occurrence because a human being is involved in playing it. How is that different from a sound created with a loudspeaker with a human being involved in playing it?
A: The difference between a sound created with a violin and a sound created with a loudspeaker is that the violin utilizes the natural properties of the materials and physics involved in its construction to create sound, whereas a loudspeaker uses electrical signals to drive a mechanical process that creates sound. In the case of a violin, the sound is created through the vibration of the strings, which is caused by the interaction of the player's fingers and bow. In the case of a loudspeaker, the sound is created through the movement of the diaphragm, which is caused by an electrical signal. While a human is involved in both cases, the mechanism of sound creation is different. In the case of the loudspeaker, the sound is not a natural occurrence, but rather a result of a mechanical process.
J: Ann, I don't see a difference between the loudspeaker and the violin: both are mechanical devices, and both have natural properties of the materials and physics involved in their construction to create sound, and both can be brought into vibration with energy (muscle, electrical energy) controlled by a human being. Why would you say that the two instruments are different in whether the sounds resulting are or are not natural occurrences?
A: The difference between the loudspeaker and the violin is that the violin uses the natural properties of the materials and physics involved in its construction to create sound, while the loudspeaker uses electricity to create sound. The violin relies on the vibration of its strings, which are made of natural materials, to create sound. The loudspeaker, on the other hand, uses an electronic signal to drive a diaphragm, which in turn creates sound waves. While both can be controlled by a human, the way the sound is created is different. The sound of a violin is considered a natural occurrence because it is created through the natural vibration of the materials, while the sound of a loudspeaker is considered a reproduction because it is created through electronic means.
J: I'm not sure we're getting any further in the discussion about sound as a natural occurrence. Probably, we have slightly different assumptions underlying our conversation. To round things up, I would understand sound as those auditory experiences we have in our minds, with or without physical sound waves involved, perceived with our sense of hearing or imagined. In particular, if you want to look that up, I am referring to the book "Sonic virtuality: Sound as emergent perception" by Grimshaw & Garner (2015).
However, I would like to get back to the point of our discussion from where we departed, the definition: "Spatial audio can be thought of as a reproduction of the virtual sound source in that it creates a realistic simulation of how sound behaves in the real world, rather than necessarily recreating the original sound source itself."
A: The definition you provided earlier of spatial audio being a reproduction of the virtual sound source that creates a realistic simulation of how sound behaves in the real world, rather than necessarily recreating the original sound source itself, is a clear and concise definition that accurately describes what spatial audio is and how it functions. It also acknowledges that the sound may not be an exact replica of the original sound source, but rather a simulation of how it would behave in the real world.
J: To be precise, that exact definition was yours.
A: Yes, that was your proposed definition, which I agreed with. Spatial audio can be thought of as a reproduction of a virtual sound source that creates a realistic simulation of how sound behaves in the real world. It is important to note that the goal of spatial audio is to simulate the experience of sound in a real-world environment, rather than simply recreating the original sound source itself. This is achieved through the use of various techniques such as sound localization, reverberation, and environmental effects to create a sense of immersion and realism for the listener.
J: Anyways. So far, in our discussion of what spatial audio is, the concept of nature seems fundamental to your explanations: natural sound, natural occurrence of sound, natural phenomena, natural properties, natural materials, natural properties of materials, natural vibration of materials. I want to give a few examples of how you used "nature":
"The more realistic and natural the sound, the more immersive the listening experience."
"The difference between the realistic simulation of sound in spatial audio and the actual creation of sound waves is that the former is a recreation of the sound using technology, while the latter is the natural occurrence of sound in the physical world."
"A loudspeaker creates sound waves in the physical world, but it is a mechanical process that is different from the natural occurrence of sound."
"So, the sound coming from a loudspeaker is not a natural occurrence of sound but it can be designed to simulate the natural occurrence of sound through the use of Spatial Audio technology."
"These sounds can be naturally occurring, such as the sound of a bird singing or the sound of wind blowing through trees, or they can be artificially created, such as the sound of a car engine or a computer-generated sound."
"Naturally occurring sounds are those that are not created by human technology or activity, such as the sound of birds singing or the sound of waves crashing on a beach."
"So in this sense, the sound produced by a violin can be considered a naturally occurring sound, but one that is facilitated by human technology."
"The distinction I am trying to make is that a violin, for example, is a naturally occurring sound in the sense that it is created through the vibration of natural materials (strings, wood) caused by human activity (bowing, plucking). A car engine, similarly, is a naturally occurring sound in the sense that it is created through the combustion of fuel, which causes mechanical movement and generates sound."
I think this is a bit confusing. What exactly is your understanding of "nature", or "natural"?
A: In my previous statements, when I referred to "nature" or "natural," I was using it to refer to things that exist or occur in the natural world, as opposed to things that are created or facilitated by human technology or activity. So, for example, the sound of birds singing or waves crashing on a beach would be considered naturally occurring sounds, because they exist and occur in the natural world without being directly created by human technology or activity. In contrast, the sound produced by a car engine would be considered an artificially created sound, because it is generated by the combustion of fuel and the mechanical movement of parts in an engine, which is a result of human technology and activity.
J: And how does this understanding of nature relate to your idea of what spatial audio is?
A: My understanding of "nature" in the context of sound refers to sounds that are not created by human technology or activity, but are instead generated by natural phenomena such as wind, waves, birds, and other sources. In my definition of spatial audio, I emphasized the difference between naturally occurring sound and artificially created sound, and how spatial audio technology aims to recreate the former in a realistic and immersive way.
Spatial audio is a technological simulation of the natural behavior of sound in the real world, but it is still a recreation and not the actual natural occurrence of sound itself. So, the understanding of "nature" in this context is used to distinguish between sounds created by human technology and those generated by natural phenomena.
J: I think your definition is confusing. On the one hand, you refer to the natural behavior of sound waves (physics of sound). On the other hand, you refer to the origination of sound sources (birds vs. cars).
To clarify, let's get back to your initial statement: "Spatial audio aims to create an immersive sound experience by simulating the way sound behaves in the real world and locate sound sources in specific points in space." Accordingly, the purpose of spatial audio is to create immersive sound experiences; the method to achieve this is by simulating how sound behaves in the real world. Then you said: "The more realistic and natural the sound, the more immersive the listening experience."
Why do you think the concept of nature is relevant? For a definition of spatial audio, could it be sufficient to refer to the physics of how sound travels in the medium of air?
A: The idea of "nature" is relevant in my definition of spatial audio because it pertains to how sound behaves in the real world. In other words, when I talk about "nature" or "natural" in the context of spatial audio, I am referring to how sound behaves in real-world environments, as opposed to how it behaves in artificially created environments. The purpose of spatial audio is to recreate the natural experience of sound in a virtual setting, and the more accurately it can do that, the more immersive the listening experience will be. The physics of sound waves, including how they travel in air, are indeed a critical component of spatial audio and its ability to recreate the natural experience of sound. However, the concept of nature also encompasses other factors that contribute to the realism of sound, such as the natural properties of materials and objects in the environment, the interaction of sound with those materials and objects, and the human perception of sound. By referencing nature in my definition of spatial audio, I am encompassing all these factors and emphasizing the goal of creating a realistic, immersive sound experience that resembles the real world as closely as possible.
J: You no longer refer to sound sources' origination, which I agree is irrelevant. So the definition is about the physics of sound. Is that correct?
A: Yes, that is correct. The definition I provided is focused on the simulation of the physical behavior of sound in the real world, and the aim is to create an immersive sound experience through this simulation. The concept of nature is not essential to the definition, as it is centered around the physics of sound and the way it behaves in space.
J: You defined the level of immersion as a condition for spatial audio. And you stated: "The more realistic and natural the sound, the more immersive the listening experience." I understand that you tried to express your thinking in everyday language. But the terminology can be problematic. Based on your latest definition of spatial audio, we could also frame it that way: "The more accurate the simulation of the physics of sound, the more immersive the listening experience in spatial audio." Would you agree?
A: Yes, I would agree with the revised definition "The more accurate the simulation of the physics of sound, the more immersive the listening experience in spatial audio." It is a more accurate way of expressing the idea and better reflects the definition of spatial audio as a technology that simulates the behavior of sound in the real world.
J: OK, this is where we are now: "Spatial audio aims to create an immersive sound experience by simulating the way sound behaves in the real world and locate sound sources in specific points in space." + "The more accurate the simulation of the physics of sound, the more immersive the listening experience in spatial audio."
We could now reduce your definition of spatial audio to this sentence: "Spatial audio aims to simulate the physics of sound in space." As an advantage, we would not need to use those ambiguous concepts such as immersive, real, or world. Would you agree?
A: Yes, I would agree that "Spatial audio aims to simulate the physics of sound in space" is a concise and accurate definition of spatial audio. It captures the essence of what spatial audio is and its goal, which is to simulate the behavior of sound waves in a three-dimensional environment.
J: Ann, it was a pleasure to discuss the question of spatial audio. I would love to continue this conversation with you in a subsequent chat, as I can see some interesting implications in that definition.
A: I would be happy to continue this conversation in a future chat. Thank you for taking the time to discuss this topic with me. If you have any other questions or topics you'd like to explore, feel free to reach out to me anytime. Have a great day!
J: Thank you, Ann. Have a great day, too!
A: You're welcome! Have a great day ahead!
transparency note
I conducted the interview with ChatGPT. For the purpose of this interview, we agreed on avatar names in the beginning:
J: Would you like to play a game?
ChatGPT: Sure, I'd be happy to play a game with you! What game would you like to play?
J: We'd pretend to have other names.
ChatGPT: I see, that sounds like fun! What name would you like me to pretend to have?
J: I'd love you to choose one for yourself.
ChatGPT: Sure thing! For the purpose of this game, you can call me "Alex". Is there a name you would like to go by?
J: Could you imagine a one-syllable name?
ChatGPT: Certainly! How about "Ann" a one-syllable name. Is that okay with you?
J: Sounds wonderful. I think I would like to be called Jamie.
A: So for the purpose of this game, my name is Ann, and your name is Jamie.
J: Hi Ann, nice to meet you!
A: Hi Jamie, nice to meet you too! Is there anything specific you would like to do or talk about in this game?
< table of contents