Suggested Acoustic Renderer

manu3d

Active Member
Contributor
Architecture
As requested by @Cervator I repost in a separate thread one of the two ideas mentioned in this thread.

Basically my thought was: could the geometric simplicity of Terasology's world lend well to a sound-based renderer, producing enough spatial detail and acoustic realism that a completely blind person could play alongside a fully-sighted one? Imagine a blind child and one of his school friend, or also a blind grandfather and his grandchild. A sufficiently realistic (3D) soundscape, including the way sound bounces off and gets absorbed other surfaces would give a blind person lots of useful information to move about in the environment. Crucially, sound realistically bouncing off surfaces would allow in-game human echolocation, allowing a player to actively acoustically "illuminate" nearby surfaces and perceive their shapes and properties through sound. A normal, arbitrary-triangle-based 3D game, with its complex surfaces might struggle to create a sufficiently realistic, interactive acoustic rendering of the environment in today's consumer-level hardware. A highly geometrically structured reality such as Terasology might have just the right simplicity to make an effective acoustic renderer possible.

Interestingly, while more traditional software-for-the-blind challenges would emerge alongside the renderer's development (non-visual UIs would be obviously needed for everything but 3d navigation), intriguing gameplay opportunities would also arise. I.e. a blind person will generally have a more refined, discriminating sense of hearing. In the game this might allow for environmental information in the form of sound that a sighted person would not be able to perceive, discriminate against other sounds or simply interpret correctly. I.e. a source of water underground might normally gurgle too subtly to be heard. A predator approaching might be too difficult to detect among the constant rustling of forest leaves. Or the call of a rare poisonous frog used to augment the efficacy of arrows might be indistinguishable from its more common, non poisonous and largely useless cousin... except to somebody with a very fine hearing. In all those circumstances a more refined hearing would provide an advantage. As it would in other circumstances where the sight of a normally sighted person gets impaired ingame. I'm thinking dark, windy caves where torches are blown off, or temporary blindness caused by a spell turning the whole screen into a useless overburnt blur. A blind person using the acoustic renderer would be unaffected and would be able to either help his or her sighted companions or even take advantage of the situation if the setting is a competitive one.

Finally, I would suggest that commercial enterprises such as Minecraft are unlikely to ever go in this kind of direction, as it is too risky a proposition given the numerically limited userbase for this feature and the associated reduced profit margins once R&D is taken in account. An open source project could however embrace the risk, make it a badge of honor and open the door for blind people toward experiencing fully 3D, fully interactive, voxel worlds. Not to mention, it would also rake quite a bit of free advertising in the media and through word of mouth, which would eventually generate additional sighted users, not just visually impaired ones.
 

Cervator

Org Co-Founder & Project Lead
Contributor
Design
Logistics
SpecOps
This kind of thread is why we really need a "Crazily awesome" rating icon :D

This would be a wicked sweet long-term goal, although it would take ages to realize. I would love to see (or hear, rather!) this one day.
 

Skaldarnar

Development Lead
Contributor
Art
World
SpecOps
Crazily awesome idea!!!

Probably we can start off with some requirements as foundation for this concept. The first things coming to my mind are
  • high quality sound effects, e.g., every type of soil should have a distinguishable footstep asset, a player wearing heavy boots might sound different than a lightweight chicken walking around, ...
  • the game's sound system has to be capable of this kind of spatial sound(?). Do we support this already?
 

manu3d

Active Member
Contributor
Architecture
Briefly, the issue of proper acoustics is probably as complicated and perhaps even similar to illumination on the visual side. To me in fact is not so much about what sounds are handled (high quality might be desirable but perhaps computationally unfeasible?) but how they are handled. In particular they would have to bounce-off and/or be absorbed-by and/or pass-through surfaces, not unlike what light does. In fact, if light sources were to be represented as emitting a spectrum rather than a simple RGB triplet (some high-end, non-realtime renders do that), some of the visual renderer functionality might be applicable to an acoustic renderer.

However, and this is where things could get really complicated, light can be considered traveling at infinite speed for most purposes, hence arriving instantaneously and having no temporal dimension. Sound, on the other hand, has a much lower speed in air, hence having a temporal dimension, smearing/fragmenting sounds over time. I.e. imagine shouting "echo" in a valley: if you are not in the middle of the valley you'll get two temporally separated echos out of one shout. Now imagine the same thing for a more complicated setup and you can see how the complexity increases dramatically.

That been said, I know very little on the topic other than basic physics. Perhaps it doesn't have to be so realistic to be effective. It's something that would need to be looked at in-depth.

Concerning spatial sound, I think openAL (is that what Terasology uses?) has spatial sound capabilities. In the sense that sounds are attenuated depending on distance and I think it can handle frequency shifts such as in the doppler effect, presumably because racing games needed it. But I don't think it goes beyond that. I.e. imagine that to simulate a sound bouncing off a surface you'd have to have each acoustically important surface be a permanent listener/emitter pair or somehow create pairs at runtime, whenever they are needed. So I suspect existing libraries might not be adequate to do something realistic enough for a non-sighted person.

However, it might be possible to hijack the GPU to do the heavy lifting via shaders and somehow render an image which in turns gets translated into sound. It's just an embryo of an idea and I wouldn't want to get into it personally, but I suspect it might be a viable option.
 
Last edited:

synopia

Member
Contributor
Architecture
GUI
Crazily awesome idea!!!

I really like that idea, however such sound processing is by far (far far) more complex than rendering our block world. The main problem is, as you already said, the lower speed of sound. So you need to simulate waves for each sound source and the reflections of the waves on world blocks. Each "collision" will then act as a new sound source. For one sound source in a simple rectangular room, you get numerous overlapping waves very quickly, which together form what you actually hear.

However, there are papers and research on that topic, but, sadly, I don't think its possible to get this to work in realtime.

Also, some time ago I found a project where someone played around with a kind of raytracing sound. But I cant find it right now.

Nice video to get an idea of the complexity:
 

synopia

Member
Contributor
Architecture
GUI
Ok, I talked to a friend of mine, who is working and phd'ing in accoustic research. Wave simulation, like I described in my previous post would not be possible to do in realtime in far future. This is, because the size of the simulation grid cells determine the maximum sound frequency, that can be simulated. Without pen and paper we roughy came to 1.5cm grid resolution to simulate frequencies up to 2kHz (like in the paper [1]), which means around 300 million cells for 10x10x10 Terasology blocks.

The guys behind the video above use some fancy algorithms to reduce the required amount of cells drastically - but the cathedral scene still took 15h to calculate a 2s long sound impulse.

However, the raytracing approach is much more promising for realtime applications.


They trace 1000 random rays per sound source with 10 reflections - in realtime. But they heavily rely on precalculations, which only works in static environments (they support moving sound source but no changes of the scene).

All in all, really interesting topic. Maybe we find someone who wants to dig deeper into this ;)

[1] Paper for the video in previous post: http://gamma.cs.unc.edu/propagation/main.pdf
[2] Raytracing paper: http://gamma.cs.unc.edu/HIGHDIFF/
 

manu3d

Active Member
Contributor
Architecture
First of all, thank you Cervator, Skaldarnar and especially Synopia for indulging me with such a far fetched idea. It's nice to throw ideas in the arena and do some blue-sky group-thinking now and then!

I agree with everything you wrote Synopia, about the difficulty of having a realistic system in realtime. Similarly to visual raytracing, I can see the potential of raytracing sound, which is what I had mind originally, but I can also see the difficulties, especially as soon as we deal with a dynamic, freeform scene such as a block world.

I do have a few theoretical, perhaps -partial- escape routes from the computationally unfeasible hole I dug myself into though. In my original post I used the expressions "producing enough spatial detail and acoustic realism" and "sufficiently realistic", they keywords here being "enough" and "sufficiently".

1) From a gameplay perspective, it should be possible to have very few realistically-handled sound sources. In fact, at least at the beginning, a single source located at the mouth of the player's avatar capable of clicking sounds might be enough for human echolocation to work, allowing a non-sighted player to create a mental image of the environment. (Note: some people use other means of echolocation, i.e. a cane or snapping their fingers).
2) For some sound sources, a fully realistic sound rendering might not be necessary. I.e. the footsteps of another player would benefit from being shadowed by obstacles between the two players. But reflections on surfaces and diffraction around angles might be overkill and in some cases outright confusing. I mean: I get sometime confused how real-life sounds reflects on walls, in a game we could avoid that.
3) For other sound sources, i.e. the environment as a whole or far sound sources, currently available, non-realistic sound rendering, is perfectly enough.
4) I seem to understand that the mental picture of an experienced, echolocating, non-sighted individual, is not particularly long range. They can "see" obstacles nearby, even relatively thin ones, they can get a feeling for how hard they are (a wall vs a bush) and they can detect open space by the absence of obstacles. But they are not, as far as I can understand "far-sighted". This would allow safely restricting the sound simulation to an area perhaps 40x40 meters around the player.
5) 3D Computer graphics started initially with wireframe rendering at very low frame rates. Games of that kind wouldn't impress anybody by today's standard, but they were playable and definitely enjoyable. An acoustic rendering for Terasology might have to make a lot of compromises with realism, at least for the foreseable future, but might still enjoyable for people that never had the chance to play in the scenarios described by my original post.

So, -perhaps- there is room to maneuver even on today's hardware.
 
Top