Sounds in Interfaces
Mar 9, 2026
Sounds are not just about playing some audio tracks. A screen cannot talk, but the content can.
Larger text means more visibility, which is obvious. The deeper layer is: size is about volume.
The Legend of Zelda is illustrating this. If you mute the sounds, you can still tell the volume of a NPC talking. Per the size of the subtitle.
People can hear your interface with eyes. They can also see your interface with ears.
You know you have a Slack notification with a light beep from your iPhone.
You also know you have a phone call with a long-lasting music.
That's why sounds is hard: you are not just adding audio, you are making an expereince.
Silence as Sound
Most software is silent. Silence has become the default, and for good reason — most interface sounds were never worth hearing in the first place.
But silence is not the same as absence. The best interfaces use sound the way a good room uses light: sparingly, with intention, and only where it earns its place.
Path vs Places
As we discussed before. Slience is a choice if chosen carefully. So when to add sounds?
People hate if VSCode bring a tab switch chime. People feel something off if Switch system has no sounds. Both are about immersive expereince. Both are not games. Where does the volume difference come from?
A destination is a place where attention is the product. Sound deepens attention by adding a sensory channel. More senses engaged = more presence = more "I am here." The Switch wants you here. Instagram wants you here. A game wants you here. Sound is the cheapest way to thicken the here.
A vehicle is a place where attention is passing through. Anything that thickens the vehicle thickens the wrong layer. You don't want to feel the car; you want to feel the road. VSCode adding a tab-switch chime is like a car adding a chime every time you turn the steering wheel — it pulls your attention back to the vehicle when your attention is supposed to be ahead of it.
What sound is doing
Turn on a Switch and listen. The console wakes with a soft click-pop. The home menu greets you with a low chime that hangs in the air for a half-second longer than you expect. Move the cursor between game tiles and each one answers with a small wooden tap — different pitch, same family. Open a game and the tile expands with a satisfying whoosh. Close it and the sound reverses, like a door easing shut.
None of these sounds are decorative. Each one confirms a state change that you initiated. The cursor moved. The tile opened. The console heard you.
And most video game have sounds like this.
What makes the Switch's sound design remarkable is not that there is a lot of it — there is — but that it never feels like a lot. The sounds are quiet. They sit below speech volume. They are tuned to the same warm, slightly muffled palette, so that no single sound stands out as louder or sharper than the others. You stop noticing them as discrete events and start hearing them as the texture of the system.
Close your eyes, and listen. Can you tell what you are doing?
[VIDEO: Nintendo Switch home menu navigation]
Texture
Texture sounds don't inform — they make the thing feel real.
When you slide the iOS volume bar and hear that tiny tick, it's not telling you anything. The visual already showed you. The number changed. The bar moved. The tick is there because physical things make sound when they move, and the absence of sound makes the digital thing feel ghostly. The tick is borrowed physics. It says: this is a real object, not a picture of one.
- The camera shutter sound (the photo already happened — you can see it)
- The trash can crumple in macOS (the file is already gone)
- The keyboard click on iOS (the letter already appeared)
- The "whoosh" when you send a message (the message already sent)
Boundary-crossing
We use five senses to feel the world.
Reward
This is the trick behind Duolingo.
When you complete a purchase with Apple Wallet, you heard a chime.
The Rule
Seriously? There is no rule for whether adding sounds for one place.
Each sound being the right kind of sound for what just happened. Different moments, different logics, all running at once.
This is what makes interface sound hard. A piece of software isn't one thing. It's a place you spend time in, a tool you pass through, a messenger that delivers consequences from elsewhere, and a guardian against your own inattention. Each of those roles wants a different audio treatment. Most software handles this badly by picking one rule — usually silence — and applying it everywhere. The software that handles it well runs the rules in parallel, quietly, and trusts that no one will notice the seams.
Notes
- If you want to learn more about making interfaces, I curate them at detail.design._