/insights · Engineering
Android audio development in 2026 — the engineer guide
A working engineer guide to building serious audio on Android in 2026 — AAudio, Oboe, JUCE, latency budgets, hearing-device routing, AudioWorklet-equivalents, and the parts of the platform that finally just work.
Building audio on Android used to mean designing around the platform. In 2026 it mostly means using the platform — and knowing which corners still need defensive work. This is the working engineer guide to where Android audio stands today, what to reach for first, and where the surprises still live.
It’s the third of AudioLab.tools’s pillar pieces, following What is audio AI? and AI voice infrastructure in 2026. The shape: layers from low to high, what works, what bites, what to build on.
The stack, ground up
Your code
↓
AAudio / Oboe (recommended) OR OpenSL ES (legacy, not recommended)
↓
Audio HAL / Audio Server
↓
Audio drivers + DSP firmware
↓
Speaker / headphones / Bluetooth / hearing device
Above AAudio sit higher-level frameworks: JUCE, Superpowered, Tarsos, Oboe-direct, and increasingly capable WebAudio-via-WebView for hybrid apps. Below the HAL, Android’s audio routing matrix handles destination selection (speaker, BT, USB, hearing aid via ASHA/LE Audio).
The single biggest message of this guide: use AAudio (via Oboe), not OpenSL ES. OpenSL ES still works but is unsupported and gives you nothing AAudio doesn’t already provide.
AAudio — what it is, what it gives you
AAudio is Google’s native low-latency audio API. Shipped in Android 8.0, matured significantly in 12+. In 2026 it’s the only realtime audio API you should consider for new Android work.
What you get
- Exclusive mode — a direct path to the audio HAL with the lowest possible latency. Round-trip latency on flagship devices is now 3–8ms.
- Shared mode — the default. Higher latency (15–40ms) but works everywhere and doesn’t require exclusive access.
- Performance mode hints —
LOW_LATENCY,NONE,POWER_SAVING. The platform selects buffer sizes accordingly. - Sample-rate negotiation — you can request a rate; the platform may give you a different one. Always check.
- Audio attributes — usage, content type, source. Drives routing, ducking, and accessibility behaviour.
What still bites
- Round-trip latency varies wildly across devices. Pixel and Galaxy flagships are excellent; some mid-tier devices are still in the 30–80ms range.
- Performance mode isn’t guaranteed. Asking for
LOW_LATENCYis a hint, not a promise. Always verify the buffer size you got matches what you asked for. - Bluetooth routing is still slower than wired. LE Audio narrowed the gap meaningfully; pre-LE-Audio devices still pay 60–120ms of BT latency.
- Doze mode and battery savers will kill or pause your audio thread. Test with battery saver enabled. Always.
Oboe — the right wrapper
Use Oboe. It’s the Google-blessed C++ wrapper around AAudio (with OpenSL ES fallback for very old devices, which you can ignore in 2026).
#include <oboe/Oboe.h>
class AudioCallback : public oboe::AudioStreamCallback {
oboe::DataCallbackResult onAudioReady(
oboe::AudioStream *stream,
void *audioData,
int32_t numFrames) override {
auto *out = static_cast<float*>(audioData);
// Fill buffer. No allocations, no locks, no JNI calls in this thread.
return oboe::DataCallbackResult::Continue;
}
};
oboe::AudioStreamBuilder builder;
builder.setDirection(oboe::Direction::Output)
->setPerformanceMode(oboe::PerformanceMode::LowLatency)
->setSharingMode(oboe::SharingMode::Exclusive)
->setFormat(oboe::AudioFormat::Float)
->setChannelCount(oboe::ChannelCount::Stereo)
->setCallback(&callback);
oboe::AudioStream *stream;
oboe::Result result = builder.openStream(&stream);
stream->requestStart();
A few rules that aren’t in the documentation as plainly as they should be:
The audio thread does only the audio thread’s job
No allocations. No mutex locks. No JNI calls into Java. No logging in the hot path. No file I/O. Reading shared state requires lock-free queues. Writing shared state requires the same on the other side. The audio thread is real-time priority; anything that blocks it causes glitches.
Hot reload of parameters via lock-free queues
The pattern that scales: a triple-buffered or ring-buffered parameter struct that the UI thread writes to and the audio thread reads from. When the user moves a slider, the UI thread writes the new value; the audio thread picks it up at the next callback. No locks, no allocations.
Recovery is your responsibility
The audio thread can be stopped, paused, or restarted by the OS at any time. Bluetooth handoff, screen off, headphone disconnect, doze mode, exclusive-mode preemption — all of these terminate or interrupt your stream. Your code needs to detect these and restart cleanly.
oboe::DataCallbackResult onAudioReady(...) {
// ...
return shouldContinue ? Continue : Stop;
}
void onErrorAfterClose(oboe::AudioStream *stream, oboe::Result error) {
// Re-open and resume — your reconnect logic lives here.
}
JUCE — when cross-platform matters
For an app that needs to ship on Android and iOS and macOS and Windows, JUCE is the path of least resistance. The DSP code is the same; JUCE handles the platform-specific audio bring-up.
In 2026, JUCE on Android targets Oboe under the hood. The performance is good, the cross-platform abstraction holds for most use cases, and the audio plugin authoring story (VST/AU/AAX) is mature.
When not to use JUCE on Android:
- You only target Android.
- You want minimum binary size (JUCE adds several MB).
- You need to integrate deeply with Android-specific APIs (MIDI 2.0 on Android, USB audio class drivers, custom audio HAL features). JUCE’s abstractions are leaky here.
For a focused Android-only audio app, Oboe + your own minimal C++ harness is faster, smaller, and more flexible.
Latency budgets
Real numbers from flagship hardware in 2026:
| Path | Round-trip latency |
|---|---|
| Wired headphones + LowLatency + Exclusive | 3–8 ms |
| Wired headphones + LowLatency + Shared | 15–25 ms |
| Bluetooth A2DP (legacy) | 60–250 ms |
| Bluetooth LE Audio | 10–30 ms |
| USB-C wired DAC | 8–18 ms |
| Built-in speaker | 30–60 ms |
| AVB / Dante via USB | 5–15 ms |
These are flagship numbers; mid-tier hardware adds 10–20ms across the board. Test on your actual target device matrix.
The realtime audio app rule: latency only matters where the user notices it. A guitar tuner needs under 15 ms. A drum pad needs under 30 ms. A music player can accept 150 ms. A meditation app accepts 500 ms. Design for your actual budget, not the lowest theoretical number.
Bluetooth audio in 2026
The Bluetooth audio story has changed meaningfully since the LE Audio rollout in 2024.
Classic Bluetooth (A2DP)
Still in use everywhere. SBC codec is the baseline; AAC is supported by most devices; aptX HD and LDAC are higher quality. Latency is the perennial problem — A2DP has no real low-latency mode.
LE Audio (Bluetooth 5.2+)
The 2024–2026 story. LC3 codec, low latency (~10–30ms), broadcast support (Auracast), direct streaming to hearing aids. Adoption is mainstream on flagships; rolling out across mid-tier.
For new development, design assuming LE Audio is available but falling back gracefully to A2DP.
Auracast
Public broadcast audio over LE. A single source streams to many devices — including hearing aids. The infrastructure is rolling out in airports, conference centres, and assisted-listening venues. Early days, but a meaningful capability for accessibility-conscious applications.
ASHA (Audio Streaming for Hearing Aids)
The proprietary-style profile Google built for direct hearing-aid streaming. Still in use for hearing aids that pre-date LE Audio. Should be assumed alongside LE Audio in any hearing-support app for the next few years.
See Hearing accessibility on Android for the deeper hearing-device routing dive.
The Audio Framework in Java/Kotlin
Above the native layer, Android’s Java/Kotlin AudioManager + MediaRouter give you the routing surface. The important APIs:
val attrs = AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_VOICE_COMMUNICATION)
.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
.build()
val routerInfo = MediaRouter.getInstance(context).selectedRoute
// routerInfo.deviceType — speaker, headset, BT, hearing aid
USAGE drives routing decisions and ducking behaviour. Setting it correctly is the difference between your audio being routed to the user’s hearing aid via ASHA and being routed to the phone speaker. Get this right.
The full set: USAGE_MEDIA, USAGE_VOICE_COMMUNICATION, USAGE_ASSISTANCE_ACCESSIBILITY, USAGE_NOTIFICATION_RINGTONE, USAGE_ALARM, etc. Use the most specific one.
Microphone, capture, and effects
For audio capture:
- Use
AAudioStreamwithDirection::Inputand the same performance mode hints as output. - For higher-level needs (record to file with built-in effects), MediaRecorder + AudioEffect is the Java path. Less flexible but covers the common cases.
AcousticEchoCanceler,NoiseSuppressor,AutomaticGainControlexist as system effects. Quality varies by device; for serious work, don’t rely on them and use a third-party library.
For real-time effects on captured audio:
- Capture → AAudio input → your DSP → AAudio output
- Round-trip latency targets: under 20 ms is acceptable for monitoring, under 10 ms for tracking.
MIDI 2.0
Android adopted MIDI 2.0 in 13. By 2026 it’s the recommended path for new MIDI work. JUCE supports it; the platform APIs are stable.
For most audio apps, MIDI 1.0 is fine. For modern hardware controllers, USB-MIDI 2.0 is the right choice.
Power, doze, and background audio
Real-time audio + background = trickle of footguns:
- Foreground service with
FOREGROUND_SERVICE_TYPE_MEDIA_PLAYBACKkeeps the audio thread alive. - Without it, the audio thread will be paused or killed in background.
- Doze mode and battery saver are aggressive. Test with both on.
- App standby buckets — your app can drop to a restricted state if not used for days. Background audio degrades in those states.
For background music playback, MediaSession + ExoPlayer is the right stack. For background DSP work, foreground service + AAudio.
The thing nobody surfaces
The pattern I keep flagging in this series: the most useful piece of UX for any Android audio app — and especially one used by accessibility users — is showing the user where their audio is going.
Streaming via LE Audio → [Device name]
Or: Streaming via wired → headphones
Or: Streaming via speaker
The MediaRouter API gives you this info. Showing it costs you 20 lines of code. Almost no consumer apps do. This is one of the easier wins available to an Android audio developer in 2026.
What we built
HearLab Companion is the accessibility-first Android companion app. It uses AAudio for low-latency captioning audio, MediaRouter for routing visibility, and the standard accessibility APIs for caption rendering. Non-medical by design — see Designing hearing support without medicalising it for the framing.
The roadmap covers:
- ASHA + LE Audio routing visualisation in the companion app
- Background environment-audio logging with battery-friendly duty cycling
- Audiologist-facing dashboard layer (B2B2C)
Where Android audio is going
Three predictions for 2027:
- LE Audio replaces A2DP in mainstream consumer use. Already underway; will be near-complete on flagships by mid-2027.
- Auracast broadcast audio becomes a standard accessibility surface at public venues. The infrastructure is ahead of the applications.
- The mid-tier latency gap closes meaningfully. Mid-tier Android devices catching up to flagships on round-trip latency is a 12–18 month story.
Related
- Realtime DSP on Android — what AAudio gets right — the original AAudio overview
- Hearing accessibility on Android — sister pillar piece
- WebAudio vs native — where the line actually is in 2026
- Hearing-aid routing through Android Audio Framework
Get involved
We’re hiring opinions, not roles. If you work in Android audio at any level — game audio, music apps, accessibility, broadcast, conferencing — and you think this guide is wrong, get in touch. We update the document as the platform changes.
More in Engineering
-
Real-time audio in the browser: what 2026 actually makes possible
The honest 2026 floor and ceiling for browser audio — AudioWorklet latency, WebGPU inference, MediaDevices capture, the gaps that still exist vs native, and the gaps that have finally closed.
-
Realtime DSP on Android — what AAudio gets right
AAudio + AudioWorklet is now a credible realtime audio stack. A short tour through the rough edges and the parts that just work.
-
WebAudio vs native — where the line actually is in 2026
WebAudio has crossed the threshold from "demo only" to "production for many use cases". Here is where it still falls short and where it’s now the right tool.