After extensive research into Apple’s poorly documented audio programming Objective-C classes and API, here are some helpful links.

Using RemoteIO audio unit

Decibel metering from an iPhone audio unit

Analyse Audio with RemoteIO

iPhone Core Audio tutorial

  1. http://timbolstad.com/2010/03/16/core-audio-getting-started-pt1/
  2. http://timbolstad.com/2010/03/16/core-audio-getting-started-pt2/
  3. http://timbolstad.com/2010/03/16/core-audio-getting-started-pt3/

It’s hard. Jens Alfke put it thusly:

“Easy” and “CoreAudio” can’t be used in the same sentence. :P CoreAudio is very powerful, very complex, and under-documented. Be prepared for a steep learning curve, APIs with millions of tiny little pieces, and puzzling things out from sample code rather than reading high-level documentation.

  • Media is hard because you’re dealing with issues of hardware I/O, real-time, threading, performance, and a pretty dense body of theory, all at the same time. Webapps are trite by comparison.

  • On the iPhone, Core Audio has three levels of opt-in for playback and recording, given your needs, listed here in increasing order of complexity/difficulty:

    1. AVAudioPlayer – File-based playback of DRM-free audio in Apple-supported codecs. Cocoa classes, called with Obj-C. iPhone 3.0 adds AVAudioRecorder (wasn’t sure if this was NDA, but it’s on the WWDC marketing page).
    2. Audio Queues – C-based API for buffered recording and playback of audio. Since you supply the samples, would work for a net radio player, and for your own formats and/or DRM/encryption schemes (decrypt in memory before handing off to the queue). Inherent latency due to the use of buffers.
    3. Audio Units – Low-level C-based API. Very low latency, as little as 29 milliseconds. Mixing, effects, near-direct access to input and output hardware.
  • Other important Core API’s not directly tied to playback and recording: Audio Session Services (for communicating your app’s audio needs to the system and defining interaction with things like background iPod player, ring/silent switch) as well as getting audio H/W metadata, Audio File Services for reading/writing files, Audio File Stream Services for dealing with audio data in a network stream, Audio Conversion Services for converting between PCM and compressed formats (and vice versa), Extended Audio File Services for combining file and conversion Services (e.g., given PCM, write out to a compressed AAC file).

  • Setting a property on an audio unit requires declaring the “scope” that the property applies to. Input scope is audio coming into the AU, output is going out of the unit, and global is for properties that affect the whole unit. So, if you set the stream format property on an AU’s input scope, you’re describing what you will supply to the AU.
  • Make the RemoteIO unit your friend. This is the AU that talks to both input and output hardware. Its use of buses is atypical and potentially confusing. Enjoy the ASCII art:


    -------------------------
    | i o |
    -- BUS 1 -- from mic --> | n REMOTE I/O u | -- BUS 1 -- to app -->
    | p AUDIO t |
    -- BUS 0 -- from app --> | u UNIT p | -- BUS 0 -- to speaker -->
    | t u |
    | t |
    -------------------------

    Ergo, the stream properties for this unit are


    Bus 0 Bus 1
    Input Scope: Set ASBD to indicate what you’re providing for play-out Get ASBD to inspect audio format being received from H/W
    Output Scope: Get ASBD to inspect audio format being sent to H/W Set ASBD to indicate what format you want your units to receive
  • That said, setting up the callbacks for providing samples to or getting them from a unit take global scope, as their purpose is implicit from the property names: kAudioOutputUnitProperty_SetInputCallback and kAudioUnitProperty_SetRenderCallback.