code snippet

iPhone Audio Programming Tips

After extensive research into Apple’s poorly documented audio programming Objective-C classes and API, here are some helpful links.

Using RemoteIO audio unit

Decibel metering from an iPhone audio unit

Analyse Audio with RemoteIO

iPhone Core Audio tutorial

  1. http://timbolstad.com/2010/03/16/core-audio-getting-started-pt1/
  2. http://timbolstad.com/2010/03/16/core-audio-getting-started-pt2/
  3. http://timbolstad.com/2010/03/16/core-audio-getting-started-pt3/

It’s hard. Jens Alfke put it thusly:

“Easy” and “CoreAudio” can’t be used in the same sentence. :P CoreAudio is very powerful, very complex, and under-documented. Be prepared for a steep learning curve, APIs with millions of tiny little pieces, and puzzling things out from sample code rather than reading high-level documentation.

  • Media is hard because you’re dealing with issues of hardware I/O, real-time, threading, performance, and a pretty dense body of theory, all at the same time. Webapps are trite by comparison.

  • On the iPhone, Core Audio has three levels of opt-in for playback and recording, given your needs, listed here in increasing order of complexity/difficulty:

    1. AVAudioPlayer – File-based playback of DRM-free audio in Apple-supported codecs. Cocoa classes, called with Obj-C. iPhone 3.0 adds AVAudioRecorder (wasn’t sure if this was NDA, but it’s on the WWDC marketing page).
    2. Audio Queues – C-based API for buffered recording and playback of audio. Since you supply the samples, would work for a net radio player, and for your own formats and/or DRM/encryption schemes (decrypt in memory before handing off to the queue). Inherent latency due to the use of buffers.
    3. Audio Units – Low-level C-based API. Very low latency, as little as 29 milliseconds. Mixing, effects, near-direct access to input and output hardware.
  • Other important Core API’s not directly tied to playback and recording: Audio Session Services (for communicating your app’s audio needs to the system and defining interaction with things like background iPod player, ring/silent switch) as well as getting audio H/W metadata, Audio File Services for reading/writing files, Audio File Stream Services for dealing with audio data in a network stream, Audio Conversion Services for converting between PCM and compressed formats (and vice versa), Extended Audio File Services for combining file and conversion Services (e.g., given PCM, write out to a compressed AAC file).

  • Setting a property on an audio unit requires declaring the “scope” that the property applies to. Input scope is audio coming into the AU, output is going out of the unit, and global is for properties that affect the whole unit. So, if you set the stream format property on an AU’s input scope, you’re describing what you will supply to the AU.
  • Make the RemoteIO unit your friend. This is the AU that talks to both input and output hardware. Its use of buses is atypical and potentially confusing. Enjoy the ASCII art:


    -------------------------
    | i o |
    -- BUS 1 -- from mic --> | n REMOTE I/O u | -- BUS 1 -- to app -->
    | p AUDIO t |
    -- BUS 0 -- from app --> | u UNIT p | -- BUS 0 -- to speaker -->
    | t u |
    | t |
    -------------------------

    Ergo, the stream properties for this unit are


    Bus 0 Bus 1
    Input Scope: Set ASBD to indicate what you’re providing for play-out Get ASBD to inspect audio format being received from H/W
    Output Scope: Get ASBD to inspect audio format being sent to H/W Set ASBD to indicate what format you want your units to receive
  • That said, setting up the callbacks for providing samples to or getting them from a unit take global scope, as their purpose is implicit from the property names: kAudioOutputUnitProperty_SetInputCallback and kAudioUnitProperty_SetRenderCallback.


OpenGL push/pop transformation matrix

For those familiar with 3D object manipulation, OpenGL uses a “transformation matrix stack” to allow you to apply changes only to certain parts of your OpenGL object rendering without impacting the rest of the rendering.  It operates similar to a software stack.  But it only works for translations, rotation, and scaling.  Too bad that the iPhone OpenGL ES does not support push/pop of OpenGL attributes (to control linewidths, etc.)

glPushMatrix();

// execute here your glTranslate(), glRotate(), glScale()
// if you are going to change the color, then restore it manually to 1,1,1,1

glPopMatrix();

OpenGL ES quirks on iPhone

It’s great that iPhone supports OpenGL, but there are a few things that are wanting (at least for things we wanted to do).

In this case, it was drawing vector art on the iPhone. We wanted to draw lines of varying widths, but iPhone does not support glPushAttrib and glPopAttrib.

Normally, if you want to change the state of a lot of different enviroment varibles, such as GL_LIGHTING, glpolygonmode, gllinewidth, things like that, you would use following code:

glPushAttrib(GL_ENABLE_BIT);
glPushAttrib(GL_LINE_BIT);
glPushAttrib(GL_POLYGON_BIT);

glPopAttrib();
glPopAttrib();
glPopAttrib();

Below is from http://www.bradleymacomber.com/coderef/OpenGLES/ on some of differences on iPhone OpenGL

OpenGL ES Limitations (on iPhone)

The subset of OpenGL for mobile devices is missing a lot of the typical functions. The exact details may come as a surprise. The Khronos site lacks any documentation explaining this. (Presumably this is an excuse for them to sell me a book.) So I am writing down the limitations as I find ’em. Most often the convention seems to be to eliminate functionality that is a convenient re-presentation of more fundamental low-level functionality.

No GLU library.
Some handy functions such as gluPerspective() and gluLookAt() will have to be replaced with manual calculations.
No immediate-mode rendering.
This means there are no glBegin() and glEnd() functions. Instead you must use vertex arrays or vertex buffers. This is no surprise since games shouldn’t be using immediate mode anyway.
Simplified vertex arrays.
The glInterleavedArrays() function is unavailable; each array must be specified separately, although stride can still be used. There are no glDrawRangeElements() nor glMultiDrawElements() nor glMultiDrawArrays() functions. Instead use DrawArrays() or DrawElements(). There is also no ArrayElement() function, which makes sense since it requires glBegin() and glEnd().
No quads.
All the usual geometric primitives are supported except for GL_QUADS, GL_QUAD_STRIP, and GL_POLYGON. Of course, these are provided for convenience and are almost always easily replaced by triangles.
Smaller datatypes.
Many functions accept only smaller datatypes such as GL_SHORT instead of GL_INT, or GL_FLOAT instead of GL_DOUBLE. Presumably this is to save space on a device with limited memory and a screen small enough that a lack of fine detail can go unnoticed.
GLfixed.
This new low-level datatype is introduced to replace a variety of datatypes normally presented. For example, there is no glColor4ub() function. Presumably this helps support devices which do not have a FPU.
No glPushAttrib nor glPopAttrib (nor glPushClientAttrib).
Well, this is annoying. I guess an iPhone application is supposed to be simple enough that we can keep track of all states manually, eh?
No GL_LINE_SMOOTH.
Enabling it has no effect.
No display lists.
Instead use vertex arrays or vertex buffers.
No bitmap functions.
Functions such as glBitmap() and glRasterPos*() do not exist. This means you cannot render simple bitmap fonts. Instead, textured quads must be rendered. Of course, you could always render vector fonts. Don’t let me stop you.
Texture borders not supported.
Probably not a big deal.

Guess that means I have to manually track some of these changes and revert them back when needed.  This is not always easy to do when other parts of your program can access OpenGL without your knowledge.


Copyright 2009-2010 ZeroInverse.com