Audio Session:系统与应用程序的中介

Overview

Apple通过audio sessions管理app, app与其他app, app与外部音频硬件间的行为.使用audio session可以向系统传达你将如何使用音频.audio session充当着app与系统间的中介.这样我们无需了解硬件相关却可以操控硬件行为.

1.Audio session

配置audio session类别与模式去告诉系统在app中你想怎么使用音频
激活audio session使配置的类别与模式可以工作
添加通知,响应重要的audio session通知,例如音频中断与硬件线路改变
配置音频采样率,声道数等信息

1.配置Audio Session

1.1. Audio Session管理Audio

audio session是应用程序与系统间的中介,用于配置音频行为,APP启动时,会自动获得一个audio session的单例对象,配置并且激活它以让音频按照期望开始工作.

1.2. Categories代表Audio作用

audio session category代表音频的主要行为.通过设置类别, 可以指明app是否使用的当前的输入或输出音频设备,以及当别的app中正在播放音频进入我们app时他们的音频是强制停止还是与我们的音频一起播放等等.

AVFoundation中定义了很多audio session categories, 你可以根据需要自定义音频行为,很多类别支持播放,录制,录制与播放同时进行,当系统了解了你定义的音频规则,它将提供给你合适的路径去访问硬件资源.系统也将确保别的app中的音频以适合你应用的方式运行.

一些categories可以根据Mode进一步定制,该模式用于专门指定类别的行为,例如当使用视频录制模式时,系统可能会选择一个不同于默认内置麦克风的麦克风,系统还可以针对录制调整麦克风的信号强度.

1.3. 中断处理

如果audio意外中断,系统会将aduio session置为停用状态,音频也会因此立即停止.当一个别的app的audio session被激活并且它的类别未设置与系统类别或你应用程序类别混合时,中断就会发生.你的应用程序在收到中断通知后应该保存当时的状态,以及更新用户界面等相关操作.通过注册AVAudioSessionInterruptionNotification可以观察中断的开始与结束点.

1.4. 音频线路改变

当用户做出连接,断开音频输入,输出设备时,(如:插拔耳机)音频线路发生变化,通过注册AVAudioSessionRouteChangeNotification可以在音频线路发生变化时做出相应处理.

1.5. Audio Sessions控制设备配置

App不能直接控制设备的硬件,但是audio session提供了一些接口去获取或设置一些高级的音频设置,如采样率,声道数等等.

1.6. Audio Sessions保护用户隐私

App如果想使用音频录制功能必须请求用户授权,否则无法使用.

2. 激活Audio Session

在设置了audio session的category, options, mode后,我们可以激活它以启动音频.

2.1. 系统如何解决音频竞争

随着app的启动,内置的一些服务(短信,音乐,浏览器,电话等)也将在后台运行.前面的这些内置服务都可能产生音频,如有电话打来,有短信提示等等…

2.2. 激活,停用Audio Session

虽然AVFoundation中播放与录制可以自动激活你的audio session, 但你可以手动激活并且测试是否激活成功.

系统会停用你的audio session当有电话打进来,闹钟响了,或是日历提醒等消息介入.当处理完这些介入的消息后,系统允许我们手动重新激活audio sesseion.

let session = AVAudioSession.sharedInstance()
do {
    // 1) Configure your audio session category, options, and mode
    // 2) Activate your audio session to enable your custom configuration
    try session.setActive(true)
} catch let error as NSError {
    print("Unable to activate audio session:  \(error.localizedDescription)")
}

如果我们使用AVFoundation对象(AVPlayer, AVAudioRecorder等),系统负责在中断结束时重新激活audio session.然而,如果你注册了通知去重新激活audio session,你可以验证是否激活成功并且更新用户界面.

确保在后台运行的VoIP应用程序的音频会话仅在应用程序处理呼叫时才处于激活状态。在后台，若未收到呼叫,VoIP应用程序的音频会话不应该是激活的。
确保使用录制类别的应用程序的音频会话仅在录制时处于激活状态。在录制开始和停止之前，请确保您的会话处于未激活状态，以允许播放其他声音，例如系统声音。
如果应用程序支持后台音频播放或录制，但在应用程序未主动使用音频（或准备使用音频）时，在进入后台时停用其音频会话。这样做允许系统释放音频资源，以便其他进程可以使用它们。

2.3. 检查别的Audio是否正在播放

当你的app被激活前,当前设备可能正在播放别的声音,如果你的app是一个游戏的app,知道别的声音来源显得十分重要,因为许多游戏允许同时播放别的音乐以增强用户体验.

在app进入前台前,我们可以通过applicationDidBecomeActive:代理方法在其中使用secondaryAudioShouldBeSilencedHint属性来确定音频是否正在播放.当别的app正在播放的audio session为不可混音配置时,该值为true. app可以使用此属性消除次要音频.

func setupNotifications() {
    NotificationCenter.default.addObserver(self,
                                           selector: #selector(handleSecondaryAudio),
                                           name: .AVAudioSessionSilenceSecondaryAudioHint,
                                           object: AVAudioSession.sharedInstance())
}
 
func handleSecondaryAudio(notification: Notification) {
    // Determine hint type
    guard let userInfo = notification.userInfo,
        let typeValue = userInfo[AVAudioSessionSilenceSecondaryAudioHintTypeKey] as? UInt,
        let type = AVAudioSessionSilenceSecondaryAudioHintType(rawValue: typeValue) else {
            return
    }
 
    if type == .begin {
        // Other app audio started playing - mute secondary audio
    } else {
        // Other app audio stopped playing - restart secondary audio
    }
}

3. 响应中断

在app中断后可以通过代码做出响应.音频中断将会导致audio session停用,同时应用程序中音频立即终止.当一个来自其他app的竞争的audio session被激活且这个audio session类别不支持与你的app进行混音时,中断发生.注册通知后我们可以在得知音频中断后做出相应处理.

App会因为中断被暂停,当用户接到电话时,闹钟,或其他系统事件被触发时,当中断结束后,App会继续运行,但是需要我们手动重新激活audio session.

3.1. 中断的生命周期

下图简单展示了当收到facetime后app的audio session与系统的audio session间激活与未激活状态变化.
2.interruput_lifecycle

3.2. 中断处理方法

通过注册监听中断的通知可以在中断来的时候进行处理.处理中断取决于你当前正在执行的操作:播放,录制,音频格式转换,读取音频数据包等等.一般而言,我们应尽量避免中断并且做到中断后尽快恢复.

中断前

保存状态与上下文
更新用户界面

中断后

恢复状态与上下文
更新用户界面
重新激活audio session.

Audio technology	How interruptions work
AVFoundation framework	系统在中断时会自动暂停录制与播放,当中断结束后重新激活audio session,恢复录制与播放
Audio Queue Services, I/O audio unit	系统会发出中断通知,开发者可以保存播放与录制状态并且在中断结束后重新激活audio session
System Sound Services	使用系统声音服务在中断来临时保持静音,如果中断结束,声音自动播放.

3.3. 处理Siri

当处理Siri时,与其他中断不同,我们在中断期间需要对Siri进行监听,如在中断期间,用户要求Siri去暂停开发者app中的音频播放,当app收到中断结束的通知时,不应该自动恢复播放.同时,用户界面需要跟Siri要求的保持一致.

3.4. 监听中断

func registerForNotifications() {
    NotificationCenter.default.addObserver(self,
                                           selector: #selector(handleInterruption),
                                           name: .AVAudioSessionInterruption,
                                           object: AVAudioSession.sharedInstance())
}
 
func handleInterruption(_ notification: Notification) {
    // Handle interruption
}

func handleInterruption(_ notification: Notification) {
    guard let info = notification.userInfo,
        let typeValue = info[AVAudioSessionInterruptionTypeKey] as? UInt,
        let type = AVAudioSessionInterruptionType(rawValue: typeValue) else {
            return
    }
    if type == .began {
        // Interruption began, take appropriate actions (save state, update user interface)
    }
    else if type == .ended {
        guard let optionsValue =
            userInfo[AVAudioSessionInterruptionOptionKey] as? UInt else {
                return
        }
        let options = AVAudioSessionInterruptionOptions(rawValue: optionsValue)
        if options.contains(.shouldResume) {
            // Interruption Ended - playback should resume
        }
    }
}

注意: 无法确保在开始中断后一定有一个结束中断,所以,如果没有结束中断,我们在app重新播放音频时需要总是检查aduio session是否被激活.

3.5. 响应媒体服务器重置操作

media server通过一个共享服务器进程提供了音频和其他多媒体功能.尽管很少见,但是如果在你的app正在运行时收到一条重置命令,可以通过注册AVAudioSessionMediaServicesWereResetNotification通知监听media server是否重置.收到通知后需要做如下操作.

销毁音频对象并且创建新的音频对象(如:players,recorders,converters,audio queues)
重置所有audio状态,包括AVAudioSession全部属性
在合适时机重新激活AVAudioSession对象.

如果开发者的应用程序中需要重置功能,如设置中有重置选项,可以使用这个方法轻松重置.

4. 线路改变

audio hardware route指定的设备音频硬件线路发生改变.当用户插拔耳机,系统会自动改变硬件的线路.开发者可以注册AVAudioSessionRouteChangeNotification通知在线路变化时作出相应调整.

3.audio_route_change

如上图,系统在app启动时会确定一套音频线路,而后程序运行期间会继续监听当前活跃的音频线路,在录制期间,用户可能插拔耳机,系统会发送一份改变线路的通知告诉开发者同时音频停止,开发者可以通过代码决定是否重新激活.

播放与录制稍有不同,播放时如果用户拔掉耳机,默认暂停音频,如果插上耳机,默认继续播放.

4.1. 监听Audio线路变化

原因

插拔耳机
连接,断开蓝牙耳机
插拔USB音频设备

func setupNotifications() {
    NotificationCenter.default.addObserver(self,
                                           selector: #selector(handleRouteChange),
                                           name: .AVAudioSessionRouteChange,
                                           object: AVAudioSession.sharedInstance())
}
 
func handleRouteChange(_ notification: Notification) {
 
}

userInfo中提供了关于线路改变的详细信息.可以查询改变原因通过字典中的AVAudioSessionRouteChangeReason,如当新的设备接入时,原因为AVAudioSessionRouteChangeReason,移除时为AVAudioSessionRouteChangeReasonOldDeviceUnavailable

func handleRouteChange(_ notification: Notification) {
    guard let userInfo = notification.userInfo,
        let reasonValue = userInfo[AVAudioSessionRouteChangeReasonKey] as? UInt,
        let reason = AVAudioSessionRouteChangeReason(rawValue:reasonValue) else {
            return
    }
    switch reason {
    case .newDeviceAvailable:
        // Handle new device available.
    case .oldDeviceUnavailable:
        // Handle old device removed.
    default: ()
    }
}

当有音频硬件插入时,你可以查询audio session的currentRoute属性去确定当前音频输出的位置.它将返回一个AVAudioSessionRouteDescription对象包含audio session全部的输入输出信息.当一个音频硬件被移除时,我们也可以从该对象中查询上一个线路.在以上两种情况中,我们都可以查询outputs属性,通过返回的AVAudioSessionPortDescription对象提供了音频输出的全部信息.

func handleRouteChange(notification: NSNotification) {
    guard let userInfo = notification.userInfo,
        let reasonValue = userInfo[AVAudioSessionRouteChangeReasonKey] as? UInt,
        let reason = AVAudioSessionRouteChangeReason(rawValue:reasonValue) else {
            return
    }
    switch reason {
    case .newDeviceAvailable:
        let session = AVAudioSession.sharedInstance()
        for output in session.currentRoute.outputs where output.portType == AVAudioSessionPortHeadphones {
            headphonesConnected = true
        }
    case .oldDeviceUnavailable:
        if let previousRoute =
            userInfo[AVAudioSessionRouteChangePreviousRouteKey] as? AVAudioSessionRouteDescription {
            for output in previousRoute.outputs where output.portType == AVAudioSessionPortHeadphones {
                headphonesConnected = false
            }
        }
    default: ()
    }
}

5. 配置设备硬件

使用audio session属性,可以在运行时优化硬件音频行为.这样可以让代码适配运行设备的特性.这样做同样适用于用户对音频硬件作出的更改.

5.1. 配置初始音频参数

使用audio session指定音频设备的设置,如采样率, I/O缓冲区时间.

Setting	Preferred sample rate	Preferred I/O buffer duration
High value	Example: 48 kHz, + High audio quality, – Large file or buffer size	Example: 500 mS, + Less-frequent file access, – Longer latency
Low value	Example: 8 kHz, + Small file or buffer size, – Low audio quality	Example: 5 mS,+ Low latency, – Frequent file access

Note: 默认音频输入输出缓冲时间(I/O buffer duration)为大多数应用提供足够的相应时间,如44.1kHz音频大概为20ms响应一次,你可以设置更低的延迟但相应数据量每次过来的也会降低,根据自己的需求进行选择.

5.2. 设置

在激活audio session前必须完成设置内容.如果你正在运行audio session, 先停用它,然后改变设置重新激活.

let session = AVAudioSession.sharedInstance()
 
// Configure category and mode
do {
    try session.setCategory(AVAudioSessionCategoryRecord, mode: AVAudioSessionModeDefault)
} catch let error as NSError {
    print("Unable to set category:  \(error.localizedDescription)")
}
 
// Set preferred sample rate
do {
    try session.setPreferredSampleRate(44_100)
} catch let error as NSError {
    print("Unable to set preferred sample rate:  \(error.localizedDescription)")
}
 
// Set preferred I/O buffer duration
do {
    try session.setPreferredIOBufferDuration(0.005)
} catch let error as NSError {
    print("Unable to set preferred I/O buffer duration:  \(error.localizedDescription)")
}
 
// Activate the audio session
do {
    try session.setActive(true)
} catch let error as NSError {
    print("Unable to activate session. \(error.localizedDescription)")
}
 
// Query the audio session's ioBufferDuration and sampleRate properties
// to determine if the preferred values were set
print("Audio Session ioBufferDuration: \(session.ioBufferDuration), sampleRate: \(session.sampleRate)")

5.3. 选择,配置麦克风

一个设备可能有多个麦克风(内置,外接),iOS会根据当前使用的audio session mode自动选择一个.mode指定了输入数字信号处理(DSP)和可能的线路.输入线路针对每种模式的用例进行了优化,设置mode还可能影响正在使用的音频线路.

开发者可以手动选择麦克风,甚至可以设置polar pattern如果硬件支持.

在使用任何音频设备之前，请为您的应用设置音频会话类别和模式，然后激活音频会话。

设置Preferred Input

为了找到当前设备连接的音频输入设备,可以使用audio session的availableInputs属性,该属性返回一个AVAudioSessionPortDescription对象的数组,描述当前可用输入设备端口,端口用portType进行标识.可以使用setPreferredInput:error:设置可用的音频输入设备.

设置Preferred Data Source

部分端口如内置麦克风,USB等支持数据源(data source),应用程序可以通过查询端口的dataSources属性发现可用的数据源.对于内置麦克风，返回的数据源描述对象代表每个单独的麦克风。不同的设备为内置麦克风返回不同的值。例如，iPhone 4和iPhone 4S有两个麦克风：底部和顶部。 iPhone 5有三个麦克风：底部，前部和后部。

可以通过数据源描述的location属性（上，下）和orientation属性（前，后等）的组合来识别各个内置麦克风。应用程序可以使用AVAudioSessionPortDescription对象的setPreferredDataSource：error：方法设置首选数据源。

设置 Preferred Polar Pattern

某些iOS设备支持为某些内置麦克风配置麦克风极性模式。麦克风的极性模式定义了其对声音相对于声源方向的灵敏度。使用supportedPolarPatterns属性返回数据源是否支持此模式,此属性返回数据源支持的极坐标模式数组（如心形或全向），或者在没有可选模式时返回nil。如果数据源具有许多支持的极坐标模式，则可以使用数据源描述的setPreferredPolarPattern：error：方法设置首选极坐标模式。

选择特定麦克风并且设置polar pattern.

// Preferred Mic = Front, Preferred Polar Pattern = Cardioid
let preferredMicOrientation = AVAudioSessionOrientationFront
let preferredPolarPattern = AVAudioSessionPolarPatternCardioid
 
// Retrieve your configured and activated audio session
let session = AVAudioSession.sharedInstance()
 
// Get available inputs
guard let inputs = session.availableInputs else { return }
 
// Find built-in mic
guard let builtInMic = inputs.first(where: {
    $0.portType == AVAudioSessionPortBuiltInMic
}) else { return }
 
// Find the data source at the specified orientation
guard let dataSource = builtInMic.dataSources?.first (where: {
    $0.orientation == preferredMicOrientation
}) else { return }
 
// Set data source's polar pattern
do {
    try dataSource.setPreferredPolarPattern(preferredPolarPattern)
} catch let error as NSError {
    print("Unable to preferred polar pattern: \(error.localizedDescription)")
}
 
// Set the data source as the input's preferred data source
do {
    try builtInMic.setPreferredDataSource(dataSource)
} catch let error as NSError {
    print("Unable to preferred dataSource: \(error.localizedDescription)")
}
 
// Set the built-in mic as the preferred input
// This call will be a no-op if already selected
do {
    try session.setPreferredInput(builtInMic)
} catch let error as NSError {
    print("Unable to preferred input: \(error.localizedDescription)")
}
 
// Print Active Configuration
session.currentRoute.inputs.forEach { portDesc in
    print("Port: \(portDesc.portType)")
    if let ds = portDesc.selectedDataSource {
        print("Name: \(ds.dataSourceName)")
        print("Polar Pattern: \(ds.selectedPolarPattern ?? "[none]")")
    }
}
Running this code on an iPhone 6s produces the following console output:

Port: MicrophoneBuiltIn
Name: Front
Polar Pattern: Cardioid

5.4. 模拟器运行

可以在模拟器或设备上运行您的应用。但是，Simulator不会模拟不同进程或音频线路更改中的音频会话之间的大多数交互。在Simulator中运行应用程序时，您不能：

调用中断
模拟插入或拔出耳机
更改静音开关的设置
模拟屏幕锁定
测试音频混合行为 - 即播放音频以及来自其他应用（例如音乐应用）的音频

#if arch(i386) || arch(x86_64)
    // Execute subset of code that works in the Simulator
#else
    // Execute device-only code as well as the other code
#endif

保护用户隐私

为了保护用户隐私，应用必须在录制音频之前询问并获得用户的许可。如果用户未授予许可，则仅记录静音。当您使用支持录制的类别并且应用程序尝试使用输入线路时，系统会自动提示用户获得权限。

您可以使用requestRecordPermission：方法手动请求权限，而不是等待系统提示用户提供记录权限。使用此方法可以让您的应用获得权限，而不会中断应用的自然流动，从而获得更好的用户体验。

AVAudioSession.sharedInstance().requestRecordPermission { granted in
    if granted {
        // User granted access. Present recording interface.
    } else {
        // Present message to user indicating that recording
        // can't be performed until they change their preference
        // under Settings -> Privacy -> Microphone
    }
}

从iOS 10开始，所有访问任何设备麦克风的应用都必须静态声明其意图。为此，应用程序现在必须在其Info.plist文件中包含NSMicrophoneUsageDescription键，并为此密钥提供目的字符串。当系统提示用户允许访问时，此字符串将显示为警报的一部分。如果应用程序尝试访问任何设备的麦克风而没有此键和值，则应用程序将终止。