Recognize spoken words in recorded or live audio using Speech.

Posts under Speech tag

50 Posts

Post

Replies

Boosts

Views

Activity

AVSpeechSynthesisVoice.speechVoices() - different behavior on Mac (Designed for iPhone) and iOS and MANY errors checking .audioFileSettings properties.
We recently started working on getting an iOS app to work on Macs with Apple Silicon as a "Designed for iPhone" app and are having issues with speech synthesis. Specifically, voices retuned by AVSpeechSynthesisVoice.speechVoices() do not all work on the Mac. When we build an utterance and attempt to speak, the synthesizer falls back on a default voice and says some very odd text about voice parameters (that is not in the utterance speech text) before it does say the intended speech. Here is some sample code to setup the utterance and speak: func speak(_ text: String, _ settings: AppSettings) { let utterance = AVSpeechUtterance(string: text) if let voice = AVSpeechSynthesisVoice(identifier: settings.selectedVoiceIdentifier) { utterance.voice = voice print("speak: voice assigned \(voice.audioFileSettings)") } else { print("speak: voice error") } utterance.rate = settings.speechRate utterance.pitchMultiplier = settings.speechPitch do { let audioSession = AVAudioSession.sharedInstance() try audioSession.setCategory(.playback, mode: .default, options: .duckOthers) try audioSession.setActive(true, options: .notifyOthersOnDeactivation) self.synthesizer.speak(utterance) return } catch let error { print("speak: Error setting up AVAudioSession: \(error.localizedDescription)") } } When running the app on the Mac, this is the kind of error we get with "com.apple.eloquence.en-US.Rocko" as the selectedVoiceIdentifier: speak: voice assgined [:] 2023-05-29 18:00:14.245513-0700 A.I.[9244:240554] [aqme] AQMEIO_HAL.cpp:742 kAudioDevicePropertyMute returned err 2003332927 2023-05-29 18:00:14.410477-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.412837-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.413774-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.414661-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.415544-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.416384-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.416804-0700 A.I.[9244:240554] [AXTTSCommon] Audio Unit failed to start after 5 attempts. 2023-05-29 18:00:14.416974-0700 A.I.[9244:240554] [AXTTSCommon] VoiceProvider: Could not start synthesis for request SSML Length: 140, Voice: [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null), converted from tts request [TTSSpeechRequest 0x600002c29590] <speak><voice name="com.apple.eloquence.en-US.Rocko">How much wood would a woodchuck chuck if a wood chuck could chuck wood?</voice></speak> language: en-US footprint: premium rate: 0.500000 pitch: 1.000000 volume: 1.000000 2023-05-29 18:00:14.428421-0700 A.I.[9244:240360] [VOTSpeech] Failed to speak request with error: Error Domain=TTSErrorDomain Code=-4010 "(null)". Attempting to speak again with fallback identifier: com.apple.voice.compact.en-US.Samantha When we run AVSpeechSynthesisVoice.speechVoices(), the "com.apple.eloquence.en-US.Rocko" is absolutely in the list but fails to speak properly. Notice that the line: print("speak: voice assigned \(voice.audioFileSettings)") Shows: speak: voice assigned [:] The .audioFileSettings being empty seems to be a common factor for the voices that do not work properly on the Mac. For voices that do work, we see this kind of output and values in the .audioFileSettings: speak: voice assigned ["AVFormatIDKey": 1819304813, "AVLinearPCMBitDepthKey": 16, "AVLinearPCMIsBigEndianKey": 0, "AVLinearPCMIsFloatKey": 0, "AVSampleRateKey": 22050, "AVLinearPCMIsNonInterleaved": 0, "AVNumberOfChannelsKey": 1] So we added a function to check the .audioFileSettings for each voice returned by AVSpeechSynthesisVoice.speechVoices(): //The voices are set in init(): var voices = AVSpeechSynthesisVoice.speechVoices() ... func checkVoices() { DispatchQueue.global().async { [weak self] in guard let self = self else { return } let checkedVoices = self.voices.map { ($0.0, $0.0.audioFileSettings.count) } DispatchQueue.main.async { self.voices = checkedVoices } } } That looks simple enough, and does work to identify which voices have no data in their .audioFileSettings. But we have to run it asynchronously because on a real iPhone device, it takes more than 9 seconds and produces a tremendous amount of error spew to the console. 2023-06-02 10:56:59.805910-0700 A.I.[17186:910118] [catalog] Query for com.apple.MobileAsset.VoiceServices.VoiceResources failed: 2 2023-06-02 10:56:59.971435-0700 A.I.[17186:910118] [catalog] Query for com.apple.MobileAsset.VoiceServices.VoiceResources failed: 2 2023-06-02 10:57:00.122976-0700 A.I.[17186:910118] [catalog] Query for com.apple.MobileAsset.VoiceServices.VoiceResources failed: 2 2023-06-02 10:57:00.144430-0700 A.I.[17186:910116] [AXTTSCommon] MauiVocalizer: 11006 (Can't compile rule): regularExpression=\Oviedo(?=, (\x1b\\pause=\d+\\)?Florida)\b, message=unrecognized character follows \, characterPosition=1 2023-06-02 10:57:00.147993-0700 A.I.[17186:910116] [AXTTSCommon] MauiVocalizer: 16038 (Resource load failed): component=ttt/re, uri=, contentType=application/x-vocalizer-rettt+text, lhError=88602000 2023-06-02 10:57:00.148036-0700 A.I.[17186:910116] [AXTTSCommon] Error loading rules: 2147483648 ... This goes on and on and on ... There must be a better way?
7
1
3.6k
1w
iOS 26.4 — How to return from main app to host app after a keyboard-extension dictation round-trip, without private APIs?
I'm building a custom keyboard extension that offers voice dictation. Because keyboard extensions are constrained (memory cap ~30–48 MB, restricted audio session access), I delegate recording to my container app: User in a host app (e.g., Safari) taps the mic in my keyboard extension. The keyboard calls extensionContext.open(URL("myapp://dictation")) to launch the container app. The container app records audio via AVAudioEngine + SFSpeechRecognizer, writes the final transcript to the App Group, and signals completion via a Darwin notification. 4. The user is expected to be returned to the original host app (Safari) automatically so they can keep typing. The problem (step 4): On iOS 26.4 I can no longer identify which app was the host. Every previously-known path returns nil for the keyboard extension's host: parent.value(forKey: "_hostBundleID") → returns the literal string parent.value(forKey: "_hostApplicationBundleIdentifier") → returns NSNull xpc_connection_copy_bundle_id on the underlying XPC connection (via PKService.defaultService.personalities[…]) → returns NULL NSXPCConnection.processBundleIdentifier on extensionContext._extensionHostProxy._connection → returns nil proc_pidpath(hostPID, …) → EPERM from the keyboard sandbox LSApplicationWorkspace.frontmostApplication → selector unavailable from the extension RBSProcessHandle.handleForIdentifier:error: → returns an RBSServiceErrorDomain error Without the host's bundle ID, the container app has no way to call LSApplicationWorkspace.openApplicationWithBundleID: (the technique that worked on iOS 25 and earlier). UIApplication.suspend() correctly sends the container to background, but iOS treats us as a "fresh launch" — it returns the user to the Home Screen instead of Safari, because the container app was launched by an extension, not directly by Safari. KeyboardKit's maintainer reached the same conclusion (issue #1014) and shipped 10.4 without the feature. My questions: Is there a public, App-Store-safe API in iOS 26+ for a custom keyboard extension to identify its host application, or for the container app (launched via the extension's openURL) to identify which app initially hosted the extension that opened it? UIOpenURLContext.options.sourceApplication reports the extension's own container, not the actual host. 2. Is there a public mechanism for "return to source app" when the container app was launched by an extension's openURL? Equivalent to the ← Source affordance iOS shows for normal inter-app openURL, but triggered programmatically by the launched app. 3. Some popular keyboards (e.g., 微信输入法 / WeChat Keyboard) still appear to round-trip through their container app on iOS 26.4 and return the user to the original host — including the iOS ← WeChat back affordance in the host's status bar afterward. What's the recommended approach to achieve this? If it requires a specific scene-activation flow, NSUserActivity pattern, or extension-context configuration, please point at the relevant docs. 4. If there is no public path today, is FB22247647 (or a related radar) the right place to track this? Should developers in this position migrate to in-extension audio capture (which has its own significant constraints in keyboard extensions)? I'd much rather not rely on private APIs. Concrete guidance — or even an acknowledgment of which direction Apple intends — would help thousands of custom-keyboard developers who currently have a degraded voice-input experience on iOS 26.4+. Tested on iPhone 12 Pro Max running iOS 26.4.2 (build 23E261), Xcode 26.x, Swift 5. Thanks!
0
0
225
2w
iOS feasibility question: user-initiated wake-word detection during active session
Hi all, Technical architecture question for those experienced with iOS background audio / microphone constraints. I’m exploring an app concept where: the user explicitly starts a temporary active session during that session, on-device wake-word / keyword detection runs locally no audio is stored or transmitted during passive monitoring monitoring stops when the user ends the session The intended UX is that the user may then lock the phone or place it away while the active session remains in progress. Question: Is there any App Store-compliant architecture that would allow local keyword / wake-word detection to continue while the device is locked or the app is backgrounded during that active session? Or would iOS lifecycle / background execution rules make this infeasible for custom wake-word detection? Interested in practical experience around: AVAudioSession background audio modes on-device speech processing App Review acceptability Thanks in advance.
0
0
361
2w
How to use the SpeechDetector Module
I am trying to use SpeechDetector Module in Speech framework along with SpeechTranscriber. and it is giving me an error Cannot convert value of type 'SpeechDetector' to expected element type 'Array.ArrayLiteralElement' (aka 'any SpeechModule') Below is how I am using it let speechDetector = Speech.SpeechDetector() let transcriber = SpeechTranscriber(locale: Locale.current, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange]) speechAnalyzer = try SpeechAnalyzer(modules: [transcriber,speechDetector])
5
2
708
Apr ’26
SpeechAnalyzer speech to text wwdc sample app
I am using the sample app from: https://developer.apple.com/videos/play/wwdc2025/277/?time=763 I installed this on an Iphone 15 Pro with iOS 26 beta 1. I was able to get good transcription with it. The app did crash sometimes when transcribing and I was going to post here with the details. I then installed iOS beta 2 and uninstalled the sample app. Now every time I try to run the sample app on the 15 Pro I get this message: SpeechAnalyzer: Input loop ending with error: Error Domain=SFSpeechErrorDomain Code=10 "Cannot use modules with unallocated locales [en_US (fixed en_US)]" UserInfo={NSLocalizedDescription=Cannot use modules with unallocated locales [en_US (fixed en_US)]} I can't continue our our work towards using SpeechAnalyzer now with this error. I have set breakpoints on all the catch handlers and it doesn't catch this error. My phone region is "United States"
22
9
3k
Apr ’26
SpeechAnalyzer error "asset not found after attempted download" for certain languages
I am trying to use the new SpeechAnalyzer framework in my Mac app, and am running into an issue for some languages. When I call AssetInstallationRequest.downloadAndInstall() for some languages, it throws an error: Error Domain=SFSpeechErrorDomain Code=1 "transcription.ar asset not found after attempted download." The ".ar" appears to be the language code, which in this case was Arabic. When I call AssetInventory.status(forModules:) before attempting the download, it is giving me a status of "downloading" (perhaps from an earlier attempt?). If this language was completely unsupported, I would expect it to return a status of "unsupported", so I'm not sure what's going on here. For other languages (Polish, for example) SpeechTranscriber.supportedLocale(equivalentTo:) is returning nil, so that seems like a clearly unsupported language. But I can't tell if the languages I'm trying, like Arabic, are supported and something is going wrong, or if this error represents something I can work around. Here's the relevant section of code. The error is thrown from downloadAndInstall(), so I never even get as far as setting up the SpeechAnalyzer itself. private func setUpAnalyzer() async throws { guard let sourceLanguage else { throw Error.languageNotSpecified } guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: sourceLanguage.rawValue)) else { throw Error.unsupportedLanguage } let transcriber = SpeechTranscriber(locale: locale, preset: .progressiveTranscription) self.transcriber = transcriber let reservedLocales = await AssetInventory.reservedLocales if !reservedLocales.contains(locale) && reservedLocales.count == AssetInventory.maximumReservedLocales { if let oldest = reservedLocales.last { await AssetInventory.release(reservedLocale: oldest) } } do { let status = await AssetInventory.status(forModules: [transcriber]) print("status: \(status)") if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) { try await installationRequest.downloadAndInstall() } } ...
9
0
1.3k
Apr ’26
SpeechAnalyzer > AnalysisContext lack of documentation
I'm using the new SpeechAnalyzer framework to detect certain commands and want to improve accuracy by giving context. Seems like AnalysisContext is the solution for this, but couldn't find any usage example. So I want to make sure that I'm doing it right or not. let context = AnalysisContext() context.contextualStrings = [ AnalysisContext.ContextualStringsTag("commands"): [ "set speed level", "set jump level", "increase speed", "decrease speed", ... ], AnalysisContext.ContextualStringsTag("vocabulary"): [ "speed", "jump", ... ] ] try await analyzer.setContext(context) With this implementation, it still gives outputs like "Set some speed level", "It's speed level", etc. Also, is it possible to make it expect number after those commands, in order to eliminate results like "set some speed level to" (instead of two).
2
0
727
Apr ’26
CarPlay: Voice Conversational Entitlement Details
With the Voice Conversational Entitlement, can a CarPlay app establish a turn-based audio interface that operates in two modes: Speaking mode: Audio Session configured for playback Buffered audio Listening mode: Switch Audio Session to .record or .playAndRecord Activate SFSpeechRecognizer And continue toggling back and forth. The app should listen for responses to questions or other audio cues, and assuming those answers are correct (based on analysis of results from SFSpeechRecognizer), continue this pattern of mode 1 and 2 alternating. This appears to be a valid use of this entitlement. Does this also require the Audio App Entitlement, or is the Voice Conversational Entitlement sufficient? Are there other obstacles to this type of app that I'm not seeing? Or perhaps this is technically possible, but unlikely to pass app store review?
0
0
334
Apr ’26
AXSpeech Thread Crash SEGV_ACCERR
Hi everyone, I've encountered a rare and strange crash in my app that I can't consistently reproduce. The crash seems to occur deep within Apple's internal frameworks, and I can't pinpoint which line of my own code is causing it. Here's the crash stack trace: #44 AXSpeech SIGSEGV SEGV_ACCERR 0 CoreFoundation ___CFCheckCFInfoPACSignature + 4 1 CoreFoundation _CFRunLoopSourceSignal + 28 2 Foundation _performQueueDequeue + 492 3 Foundation ___NSThreadPerformPerform + 88 4 CoreFoundation ___CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28 5 CoreFoundation ___CFRunLoopDoSource0 + 176 6 CoreFoundation ___CFRunLoopDoSources0 + 340 7 CoreFoundation ___CFRunLoopRun + 828 8 CoreFoundation _CFRunLoopRunSpecific + 608 9 Foundation -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 10 TextToSpeech _TTSCFAttributedStringCreateStringByBracketingAttributeWithString + 776 11 Foundation ___NSThread__start__ + 732 12 libsystem_pthread.dylib __pthread_start + 136 Sometimes, instead of line 10 referencing _TTSCFAttributedStringCreateStringByBracketingAttributeWithString, it shows: 10 TextToSpeech LogWarning(char const*, ...) + 7288 Has anyone experienced a similar issue or know what might be triggering this crash? Any guidance on how to investigate or resolve this would be greatly appreciated. Thank you!
7
0
2.6k
Mar ’26
SpeechTranscriber/SpeechAnalyzer being relatively slow compared to FoundationModel and TTS
So, I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be. Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS. E.g. InteractionStatistics: - listeningStarted: 21:24:23 4480 2423 - timeTillFirstAboveNoiseFloor: 01.794 - timeTillLastNoiseAboveFloor: 02.383 - timeTillFirstSpeechDetected: 02.399 - timeTillTranscriptFinalized: 04.510 - timeTillFirstMLModelResponse: 04.938 - timeTillMLModelResponse: 05.379 - timeTillTTSStarted: 04.962 - timeTillTTSFinished: 11.016 - speechLength: 06.054 - timeToResponse: 02.578 - transcript: This is a test. - mlModelResponse: Sure! I'm ready to help with your test. What do you need help with? Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s. Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly). I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now? I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
3
0
876
Mar ’26
Strange crash in iOS AudioToolboxCore when using AVSpeechSynthesizer in iOS 16
I'm getting Crashlytics crashes from some my users, deep in the Apple code: Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x00000007ec54b360 0 libobjc.A.dylib 0x3c9c objc_retain_x8 + 16 1 AudioToolboxCore 0x99580 auoop::RenderPipeUser::~RenderPipeUser() + 112 2 AudioToolboxCore 0xe6090 -[AUAudioUnit_XPC internalDeallocateRenderResources] + 92 3 AVFAudio 0x90a0 AUInterfaceBaseV3::Uninitialize() + 60 4 AVFAudio 0x4cbe0 AVAudioEngineGraph::PerformCommand(AUGraphNodeBaseV3&, AVAudioEngineGraph::ENodeCommand, void*, unsigned int) const + 768 5 AVFAudio 0x56b0c AVAudioEngineGraph::_Uninitialize(NSError**) + 132 6 AVFAudio 0x7834 AVAudioEngineImpl::Stop(NSError**) + 388 7 AVFAudio 0x636c -[AVAudioEngine dealloc] + 52 8 TextToSpeech 0x30674 _TTSNameForVoiceInformation + 20864 9 libobjc.A.dylib 0x20a4 object_cxxDestructFromClass(objc_object*, objc_class*) + 116 10 libobjc.A.dylib 0x6e00 objc_destructInstance + 80 11 libobjc.A.dylib 0x104fc _objc_rootDealloc + 80 12 TextToSpeech 0x2d2f4 _TTSNameForVoiceInformation + 7680 13 TextToSpeech 0x496c TTSVocalizerCopyURLForFallbackResource + 8540 14 TextToSpeech 0x26094 TTSSpeechUnitTestingMode + 5548 15 libAXSpeechManager.dylib 0x108b0 -[AXSpeechManager .cxx_destruct] + 192 16 libobjc.A.dylib 0x20a4 object_cxxDestructFromClass(objc_object*, objc_class*) + 116 17 libobjc.A.dylib 0x6e00 objc_destructInstance + 80 18 libobjc.A.dylib 0x104fc _objc_rootDealloc + 80 19 libAXSpeechManager.dylib 0x5298 -[AXSpeechManager dealloc] + 268 20 Foundation 0x3b8a4 __NSThreadPerformPerform + 272 21 CoreFoundation 0xd3208 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28 22 CoreFoundation 0xdf864 __CFRunLoopDoSource0 + 176 23 CoreFoundation 0x646c8 __CFRunLoopDoSources0 + 244 24 CoreFoundation 0x7a1c4 __CFRunLoopRun + 828 25 CoreFoundation 0x7f4dc CFRunLoopRunSpecific + 612 26 Foundation 0x420c4 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 27 libAXSpeechManager.dylib 0x13390 -[AXSpeechThread main] + 552 28 Foundation 0x5b634 __NSThread__start__ + 716 29 libsystem_pthread.dylib 0x16b8 _pthread_start + 148 30 libsystem_pthread.dylib 0xb88 thread_start + 8 It's most likely related to my use of AVSpeechSynthesizer. I do change some of the utterance fields, including the voice that's being used (which is set to a value from speechVoices()). UtilAudioIos_tts = AVSpeechSynthesizer() let utterance = AVSpeechUtterance utterance.voice = AVSpeechSynthesisVoice(identifier: voice.voiceCode) utterance.volume = volume utterance.pitchMultiplier = pitch utterance.rate = rate UtilAudioIos_tts!.speak(utterance) By coincidence or not, the following sometimes appears in the device log: 2023-05-30 20:35:29.948078+0100 <appname>[466:12882] [catalog] Unable to list voice folder and also, sometimes: 2023-05-30 20:37:35.345933+0100 <appname>[466:13298] [catalog] Query for com.apple.MobileAsset.VoiceServices.VoiceResources failed: 2 2023-05-30 20:37:35.360854+0100 rehearserfree[466:13433] [AXTTSCommon] MauiVocalizer: 11006 (Can't compile rule): regularExpression=\Oviedo(?=, (\x1b\\pause=\d+\\)?Florida)\b, message=unrecognized character follows \, characterPosition=1 2023-05-30 20:37:35.363163+0100 <appname>[466:13433] [AXTTSCommon] MauiVocalizer: 16038 (Resource load failed): component=ttt/re, uri=, contentType=application/x-vocalizer-rettt+text, lhError=88602000 2023-05-30 20:37:35.363182+0100 <appname>[466:13433] [AXTTSCommon] Error loading rules: 2147483648 All of these crashes have been on the various versions of iOS 16. Edit: I can't reproduce the crash myself - it's just some (not all) app users. The log entries above appear locally on my device (with no crash) but I can't see the logs of the users who have the crashes. Any idea what this might be caused by, or how to go about tracking the problem down?
6
0
2.7k
Mar ’26
SpeechTranscriber not supported
I've tried SpeechTranscriber with a lot of my devices (from iPhone 12 series ~ iPhone 17 series) without issues. However, SpeechTranscriber.isAvailable value is false for my iPhone 11 Pro. https://developer.apple.com/documentation/speech/speechtranscriber/isavailable I'am curious why the iPhone 11 Pro device is not supported. Are all iPhone 11 series not supported intentionally? Or is there any problem with my specific device? I've also checked the supportedLocales, and the value is an empty array. https://developer.apple.com/documentation/speech/speechtranscriber/supportedlocales
5
0
971
Mar ’26
Video Audio + Speech To Text
Hello, I am wondering if it is possible to have audio from my AirPods be sent to my speech to text service and at the same time have the built in mic audio input be sent to recording a video? I ask because I want my users to be able to say "CAPTURE" and I start recording a video (with audio from the built in mic) and then when the user says "STOP" I stop the recording.
2
0
1.5k
Mar ’26
AVAudioEngine fails to start during FaceTime call (error 2003329396)
Is it possible to perform speech-to-text using AVAudioEngine to capture microphone input while being on a FaceTime call at the same time? I tried implementing this, but whenever I attempt to start the  AVAudioEngine  while a FaceTime call is active, I get the following error: “The operation couldn’t be completed. (OSStatus error 2003329396)” I assume this might be due to microphone resource restrictions during FaceTime, but I’d like to confirm whether this limitation is at the system level or if there’s any possible workaround or entitlement that allows concurrent microphone access. Has anyone encountered this issue or found a solution?
2
1
1.2k
Mar ’26
Building Real-Time Voice Input on macOS 26 with SpeechAnalyzer + ScreenCaptureKit
We built an open-source macOS menu bar app that turns speech into text and pastes it into the active app — using SpeechAnalyzer for on-device transcription, ScreenCaptureKit + Vision for screen-aware context, and FluidAudio for speaker diarization in meeting mode. Here's what we learned shipping it on macOS 26. GitHub: github.com/Marvinngg/ambient-voice Architecture The app has two modes: hotkey dictation (press to talk, release to inject) and meeting recording (continuous transcription with a floating panel). Dictation Mode Audio capture uses AVCaptureSession (more on why below). The captured audio feeds into SpeechAnalyzer via an AsyncStream: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults, .alternativeTranscriptions], attributeOptions: [.audioTimeRange, .transcriptionConfidence] ) let analyzer = SpeechAnalyzer(modules: [transcriber]) let (inputSequence, inputBuilder) = AsyncStream.makeStream() try await analyzer.start(inputSequence: inputSequence) While recording, we capture a screenshot of the focused window using ScreenCaptureKit, run Vision OCR (VNRecognizeTextRequest), extract keywords, and inject them into SpeechAnalyzer as contextual bias: let context = AnalysisContext() context.contextualStrings[.general] = ocrKeywords try await analyzer.setContext(context) This improves accuracy for technical terms and proper nouns visible on screen. If your screen shows "SpeechAnalyzer", saying it out loud is more likely to be transcribed correctly. After transcription, an optional L2 step sends the text through a local LLM (ollama) for spoken-to-written cleanup, then CGEvent simulates Cmd+V to paste into the active app. Meeting Mode Meeting mode forks the same audio stream to two consumers: SpeechAnalyzer — real-time streaming transcription, displayed in a floating NSPanel FluidAudio buffer — accumulates 16kHz Float32 mono samples for batch speaker diarization after recording stops When the user ends the meeting, FluidAudio's performCompleteDiarization() runs on the accumulated audio. We align transcription segments with speaker segments using audioTimeRange overlap matching — each transcription segment gets assigned the speaker ID with the most time overlap. Results export to Markdown. Pitfalls We Hit on macOS 26 1. AVAudioEngine installTap doesn't fire with Bluetooth devices We started with AVAudioEngine.inputNode.installTap() for audio capture. It worked fine with built-in mics but the tap callback never fired with Bluetooth devices (tested with vivo TWS 4 Hi-Fi). Fix: switched to AVCaptureSession. The delegate callback captureOutput(_:didOutput:from:) fires reliably regardless of audio device. The tradeoff is you get CMSampleBuffer instead of AVAudioPCMBuffer, so you need a conversion step. 2. NSEvent addGlobalMonitorForEvents crashes Our global hotkey listener used NSEvent.addGlobalMonitorForEvents. On macOS 26, this crashes with a Bus error inside GlobalObserverHandler — appears to be a Swift actor runtime issue. Fix: switched to CGEventTap. Works reliably, but the callback runs on a CFRunLoop context, which Swift doesn't recognize as MainActor. 3. CGEventTap callbacks aren't on MainActor If your CGEventTap callback touches any @MainActor state, you'll get concurrency violations. The callback runs on whatever thread owns the CFRunLoop. Fix: bridge with DispatchQueue.main.async {} inside the tap callback before touching any MainActor state. 4. CGPreflightScreenCaptureAccess doesn't request permission We used CGPreflightScreenCaptureAccess() as a guard before calling ScreenCaptureKit. If it returned false, we'd bail out. The problem: this function only checks — it never triggers macOS to add your app to the Screen Recording permission list. Chicken-and-egg: you can't get permission because you never ask for it. Fix: call CGRequestScreenCaptureAccess() at app startup. This adds your app to System Settings → Screen Recording. Then let ScreenCaptureKit calls proceed without the preflight guard — SCShareableContent will also trigger the permission prompt on first use. 5. Ad-hoc signing breaks TCC permissions on every rebuild During development, codesign --sign - (ad-hoc) generates a different code directory hash on every build. macOS TCC tracks permissions by this hash, so every rebuild = new app identity = all permissions reset. Fix: sign with a stable certificate. If you have an Apple Development certificate, use that. The TeamIdentifier stays constant across rebuilds, so TCC permissions persist. We also discovered that launching via open WE.app (LaunchServices) instead of directly executing the binary is required — otherwise macOS attributes TCC permissions to Terminal, not your app. Benchmarks We ran end-to-end benchmarks on public datasets (Mac Mini M4 16GB, macOS 26): Transcription (SpeechAnalyzer, AliMeeting Chinese): • Near-field CER 34% (excluding outliers ~25%) • Far-field CER 40% (single channel, no beamforming, >30% overlap) • Processing speed 74-89x real-time Speaker diarization (FluidAudio offline): • AMI English 16 meetings: avg DER 23.2% (collar=0.25s, ignoreOverlap=True) • AliMeeting Chinese 8 meetings: DER 48.5% (including overlap regions) • Memory: RSS ~500MB, peak 730-930MB Full evaluation methodology, scripts, and raw results are in the repo. Open Source The project is MIT licensed: github.com/Marvinngg/ambient-voice It includes the macOS client (Swift 6.2, SPM), server-side distillation/training scripts (Python), and a complete evaluation framework with reproducible benchmarks. Feedback and contributions welcome.
0
0
640
Mar ’26
Building Real-Time Voice Input on macOS 26 with SpeechAnalyzer + ScreenCaptureKit
We built an open-source macOS menu bar app that turns speech into text and pastes it into the active app — using SpeechAnalyzer for on-device transcription, ScreenCaptureKit + Vision for screen-aware context, and FluidAudio for speaker diarization in meeting mode. Here's what we learned shipping it on macOS 26. GitHub: github.com/Marvinngg/ambient-voice Architecture The app has two modes: hotkey dictation (press to talk, release to inject) and meeting recording (continuous transcription with a floating panel). Dictation Mode Audio capture uses AVCaptureSession (more on why below). The captured audio feeds into SpeechAnalyzer via an AsyncStream: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults, .alternativeTranscriptions], attributeOptions: [.audioTimeRange, .transcriptionConfidence] ) let analyzer = SpeechAnalyzer(modules: [transcriber]) let (inputSequence, inputBuilder) = AsyncStream.makeStream() try await analyzer.start(inputSequence: inputSequence) While recording, we capture a screenshot of the focused window using ScreenCaptureKit, run Vision OCR (VNRecognizeTextRequest), extract keywords, and inject them into SpeechAnalyzer as contextual bias: let context = AnalysisContext() context.contextualStrings[.general] = ocrKeywords try await analyzer.setContext(context) This improves accuracy for technical terms and proper nouns visible on screen. If your screen shows "SpeechAnalyzer", saying it out loud is more likely to be transcribed correctly. After transcription, an optional L2 step sends the text through a local LLM (ollama) for spoken-to-written cleanup, then CGEvent simulates Cmd+V to paste into the active app. Meeting Mode Meeting mode forks the same audio stream to two consumers: SpeechAnalyzer — real-time streaming transcription, displayed in a floating NSPanel FluidAudio buffer — accumulates 16kHz Float32 mono samples for batch speaker diarization after recording stops When the user ends the meeting, FluidAudio's performCompleteDiarization() runs on the accumulated audio. We align transcription segments with speaker segments using audioTimeRange overlap matching — each transcription segment gets assigned the speaker ID with the most time overlap. Results export to Markdown. Pitfalls We Hit on macOS 26 1. AVAudioEngine installTap doesn't fire with Bluetooth devices We started with AVAudioEngine.inputNode.installTap() for audio capture. It worked fine with built-in mics but the tap callback never fired with Bluetooth devices (tested with vivo TWS 4 Hi-Fi). Fix: switched to AVCaptureSession. The delegate callback captureOutput(_:didOutput:from:) fires reliably regardless of audio device. The tradeoff is you get CMSampleBuffer instead of AVAudioPCMBuffer, so you need a conversion step. 2. NSEvent addGlobalMonitorForEvents crashes Our global hotkey listener used NSEvent.addGlobalMonitorForEvents. On macOS 26, this crashes with a Bus error inside GlobalObserverHandler — appears to be a Swift actor runtime issue. Fix: switched to CGEventTap. Works reliably, but the callback runs on a CFRunLoop context, which Swift doesn't recognize as MainActor. 3. CGEventTap callbacks aren't on MainActor If your CGEventTap callback touches any @MainActor state, you'll get concurrency violations. The callback runs on whatever thread owns the CFRunLoop. Fix: bridge with DispatchQueue.main.async {} inside the tap callback before touching any MainActor state. 4. CGPreflightScreenCaptureAccess doesn't request permission We used CGPreflightScreenCaptureAccess() as a guard before calling ScreenCaptureKit. If it returned false, we'd bail out. The problem: this function only checks — it never triggers macOS to add your app to the Screen Recording permission list. Chicken-and-egg: you can't get permission because you never ask for it. Fix: call CGRequestScreenCaptureAccess() at app startup. This adds your app to System Settings → Screen Recording. Then let ScreenCaptureKit calls proceed without the preflight guard — SCShareableContent will also trigger the permission prompt on first use. 5. Ad-hoc signing breaks TCC permissions on every rebuild During development, codesign --sign - (ad-hoc) generates a different code directory hash on every build. macOS TCC tracks permissions by this hash, so every rebuild = new app identity = all permissions reset. Fix: sign with a stable certificate. If you have an Apple Development certificate, use that. The TeamIdentifier stays constant across rebuilds, so TCC permissions persist. We also discovered that launching via open WE.app (LaunchServices) instead of directly executing the binary is required — otherwise macOS attributes TCC permissions to Terminal, not your app. Benchmarks We ran end-to-end benchmarks on public datasets (Mac Mini M4 16GB, macOS 26): Transcription (SpeechAnalyzer, AliMeeting Chinese): • Near-field CER 34% (excluding outliers ~25%) • Far-field CER 40% (single channel, no beamforming, >30% overlap) • Processing speed 74-89x real-time Speaker diarization (FluidAudio offline): • AMI English 16 meetings: avg DER 23.2% (collar=0.25s, ignoreOverlap=True) • AliMeeting Chinese 8 meetings: DER 48.5% (including overlap regions) • Memory: RSS ~500MB, peak 730-930MB Full evaluation methodology, scripts, and raw results are in the repo. Open Source The project is MIT licensed: github.com/Marvinngg/ambient-voice It includes the macOS client (Swift 6.2, SPM), server-side distillation/training scripts (Python), and a complete evaluation framework with reproducible benchmarks. Feedback and contributions welcome.
0
0
753
Mar ’26
SpeechAnalyzer.start(inputSequence:) fails with _GenericObjCError nilError, while the same WAV succeeds with start(inputAudioFile:)
I'm trying to use the new Speech framework for streaming transcription on macOS 26.3, and I can reproduce a failure with SpeechAnalyzer.start(inputSequence:). What is working: SpeechAnalyzer + SpeechTranscriber offline path using start(inputAudioFile:finishAfterFile:) same Spanish WAV file transcribes successfully and returns a coherent final result What is not working: SpeechAnalyzer + SpeechTranscriber stream path using start(inputSequence:) same WAV, replayed as AnalyzerInput(buffer:bufferStartTime:) fails once replay starts with: _GenericObjCError domain=Foundation._GenericObjCError code=0 detail=nilError I also tried: DictationTranscriber instead of SpeechTranscriber no realtime pacing during replay Both still fail in stream mode with the same error. So this does not currently look like a ScreenCaptureKit issue or a Python integration issue. I reduced it to a pure Swift CLI repro. Environment: macOS 26.3 (25D122) Xcode 26.3 Swift 6.2.4 Apple Silicon Mac Has anyone here gotten SpeechAnalyzer.start(inputSequence:) working reliably on macOS 26.x? If so, I'd be interested in any workaround or any detail that differs from the obvious setup: prepareToAnalyze(in:) bestAvailableAudioFormat(...) AnalyzerInput(buffer:bufferStartTime:) replaying a known-good WAV in chunks I already filed Feedback Assistant: FB22149971
1
0
545
Mar ’26
Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000056f023efbeb0
Application is getting Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000056f023efbeb0 Crashed: AXSpeech 0 libobjc.A.dylib 0x4820 objc_msgSend + 32 1 libsystem_trace.dylib 0x6c34 _os_log_fmt_flatten_object + 116 2 libsystem_trace.dylib 0x5344 _os_log_impl_flatten_and_send + 1884 3 libsystem_trace.dylib 0x4bd0 _os_log + 152 4 libsystem_trace.dylib 0x9c48 _os_log_error_impl + 24 5 TextToSpeech 0xd0a8c _pcre2_xclass_8 6 TextToSpeech 0x3bc04 TTSSpeechUnitTestingMode 7 TextToSpeech 0x3f128 TTSSpeechUnitTestingMode 8 AXCoreUtilities 0xad38 -[NSArray(AXExtras) ax_flatMappedArrayUsingBlock:] + 204 9 TextToSpeech 0x3eb18 TTSSpeechUnitTestingMode 10 TextToSpeech 0x3c948 TTSSpeechUnitTestingMode 11 TextToSpeech 0x48824 AXAVSpeechSynthesisVoiceFromTTSSpeechVoice 12 TextToSpeech 0x49804 AXAVSpeechSynthesisVoiceFromTTSSpeechVoice 13 Foundation 0xf6064 __NSThreadPerformPerform + 264 14 CoreFoundation 0x37acc CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION + 28 15 CoreFoundation 0x36d48 __CFRunLoopDoSource0 + 176 16 CoreFoundation 0x354fc __CFRunLoopDoSources0 + 244 17 CoreFoundation 0x34238 __CFRunLoopRun + 828 18 CoreFoundation 0x33e18 CFRunLoopRunSpecific + 608 19 Foundation 0x2d4cc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 20 TextToSpeech 0x24b88 TTSCFAttributedStringCreateStringByBracketingAttributeWithString 21 Foundation 0xb3154 NSThread__start + 732 com.livingMedia.AajTakiPhone_issue_3ceba855a8ad2d1af83655803dc13f70_crash_session_9081fa41ced440ae9a57c22cb432f312_DNE_0_v2_stacktrace.txt 22 libsystem_pthread.dylib 0x24d4 _pthread_start + 136 23 libsystem_pthread.dylib 0x1a10 thread_start + 8
4
1
1.8k
Mar ’26
AXSpeech Crash
I have a very terrible crash problem in my App when I use AVSpeechSynthesizer and I can't repetition it.Here is my code, It's a singleton- (void)stopSpeech { if ([self.synthesizer isPaused]) { return; } if ([self.synthesizer isSpeaking]) { BOOL isSpeech = [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate]; if (!isSpeech) { [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryWord]; } } self.stopBlock ? self.stopBlock() : nil; } -(AVSpeechSynthesizer *)synthesizer { if (!_synthesizer) { _synthesizer = [[AVSpeechSynthesizer alloc] init]; _synthesizer.delegate = self; } return _synthesizer; }When the user leaves the page, I call the stopSpeech method。Then I got a lot of crash messagesHere is a crash log:# Crashlytics - plaintext stacktrace downloaded by liweican at Mon, 13 May 2019 03:03:24 GMT # URL: https://fabric.io/youdao-dict/ios/apps/com.youdao.udictionary/issues/5a904ed88cb3c2fa63ad7ed3?time=last-thirty-days/sessions/b1747d91bafc4680ab0ca8e3a702c52c_DNE_0_v2 # Organization: zzz # Platform: ios # Application: U-Dictionary # Version: 3.0.5.4 # Bundle Identifier: com.youdao.UDictionary # Issue ID: 5a904ed88cb3c2fa63ad7ed3 # Session ID: b1747d91bafc4680ab0ca8e3a702c52c_DNE_0_v2 # Date: 2019-05-13T02:27:00Z # OS Version: 12.2.0 (16E227) # Device: iPhone 8 Plus # RAM Free: 17% # Disk Free: 64.6% #19. Crashed: AXSpeech 0 libsystem_pthread.dylib 0x19c15e5b8 pthread_mutex_lock$VARIANT$armv81 + 102 1 CoreFoundation 0x19c4cf84c CFRunLoopSourceSignal + 68 2 Foundation 0x19cfc7280 performQueueDequeue + 464 3 Foundation 0x19cfc680c __NSThreadPerformPerform + 136 4 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 5 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88 6 CoreFoundation 0x19c4d1b74 __CFRunLoopDoSources0 + 256 7 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004 8 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 9 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 10 libAXSpeechManager.dylib 0x1ac16c94c -[AXSpeechThread main] + 264 11 Foundation 0x19cfc66e4 __NSThread__start__ + 984 12 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 13 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 14 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 -- #0. com.apple.main-thread 0 libsystem_malloc.dylib 0x19c11ce24 small_free_list_remove_ptr_no_clear + 768 1 libsystem_malloc.dylib 0x19c11f094 small_malloc_from_free_list + 296 2 libsystem_malloc.dylib 0x19c11f094 small_malloc_from_free_list + 296 3 libsystem_malloc.dylib 0x19c11d63c small_malloc_should_clear + 224 4 libsystem_malloc.dylib 0x19c11adcc szone_malloc_should_clear + 132 5 libsystem_malloc.dylib 0x19c123c18 malloc_zone_malloc + 156 6 CoreFoundation 0x19c569ab4 __CFBasicHashRehash + 300 7 CoreFoundation 0x19c56b430 __CFBasicHashAddValue + 96 8 CoreFoundation 0x19c56ab9c CFBasicHashAddValue + 2160 9 CoreFoundation 0x19c49f3bc CFDictionaryAddValue + 260 10 CoreFoundation 0x19c572ee8 __54-[CFPrefsSource mergeIntoDictionary:sourceDictionary:]_block_invoke + 28 11 CoreFoundation 0x19c49f0b4 __CFDictionaryApplyFunction_block_invoke + 24 12 CoreFoundation 0x19c568b7c CFBasicHashApply + 116 13 CoreFoundation 0x19c49f090 CFDictionaryApplyFunction + 168 14 CoreFoundation 0x19c42f504 -[CFPrefsSource mergeIntoDictionary:sourceDictionary:] + 136 15 CoreFoundation 0x19c4bcd38 -[CFPrefsSearchListSource alreadylocked_getDictionary:] + 644 16 CoreFoundation 0x19c42e71c -[CFPrefsSearchListSource alreadylocked_copyValueForKey:] + 152 17 CoreFoundation 0x19c42e660 -[CFPrefsSource copyValueForKey:] + 60 18 CoreFoundation 0x19c579e88 __76-[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:]_block_invoke + 40 19 CoreFoundation 0x19c4bdff4 __108-[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:]_block_invoke + 272 20 CoreFoundation 0x19c4bda38 normalizeQuintuplet + 340 21 CoreFoundation 0x19c42c634 -[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:] + 108 22 CoreFoundation 0x19c42cec0 -[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:] + 148 23 CoreFoundation 0x19c57c2d0 _CFPreferencesCopyAppValueWithContainerAndConfiguration + 124 24 TextInput 0x1a450e550 -[TIPreferencesController valueForPreferenceKey:] + 460 25 UIKitCore 0x1c87c71f8 -[UIKeyboardPreferencesController handBias] + 36 26 UIKitCore 0x1c887275c -[UIKeyboardLayoutStar showKeyboardWithInputTraits:screenTraits:splitTraits:] + 320 27 UIKitCore 0x1c88f4240 -[UIKeyboardImpl finishLayoutChangeWithArguments:] + 492 28 UIKitCore 0x1c88f47c8 -[UIKeyboardImpl updateLayout] + 1208 29 UIKitCore 0x1c88eaad0 -[UIKeyboardImpl updateLayoutIfNecessary] + 448 30 UIKitCore 0x1c88eab9c -[UIKeyboardImpl setFrame:] + 140 31 UIKitCore 0x1c88d5d60 -[UIKeyboard activate] + 652 32 UIKitCore 0x1c894c90c -[UIKeyboardAutomatic activate] + 128 33 UIKitCore 0x1c88d5158 -[UIKeyboard setFrame:] + 296 34 UIKitCore 0x1c88d81b0 -[UIKeyboard _didChangeKeyplaneWithContext:] + 228 35 UIKitCore 0x1c88f4aa0 -[UIKeyboardImpl didMoveToSuperview] + 136 36 UIKitCore 0x1c8f2ad84 __45-[UIView(Hierarchy) _postMovedFromSuperview:]_block_invoke + 888 37 UIKitCore 0x1c8f2a970 -[UIView(Hierarchy) _postMovedFromSuperview:] + 760 38 UIKitCore 0x1c8f39ddc -[UIView(Internal) _addSubview:positioned:relativeTo:] + 1740 39 UIKitCore 0x1c88d5d84 -[UIKeyboard activate] + 688 40 UIKitCore 0x1c894c90c -[UIKeyboardAutomatic activate] + 128 41 UIKitCore 0x1c893b3a4 -[UIPeripheralHost(UIKitInternal) _reloadInputViewsForResponder:] + 1332 42 UIKitCore 0x1c8ae66d8 -[UIResponder(UIResponderInputViewAdditions) reloadInputViews] + 80 43 UIKitCore 0x1c8ae23bc -[UIResponder becomeFirstResponder] + 804 44 UIKitCore 0x1c8f2a560 -[UIView(Hierarchy) becomeFirstResponder] + 156 45 UIKitCore 0x1c8d93e84 -[UITextField becomeFirstResponder] + 244 46 UIKitCore 0x1c8d578dc -[UITextInteractionAssistant(UITextInteractionAssistant_Internal) setFirstResponderIfNecessary] + 192 47 UIKitCore 0x1c8d45d8c -[UITextSelectionInteraction oneFingerTap:] + 3136 48 UIKitCore 0x1c86e0bcc -[UIGestureRecognizerTarget _sendActionWithGestureRecognizer:] + 64 49 UIKitCore 0x1c86e8dd4 _UIGestureRecognizerSendTargetActions + 124 50 UIKitCore 0x1c86e6778 _UIGestureRecognizerSendActions + 316 51 UIKitCore 0x1c86e5ca4 -[UIGestureRecognizer _updateGestureWithEvent:buttonEvent:] + 760 52 UIKitCore 0x1c86d9d80 _UIGestureEnvironmentUpdate + 2180 53 UIKitCore 0x1c86d94b0 -[UIGestureEnvironment _deliverEvent:toGestureRecognizers:usingBlock:] + 384 54 UIKitCore 0x1c86d9290 -[UIGestureEnvironment _updateForEvent:window:] + 204 55 UIKitCore 0x1c8af14a8 -[UIWindow sendEvent:] + 3112 56 UIKitCore 0x1c8ad1534 -[UIApplication sendEvent:] + 340 57 UIKitCore 0x1c8b977c0 __dispatchPreprocessedEventFromEventQueue + 1768 58 UIKitCore 0x1c8b99eec __handleEventQueueInternal + 4828 59 UIKitCore 0x1c8b9311c __handleHIDEventFetcherDrain + 152 60 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 61 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88 62 CoreFoundation 0x19c4d1b24 __CFRunLoopDoSources0 + 176 63 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004 64 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 65 GraphicsServices 0x19e6cc79c GSEventRunModal + 104 66 UIKitCore 0x1c8ab7b68 UIApplicationMain + 212 67 UDictionary 0x10517e138 main (main.m:17) 68 libdyld.dylib 0x19bf928e0 start + 4 #1. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #2. com.apple.uikit.eventfetch-thread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 6 Foundation 0x19ce99e5c -[NSRunLoop(NSRunLoop) runUntilDate:] + 96 7 UIKitCore 0x1c8b9d540 -[UIEventFetcher threadMain] + 136 8 Foundation 0x19cfc66e4 __NSThread__start__ + 984 9 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 10 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 11 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #3. JavaScriptCore bmalloc scavenger 0 libsystem_kernel.dylib 0x19c0ddee4 __psynch_cvwait + 8 1 libsystem_pthread.dylib 0x19c15d4a4 _pthread_cond_wait$VARIANT$armv81 + 628 2 libc++.1.dylib 0x19b6b5090 std::__1::condition_variable::wait(std::__1::unique_lock&lt;std::__1::mutex&gt;&amp;) + 24 3 JavaScriptCore 0x1a36a2238 void std::__1::condition_variable_any::wait&lt;std::__1::unique_lock&lt;bmalloc::Mutex&gt; &gt;(std::__1::unique_lock&lt;bmalloc::Mutex&gt;&amp;) + 108 4 JavaScriptCore 0x1a36a622c bmalloc::Scavenger::threadRunLoop() + 176 5 JavaScriptCore 0x1a36a59a4 bmalloc::Scavenger::Scavenger(std::__1::lock_guard&lt;bmalloc::Mutex&gt;&amp;) + 10 6 JavaScriptCore 0x1a36a73e4 std::__1::__thread_specific_ptr&lt;std::__1::__thread_struct&gt;::set_pointer(std::__1::__thread_struct*) + 38 7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #4. WebThread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 WebCore 0x1a5126480 RunWebThread(void*) + 600 6 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 7 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 8 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #5. com.twitter.crashlytics.ios.MachExceptionServer 0 UDictionary 0x1058a5564 CLSProcessRecordAllThreads (CLSProcess.c:376) 1 UDictionary 0x1058a594c CLSProcessRecordAllThreads (CLSProcess.c:407) 2 UDictionary 0x1058952dc CLSHandler (CLSHandler.m:26) 3 UDictionary 0x1058906cc CLSMachExceptionServer (CLSMachException.c:446) 4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #6. com.apple.NSURLConnectionLoader 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 CFNetwork 0x19cae574c -[__CoreSchedulingSetRunnable runForever] + 216 6 Foundation 0x19cfc66e4 __NSThread__start__ + 984 7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #7. AVAudioSession Notify Thread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 AVFAudio 0x1a238a378 GenericRunLoopThread::Entry(void*) + 156 6 AVFAudio 0x1a23b4c60 CAPThread::Entry(CAPThread*) + 88 7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #8. WebCore: LocalStorage 0 libsystem_kernel.dylib 0x19c0ddee4 __psynch_cvwait + 8 1 libsystem_pthread.dylib 0x19c15d4a4 _pthread_cond_wait$VARIANT$armv81 + 628 2 JavaScriptCore 0x1a3668ce4 ***::ThreadCondition::timedWait(***::Mutex&amp;, ***::WallTime) + 80 3 JavaScriptCore 0x1a364f96c ***::ParkingLot::parkConditionallyImpl(void const*, ***::ScopedLambda&lt;bool ()&gt; const&amp;, ***::ScopedLambda&lt;void ()&gt; const&amp;, ***::TimeWithDynamicClockType const&amp;) + 2004 4 WebKitLegacy 0x1a67b6ea8 bool ***::Condition::waitUntil&lt;***::Lock&gt;(***::Lock&amp;, ***::TimeWithDynamicClockType const&amp;) + 184 5 WebKitLegacy 0x1a67b9ba4 std::__1::unique_ptr&lt;***::Function&lt;void ()&gt;, std::__1::default_delete&lt;***::Function&lt;void ()&gt; &gt; &gt; ***::MessageQueue&lt;***::Function&lt;void ()&gt; &gt;::waitForMessageFilteredWithTimeout&lt;***::MessageQueue&lt;***::Function&lt;void ()&gt; &gt;::waitForMessage()::'lambda'(***::Function&lt;void ()&gt; const&amp;)&gt;(***::MessageQueueWaitResult&amp;, ***::MessageQueue&lt;***::Function&lt;void ()&gt; &gt;::waitForMessage()::'lambda'(***::Function&lt;void ()&gt; const&amp;)&amp;&amp;, ***::WallTime) + 156 6 WebKitLegacy 0x1a67b91c0 WebCore::StorageThread::threadEntryPoint() + 68 7 JavaScriptCore 0x1a3666f88 ***::Thread::entryPoint(***::Thread::NewThreadContext*) + 260 8 JavaScriptCore 0x1a3668494 ***::wtfThreadEntryPoint(void*) + 12 9 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 10 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 11 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #9. com.apple.CoreMotion.MotionThread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 CoreFoundation 0x19c4cd0b0 CFRunLoopRun + 80 6 CoreMotion 0x1a1df0240 (Missing) 7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #10. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #11. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c1611f8 _pthread_wqthread + 532 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #12. com.apple.CFStream.LegacyThread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 CoreFoundation 0x19c4e5094 _legacyStreamRunLoop_workThread + 260 6 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 7 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 8 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #13. Thread 0 libsystem_pthread.dylib 0x19c163cd0 start_wqthread + 190 #14. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #15. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #16. Thread 0 libsystem_kernel.dylib 0x19c0d3148 semaphore_timedwait_trap + 8 1 libdispatch.dylib 0x19bf50a4c _dispatch_sema4_timedwait$VARIANT$armv81 + 64 2 libdispatch.dylib 0x19bf513a8 _dispatch_semaphore_wait_slow + 72 3 libdispatch.dylib 0x19bf647c8 _dispatch_worker_thread + 344 4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #17. Thread 0 libsystem_kernel.dylib 0x19c0d3148 semaphore_timedwait_trap + 8 1 libdispatch.dylib 0x19bf50a4c _dispatch_sema4_timedwait$VARIANT$armv81 + 64 2 libdispatch.dylib 0x19bf513a8 _dispatch_semaphore_wait_slow + 72 3 libdispatch.dylib 0x19bf647c8 _dispatch_worker_thread + 344 4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #18. Thread 0 libsystem_kernel.dylib 0x19c0d3148 semaphore_timedwait_trap + 8 1 libdispatch.dylib 0x19bf50a4c _dispatch_sema4_timedwait$VARIANT$armv81 + 64 2 libdispatch.dylib 0x19bf513a8 _dispatch_semaphore_wait_slow + 72 3 libdispatch.dylib 0x19bf647c8 _dispatch_worker_thread + 344 4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #19. Crashed: AXSpeech 0 libsystem_pthread.dylib 0x19c15e5b8 pthread_mutex_lock$VARIANT$armv81 + 102 1 CoreFoundation 0x19c4cf84c CFRunLoopSourceSignal + 68 2 Foundation 0x19cfc7280 performQueueDequeue + 464 3 Foundation 0x19cfc680c __NSThreadPerformPerform + 136 4 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 5 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88 6 CoreFoundation 0x19c4d1b74 __CFRunLoopDoSources0 + 256 7 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004 8 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 9 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 10 libAXSpeechManager.dylib 0x1ac16c94c -[AXSpeechThread main] + 264 11 Foundation 0x19cfc66e4 __NSThread__start__ + 984 12 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 13 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 14 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #20. AXSpeech 0 (Missing) 0x1071ba524 (Missing) 1 (Missing) 0x1071b3e7c (Missing) 2 (Missing) 0x10718fba4 (Missing) 3 (Missing) 0x107184bc8 (Missing) 4 libdyld.dylib 0x19bf95908 dlopen + 176 5 CoreFoundation 0x19c5483e8 _CFBundleDlfcnLoadBundle + 140 6 CoreFoundation 0x19c486918 _CFBundleLoadExecutableAndReturnError + 352 7 Foundation 0x19ced5734 -[NSBundle loadAndReturnError:] + 428 8 TextToSpeech 0x1abfff800 TTSSpeechUnitTestingMode + 1020 9 libdispatch.dylib 0x19bf817d4 _dispatch_client_callout + 16 10 libdispatch.dylib 0x19bf52040 _dispatch_once_callout + 28 11 TextToSpeech 0x1abfff478 TTSSpeechUnitTestingMode + 116 12 libobjc.A.dylib 0x19b7173cc CALLING_SOME_+initialize_METHOD + 24 13 libobjc.A.dylib 0x19b71cee0 initializeNonMetaClass + 296 14 libobjc.A.dylib 0x19b71e640 initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt&lt;false&gt;&amp;, bool) + 260 15 libobjc.A.dylib 0x19b7265a4 lookUpImpOrForward + 244 16 libobjc.A.dylib 0x19b733858 _objc_msgSend_uncached + 56 17 libAXSpeechManager.dylib 0x1ac167324 -[AXSpeechManager _initialize] + 68 18 Foundation 0x19cfc68d4 __NSThreadPerformPerform + 336 19 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 20 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88 21 CoreFoundation 0x19c4d1b74 __CFRunLoopDoSources0 + 256 22 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004 23 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 24 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 25 libAXSpeechManager.dylib 0x1ac16c94c -[AXSpeechThread main] + 264 26 Foundation 0x19cfc66e4 __NSThread__start__ + 984 27 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 28 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 29 libsystem_pthread.dylib 0x19c163cdc thread_start + 4I change my code like this, It still has the same problem- (void)stopSpeech { if (self.synthesizer != nil &amp;&amp; [self.synthesizer isPaused]) { return; } // if ([self.synthesizer isSpeaking]) { // BOOL isSpeech = [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate]; // if (!isSpeech) { // [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryWord]; // } // } if (self.synthesizer != nil) { [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate]; // if (!isSpeech) { // [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryWord]; // } self.stopBlock ? self.stopBlock() : nil; } }
1
1
2.7k
Mar ’26
AVSpeechSynthesisVoice.speechVoices() - different behavior on Mac (Designed for iPhone) and iOS and MANY errors checking .audioFileSettings properties.
We recently started working on getting an iOS app to work on Macs with Apple Silicon as a "Designed for iPhone" app and are having issues with speech synthesis. Specifically, voices retuned by AVSpeechSynthesisVoice.speechVoices() do not all work on the Mac. When we build an utterance and attempt to speak, the synthesizer falls back on a default voice and says some very odd text about voice parameters (that is not in the utterance speech text) before it does say the intended speech. Here is some sample code to setup the utterance and speak: func speak(_ text: String, _ settings: AppSettings) { let utterance = AVSpeechUtterance(string: text) if let voice = AVSpeechSynthesisVoice(identifier: settings.selectedVoiceIdentifier) { utterance.voice = voice print("speak: voice assigned \(voice.audioFileSettings)") } else { print("speak: voice error") } utterance.rate = settings.speechRate utterance.pitchMultiplier = settings.speechPitch do { let audioSession = AVAudioSession.sharedInstance() try audioSession.setCategory(.playback, mode: .default, options: .duckOthers) try audioSession.setActive(true, options: .notifyOthersOnDeactivation) self.synthesizer.speak(utterance) return } catch let error { print("speak: Error setting up AVAudioSession: \(error.localizedDescription)") } } When running the app on the Mac, this is the kind of error we get with "com.apple.eloquence.en-US.Rocko" as the selectedVoiceIdentifier: speak: voice assgined [:] 2023-05-29 18:00:14.245513-0700 A.I.[9244:240554] [aqme] AQMEIO_HAL.cpp:742 kAudioDevicePropertyMute returned err 2003332927 2023-05-29 18:00:14.410477-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.412837-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.413774-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.414661-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.415544-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.416384-0700 A.I.[9244:240554] Could not retrieve voice [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null) 2023-05-29 18:00:14.416804-0700 A.I.[9244:240554] [AXTTSCommon] Audio Unit failed to start after 5 attempts. 2023-05-29 18:00:14.416974-0700 A.I.[9244:240554] [AXTTSCommon] VoiceProvider: Could not start synthesis for request SSML Length: 140, Voice: [AVSpeechSynthesisProviderVoice 0x6000033794f0] Name: Rocko, Identifier: com.apple.eloquence.en-US.Rocko, Supported Languages ( "en-US" ), Age: 0, Gender: 0, Size: 0, Version: (null), converted from tts request [TTSSpeechRequest 0x600002c29590] <speak><voice name="com.apple.eloquence.en-US.Rocko">How much wood would a woodchuck chuck if a wood chuck could chuck wood?</voice></speak> language: en-US footprint: premium rate: 0.500000 pitch: 1.000000 volume: 1.000000 2023-05-29 18:00:14.428421-0700 A.I.[9244:240360] [VOTSpeech] Failed to speak request with error: Error Domain=TTSErrorDomain Code=-4010 "(null)". Attempting to speak again with fallback identifier: com.apple.voice.compact.en-US.Samantha When we run AVSpeechSynthesisVoice.speechVoices(), the "com.apple.eloquence.en-US.Rocko" is absolutely in the list but fails to speak properly. Notice that the line: print("speak: voice assigned \(voice.audioFileSettings)") Shows: speak: voice assigned [:] The .audioFileSettings being empty seems to be a common factor for the voices that do not work properly on the Mac. For voices that do work, we see this kind of output and values in the .audioFileSettings: speak: voice assigned ["AVFormatIDKey": 1819304813, "AVLinearPCMBitDepthKey": 16, "AVLinearPCMIsBigEndianKey": 0, "AVLinearPCMIsFloatKey": 0, "AVSampleRateKey": 22050, "AVLinearPCMIsNonInterleaved": 0, "AVNumberOfChannelsKey": 1] So we added a function to check the .audioFileSettings for each voice returned by AVSpeechSynthesisVoice.speechVoices(): //The voices are set in init(): var voices = AVSpeechSynthesisVoice.speechVoices() ... func checkVoices() { DispatchQueue.global().async { [weak self] in guard let self = self else { return } let checkedVoices = self.voices.map { ($0.0, $0.0.audioFileSettings.count) } DispatchQueue.main.async { self.voices = checkedVoices } } } That looks simple enough, and does work to identify which voices have no data in their .audioFileSettings. But we have to run it asynchronously because on a real iPhone device, it takes more than 9 seconds and produces a tremendous amount of error spew to the console. 2023-06-02 10:56:59.805910-0700 A.I.[17186:910118] [catalog] Query for com.apple.MobileAsset.VoiceServices.VoiceResources failed: 2 2023-06-02 10:56:59.971435-0700 A.I.[17186:910118] [catalog] Query for com.apple.MobileAsset.VoiceServices.VoiceResources failed: 2 2023-06-02 10:57:00.122976-0700 A.I.[17186:910118] [catalog] Query for com.apple.MobileAsset.VoiceServices.VoiceResources failed: 2 2023-06-02 10:57:00.144430-0700 A.I.[17186:910116] [AXTTSCommon] MauiVocalizer: 11006 (Can't compile rule): regularExpression=\Oviedo(?=, (\x1b\\pause=\d+\\)?Florida)\b, message=unrecognized character follows \, characterPosition=1 2023-06-02 10:57:00.147993-0700 A.I.[17186:910116] [AXTTSCommon] MauiVocalizer: 16038 (Resource load failed): component=ttt/re, uri=, contentType=application/x-vocalizer-rettt+text, lhError=88602000 2023-06-02 10:57:00.148036-0700 A.I.[17186:910116] [AXTTSCommon] Error loading rules: 2147483648 ... This goes on and on and on ... There must be a better way?
Replies
7
Boosts
1
Views
3.6k
Activity
1w
iOS 26.4 — How to return from main app to host app after a keyboard-extension dictation round-trip, without private APIs?
I'm building a custom keyboard extension that offers voice dictation. Because keyboard extensions are constrained (memory cap ~30–48 MB, restricted audio session access), I delegate recording to my container app: User in a host app (e.g., Safari) taps the mic in my keyboard extension. The keyboard calls extensionContext.open(URL("myapp://dictation")) to launch the container app. The container app records audio via AVAudioEngine + SFSpeechRecognizer, writes the final transcript to the App Group, and signals completion via a Darwin notification. 4. The user is expected to be returned to the original host app (Safari) automatically so they can keep typing. The problem (step 4): On iOS 26.4 I can no longer identify which app was the host. Every previously-known path returns nil for the keyboard extension's host: parent.value(forKey: "_hostBundleID") → returns the literal string parent.value(forKey: "_hostApplicationBundleIdentifier") → returns NSNull xpc_connection_copy_bundle_id on the underlying XPC connection (via PKService.defaultService.personalities[…]) → returns NULL NSXPCConnection.processBundleIdentifier on extensionContext._extensionHostProxy._connection → returns nil proc_pidpath(hostPID, …) → EPERM from the keyboard sandbox LSApplicationWorkspace.frontmostApplication → selector unavailable from the extension RBSProcessHandle.handleForIdentifier:error: → returns an RBSServiceErrorDomain error Without the host's bundle ID, the container app has no way to call LSApplicationWorkspace.openApplicationWithBundleID: (the technique that worked on iOS 25 and earlier). UIApplication.suspend() correctly sends the container to background, but iOS treats us as a "fresh launch" — it returns the user to the Home Screen instead of Safari, because the container app was launched by an extension, not directly by Safari. KeyboardKit's maintainer reached the same conclusion (issue #1014) and shipped 10.4 without the feature. My questions: Is there a public, App-Store-safe API in iOS 26+ for a custom keyboard extension to identify its host application, or for the container app (launched via the extension's openURL) to identify which app initially hosted the extension that opened it? UIOpenURLContext.options.sourceApplication reports the extension's own container, not the actual host. 2. Is there a public mechanism for "return to source app" when the container app was launched by an extension's openURL? Equivalent to the ← Source affordance iOS shows for normal inter-app openURL, but triggered programmatically by the launched app. 3. Some popular keyboards (e.g., 微信输入法 / WeChat Keyboard) still appear to round-trip through their container app on iOS 26.4 and return the user to the original host — including the iOS ← WeChat back affordance in the host's status bar afterward. What's the recommended approach to achieve this? If it requires a specific scene-activation flow, NSUserActivity pattern, or extension-context configuration, please point at the relevant docs. 4. If there is no public path today, is FB22247647 (or a related radar) the right place to track this? Should developers in this position migrate to in-extension audio capture (which has its own significant constraints in keyboard extensions)? I'd much rather not rely on private APIs. Concrete guidance — or even an acknowledgment of which direction Apple intends — would help thousands of custom-keyboard developers who currently have a degraded voice-input experience on iOS 26.4+. Tested on iPhone 12 Pro Max running iOS 26.4.2 (build 23E261), Xcode 26.x, Swift 5. Thanks!
Replies
0
Boosts
0
Views
225
Activity
2w
iOS feasibility question: user-initiated wake-word detection during active session
Hi all, Technical architecture question for those experienced with iOS background audio / microphone constraints. I’m exploring an app concept where: the user explicitly starts a temporary active session during that session, on-device wake-word / keyword detection runs locally no audio is stored or transmitted during passive monitoring monitoring stops when the user ends the session The intended UX is that the user may then lock the phone or place it away while the active session remains in progress. Question: Is there any App Store-compliant architecture that would allow local keyword / wake-word detection to continue while the device is locked or the app is backgrounded during that active session? Or would iOS lifecycle / background execution rules make this infeasible for custom wake-word detection? Interested in practical experience around: AVAudioSession background audio modes on-device speech processing App Review acceptability Thanks in advance.
Replies
0
Boosts
0
Views
361
Activity
2w
I have the same, iOS 26.3.0
open FB22712056
Replies
1
Boosts
0
Views
248
Activity
May ’26
How to use the SpeechDetector Module
I am trying to use SpeechDetector Module in Speech framework along with SpeechTranscriber. and it is giving me an error Cannot convert value of type 'SpeechDetector' to expected element type 'Array.ArrayLiteralElement' (aka 'any SpeechModule') Below is how I am using it let speechDetector = Speech.SpeechDetector() let transcriber = SpeechTranscriber(locale: Locale.current, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange]) speechAnalyzer = try SpeechAnalyzer(modules: [transcriber,speechDetector])
Replies
5
Boosts
2
Views
708
Activity
Apr ’26
SpeechAnalyzer speech to text wwdc sample app
I am using the sample app from: https://developer.apple.com/videos/play/wwdc2025/277/?time=763 I installed this on an Iphone 15 Pro with iOS 26 beta 1. I was able to get good transcription with it. The app did crash sometimes when transcribing and I was going to post here with the details. I then installed iOS beta 2 and uninstalled the sample app. Now every time I try to run the sample app on the 15 Pro I get this message: SpeechAnalyzer: Input loop ending with error: Error Domain=SFSpeechErrorDomain Code=10 "Cannot use modules with unallocated locales [en_US (fixed en_US)]" UserInfo={NSLocalizedDescription=Cannot use modules with unallocated locales [en_US (fixed en_US)]} I can't continue our our work towards using SpeechAnalyzer now with this error. I have set breakpoints on all the catch handlers and it doesn't catch this error. My phone region is "United States"
Replies
22
Boosts
9
Views
3k
Activity
Apr ’26
SpeechAnalyzer error "asset not found after attempted download" for certain languages
I am trying to use the new SpeechAnalyzer framework in my Mac app, and am running into an issue for some languages. When I call AssetInstallationRequest.downloadAndInstall() for some languages, it throws an error: Error Domain=SFSpeechErrorDomain Code=1 "transcription.ar asset not found after attempted download." The ".ar" appears to be the language code, which in this case was Arabic. When I call AssetInventory.status(forModules:) before attempting the download, it is giving me a status of "downloading" (perhaps from an earlier attempt?). If this language was completely unsupported, I would expect it to return a status of "unsupported", so I'm not sure what's going on here. For other languages (Polish, for example) SpeechTranscriber.supportedLocale(equivalentTo:) is returning nil, so that seems like a clearly unsupported language. But I can't tell if the languages I'm trying, like Arabic, are supported and something is going wrong, or if this error represents something I can work around. Here's the relevant section of code. The error is thrown from downloadAndInstall(), so I never even get as far as setting up the SpeechAnalyzer itself. private func setUpAnalyzer() async throws { guard let sourceLanguage else { throw Error.languageNotSpecified } guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: sourceLanguage.rawValue)) else { throw Error.unsupportedLanguage } let transcriber = SpeechTranscriber(locale: locale, preset: .progressiveTranscription) self.transcriber = transcriber let reservedLocales = await AssetInventory.reservedLocales if !reservedLocales.contains(locale) && reservedLocales.count == AssetInventory.maximumReservedLocales { if let oldest = reservedLocales.last { await AssetInventory.release(reservedLocale: oldest) } } do { let status = await AssetInventory.status(forModules: [transcriber]) print("status: \(status)") if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) { try await installationRequest.downloadAndInstall() } } ...
Replies
9
Boosts
0
Views
1.3k
Activity
Apr ’26
SpeechAnalyzer > AnalysisContext lack of documentation
I'm using the new SpeechAnalyzer framework to detect certain commands and want to improve accuracy by giving context. Seems like AnalysisContext is the solution for this, but couldn't find any usage example. So I want to make sure that I'm doing it right or not. let context = AnalysisContext() context.contextualStrings = [ AnalysisContext.ContextualStringsTag("commands"): [ "set speed level", "set jump level", "increase speed", "decrease speed", ... ], AnalysisContext.ContextualStringsTag("vocabulary"): [ "speed", "jump", ... ] ] try await analyzer.setContext(context) With this implementation, it still gives outputs like "Set some speed level", "It's speed level", etc. Also, is it possible to make it expect number after those commands, in order to eliminate results like "set some speed level to" (instead of two).
Replies
2
Boosts
0
Views
727
Activity
Apr ’26
CarPlay: Voice Conversational Entitlement Details
With the Voice Conversational Entitlement, can a CarPlay app establish a turn-based audio interface that operates in two modes: Speaking mode: Audio Session configured for playback Buffered audio Listening mode: Switch Audio Session to .record or .playAndRecord Activate SFSpeechRecognizer And continue toggling back and forth. The app should listen for responses to questions or other audio cues, and assuming those answers are correct (based on analysis of results from SFSpeechRecognizer), continue this pattern of mode 1 and 2 alternating. This appears to be a valid use of this entitlement. Does this also require the Audio App Entitlement, or is the Voice Conversational Entitlement sufficient? Are there other obstacles to this type of app that I'm not seeing? Or perhaps this is technically possible, but unlikely to pass app store review?
Replies
0
Boosts
0
Views
334
Activity
Apr ’26
AXSpeech Thread Crash SEGV_ACCERR
Hi everyone, I've encountered a rare and strange crash in my app that I can't consistently reproduce. The crash seems to occur deep within Apple's internal frameworks, and I can't pinpoint which line of my own code is causing it. Here's the crash stack trace: #44 AXSpeech SIGSEGV SEGV_ACCERR 0 CoreFoundation ___CFCheckCFInfoPACSignature + 4 1 CoreFoundation _CFRunLoopSourceSignal + 28 2 Foundation _performQueueDequeue + 492 3 Foundation ___NSThreadPerformPerform + 88 4 CoreFoundation ___CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28 5 CoreFoundation ___CFRunLoopDoSource0 + 176 6 CoreFoundation ___CFRunLoopDoSources0 + 340 7 CoreFoundation ___CFRunLoopRun + 828 8 CoreFoundation _CFRunLoopRunSpecific + 608 9 Foundation -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 10 TextToSpeech _TTSCFAttributedStringCreateStringByBracketingAttributeWithString + 776 11 Foundation ___NSThread__start__ + 732 12 libsystem_pthread.dylib __pthread_start + 136 Sometimes, instead of line 10 referencing _TTSCFAttributedStringCreateStringByBracketingAttributeWithString, it shows: 10 TextToSpeech LogWarning(char const*, ...) + 7288 Has anyone experienced a similar issue or know what might be triggering this crash? Any guidance on how to investigate or resolve this would be greatly appreciated. Thank you!
Replies
7
Boosts
0
Views
2.6k
Activity
Mar ’26
SpeechTranscriber/SpeechAnalyzer being relatively slow compared to FoundationModel and TTS
So, I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be. Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS. E.g. InteractionStatistics: - listeningStarted: 21:24:23 4480 2423 - timeTillFirstAboveNoiseFloor: 01.794 - timeTillLastNoiseAboveFloor: 02.383 - timeTillFirstSpeechDetected: 02.399 - timeTillTranscriptFinalized: 04.510 - timeTillFirstMLModelResponse: 04.938 - timeTillMLModelResponse: 05.379 - timeTillTTSStarted: 04.962 - timeTillTTSFinished: 11.016 - speechLength: 06.054 - timeToResponse: 02.578 - transcript: This is a test. - mlModelResponse: Sure! I'm ready to help with your test. What do you need help with? Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s. Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly). I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now? I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
Replies
3
Boosts
0
Views
876
Activity
Mar ’26
Strange crash in iOS AudioToolboxCore when using AVSpeechSynthesizer in iOS 16
I'm getting Crashlytics crashes from some my users, deep in the Apple code: Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x00000007ec54b360 0 libobjc.A.dylib 0x3c9c objc_retain_x8 + 16 1 AudioToolboxCore 0x99580 auoop::RenderPipeUser::~RenderPipeUser() + 112 2 AudioToolboxCore 0xe6090 -[AUAudioUnit_XPC internalDeallocateRenderResources] + 92 3 AVFAudio 0x90a0 AUInterfaceBaseV3::Uninitialize() + 60 4 AVFAudio 0x4cbe0 AVAudioEngineGraph::PerformCommand(AUGraphNodeBaseV3&, AVAudioEngineGraph::ENodeCommand, void*, unsigned int) const + 768 5 AVFAudio 0x56b0c AVAudioEngineGraph::_Uninitialize(NSError**) + 132 6 AVFAudio 0x7834 AVAudioEngineImpl::Stop(NSError**) + 388 7 AVFAudio 0x636c -[AVAudioEngine dealloc] + 52 8 TextToSpeech 0x30674 _TTSNameForVoiceInformation + 20864 9 libobjc.A.dylib 0x20a4 object_cxxDestructFromClass(objc_object*, objc_class*) + 116 10 libobjc.A.dylib 0x6e00 objc_destructInstance + 80 11 libobjc.A.dylib 0x104fc _objc_rootDealloc + 80 12 TextToSpeech 0x2d2f4 _TTSNameForVoiceInformation + 7680 13 TextToSpeech 0x496c TTSVocalizerCopyURLForFallbackResource + 8540 14 TextToSpeech 0x26094 TTSSpeechUnitTestingMode + 5548 15 libAXSpeechManager.dylib 0x108b0 -[AXSpeechManager .cxx_destruct] + 192 16 libobjc.A.dylib 0x20a4 object_cxxDestructFromClass(objc_object*, objc_class*) + 116 17 libobjc.A.dylib 0x6e00 objc_destructInstance + 80 18 libobjc.A.dylib 0x104fc _objc_rootDealloc + 80 19 libAXSpeechManager.dylib 0x5298 -[AXSpeechManager dealloc] + 268 20 Foundation 0x3b8a4 __NSThreadPerformPerform + 272 21 CoreFoundation 0xd3208 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28 22 CoreFoundation 0xdf864 __CFRunLoopDoSource0 + 176 23 CoreFoundation 0x646c8 __CFRunLoopDoSources0 + 244 24 CoreFoundation 0x7a1c4 __CFRunLoopRun + 828 25 CoreFoundation 0x7f4dc CFRunLoopRunSpecific + 612 26 Foundation 0x420c4 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 27 libAXSpeechManager.dylib 0x13390 -[AXSpeechThread main] + 552 28 Foundation 0x5b634 __NSThread__start__ + 716 29 libsystem_pthread.dylib 0x16b8 _pthread_start + 148 30 libsystem_pthread.dylib 0xb88 thread_start + 8 It's most likely related to my use of AVSpeechSynthesizer. I do change some of the utterance fields, including the voice that's being used (which is set to a value from speechVoices()). UtilAudioIos_tts = AVSpeechSynthesizer() let utterance = AVSpeechUtterance utterance.voice = AVSpeechSynthesisVoice(identifier: voice.voiceCode) utterance.volume = volume utterance.pitchMultiplier = pitch utterance.rate = rate UtilAudioIos_tts!.speak(utterance) By coincidence or not, the following sometimes appears in the device log: 2023-05-30 20:35:29.948078+0100 <appname>[466:12882] [catalog] Unable to list voice folder and also, sometimes: 2023-05-30 20:37:35.345933+0100 <appname>[466:13298] [catalog] Query for com.apple.MobileAsset.VoiceServices.VoiceResources failed: 2 2023-05-30 20:37:35.360854+0100 rehearserfree[466:13433] [AXTTSCommon] MauiVocalizer: 11006 (Can't compile rule): regularExpression=\Oviedo(?=, (\x1b\\pause=\d+\\)?Florida)\b, message=unrecognized character follows \, characterPosition=1 2023-05-30 20:37:35.363163+0100 <appname>[466:13433] [AXTTSCommon] MauiVocalizer: 16038 (Resource load failed): component=ttt/re, uri=, contentType=application/x-vocalizer-rettt+text, lhError=88602000 2023-05-30 20:37:35.363182+0100 <appname>[466:13433] [AXTTSCommon] Error loading rules: 2147483648 All of these crashes have been on the various versions of iOS 16. Edit: I can't reproduce the crash myself - it's just some (not all) app users. The log entries above appear locally on my device (with no crash) but I can't see the logs of the users who have the crashes. Any idea what this might be caused by, or how to go about tracking the problem down?
Replies
6
Boosts
0
Views
2.7k
Activity
Mar ’26
SpeechTranscriber not supported
I've tried SpeechTranscriber with a lot of my devices (from iPhone 12 series ~ iPhone 17 series) without issues. However, SpeechTranscriber.isAvailable value is false for my iPhone 11 Pro. https://developer.apple.com/documentation/speech/speechtranscriber/isavailable I'am curious why the iPhone 11 Pro device is not supported. Are all iPhone 11 series not supported intentionally? Or is there any problem with my specific device? I've also checked the supportedLocales, and the value is an empty array. https://developer.apple.com/documentation/speech/speechtranscriber/supportedlocales
Replies
5
Boosts
0
Views
971
Activity
Mar ’26
Video Audio + Speech To Text
Hello, I am wondering if it is possible to have audio from my AirPods be sent to my speech to text service and at the same time have the built in mic audio input be sent to recording a video? I ask because I want my users to be able to say "CAPTURE" and I start recording a video (with audio from the built in mic) and then when the user says "STOP" I stop the recording.
Replies
2
Boosts
0
Views
1.5k
Activity
Mar ’26
AVAudioEngine fails to start during FaceTime call (error 2003329396)
Is it possible to perform speech-to-text using AVAudioEngine to capture microphone input while being on a FaceTime call at the same time? I tried implementing this, but whenever I attempt to start the  AVAudioEngine  while a FaceTime call is active, I get the following error: “The operation couldn’t be completed. (OSStatus error 2003329396)” I assume this might be due to microphone resource restrictions during FaceTime, but I’d like to confirm whether this limitation is at the system level or if there’s any possible workaround or entitlement that allows concurrent microphone access. Has anyone encountered this issue or found a solution?
Replies
2
Boosts
1
Views
1.2k
Activity
Mar ’26
Building Real-Time Voice Input on macOS 26 with SpeechAnalyzer + ScreenCaptureKit
We built an open-source macOS menu bar app that turns speech into text and pastes it into the active app — using SpeechAnalyzer for on-device transcription, ScreenCaptureKit + Vision for screen-aware context, and FluidAudio for speaker diarization in meeting mode. Here's what we learned shipping it on macOS 26. GitHub: github.com/Marvinngg/ambient-voice Architecture The app has two modes: hotkey dictation (press to talk, release to inject) and meeting recording (continuous transcription with a floating panel). Dictation Mode Audio capture uses AVCaptureSession (more on why below). The captured audio feeds into SpeechAnalyzer via an AsyncStream: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults, .alternativeTranscriptions], attributeOptions: [.audioTimeRange, .transcriptionConfidence] ) let analyzer = SpeechAnalyzer(modules: [transcriber]) let (inputSequence, inputBuilder) = AsyncStream.makeStream() try await analyzer.start(inputSequence: inputSequence) While recording, we capture a screenshot of the focused window using ScreenCaptureKit, run Vision OCR (VNRecognizeTextRequest), extract keywords, and inject them into SpeechAnalyzer as contextual bias: let context = AnalysisContext() context.contextualStrings[.general] = ocrKeywords try await analyzer.setContext(context) This improves accuracy for technical terms and proper nouns visible on screen. If your screen shows "SpeechAnalyzer", saying it out loud is more likely to be transcribed correctly. After transcription, an optional L2 step sends the text through a local LLM (ollama) for spoken-to-written cleanup, then CGEvent simulates Cmd+V to paste into the active app. Meeting Mode Meeting mode forks the same audio stream to two consumers: SpeechAnalyzer — real-time streaming transcription, displayed in a floating NSPanel FluidAudio buffer — accumulates 16kHz Float32 mono samples for batch speaker diarization after recording stops When the user ends the meeting, FluidAudio's performCompleteDiarization() runs on the accumulated audio. We align transcription segments with speaker segments using audioTimeRange overlap matching — each transcription segment gets assigned the speaker ID with the most time overlap. Results export to Markdown. Pitfalls We Hit on macOS 26 1. AVAudioEngine installTap doesn't fire with Bluetooth devices We started with AVAudioEngine.inputNode.installTap() for audio capture. It worked fine with built-in mics but the tap callback never fired with Bluetooth devices (tested with vivo TWS 4 Hi-Fi). Fix: switched to AVCaptureSession. The delegate callback captureOutput(_:didOutput:from:) fires reliably regardless of audio device. The tradeoff is you get CMSampleBuffer instead of AVAudioPCMBuffer, so you need a conversion step. 2. NSEvent addGlobalMonitorForEvents crashes Our global hotkey listener used NSEvent.addGlobalMonitorForEvents. On macOS 26, this crashes with a Bus error inside GlobalObserverHandler — appears to be a Swift actor runtime issue. Fix: switched to CGEventTap. Works reliably, but the callback runs on a CFRunLoop context, which Swift doesn't recognize as MainActor. 3. CGEventTap callbacks aren't on MainActor If your CGEventTap callback touches any @MainActor state, you'll get concurrency violations. The callback runs on whatever thread owns the CFRunLoop. Fix: bridge with DispatchQueue.main.async {} inside the tap callback before touching any MainActor state. 4. CGPreflightScreenCaptureAccess doesn't request permission We used CGPreflightScreenCaptureAccess() as a guard before calling ScreenCaptureKit. If it returned false, we'd bail out. The problem: this function only checks — it never triggers macOS to add your app to the Screen Recording permission list. Chicken-and-egg: you can't get permission because you never ask for it. Fix: call CGRequestScreenCaptureAccess() at app startup. This adds your app to System Settings → Screen Recording. Then let ScreenCaptureKit calls proceed without the preflight guard — SCShareableContent will also trigger the permission prompt on first use. 5. Ad-hoc signing breaks TCC permissions on every rebuild During development, codesign --sign - (ad-hoc) generates a different code directory hash on every build. macOS TCC tracks permissions by this hash, so every rebuild = new app identity = all permissions reset. Fix: sign with a stable certificate. If you have an Apple Development certificate, use that. The TeamIdentifier stays constant across rebuilds, so TCC permissions persist. We also discovered that launching via open WE.app (LaunchServices) instead of directly executing the binary is required — otherwise macOS attributes TCC permissions to Terminal, not your app. Benchmarks We ran end-to-end benchmarks on public datasets (Mac Mini M4 16GB, macOS 26): Transcription (SpeechAnalyzer, AliMeeting Chinese): • Near-field CER 34% (excluding outliers ~25%) • Far-field CER 40% (single channel, no beamforming, >30% overlap) • Processing speed 74-89x real-time Speaker diarization (FluidAudio offline): • AMI English 16 meetings: avg DER 23.2% (collar=0.25s, ignoreOverlap=True) • AliMeeting Chinese 8 meetings: DER 48.5% (including overlap regions) • Memory: RSS ~500MB, peak 730-930MB Full evaluation methodology, scripts, and raw results are in the repo. Open Source The project is MIT licensed: github.com/Marvinngg/ambient-voice It includes the macOS client (Swift 6.2, SPM), server-side distillation/training scripts (Python), and a complete evaluation framework with reproducible benchmarks. Feedback and contributions welcome.
Replies
0
Boosts
0
Views
640
Activity
Mar ’26
Building Real-Time Voice Input on macOS 26 with SpeechAnalyzer + ScreenCaptureKit
We built an open-source macOS menu bar app that turns speech into text and pastes it into the active app — using SpeechAnalyzer for on-device transcription, ScreenCaptureKit + Vision for screen-aware context, and FluidAudio for speaker diarization in meeting mode. Here's what we learned shipping it on macOS 26. GitHub: github.com/Marvinngg/ambient-voice Architecture The app has two modes: hotkey dictation (press to talk, release to inject) and meeting recording (continuous transcription with a floating panel). Dictation Mode Audio capture uses AVCaptureSession (more on why below). The captured audio feeds into SpeechAnalyzer via an AsyncStream: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults, .alternativeTranscriptions], attributeOptions: [.audioTimeRange, .transcriptionConfidence] ) let analyzer = SpeechAnalyzer(modules: [transcriber]) let (inputSequence, inputBuilder) = AsyncStream.makeStream() try await analyzer.start(inputSequence: inputSequence) While recording, we capture a screenshot of the focused window using ScreenCaptureKit, run Vision OCR (VNRecognizeTextRequest), extract keywords, and inject them into SpeechAnalyzer as contextual bias: let context = AnalysisContext() context.contextualStrings[.general] = ocrKeywords try await analyzer.setContext(context) This improves accuracy for technical terms and proper nouns visible on screen. If your screen shows "SpeechAnalyzer", saying it out loud is more likely to be transcribed correctly. After transcription, an optional L2 step sends the text through a local LLM (ollama) for spoken-to-written cleanup, then CGEvent simulates Cmd+V to paste into the active app. Meeting Mode Meeting mode forks the same audio stream to two consumers: SpeechAnalyzer — real-time streaming transcription, displayed in a floating NSPanel FluidAudio buffer — accumulates 16kHz Float32 mono samples for batch speaker diarization after recording stops When the user ends the meeting, FluidAudio's performCompleteDiarization() runs on the accumulated audio. We align transcription segments with speaker segments using audioTimeRange overlap matching — each transcription segment gets assigned the speaker ID with the most time overlap. Results export to Markdown. Pitfalls We Hit on macOS 26 1. AVAudioEngine installTap doesn't fire with Bluetooth devices We started with AVAudioEngine.inputNode.installTap() for audio capture. It worked fine with built-in mics but the tap callback never fired with Bluetooth devices (tested with vivo TWS 4 Hi-Fi). Fix: switched to AVCaptureSession. The delegate callback captureOutput(_:didOutput:from:) fires reliably regardless of audio device. The tradeoff is you get CMSampleBuffer instead of AVAudioPCMBuffer, so you need a conversion step. 2. NSEvent addGlobalMonitorForEvents crashes Our global hotkey listener used NSEvent.addGlobalMonitorForEvents. On macOS 26, this crashes with a Bus error inside GlobalObserverHandler — appears to be a Swift actor runtime issue. Fix: switched to CGEventTap. Works reliably, but the callback runs on a CFRunLoop context, which Swift doesn't recognize as MainActor. 3. CGEventTap callbacks aren't on MainActor If your CGEventTap callback touches any @MainActor state, you'll get concurrency violations. The callback runs on whatever thread owns the CFRunLoop. Fix: bridge with DispatchQueue.main.async {} inside the tap callback before touching any MainActor state. 4. CGPreflightScreenCaptureAccess doesn't request permission We used CGPreflightScreenCaptureAccess() as a guard before calling ScreenCaptureKit. If it returned false, we'd bail out. The problem: this function only checks — it never triggers macOS to add your app to the Screen Recording permission list. Chicken-and-egg: you can't get permission because you never ask for it. Fix: call CGRequestScreenCaptureAccess() at app startup. This adds your app to System Settings → Screen Recording. Then let ScreenCaptureKit calls proceed without the preflight guard — SCShareableContent will also trigger the permission prompt on first use. 5. Ad-hoc signing breaks TCC permissions on every rebuild During development, codesign --sign - (ad-hoc) generates a different code directory hash on every build. macOS TCC tracks permissions by this hash, so every rebuild = new app identity = all permissions reset. Fix: sign with a stable certificate. If you have an Apple Development certificate, use that. The TeamIdentifier stays constant across rebuilds, so TCC permissions persist. We also discovered that launching via open WE.app (LaunchServices) instead of directly executing the binary is required — otherwise macOS attributes TCC permissions to Terminal, not your app. Benchmarks We ran end-to-end benchmarks on public datasets (Mac Mini M4 16GB, macOS 26): Transcription (SpeechAnalyzer, AliMeeting Chinese): • Near-field CER 34% (excluding outliers ~25%) • Far-field CER 40% (single channel, no beamforming, >30% overlap) • Processing speed 74-89x real-time Speaker diarization (FluidAudio offline): • AMI English 16 meetings: avg DER 23.2% (collar=0.25s, ignoreOverlap=True) • AliMeeting Chinese 8 meetings: DER 48.5% (including overlap regions) • Memory: RSS ~500MB, peak 730-930MB Full evaluation methodology, scripts, and raw results are in the repo. Open Source The project is MIT licensed: github.com/Marvinngg/ambient-voice It includes the macOS client (Swift 6.2, SPM), server-side distillation/training scripts (Python), and a complete evaluation framework with reproducible benchmarks. Feedback and contributions welcome.
Replies
0
Boosts
0
Views
753
Activity
Mar ’26
SpeechAnalyzer.start(inputSequence:) fails with _GenericObjCError nilError, while the same WAV succeeds with start(inputAudioFile:)
I'm trying to use the new Speech framework for streaming transcription on macOS 26.3, and I can reproduce a failure with SpeechAnalyzer.start(inputSequence:). What is working: SpeechAnalyzer + SpeechTranscriber offline path using start(inputAudioFile:finishAfterFile:) same Spanish WAV file transcribes successfully and returns a coherent final result What is not working: SpeechAnalyzer + SpeechTranscriber stream path using start(inputSequence:) same WAV, replayed as AnalyzerInput(buffer:bufferStartTime:) fails once replay starts with: _GenericObjCError domain=Foundation._GenericObjCError code=0 detail=nilError I also tried: DictationTranscriber instead of SpeechTranscriber no realtime pacing during replay Both still fail in stream mode with the same error. So this does not currently look like a ScreenCaptureKit issue or a Python integration issue. I reduced it to a pure Swift CLI repro. Environment: macOS 26.3 (25D122) Xcode 26.3 Swift 6.2.4 Apple Silicon Mac Has anyone here gotten SpeechAnalyzer.start(inputSequence:) working reliably on macOS 26.x? If so, I'd be interested in any workaround or any detail that differs from the obvious setup: prepareToAnalyze(in:) bestAvailableAudioFormat(...) AnalyzerInput(buffer:bufferStartTime:) replaying a known-good WAV in chunks I already filed Feedback Assistant: FB22149971
Replies
1
Boosts
0
Views
545
Activity
Mar ’26
Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000056f023efbeb0
Application is getting Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000056f023efbeb0 Crashed: AXSpeech 0 libobjc.A.dylib 0x4820 objc_msgSend + 32 1 libsystem_trace.dylib 0x6c34 _os_log_fmt_flatten_object + 116 2 libsystem_trace.dylib 0x5344 _os_log_impl_flatten_and_send + 1884 3 libsystem_trace.dylib 0x4bd0 _os_log + 152 4 libsystem_trace.dylib 0x9c48 _os_log_error_impl + 24 5 TextToSpeech 0xd0a8c _pcre2_xclass_8 6 TextToSpeech 0x3bc04 TTSSpeechUnitTestingMode 7 TextToSpeech 0x3f128 TTSSpeechUnitTestingMode 8 AXCoreUtilities 0xad38 -[NSArray(AXExtras) ax_flatMappedArrayUsingBlock:] + 204 9 TextToSpeech 0x3eb18 TTSSpeechUnitTestingMode 10 TextToSpeech 0x3c948 TTSSpeechUnitTestingMode 11 TextToSpeech 0x48824 AXAVSpeechSynthesisVoiceFromTTSSpeechVoice 12 TextToSpeech 0x49804 AXAVSpeechSynthesisVoiceFromTTSSpeechVoice 13 Foundation 0xf6064 __NSThreadPerformPerform + 264 14 CoreFoundation 0x37acc CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION + 28 15 CoreFoundation 0x36d48 __CFRunLoopDoSource0 + 176 16 CoreFoundation 0x354fc __CFRunLoopDoSources0 + 244 17 CoreFoundation 0x34238 __CFRunLoopRun + 828 18 CoreFoundation 0x33e18 CFRunLoopRunSpecific + 608 19 Foundation 0x2d4cc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 20 TextToSpeech 0x24b88 TTSCFAttributedStringCreateStringByBracketingAttributeWithString 21 Foundation 0xb3154 NSThread__start + 732 com.livingMedia.AajTakiPhone_issue_3ceba855a8ad2d1af83655803dc13f70_crash_session_9081fa41ced440ae9a57c22cb432f312_DNE_0_v2_stacktrace.txt 22 libsystem_pthread.dylib 0x24d4 _pthread_start + 136 23 libsystem_pthread.dylib 0x1a10 thread_start + 8
Replies
4
Boosts
1
Views
1.8k
Activity
Mar ’26
AXSpeech Crash
I have a very terrible crash problem in my App when I use AVSpeechSynthesizer and I can't repetition it.Here is my code, It's a singleton- (void)stopSpeech { if ([self.synthesizer isPaused]) { return; } if ([self.synthesizer isSpeaking]) { BOOL isSpeech = [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate]; if (!isSpeech) { [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryWord]; } } self.stopBlock ? self.stopBlock() : nil; } -(AVSpeechSynthesizer *)synthesizer { if (!_synthesizer) { _synthesizer = [[AVSpeechSynthesizer alloc] init]; _synthesizer.delegate = self; } return _synthesizer; }When the user leaves the page, I call the stopSpeech method。Then I got a lot of crash messagesHere is a crash log:# Crashlytics - plaintext stacktrace downloaded by liweican at Mon, 13 May 2019 03:03:24 GMT # URL: https://fabric.io/youdao-dict/ios/apps/com.youdao.udictionary/issues/5a904ed88cb3c2fa63ad7ed3?time=last-thirty-days/sessions/b1747d91bafc4680ab0ca8e3a702c52c_DNE_0_v2 # Organization: zzz # Platform: ios # Application: U-Dictionary # Version: 3.0.5.4 # Bundle Identifier: com.youdao.UDictionary # Issue ID: 5a904ed88cb3c2fa63ad7ed3 # Session ID: b1747d91bafc4680ab0ca8e3a702c52c_DNE_0_v2 # Date: 2019-05-13T02:27:00Z # OS Version: 12.2.0 (16E227) # Device: iPhone 8 Plus # RAM Free: 17% # Disk Free: 64.6% #19. Crashed: AXSpeech 0 libsystem_pthread.dylib 0x19c15e5b8 pthread_mutex_lock$VARIANT$armv81 + 102 1 CoreFoundation 0x19c4cf84c CFRunLoopSourceSignal + 68 2 Foundation 0x19cfc7280 performQueueDequeue + 464 3 Foundation 0x19cfc680c __NSThreadPerformPerform + 136 4 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 5 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88 6 CoreFoundation 0x19c4d1b74 __CFRunLoopDoSources0 + 256 7 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004 8 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 9 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 10 libAXSpeechManager.dylib 0x1ac16c94c -[AXSpeechThread main] + 264 11 Foundation 0x19cfc66e4 __NSThread__start__ + 984 12 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 13 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 14 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 -- #0. com.apple.main-thread 0 libsystem_malloc.dylib 0x19c11ce24 small_free_list_remove_ptr_no_clear + 768 1 libsystem_malloc.dylib 0x19c11f094 small_malloc_from_free_list + 296 2 libsystem_malloc.dylib 0x19c11f094 small_malloc_from_free_list + 296 3 libsystem_malloc.dylib 0x19c11d63c small_malloc_should_clear + 224 4 libsystem_malloc.dylib 0x19c11adcc szone_malloc_should_clear + 132 5 libsystem_malloc.dylib 0x19c123c18 malloc_zone_malloc + 156 6 CoreFoundation 0x19c569ab4 __CFBasicHashRehash + 300 7 CoreFoundation 0x19c56b430 __CFBasicHashAddValue + 96 8 CoreFoundation 0x19c56ab9c CFBasicHashAddValue + 2160 9 CoreFoundation 0x19c49f3bc CFDictionaryAddValue + 260 10 CoreFoundation 0x19c572ee8 __54-[CFPrefsSource mergeIntoDictionary:sourceDictionary:]_block_invoke + 28 11 CoreFoundation 0x19c49f0b4 __CFDictionaryApplyFunction_block_invoke + 24 12 CoreFoundation 0x19c568b7c CFBasicHashApply + 116 13 CoreFoundation 0x19c49f090 CFDictionaryApplyFunction + 168 14 CoreFoundation 0x19c42f504 -[CFPrefsSource mergeIntoDictionary:sourceDictionary:] + 136 15 CoreFoundation 0x19c4bcd38 -[CFPrefsSearchListSource alreadylocked_getDictionary:] + 644 16 CoreFoundation 0x19c42e71c -[CFPrefsSearchListSource alreadylocked_copyValueForKey:] + 152 17 CoreFoundation 0x19c42e660 -[CFPrefsSource copyValueForKey:] + 60 18 CoreFoundation 0x19c579e88 __76-[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:]_block_invoke + 40 19 CoreFoundation 0x19c4bdff4 __108-[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:]_block_invoke + 272 20 CoreFoundation 0x19c4bda38 normalizeQuintuplet + 340 21 CoreFoundation 0x19c42c634 -[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:] + 108 22 CoreFoundation 0x19c42cec0 -[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:] + 148 23 CoreFoundation 0x19c57c2d0 _CFPreferencesCopyAppValueWithContainerAndConfiguration + 124 24 TextInput 0x1a450e550 -[TIPreferencesController valueForPreferenceKey:] + 460 25 UIKitCore 0x1c87c71f8 -[UIKeyboardPreferencesController handBias] + 36 26 UIKitCore 0x1c887275c -[UIKeyboardLayoutStar showKeyboardWithInputTraits:screenTraits:splitTraits:] + 320 27 UIKitCore 0x1c88f4240 -[UIKeyboardImpl finishLayoutChangeWithArguments:] + 492 28 UIKitCore 0x1c88f47c8 -[UIKeyboardImpl updateLayout] + 1208 29 UIKitCore 0x1c88eaad0 -[UIKeyboardImpl updateLayoutIfNecessary] + 448 30 UIKitCore 0x1c88eab9c -[UIKeyboardImpl setFrame:] + 140 31 UIKitCore 0x1c88d5d60 -[UIKeyboard activate] + 652 32 UIKitCore 0x1c894c90c -[UIKeyboardAutomatic activate] + 128 33 UIKitCore 0x1c88d5158 -[UIKeyboard setFrame:] + 296 34 UIKitCore 0x1c88d81b0 -[UIKeyboard _didChangeKeyplaneWithContext:] + 228 35 UIKitCore 0x1c88f4aa0 -[UIKeyboardImpl didMoveToSuperview] + 136 36 UIKitCore 0x1c8f2ad84 __45-[UIView(Hierarchy) _postMovedFromSuperview:]_block_invoke + 888 37 UIKitCore 0x1c8f2a970 -[UIView(Hierarchy) _postMovedFromSuperview:] + 760 38 UIKitCore 0x1c8f39ddc -[UIView(Internal) _addSubview:positioned:relativeTo:] + 1740 39 UIKitCore 0x1c88d5d84 -[UIKeyboard activate] + 688 40 UIKitCore 0x1c894c90c -[UIKeyboardAutomatic activate] + 128 41 UIKitCore 0x1c893b3a4 -[UIPeripheralHost(UIKitInternal) _reloadInputViewsForResponder:] + 1332 42 UIKitCore 0x1c8ae66d8 -[UIResponder(UIResponderInputViewAdditions) reloadInputViews] + 80 43 UIKitCore 0x1c8ae23bc -[UIResponder becomeFirstResponder] + 804 44 UIKitCore 0x1c8f2a560 -[UIView(Hierarchy) becomeFirstResponder] + 156 45 UIKitCore 0x1c8d93e84 -[UITextField becomeFirstResponder] + 244 46 UIKitCore 0x1c8d578dc -[UITextInteractionAssistant(UITextInteractionAssistant_Internal) setFirstResponderIfNecessary] + 192 47 UIKitCore 0x1c8d45d8c -[UITextSelectionInteraction oneFingerTap:] + 3136 48 UIKitCore 0x1c86e0bcc -[UIGestureRecognizerTarget _sendActionWithGestureRecognizer:] + 64 49 UIKitCore 0x1c86e8dd4 _UIGestureRecognizerSendTargetActions + 124 50 UIKitCore 0x1c86e6778 _UIGestureRecognizerSendActions + 316 51 UIKitCore 0x1c86e5ca4 -[UIGestureRecognizer _updateGestureWithEvent:buttonEvent:] + 760 52 UIKitCore 0x1c86d9d80 _UIGestureEnvironmentUpdate + 2180 53 UIKitCore 0x1c86d94b0 -[UIGestureEnvironment _deliverEvent:toGestureRecognizers:usingBlock:] + 384 54 UIKitCore 0x1c86d9290 -[UIGestureEnvironment _updateForEvent:window:] + 204 55 UIKitCore 0x1c8af14a8 -[UIWindow sendEvent:] + 3112 56 UIKitCore 0x1c8ad1534 -[UIApplication sendEvent:] + 340 57 UIKitCore 0x1c8b977c0 __dispatchPreprocessedEventFromEventQueue + 1768 58 UIKitCore 0x1c8b99eec __handleEventQueueInternal + 4828 59 UIKitCore 0x1c8b9311c __handleHIDEventFetcherDrain + 152 60 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 61 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88 62 CoreFoundation 0x19c4d1b24 __CFRunLoopDoSources0 + 176 63 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004 64 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 65 GraphicsServices 0x19e6cc79c GSEventRunModal + 104 66 UIKitCore 0x1c8ab7b68 UIApplicationMain + 212 67 UDictionary 0x10517e138 main (main.m:17) 68 libdyld.dylib 0x19bf928e0 start + 4 #1. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #2. com.apple.uikit.eventfetch-thread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 6 Foundation 0x19ce99e5c -[NSRunLoop(NSRunLoop) runUntilDate:] + 96 7 UIKitCore 0x1c8b9d540 -[UIEventFetcher threadMain] + 136 8 Foundation 0x19cfc66e4 __NSThread__start__ + 984 9 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 10 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 11 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #3. JavaScriptCore bmalloc scavenger 0 libsystem_kernel.dylib 0x19c0ddee4 __psynch_cvwait + 8 1 libsystem_pthread.dylib 0x19c15d4a4 _pthread_cond_wait$VARIANT$armv81 + 628 2 libc++.1.dylib 0x19b6b5090 std::__1::condition_variable::wait(std::__1::unique_lock&lt;std::__1::mutex&gt;&amp;) + 24 3 JavaScriptCore 0x1a36a2238 void std::__1::condition_variable_any::wait&lt;std::__1::unique_lock&lt;bmalloc::Mutex&gt; &gt;(std::__1::unique_lock&lt;bmalloc::Mutex&gt;&amp;) + 108 4 JavaScriptCore 0x1a36a622c bmalloc::Scavenger::threadRunLoop() + 176 5 JavaScriptCore 0x1a36a59a4 bmalloc::Scavenger::Scavenger(std::__1::lock_guard&lt;bmalloc::Mutex&gt;&amp;) + 10 6 JavaScriptCore 0x1a36a73e4 std::__1::__thread_specific_ptr&lt;std::__1::__thread_struct&gt;::set_pointer(std::__1::__thread_struct*) + 38 7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #4. WebThread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 WebCore 0x1a5126480 RunWebThread(void*) + 600 6 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 7 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 8 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #5. com.twitter.crashlytics.ios.MachExceptionServer 0 UDictionary 0x1058a5564 CLSProcessRecordAllThreads (CLSProcess.c:376) 1 UDictionary 0x1058a594c CLSProcessRecordAllThreads (CLSProcess.c:407) 2 UDictionary 0x1058952dc CLSHandler (CLSHandler.m:26) 3 UDictionary 0x1058906cc CLSMachExceptionServer (CLSMachException.c:446) 4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #6. com.apple.NSURLConnectionLoader 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 CFNetwork 0x19cae574c -[__CoreSchedulingSetRunnable runForever] + 216 6 Foundation 0x19cfc66e4 __NSThread__start__ + 984 7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #7. AVAudioSession Notify Thread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 AVFAudio 0x1a238a378 GenericRunLoopThread::Entry(void*) + 156 6 AVFAudio 0x1a23b4c60 CAPThread::Entry(CAPThread*) + 88 7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #8. WebCore: LocalStorage 0 libsystem_kernel.dylib 0x19c0ddee4 __psynch_cvwait + 8 1 libsystem_pthread.dylib 0x19c15d4a4 _pthread_cond_wait$VARIANT$armv81 + 628 2 JavaScriptCore 0x1a3668ce4 ***::ThreadCondition::timedWait(***::Mutex&amp;, ***::WallTime) + 80 3 JavaScriptCore 0x1a364f96c ***::ParkingLot::parkConditionallyImpl(void const*, ***::ScopedLambda&lt;bool ()&gt; const&amp;, ***::ScopedLambda&lt;void ()&gt; const&amp;, ***::TimeWithDynamicClockType const&amp;) + 2004 4 WebKitLegacy 0x1a67b6ea8 bool ***::Condition::waitUntil&lt;***::Lock&gt;(***::Lock&amp;, ***::TimeWithDynamicClockType const&amp;) + 184 5 WebKitLegacy 0x1a67b9ba4 std::__1::unique_ptr&lt;***::Function&lt;void ()&gt;, std::__1::default_delete&lt;***::Function&lt;void ()&gt; &gt; &gt; ***::MessageQueue&lt;***::Function&lt;void ()&gt; &gt;::waitForMessageFilteredWithTimeout&lt;***::MessageQueue&lt;***::Function&lt;void ()&gt; &gt;::waitForMessage()::'lambda'(***::Function&lt;void ()&gt; const&amp;)&gt;(***::MessageQueueWaitResult&amp;, ***::MessageQueue&lt;***::Function&lt;void ()&gt; &gt;::waitForMessage()::'lambda'(***::Function&lt;void ()&gt; const&amp;)&amp;&amp;, ***::WallTime) + 156 6 WebKitLegacy 0x1a67b91c0 WebCore::StorageThread::threadEntryPoint() + 68 7 JavaScriptCore 0x1a3666f88 ***::Thread::entryPoint(***::Thread::NewThreadContext*) + 260 8 JavaScriptCore 0x1a3668494 ***::wtfThreadEntryPoint(void*) + 12 9 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 10 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 11 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #9. com.apple.CoreMotion.MotionThread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 CoreFoundation 0x19c4cd0b0 CFRunLoopRun + 80 6 CoreMotion 0x1a1df0240 (Missing) 7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #10. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #11. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c1611f8 _pthread_wqthread + 532 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #12. com.apple.CFStream.LegacyThread 0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8 1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72 2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236 3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360 4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 5 CoreFoundation 0x19c4e5094 _legacyStreamRunLoop_workThread + 260 6 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 7 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 8 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #13. Thread 0 libsystem_pthread.dylib 0x19c163cd0 start_wqthread + 190 #14. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #15. Thread 0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8 1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340 2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4 #16. Thread 0 libsystem_kernel.dylib 0x19c0d3148 semaphore_timedwait_trap + 8 1 libdispatch.dylib 0x19bf50a4c _dispatch_sema4_timedwait$VARIANT$armv81 + 64 2 libdispatch.dylib 0x19bf513a8 _dispatch_semaphore_wait_slow + 72 3 libdispatch.dylib 0x19bf647c8 _dispatch_worker_thread + 344 4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #17. Thread 0 libsystem_kernel.dylib 0x19c0d3148 semaphore_timedwait_trap + 8 1 libdispatch.dylib 0x19bf50a4c _dispatch_sema4_timedwait$VARIANT$armv81 + 64 2 libdispatch.dylib 0x19bf513a8 _dispatch_semaphore_wait_slow + 72 3 libdispatch.dylib 0x19bf647c8 _dispatch_worker_thread + 344 4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #18. Thread 0 libsystem_kernel.dylib 0x19c0d3148 semaphore_timedwait_trap + 8 1 libdispatch.dylib 0x19bf50a4c _dispatch_sema4_timedwait$VARIANT$armv81 + 64 2 libdispatch.dylib 0x19bf513a8 _dispatch_semaphore_wait_slow + 72 3 libdispatch.dylib 0x19bf647c8 _dispatch_worker_thread + 344 4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #19. Crashed: AXSpeech 0 libsystem_pthread.dylib 0x19c15e5b8 pthread_mutex_lock$VARIANT$armv81 + 102 1 CoreFoundation 0x19c4cf84c CFRunLoopSourceSignal + 68 2 Foundation 0x19cfc7280 performQueueDequeue + 464 3 Foundation 0x19cfc680c __NSThreadPerformPerform + 136 4 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 5 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88 6 CoreFoundation 0x19c4d1b74 __CFRunLoopDoSources0 + 256 7 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004 8 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 9 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 10 libAXSpeechManager.dylib 0x1ac16c94c -[AXSpeechThread main] + 264 11 Foundation 0x19cfc66e4 __NSThread__start__ + 984 12 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 13 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 14 libsystem_pthread.dylib 0x19c163cdc thread_start + 4 #20. AXSpeech 0 (Missing) 0x1071ba524 (Missing) 1 (Missing) 0x1071b3e7c (Missing) 2 (Missing) 0x10718fba4 (Missing) 3 (Missing) 0x107184bc8 (Missing) 4 libdyld.dylib 0x19bf95908 dlopen + 176 5 CoreFoundation 0x19c5483e8 _CFBundleDlfcnLoadBundle + 140 6 CoreFoundation 0x19c486918 _CFBundleLoadExecutableAndReturnError + 352 7 Foundation 0x19ced5734 -[NSBundle loadAndReturnError:] + 428 8 TextToSpeech 0x1abfff800 TTSSpeechUnitTestingMode + 1020 9 libdispatch.dylib 0x19bf817d4 _dispatch_client_callout + 16 10 libdispatch.dylib 0x19bf52040 _dispatch_once_callout + 28 11 TextToSpeech 0x1abfff478 TTSSpeechUnitTestingMode + 116 12 libobjc.A.dylib 0x19b7173cc CALLING_SOME_+initialize_METHOD + 24 13 libobjc.A.dylib 0x19b71cee0 initializeNonMetaClass + 296 14 libobjc.A.dylib 0x19b71e640 initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt&lt;false&gt;&amp;, bool) + 260 15 libobjc.A.dylib 0x19b7265a4 lookUpImpOrForward + 244 16 libobjc.A.dylib 0x19b733858 _objc_msgSend_uncached + 56 17 libAXSpeechManager.dylib 0x1ac167324 -[AXSpeechManager _initialize] + 68 18 Foundation 0x19cfc68d4 __NSThreadPerformPerform + 336 19 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 20 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88 21 CoreFoundation 0x19c4d1b74 __CFRunLoopDoSources0 + 256 22 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004 23 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436 24 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300 25 libAXSpeechManager.dylib 0x1ac16c94c -[AXSpeechThread main] + 264 26 Foundation 0x19cfc66e4 __NSThread__start__ + 984 27 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128 28 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44 29 libsystem_pthread.dylib 0x19c163cdc thread_start + 4I change my code like this, It still has the same problem- (void)stopSpeech { if (self.synthesizer != nil &amp;&amp; [self.synthesizer isPaused]) { return; } // if ([self.synthesizer isSpeaking]) { // BOOL isSpeech = [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate]; // if (!isSpeech) { // [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryWord]; // } // } if (self.synthesizer != nil) { [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate]; // if (!isSpeech) { // [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryWord]; // } self.stopBlock ? self.stopBlock() : nil; } }
Replies
1
Boosts
1
Views
2.7k
Activity
Mar ’26