崩溃与“暂停恢复部分功能”

huangapple go评论149阅读模式
英文:

Crash with “suspend resume partial function”

问题

我们一直在收到包含短语“挂起恢复部分功能”的崩溃报告,似乎与我们使用的Swift并发有关。以下是一个示例堆栈跟踪:

崩溃: com.apple.root.user-initiated-qos.cooperative
EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x0000000cf855c2e0
0  libobjc.A.dylib                0x4174 objc_release + 16
1  libobjc.A.dylib                0x4174 objc_release_x0 + 16
2  libswiftCore.dylib             0x3ad2c8 swift_arrayDestroy + 124
3  libswiftCore.dylib             0x98f38 _ContiguousArrayStorage.__deallocating_deinit + 96
4  libswiftCore.dylib             0x3bd628 _swift_release_dealloc + 56
5  libswiftCore.dylib             0x3be44c bool swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1> >::doDecrementSlow<(swift::PerformDeinit)1>(swift::RefCountBitsT<(swift::RefCountInlinedness)1>, unsigned int) + 132
6  MyAppName                      0x32e08c (3) suspend resume partial function for ProductUpdater.updateProducts() + 138 (Products.swift:138)
7  libswift_Concurrency.dylib     0x41948 swift::runJobInEstablishedExecutorContext(swift::Job*) + 416
8  libswift_Concurrency.dylib     0x42868 swift_job_runImpl(swift::Job*, swift::ExecutorRef) + 72
9  libdispatch.dylib              0x15944 _dispatch_root_queue_drain + 396
10 libdispatch.dylib              0x16158 _dispatch_worker_thread2 + 164
11 libsystem_pthread.dylib        0xda0 _pthread_wqthread + 228
12 libsystem_pthread.dylib        0xb7c start_wqthread + 8

Crashlytics似乎认为第6次调用是问题所在,并将崩溃标记为“(3) ProductUpdater.updateProducts() 的挂起恢复部分函数”。

“挂起恢复部分功能”这个短语的含义是什么?它表示程序在执行时发生了什么问题?以下是相关代码的非常粗略的草图:

struct Product: Codable, Equatable {
    var id: String
    var name: String
    var price: Int
}

struct ProductWrapper: Codable, Equatable {
    var metadata: String
    var products: [Product]

    static let none = ProductWrapper(metadata: "none", products: [])
}

class ProductUpdater {
    var userIsSignedIn: Bool = false
    var products: ProductWrapper = .none

    init() {
        Task {
            await updateProducts()
        }
    }

    private func updateProducts() async {
        // 如果用户未登录,我们无法获取产品。
        guard userIsSignedIn else {
            products = .none
            return
        }

        do {
            // 下一行是原始代码中的第138行:
            products = try await NetworkService.fetchProducts()
        } catch {
            print("无法获取产品:\(error)")
        }
    }
}

根据GitHub问题的此评论中的模糊建议,我尝试将products的赋值包装在一个看似毫无意义的延续中:

private func updateProducts() async {
    // 如果用户未登录,我们无法获取产品。
    guard userIsSignedIn else {
        products = .none
        return
    }

    do {
        let updatedProducts = try await NetworkService.fetchProducts()
        await withCheckedContinuation { continuation in
            products = updatedProducts
            continuation.resume()
        }
    } catch {
        print("无法获取产品:\(error)")
    }
}

但这似乎并未解决问题。我们使用Xcode Cloud来生成构建,它使用的是Xcode 14.3.1(Swift 5.8.1)。

英文:

We have been getting crash reports that include the phrase “suspend resume partial function” and which seem to come from our use of Swift Concurrency. An example stack trace is

Crashed: com.apple.root.user-initiated-qos.cooperative
EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x0000000cf855c2e0
0  libobjc.A.dylib                0x4174 objc_release + 16
1  libobjc.A.dylib                0x4174 objc_release_x0 + 16
2  libswiftCore.dylib             0x3ad2c8 swift_arrayDestroy + 124
3  libswiftCore.dylib             0x98f38 _ContiguousArrayStorage.__deallocating_deinit + 96
4  libswiftCore.dylib             0x3bd628 _swift_release_dealloc + 56
5  libswiftCore.dylib             0x3be44c bool swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1> >::doDecrementSlow<(swift::PerformDeinit)1>(swift::RefCountBitsT<(swift::RefCountInlinedness)1>, unsigned int) + 132
6  MyAppName                      0x32e08c (3) suspend resume partial function for ProductUpdater.updateProducts() + 138 (Products.swift:138)
7  libswift_Concurrency.dylib     0x41948 swift::runJobInEstablishedExecutorContext(swift::Job*) + 416
8  libswift_Concurrency.dylib     0x42868 swift_job_runImpl(swift::Job*, swift::ExecutorRef) + 72
9  libdispatch.dylib              0x15944 _dispatch_root_queue_drain + 396
10 libdispatch.dylib              0x16158 _dispatch_worker_thread2 + 164
11 libsystem_pthread.dylib        0xda0 _pthread_wqthread + 228
12 libsystem_pthread.dylib        0xb7c start_wqthread + 8

Crashlytics seems to think that call 6 is the problem here, and it labels the crash as “(3) suspend resume partial function for ProductUpdater.updateProducts()”.

What does the phrase “suspend resume partial function” mean, and what does it indicate the program was doing that caused a problem? Here is a very rough sketch of the relevant part of the code:

struct Product: Codable, Equatable {
    var id: String
    var name: String
    var price: Int
}

struct ProductWrapper: Codable, Equatable {
    var metadata: String
    var products: [Product]

    static let none = ProductWrapper(metadata: "none", products: [])
}

class ProductUpdater {
    var userIsSignedIn: Bool = false
    var products: ProductWrapper = .none

    init() {
        Task {
            await updateProducts()
        }
    }

    private func updateProducts() async {
        // If there's no user signed in, we can't fetch the products.
        guard userIsSignedIn else {
            products = .none
            return
        }

        do {
            // This next line was line 138 in the original code:
            products = try await NetworkService.fetchProducts()
        } catch {
            print("Could not fetch products: \(error)")
        }
    }
}

Following the vague advice in this comment on a GitHub issue, I tried wrapping the assignment of products in a seemingly pointless continuation,

    private func updateProducts() async {
        // If there's no user signed in, we can't fetch the products.
        guard userIsSignedIn else {
            products = .none
            return
        }

        do {
            let updatedProducts = try await NetworkService.fetchProducts()
            await withCheckedContinuation { continuation in
                products = updatedProducts
                continuation.resume()
            }
        } catch {
            print("Could not fetch products: \(error)")
        }
    }

but this doesn’t seem to have solved the issue.

We use Xcode Cloud to produce our builds, and it is using Xcode 14.3.1 (Swift 5.8.1).

答案1

得分: 1

"suspend resume partial function" 意味着在 await 之后恢复了一个任务。部分函数是指在两个 await 调用之间的函数部分(或者函数的开头或结尾)。

这个崩溃是由于数组上的过度释放引起的。我预计你在多个线程上访问了 products(或者修改了 ProductWrapper 的任何部分)。

但是,我不明白这段代码是如何编译通过的。init 在没有初始化 products 的情况下返回,这是无效的。也许你遇到了编译器的错误。你需要确保在返回之前初始化所有属性。

此外,你正在通过一个随机线程通过任务修改产品,这是无效的。

我认为你想要进行以下更改:

// products 需要一个初始值
var products: ProductWrapper = ProductWrapper(metadata: "", products: [])

你可能希望将 ProductUpdater 声明为 MainActor:

@MainActor
class ProductUpdater {

这将确保你对 products 的访问和修改是原子的(它还会将 init 中的任务推送到主执行器)。如果你想要在另一个线程上执行,你几乎肯定应该将其声明为常规 actor 而不是类。

如果这仍然是一个问题,你需要在 products 上添加某种锁定,以防止在多个线程上访问它。我建议在你的方案中打开线程检查器(在诊断下)以查看是否还有其他地方存在这个问题。

英文:

“suspend resume partial function” means that a Task was resumed after an await. A partial function is the portion of a function between two await calls (or the beginning or end of the function).

This crash is due to an over-release on an array. I would expect that you're accessing products (or mutating any part of ProductWrapper) on multiple threads.

I don't understand how this code compiles, though. init returns without initializing products, which is invalid. Maybe you're sliding past a compiler bug. You need to make sure that all properties are initialized before returning.

You're also modifying products via a Task on a random thread, which isn't valid.

I believe you want the following changes:

//  products needs an initial value
var products: ProductWrapper = ProductWrapper(metadata: "", products: [])

You probably want to make ProductUpdater MainActor:

@MainActor
class ProductUpdater {

This will ensure that your access and modifications of products are done atomically (it will push the Task in init onto the main actor as well). If you want it on another thread, you should almost certainly make it a regular actor rather than a class.

If that's still a problem, you'll need to add some kind of locking to products so it doesn't get accessed on multiple threads. I encourage turning on the thread sanitizer in your scheme (under Diagnostics) to see if there are additional places you have this problem.

huangapple
  • 本文由 发表于 2023年6月29日 04:48:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/76576612.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定