Skip to content
This repository has been archived by the owner on Jul 1, 2023. It is now read-only.

On macOS, simple models can trigger a segfault within X10 #993

Open
BradLarson opened this issue Jun 8, 2020 · 0 comments
Open

On macOS, simple models can trigger a segfault within X10 #993

BradLarson opened this issue Jun 8, 2020 · 0 comments
Assignees

Comments

@BradLarson
Copy link
Contributor

Some simple image classification models can trigger a segfault when using the XLA device specifically on macOS. For now, we're explicitly having them use the eager-mode device instead until this can be fixed.

The crash produces a backtrace like the following:

* thread tensorflow/swift-models#1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000104be39bb libx10.dylib`xla::XrtComputationClient::XrtData::GetOpaqueHandle() + 11
    frame tensorflow/swift-models#1: 0x0000000102d892ce libx10.dylib`swift_xla::XLATensor::RunPostOrder(std::__1::vector<swift_xla::XLATensor, std::__1::allocator<swift_xla::XLATensor> > const&, absl::Span<unsigned long const>) + 718
    frame tensorflow/swift-models#2: 0x0000000102d86605 libx10.dylib`swift_xla::XLATensor::SyncTensorsGraphInternal(std::__1::vector<swift_xla::XLATensor, std::__1::allocator<swift_xla::XLATensor> >*, absl::Span<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const>, swift_xla::XLATensor::SyncTensorsConfig const&) + 181
    frame tensorflow/swift-models#3: 0x0000000102d87b3a libx10.dylib`swift_xla::XLATensor::SyncTensorsGraph(std::__1::vector<swift_xla::XLATensor, std::__1::allocator<swift_xla::XLATensor> >*, absl::Span<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const>, bool, bool) + 122
    frame tensorflow/swift-models#4: 0x0000000102d8ad46 libx10.dylib`swift_xla::XLATensor::SyncLiveTensorsGraph(swift_xla::Device const*, absl::Span<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const>, bool) + 102
    frame tensorflow/swift-models#5: 0x0000000102c75366 libx10.dylib`XLATensor_LazyTensorBarrier + 118
    frame tensorflow/swift-models#6: 0x000000010256e22a libswiftTensorFlow.dylib`closure tensorflow/swift-models#1 (inout __C.DeviceList) -> () in TensorFlow.LazyTensorBarrier(on: Swift.Optional<TensorFlow.Device>, devices: Swift.Array<TensorFlow.Device>, wait: Swift.Bool) -> () + 314
    frame tensorflow/swift-models#7: 0x000000010256d72c libswiftTensorFlow.dylib`reabstraction thunk helper from @callee_guaranteed (@inout __C.DeviceList) -> () to @escaping @callee_guaranteed (@inout __C.DeviceList) -> (@out ()) + 12
    frame tensorflow/swift-models#8: 0x000000010256e2a1 libswiftTensorFlow.dylib`reabstraction thunk helper from @callee_guaranteed (@inout __C.DeviceList) -> () to @escaping @callee_guaranteed (@inout __C.DeviceList) -> (@out ())partial apply forwarder with unmangled suffix ".2" + 17
    frame tensorflow/swift-models#9: 0x000000010256caa1 libswiftTensorFlow.dylib`closure tensorflow/swift-models#2 (Swift.UnsafeBufferPointer<__C.CDevice>) -> τ_1_0 in Swift.Array<τ_0_0 where τ_0_0 == TensorFlow.Device>.withDeviceList<τ_0_0>((inout __C.DeviceList) -> τ_1_0) -> τ_1_0 + 177
    frame tensorflow/swift-models#10: 0x000000010256ef0f libswiftTensorFlow.dylib`partial apply forwarder for closure tensorflow/swift-models#2 (Swift.UnsafeBufferPointer<__C.CDevice>) -> τ_1_0 in Swift.Array<τ_0_0 where τ_0_0 == TensorFlow.Device>.withDeviceList<τ_0_0>((inout __C.DeviceList) -> τ_1_0) -> τ_1_0 + 47
    frame tensorflow/swift-models#11: 0x0000000101323776 libswiftCore.dylib`Swift._ArrayBuffer.withUnsafeBufferPointer<τ_0_0>((Swift.UnsafeBufferPointer<τ_0_0>) throws -> τ_1_0) throws -> τ_1_0 + 230
    frame tensorflow/swift-models#12: 0x00000001013343a9 libswiftCore.dylib`Swift.Array.withUnsafeBufferPointer<τ_0_0>((Swift.UnsafeBufferPointer<τ_0_0>) throws -> τ_1_0) throws -> τ_1_0 + 9
    frame tensorflow/swift-models#13: 0x000000010256c7d6 libswiftTensorFlow.dylib`Swift.Array<τ_0_0 where τ_0_0 == TensorFlow.Device>.withDeviceList<τ_0_0>((inout __C.DeviceList) -> τ_1_0) -> τ_1_0 + 422
    frame tensorflow/swift-models#14: 0x000000010256e0da libswiftTensorFlow.dylib`TensorFlow.LazyTensorBarrier(on: Swift.Optional<TensorFlow.Device>, devices: Swift.Array<TensorFlow.Device>, wait: Swift.Bool) -> () + 138
    frame tensorflow/swift-models#15: 0x000000010054c480 LeNet-MNIST`main at main.swift:88:9 [opt]
    frame tensorflow/swift-models#16: 0x00007fff704657fd libdyld.dylib`start + 1
    frame tensorflow/swift-models#17: 0x00007fff704657fd libdyld.dylib`start + 1
@shabalind shabalind transferred this issue from tensorflow/swift-models Jun 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants