Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow up question about interfacing with the client from the service #5

Open
stevebegin opened this issue Feb 14, 2024 · 3 comments
Open

Comments

@stevebegin
Copy link

Thanks so much for your previous answer and example code, it helped me a ton and I now have a good prototype for the service I'm aiming for.

I have run however into a major issue.

Here's the setup. My service needs to send some data to the client really fast. I do this using the sendOnewayMessage method on the XPCConnection created from an XPCEndpoint. The message that is sent is a struct conforming to Codable with a single property of type NSBitmapImageRep.

public struct Frame: Codable {
  public let bitmap: NSBitmapImageRep

  public enum CodingKeys: String, CodingKey {
    case bitmap
  }

  public init(bitmap: NSBitmapImageRep) {
    self.bitmap = bitmap
  }

  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let data = try container.decode(Data.self, forKey: CodingKeys.bitmap)
    guard let bitmap = NSBitmapImageRep(data: data) else {
      throw FrameGraberError.decodingFailed
    }

    self.bitmap = bitmap
  }

  public func encode(to encoder: Encoder) throws {
    var container = encoder.container(keyedBy: CodingKeys.self)
      let data = bitmap.tiffRepresentation
      guard let data else {
        throw FrameGraberError.encodingFailed
    }

    try container.encode(data, forKey: CodingKeys.bitmap)
  }
}

In a Unit test, I have used a 2000 x 2000 8 bits bitmap (4 MB) to measure the performance of those 2 blocks of code:

This runs with a Time: 0.006 sec

let data = bitmap.tiffRepresentation
let bitmap = NSBitmapImageRep(data: data)

And this runs with a Time: 0.467 sec

let encoded = try! XPCEncoder().encode(frame)
let decoded = try! XPCDecoder().decode(type: Frame.self, from: encoded)

In the end, I'm not sure what my question is, but maybe you could enlighten me on what is going on in both the Decoder and the Encoder and why they are this slow.

@CharlesJS
Copy link
Owner

XPC(En|De)Coder are mostly just wrapping the [libxpc functions] for encapsulating data to send over the wire. The numbers you posted do seem somewhat concerning. How complex is your Frame type? Do you have a test case to reproduce this? Looking through a sample may provide some insight; if the time taken is coming mostly from Apple's XPC functions, there may not be a lot we can do about it, but if it turns out it's coming from somewhere in the code, who knows, there may be something we can optimize.

I'm also curious how long it takes if you encode the output from bitmap.tiffRepresentation directly. Is it still slow?

@stevebegin
Copy link
Author

Hi Charles,
The Frame type is just a wrapper around NSBitmapImageRep to provide Codable conformance and so it is really simple.

public struct Frame: Codable {
  public let bitmap: NSBitmapImageRep

  public enum CodingKeys: String, CodingKey {
    case bitmap
  }

  public init(bitmap: NSBitmapImageRep) {
    self.bitmap = bitmap
  }

  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let data = try container.decode(Data.self, forKey: CodingKeys.bitmap)
    guard let bitmap = NSBitmapImageRep(data: data) else {
      throw FrameGraberError.decodingFailed
    }

    self.bitmap = bitmap
  }

  public func encode(to encoder: Encoder) throws {
    var container = encoder.container(keyedBy: CodingKeys.self)
      let data = bitmap.tiffRepresentation
      guard let data else {
        throw FrameGraberError.encodingFailed
    }

    try container.encode(data, forKey: CodingKeys.bitmap)
  }
}

I tried encoding/decoding the data from bitmap.tiffRepresentation directly and it runs in 0.462 sec on my machine. Here's the test case:

    func testPerformanceXPCEncoderDecoder() throws {
        
        let bitmap = NSBitmapImageRep(
            bitmapDataPlanes: nil,
            pixelsWide: 2000,
            pixelsHigh: 2000,
            bitsPerSample: 8,
            samplesPerPixel: 1,
            hasAlpha: false,
            isPlanar: false,
            colorSpaceName: .deviceWhite,
            bytesPerRow: 0,
            bitsPerPixel: 0
        )!
                
        let encoder = XPCEncoder()
        let decoder = XPCDecoder()
        
        let data = bitmap.tiffRepresentation
//        let reBitmap = NSBitmapImageRep(data: data!)
        
        self.measure {
            let encoded = try! encoder.encode(data)
            let decoded = try! decoder.decode(type: Data.self, from: encoded)
        }
    }

Thanks for taking the time to look into this. I would really prefer to keep using your package over the vanilla XPC, but this is such an important bottleneck that it would be impossible right now.

@CharlesJS
Copy link
Owner

CharlesJS commented Feb 17, 2024

Hi Steve,

The trouble with using self.measure for benchmarking is that it only really can get you a rough idea of relative performance, but due to overhead imposed by Xcode, lldb, and the testing framework, plus the fact that Xcode's tests, by default, build in debug mode (Swift is particularly sensitive to release-mode optimizations) means that it doesn't really show you how your code will perform in practice.

I get similar results to you with self.measure. Rolling my own measure function, and building as a command line tool, behaves differently:

import SwiftyXPC
import Cocoa

func measure(_ name: String, closure: () -> Void) {
    let startTime = Date()
    closure()
    print("\(name) took: \(Date().timeIntervalSince(startTime))")
}

let bitmap = NSBitmapImageRep(
    bitmapDataPlanes: nil,
    pixelsWide: 2000,
    pixelsHigh: 2000,
    bitsPerSample: 8,
    samplesPerPixel: 1,
    hasAlpha: false,
    isPlanar: false,
    colorSpaceName: .deviceWhite,
    bytesPerRow: 0,
    bitsPerPixel: 0
)!

let encoder = XPCEncoder()
let decoder = XPCDecoder()

let data = bitmap.tiffRepresentation

measure("encoding and decoding") {
    let encoded = try! encoder.encode(data)
    let decoded = try! decoder.decode(type: Data.self, from: encoded)
    withExtendedLifetime(decoded) {} // to silence some warnings
}

Running this in both debug and release mode, I get:

$ swift build; .build/debug/perftest             
Building for debugging...
Build complete! (0.20s)
encoding and decoding took: 0.4505230188369751

$ swift build -c release; .build/release/perftest
Building for production...
Build complete! (0.20s)
encoding and decoding took: 0.0896080732345581

You can see the difference that release mode makes.

We can break it down by encode and decode:

var encoded: xpc_object_t!

measure("encoding") {
    encoded = try! encoder.encode(data)
}

measure("decoding") {
    let decoded = try! decoder.decode(type: Data.self, from: encoded)
    withExtendedLifetime(decoded) {}
}

This gets us:

$ swift build -c release; .build/release/perftest
Building for production...
Build complete! (0.08s)
encoding took: 0.03677201271057129
decoding took: 0.053681015968322754

That's still a bit slower than vanilla XPC; profiling reveals that the reason is that since Codable doesn't have an inherent concept of a data blob type (probably because its common use case is to encode to JSON), it basically sees that Data is a collection of UInt8, and wants to encode it as an array of numbers, and it does that by cycling through the collection, which means that in your case it loops through adding a byte about 4 million times.

Fortunately, UnkeyedEncodingContainer does have some sequence-based encoding methods; unfortunately, UnkeyedDecodingContainer doesn't have that, and even if it did, it'd be up to Data's implementation to call it. We can of course initialize a Data by hand, but since Data is part of Foundation, it would require linking the library against Foundation to do it.

For the time being, I've pushed up a fix using the new runtime parametrized protocols in Swift 5.7 (taking advantage of the fact that RangeReplaceableCollection has an initializer that takes a sequence). Unfortunately, these are only available in macOS 13.0 and higher, so you'll still get the old behavior on older macOS versions for now, but if you're on 13.0 or better, performance seems pretty good:

$ swift build -c release; .build/release/perftest
Building for production...
Build complete! (0.22s)
encoding took: 0.0009380578994750977
decoding took: 0.0003399848937988281

Let me know what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants