adjoe Engineers’ Blog
 /  iOS  /  How to Debug iOS Crashes with MetricKit
iOS

How to Debug iOS Crashes with MetricKit

Software crashes are the bane of any user’s experience – and developers alike. They disrupt workflows, cause frustration, and reflect poorly on the product and the development team. At their worst, crashes result in significant data loss and instant user dissatisfaction.  

As engineers, it’s our responsibility to minimize these occurrences and respond effectively when they do happen. A key part of this is gathering comprehensive crash reports that provide insights into the root causes of these issues. These are ideally enriched with as many insights into their root cause as possible. This enables us to build metrics about our code’s reliability and debug nasty crashes quickly.

At adjoe, engineering practices prioritize zero fatal crashes and robust systems that our users and clients can rely on. We implemented MetricKit crash reporting in our new product from day one, allowing us to identify and resolve potential issues promptly.

In an existing product (adjoe Ads), we successfully employed MetricKit to uncover and swiftly rectify a critical crash. Without this approach, pinpointing the root cause of the issue would have been challenging and time-consuming. By leveraging MetricKit together with Sentry, we were able to deliver a seamless and dependable product to our users.

We’re excited to share our findings with the community in the hope that it will help other developers improve their monitoring practices and reduce the incidence of fatal crashes in iOS products.

To help you get started, we open-sourced a sample project that implements the entire backend part as it is going to be explained below.

Sentry SDK: The First Approach

A natural choice for many is Sentry and its comprehensive SDKs available for many platforms and programming languages. However, when developing a third-party SDK, you can’t simply use Sentry’s iOS SDK for crash reporting. Here’s why:

  • Sentry’s iOS SDK uses a single global instance
  • It can’t be initialized twice with different configurations
  • This feature isn’t planned for the future
  • Using it in your SDK would conflict with apps that want to use Sentry themselves

One of the key axioms of SDK development is keeping the maintenance burden for its users to a minimum. So, in this blog post we’re going to look at the peculiarities of Apple’s iOS platform, specifically from the standpoint of developing third-party SDKs and how we can still diagnose crashes.

Developing a third-party SDK is distinctly different from an actual app that’s hosted on Apple’s App Store. A lot of Apple’s convenience functionality is not available to third-party SDK developers. An SDK also must make sure not to get in the way of the apps which are supposed to integrate it.

How Do You Even Do Crash Reporting?

So, we could just build our own crash reporter… Except Apple doesn’t exactly encourage you to do this. The Apple engineer gives many valid (technical and legal) reasons as why that is and also offers some alternatives:

  • App Store apps can interactively access Apple crash reports through the Xcode organizer.
  • TestFlight reports crashes as they happen

But none of these options are available to third-party SDK developers who’re not hooked up to the App Store. Luckily, another option is given. 

MetricKit to the Rescue

In 2020 the MetricKit framework was released alongside iOS 14 with the capability to receive crash reports of your app inside the code. This crucially extends to third-party SDKs, allowing them to receive crash reports from integrating apps.

Subscribing to crash reports in Swift is simple enough:

class AppMetrics: NSObject, MXMetricManagerSubscriber {
   func receiveReports() {
       let shared = MXMetricManager.shared
       shared.add(self)

       // Immediately receive crash reports generated since
       // the last allocation of the shared manager instance
       didReceive(shared.pastDiagnosticPayloads)
   }

   func pauseReports() {
       let shared = MXMetricManager.shared
       shared.remove(self)
   }

   // Receive daily metrics.
   func didReceive(_ payloads: [MXMetricPayload]) {
       // Process metrics.
   }

   // Receive diagnostics immediately when available.
   func didReceive(_ payloads: [MXDiagnosticPayload]) {
       payloads.forEach { payload in
           payload.crashDiagnostics?.forEach { crashDiagnostic in
               // Process crashDiagnostic.
           }
       }
   }
}

Each crashDiagnostic we receive in the inner-most for-loop is an instance of the type MXCrashDiagnostic. We’re most interested in the object’s callStackTree property that gives us an instance of MXCallStackTree. It exports only a single method that formats the call stack tree into a JSON string.

If you’re already using the Sentry iOS SDK in your app, it offers a configuration to receive MetricKit diagnostics without implementing everything we’ll cover next. 

What is Going On Inside a Call Stack Tree?

An entire crash diagnostic is too large and comprehensive to gain any meaningful insights from. So let’s break it down a bit and only look at the relevant parts. Let’s start with a broad overview:

{
 "version" : "1.0.0",
 "callStackTree" : {
   "callStacks" : [
     {
       "threadAttributed" : true,
       "callStackRootFrames" : [ ... ]
     },
     {
       "threadAttributed" : false,
       "callStackRootFrames" : [ ... ]
     },
     {
       "threadAttributed" : false,
       "callStackRootFrames" : [ ... ]
     }
     ...
   ],
   "callStackPerThread" : true
 },
 "diagnosticMetaData" : {
   "platformArchitecture" : "arm64e",
   "exceptionType" : 6,
   "appBuildVersion" : "1",
   "isTestFlightApp" : false,
   "osVersion" : "iPhone OS 18.2 (22C152)",
   "bundleIdentifier" : "io.adjoe.MonetizeTestApp",
   "deviceType" : "iPhone15,4",
   "exceptionCode" : 1,
   "signal" : 5,
   "regionFormat" : "DE",
   "appVersion" : "1.0",
   "pid" : 1374,
   "lowPowerModeEnabled" : false
 }
}

In the broadest sense, a crash diagnostics consists only of the call stack tree and some metadata about the device on which the crash happened and some metadata about the type of crash itself.

Pay special attention to the callStackPerThread attribute. This flag indicates the nesting of call stacks. In this blog post we’ll only consider MXCrashDiagnostics for which the callStackPerThread is always true. In this case, all call stacks are entirely from individual threads of the crashed process. If you’re also interested in MXCPUException for which this flag is always false you’ll have to unnest the call stacks first. Refer to the Sentry iOS SDK implementation.

With that, let’s look at a redacted call stack:

{
 "threadAttributed": false,
 "callStackRootFrames": [
   {
     "binaryName": "MonetizeSDK",
     "binaryUUID": "ABDFD978-3F87-3B2E-924F-BAE53870A930",
     "address": 4381080288,
     "offsetIntoBinaryTextSegment": 309984,
     "sampleCount": 1,
     "subFrames": [
       {
         "binaryName": "MonetizeTestApp.debug.dylib",
         "binaryUUID": "73481986-EDBB-3D93-B0EC-816674256E3E",
         "address": 4378579800,
         "offsetIntoBinaryTextSegment": 70488,
         "sampleCount": 1,
         "subFrames": [
           {
             "binaryName": "MonetizeTestApp.debug.dylib",
             "binaryUUID": "73481986-EDBB-3D93-B0EC-816674256E3E",
             "address": 4378600324,
             "offsetIntoBinaryTextSegment": 91012,
             "sampleCount": 1,
             "subFrames": [ ... ],
           }
         ]
       }
     ]
   }
 ]
}

Note the ordering: The most recent call frame appears at the top of the JSON document, with the oldest call frame nested deepest inside subFrames.

threadAttributed states if this thread was found to be causing the crash. We’ve found cases where no thread was attributed to the crash – in that case, we attribute the first thread in the call stack tree. 

A short overview of the remaining attributes:

  • binardName: The name of the app or framework this frame belongs to
  • binaryUUID: The build UUID of the associated app or framework
  • address: The memory address of the stack frame
  • offSetIntoBinaryTextSegment: The offset of the stack frame into the segment of the associated app or framework.

For our purposes, we can ignore the sampleCount for now.

Symbolication: Finally Making Sense of It All

As you can see, function names are missing from the call stack because this information is stripped from compiled apps. To translate addresses back to function names, we need to perform Symbolication. In Xcode crashes are automatically symbolicated for us. But as SDK developers we do not have this luxury.

First, build the DeSymbolication file (*.dSYM) for your SDK. These are always built for production builds but are typically omitted in debug builds. Check your XCode project’s build settings for the DEBUG_INFORMATION_FORMAT setting and ensure the builds you want to be able to symbolicate are set to DWARF with dSYM file:

To upload these files to Sentry, use the handy Sentry CLI tool. Make sure the Sentry CLI is authenticated against your Sentry instance. Finally, if you’re building your SDK directly from within XCode, run the following command to upload the dSYM file to Sentry:

$ sentry-cli debug-files upload \
   --include-sources \
   --org <sentry-org> \
   --project <sentry-project-slug> \
   ~/Library/Developer/Xcode/DerivedData/<your-sdk-name>

If you’re building with a build script to a different destination, substitute the path accordingly.

Refer to Sentry’s documentation for other ways to upload the dSYM files which are more suitable for an automated CI/CD pipeline.

In case you’re interested in manually symbolicating your call stack tree, refer to these resources. If you’re not on a Mac or BSD OS and as such cannot use atos, try your luck with llvm-symbolizer.

Keeping Things Private

One question that might arise is: “What happens to stack frames for which we don’t have dSYM  files, like those from apps integrating our SDK?”  The answer is as simple as it is unsatisfying: Nothing. They will appear unsymbolicated in your Sentry’s stack trace with only the binary addresses displayed.

If privacy is a concern for you, as it is for us, this is beneficial. Since you’ll receive crash reports even when your SDK isn’t part of the call stack, you can choose to ignore those reports. Your customers can be reassured that you will not be gaining any additional insights about their app’s inner workings even if you collect their crash reports.

Off to the Backend: Sending Events to Sentry

We’re all set to send the crash report to Sentry now. Unfortunately, Sentry won’t be able to process the raw JSON document of the MXCrashDiagnostic – some preprocessing is required.

Due to the way our infrastructure is set up at adjoe, the Sentry server instance is only available from our private network. So we have to have some kind of proxy server in between client requests to Sentry and our Sentry server instance anyway. Let’s build a simpler version of this proxy to demonstrate processing and sending the crash reports to Sentry:

At adjoe, we use Go for most of our backend code. So this is what we’ll be using here.

Parse the Request

The data model required to parse the MXCrashDiagnostic request is simple enough in Go:

type IOSMXCrashDiagnosticRequest struct {
   CallStackTree      MXCallStackTree    `json:"callStackTree"`
   DiagnosticMetaData DiagnosticMetaData `json:"diagnosticMetaData"`
}

type MXCallStackTree struct {
   CallStacks         []MXCallStack `json:"callStacks"`
   CallStackPerThread bool          `json:"callStackPerThread"`
}

type MXCallStack struct {
   CallStackRootFrames []MXCallStackFrame `json:"callStackRootFrames"`
   ThreadAttributed    bool               `json:"threadAttributed"`
}

type MXCallStackFrame struct {
   BinaryUUID                  string             `json:"binaryUUID"`
   BinaryName                  string             `json:"binaryName"`
   SubFrames                   []MXCallStackFrame `json:"subFrames"`
   Address                     int64              `json:"address"`
   OffsetIntoBinaryTextSegment int64              `json:"offsetIntoBinaryTextSegment"`
}

type DiagnosticMetaData struct {
   ExceptionType int `json:"exceptionType"`
   ExceptionCode int `json:"exceptionCode"`
   Signal        int `json:"signal"`
}

Sentry Envelopes

Unfortunately, the official Go Sentry SDK is missing some key data types to handle the Sentry requests for us. There is an open issue on Github to address this. For now, we have to handle Sentry requests ourselves.

Let us first explore Sentry’s data model – Envelopes – a bit. It is actually very similar to JSON with specific restrictions and uses a proprietary content-type application/x-sentry-envelope.

An Envelope consists of:

  • A header that encodes metadata of the entire event
  • an arbitrary amount of items. Each item consists of:
    • another header for metadata of the item
    •  a payload which encodes the contents of the item.

Envelope Header

The envelope’s header simply contains a randomly chosen UUID as event ID, authentication info and a timestamp.

Event Header

Each event’s header simply encodes the type of the following item (for our purposes, this will always be event) and its content-type (always application/json). Additionally, it optionally encodes the length in bytes of the following payload. We have found that it is a good practice to always set this value. Omitting it can sometimes lead to lost events.

Payloads

We will only be looking at a specific group of item types in this blog post, which all share the same headers: Events. Later, we will explore different types of payloads and what information they encode.

Each part of an Envelope is a valid JSON document that may not include whitespace or line breaks between attributes. All parts of the Envelope (i.e. the JSON documents) must be separated from each other by a line break. Sentry’s documentation provides this formal grammar for Envelopes:

Envelope = Headers { "\n" Item } [ "\n" ] ;
Item = Headers "\n" Payload ;
Payload = { * } ;

(Note, that curly braces {} denote an arbitrary amount of its enclosing content. Square brackets [] denote an optional part.)

What may not be immediately apparent from this, but is a vital detail for our implementation is that a payload can contain another payload.

A very simple example of such an Envelope (adapted from the Sentry docs) would be:

{"event_id":"9ec79c33ec9942ab8353589fcb2e04dc","dsn":"https://e12d836b15bb49d7bbf99e64295d995b:@sentry.io/42","sent_at":"2024-12-20T13:28:46Z"}
{"type":"event","length":41,"content_type":"application/json"}
{"message":"hello world","level":"error"}

For our intends and purposes we will need the following event payloads:

Here’s the final Go data model that encodes the crash information in just the way Sentry would expect it:

type SentryEvent struct {
   Platform string `json:"platform"`
   Level    string `json:"level"`

   Exception Exception `json:"exception"`
   Threads   Threads   `json:"threads"`
   DebugMeta DebugMeta `json:"debug_meta"`

   Timestamp int `json:"timestamp"`
}

type DebugMeta struct {
   Images []Image `json:"images"`
}

type Image struct {
   DebugID   string `json:"debug_id"`
   Type      string `json:"type"`
   ImageAddr Hex    `json:"image_addr"`
}

type Stacktrace struct {
   Frames []Frame `json:"frames"`
}

type Frame struct {
   Package         string `json:"package"`
   ImageAddr       Hex    `json:"image_addr"`
   InstructionAddr Hex    `json:"instruction_addr"`
   InApp           bool   `json:"in_app"`
}

type Exception struct {
   Values []Values `json:"values"`
}

type Values struct {
   Type       string     `json:"type"`
   Value      string     `json:"value"`
   Stacktrace Stacktrace `json:"stacktrace"`
   Mechanism  Mechanism  `json:"mechanism"`
   ThreadID   int        `json:"thread_id"`
}

type Threads struct {
   Values []ThreadValue `json:"values"`
}

type ThreadValue struct {
   Stacktrace Stacktrace `json:"stacktrace"`
   ID         int        `json:"id"`
   Crashed    bool       `json:"crashed"`
}

type Mechanism struct {
   Type    string `json:"type"`
   Meta    Meta   `json:"meta"`
   Handled bool   `json:"handled"`
}

type Meta struct {
   Signal        Signal        `json:"signal"`
   MachException MachException `json:"mach_exception"`
}

type Signal struct {
   Number int `json:"number"`
}

type MachException struct {
   Code      int   `json:"code"`
   SubCode   int64 `json:"subcode"`
   Exception int   `json:"exception"`
}

The Hex data type is a utility type which encodes integers into Sentry’s expected hexadecimal representation – left-padded with zeros to 16 characters:

type Hex int64

func (h Hex) String() string {
   return fmt.Sprintf("0x%016x", int64(h))
}

func (h Hex) MarshalJSON() ([]byte, error) {
   return json.Marshal(h.String())
}

Note, that the above data model is redacted and omits a lot of additional information that you may add to the payloads. Consult Sentry’s documentation for all possible options.

In order to transform the MXCrashDiagnostic into a SentryEvent we need to unnest the reflexive recursion in the MXCallStackFrame into a flat array. (Remember the general structure of the call stacks from the Call stack tree section). Let’s introduce an intermediate representation for this to decouple this interim stage from the request and the event.

type SentryCrashStackTree struct {
   Threads       []Thread
   CrashedThread Thread
}

type Thread struct {
   StackFrames []StackFrame
   ID          int
}

type StackFrame struct {
   Binary Binary

   IOSAddress         int64
   SentryImageAddress int64

   InApp bool
}

type Binary struct {
   UUID string
   Name string
}

Now we can start unnesting the call stack tree as an array. A recursive function makes this especially easy, since we want to have the last stack frame as the first element in the array.

func unnestCallStack(callStack []MXCallStackFrame) []StackFrame {
   if len(callStack) == 0 {
       return nil
   }

   frame := callStack[0]

   ret := unnestCallStack(frame.SubFrames)
   ret = append(ret,
       StackFrame{
           Binary: Binary{
               UUID: frame.BinaryUUID,
               Name: frame.BinaryName,
           },
           IOSAddress:         frame.Address,
           SentryImageAddress: frame.Address - frame.OffsetIntoBinaryTextSegment,
           InApp:              strings.Contains(frame.BinaryName, "MonetizeSDK"),
       },
   )

   return ret
}

I want to highlight the most important line in the above snippet:

SentryImageAddress: frame.Address - frame.OffsetIntoBinaryTextSegment,

In order for Sentry to symbolicate the call stack tree correctly, we need to calculate the correct address into the dSYM file.

With this, we can then build a set of all binary frameworks and apps used in the call stack tree. Sentry will use this set to try to find matching uploaded dSYM files:

func (t *SentryCrashStackTree) Images() []Image {
   images := make(map[Image]struct{})
   for _, thread := range t.Threads {
       for _, frame := range thread.StackFrames {
           image := Image{
               DebugID:   frame.Binary.UUID,
               Type:      "macho",
               ImageAddr: Hex(frame.SentryImageAddress),
           }
           images[image] = struct{}{}
       }
   }

   ret := make([]Image, 0, len(images))
   for image := range images {
       ret = append(ret, image)
   }

   return ret
}

For brevity, we omit the boilerplate code which stitches together the remaining pieces. Refer to our example project if you want to see the code in its entirety. Mind you, that it only demonstrates the data processing and is not production ready. We omitted handling most of the edge-cases and some specific optimizations to make it more easily digestible for the reader. Sentry also allows to set many, many additional fields and metadata for crashes. Refer to the Sentry documentation for additional event payload types.

Ultimately, this is what our formatted request body to Sentry’s envelope endpoint https://<dsn-token>@<domain>/api/<project-id>/envelope looks like:

{
   "event_id": "facac8b7e05642d081753180cd7c349d",
   "dsn": "https://<dsn-token>@<domain>/<project-id>",
   "sent_at": "2024-12-20T13:28:46Z"
}\n
{
   "type": "event",
   "content_type": "application/json",
   "length": 1417
}\n
{
   "platform": "cocoa",
   "level": "fatal",
   "exception": {
       "values": [
           {
               "type": "MXCrashDiagnostic",
               "value": "MetricKit \u003e MXDiagnostic \u003e Crash in SDK",
               "stacktrace": {
                   "frames": [
                       {
                           "package": "MonetizeTestApp.debug.dylib",
                           "image_addr": "0x0000000100530000",
                           "instruction_addr": "0x000000010053cd30",
                           "in_app": false
                       },
                       {
                           "package": "MonetizeTestApp.debug.dylib",
                           "image_addr": "0x0000000100530000",
                           "instruction_addr": "0x00000001005478fc",
                           "in_app": false
                       },
                       {
                           "package": "MonetizeSDK",
                           "image_addr": "0x0000000101024000",
                           "instruction_addr": "0x0000000101096cc0",
                           "in_app": true
                       }
                   ]
               },
               "mechanism": {
                   "type": "MXCrashDiagnostic",
                   "meta": {
                       "signal": {
                           "number": 5
                       },
                       "mach_exception": {
                           "code": 1,
                           "subcode": 4300459312,
                           "exception": 6
                       }
                   },
                   "handled": false
               },
               "thread_id": 0
           }
       ]
   },
   "threads": {
       "values": [
           {
               "stacktrace": {
                   "frames": [
                       {
                           "package": "MonetizeTestApp.debug.dylib",
                           "image_addr": "0x0000000100530000",
                           "instruction_addr": "0x000000010053cd30",
                           "in_app": false
                       },
                       {
                           "package": "MonetizeTestApp.debug.dylib",
                           "image_addr": "0x0000000100530000",
                           "instruction_addr": "0x00000001005478fc",
                           "in_app": false
                       },
                       {
                           "package": "MonetizeSDK",
                           "image_addr": "0x0000000101024000",
                           "instruction_addr": "0x0000000101096cc0",
                           "in_app": true
                       }
                   ]
               },
               "id": 0,
               "crashed": true
           }
       ]
   },
   "debug_meta": {
       "images": [
           {
               "debug_id": "63A3C628-8302-3205-A999-3E6EEE3FE082",
               "type": "macho",
               "image_addr": "0x0000000100530000"
           },
           {
               "debug_id": "F02747BD-F96F-3985-9A9A-DFE1E839C047",
               "type": "macho",
               "image_addr": "0x0000000101024000"
           }
       ]
   },
   "timestamp": 1734701326
}

Remember that the actual request body must remove all whitespace and line breaks not explicitly marked with \n.

When everything is successful, the following should now be available on your Sentry instance:

The Complete MetricKit + Sentry Workflow

When implemented correctly, this approach creates a robust crash reporting system for your iOS SDK that doesn’t interfere with app developers’ own crash reporting solutions. Here’s a summary of the workflow:

  • Subscribe to MetricKit crash diagnostics in your SDK
  • Process the raw diagnostic data into Sentry’s expected format
  • Upload your dSYM files to Sentry for symbolication
  • Send the formatted crash reports to Sentry via its envelope endpoint
  • Analyze the resulting crash reports in your Sentry dashboard

Now this approach is inseparable from our commitment to zero fatal crashes. At the same time, it respects the needs and privacy of our clients’ applications. 

We hope you find it equally useful in your own iOS SDK development journey, and if you want to learn even more, keep exploring our Tech Blog

We are programmed to success

See vacancies