How RevenueCat handles errors in Google Play’s Billing Library  

Lessons on Billing Library error handling from RevenueCat's engineering team

How RevenueCat handles errors in Google Play’s Billing Library
Cesar de la Vega

Cesar de la Vega

Published

The Google Play’s Billing Library is a library provided by Google that facilitates communication between Android apps and the Google Play Store. This communication channel is essential for querying product information, purchasing in-app products, and managing subscriptions directly within the app. 

RevenueCat offers a solution that encapsulates the Billing Library within its broader suite of tools. By wrapping the BillingClient, RevenueCat simplifies the integration process, allowing developers to manage in-app purchases in an easier and more efficient way. 

RevenueCat enhances the functionality of the BillingClient by interfacing it with its backend systems. This integration provides developers with advanced features such as detailed analytics, subscription management, and cross-platform support. As a result, apps can offer a more personalized and user-friendly purchasing experience, while developers gain deeper insights into user behavior and revenue patterns.

The purpose of this blog post is to share our journey in addressing some of the nuanced challenges encountered with the Google Play Billing Library, including our initial approach to error handling, the resolution of certain issues that we’ve encountered along the years, and our strategies for handling specific error codes and connection disruptions. By sharing our experiences and the solutions we’ve implemented, we aim to provide guidance to developers facing similar challenges.

How the BillingClient works

When an app initiates a successful connection with the BillingClient, the app can start interacting using a series of functions, like launchBillingFlow to start purchases, queryProductDetailsAsync to get product details, or queryPurchasesAsync to get current active purchases for a user.

After the connection is established, there’s a callback (onBillingSetupFinished) that indicates if the connection was successful or if there was any issue. There is also a callback to indicate the connection can drop at any moment, where the billing client will call a function indicating it has disconnected (onBillingServiceDisconnected).

Why Google’s suggested approach to handling errors wasn’t working

Before we landed on our current implementation, we were doing very basic error handling for the errors we were receiving from the BillingClient. We were trying to reconnect using an exponential backoff and checking if the client was connected before interacting with it (and, if not, we were reconnecting). We were not retrying to reconnect on disconnections except when trying to interact with the service and finding it disconnected.

At the time we implemented the BillingClient wrapper, the documentation was scarce and there was no clear guidance on how to handle errors and disconnections. Google provides a sample implementation that we’ve used in the past as a guide on how to correctly implement the Billing Library. But as you can see, it’s not a complete example.

We knew we needed to tackle better error handling, but what we didn’t know was that our backoff retries were completely broken. We got a report that indicated the BillingClient was retrying indefinitely.

We knew we needed to tackle better error handling, but what we didn’t know was that our backoff retries were completely broken. We got a report that indicated the BillingClient was retrying indefinitely.

We had added this retry with backoff mechanism because we had gotten some reports on ANRs, and the suggested Google solution was to retry to reconnect with a backoff. That’s what they do in their example, but it clearly doesn’t work.

We got more reports of the same issue, and after tons of debugging and help from customers, we realized that a way to replicate it was to change the language in the device while the service was connected. After doing that, the service disconnects and won’t ever reconnect until restarting the device or clearing Google Play caches.

Another issue we had at the same time was one that only occurred on Samsung devices, which were failing with a +999 intent connections issue. We have never been able to reproduce this issue, but we think it’s probably related to the infinite reconnections.

We knew we needed to do something about this ASAP, since we kept receiving reports. It’s pretty easy for a developer to run into this while testing localization for their app. We had been thinking for a while of adding a retry mechanism for specific errors, since some of the errors are recoverable.

How RevenueCat now handles errors (and details of our implementation)

We realized Google had updated their documentation on errors and released an extensive guide on how to handle them.

Based on the information of the guide, we defined different types of error handling mechanisms:

  • Simple: Try the request again without any sophisticated handling.
  • Exponential backoff: Delay the next attempt, with increasing intervals.
  • Reconnect and retry: Attempt to re-establish a connection and then retry the request.
  • Propagate error: Send the error up to be handled by the developer or inform the customer.
  • Call and return error: Make a different function call and handle the error based on its response.

Here’s a summary of the guide:

Retryable errors

ErrorRetry mechanism
NETWORK_ERRORExponential backoff or simple retry (when users are in session).
SERVICE_TIMEOUT (not returned in BC6)Exponential backoff or simple retry (when users are in session). In BC6 SERVICE_UNAVAILABLE is returned.
SERVICE_DISCONNECTEDExponential backoff or simple retry (when users are in session)
Re-establish connection using BillingClient.startConnection.
SERVICE_UNAVAILABLEExponential backoff (when users not in session). For users in session, error out.
BILLING_UNAVAILABLEManual retry after user addresses the issue.
ERRORExponential backoff or simple retry (when users are in session).
ITEM_ALREADY_OWNEDSimple retry logic after calling BillingClient.queryPurchasesAsync().
ITEM_NOT_OWNEDSimple retry logic after calling BillingClient.queryPurchasesAsync().

Non-retryable errors

ErrorRetry mechanism
FEATURE_NOT_SUPPORTEDCheck feature support using BillingClient.isFeatureSupported().
USER_CANCELED
ITEM_UNAVAILABLERefresh product details using BillingClient.queryProductDetailsAsync() and check product eligibility configuration.
DEVELOPER_ERROREnsure correct use of Play Billing Library calls and check debug message

These are some details on the final implementation of our error handling:

BILLING_UNAVAILABLE issue

We ended up retrying connectivity if the BillingClient is not connected when trying to interact with it. We don’t reconnect when we receive a disconnection callback. The reason behind this is that we found some inconsistencies under certain circumstances where the BillingClient would keep calling the disconnected callback constantly, even if we didn’t try to start the connection again. We were trying to fix this issue and realized the results returned by Google were super inconsistent.

If you change the language, the BillingClient can’t connect again, most likely due to an issue with the caches. When the device is in this weird state, when calling startConnection, the onBillingSetupFinished gets called multiple times with either a BILLING_UNAVAILABLE error or an OK. I’ve seen it being called once per thread in the BillingClient thread pool (nine times IIRC). It will also call onBillingServiceDisconnected. The multiple calls to these listeners are occurring with just one single call to startConnection.

According to the errors documentation, we should be retrying to reconnect when onBillingServiceDisconnected gets called, with a maximum number of attempts. The main problem is that when there’s a retry mechanism in your code, this strange behavior makes it very easy to end up in a reconnection retry loop. BillingClient will call onBillingSetupFinished with an OK, which makes the code think there’s a result, then instantly call a onBillingServiceDisconnected. Sometimes it would also send BILLING_UNAVAILABLE as an error code, several times.

We ended up removing the retries off onBillingServiceDisconnected because those were problematic in this weird scenario. We also removed the retries when we get a BILLING_UNAVAILABLE on onBillingSetupFinished.

If Google fixes the issue and calls onBillingSetupFinished with BILLING_UNAVAILABLE just once and that’s it, it would be easier to determine not to retry. Apart from that, the service is not getting disconnected, it’s already disconnected and it shouldn’t be calling onBillingServiceDisconnected.

To improve the error message in that case, we literally compare the message we get from Google to a static string. The message says “Google Play In-app Billing API version is less than 3”, which is super useful if you take into account that Version 3 is from 2012, the year they added in-app purchases.

Also calling endConnection doesn’t prevent from getting more onBillingServiceDisconnected. The only way to stop the service from calling that callback is quitting the app.

Retry with backoff

We added a retry with backoff only for NETWORK_ERROR and ERROR that happen when acknowledging and consuming purchases, but only if those purchases acknowledging/consumption is not triggered when restoring or purchasing, which involve user interaction. We only retry with backoff when getting the current active purchases when the app starts.

We also retry with backoff if the app is in background and we get a SERVICE_UNAVAILABLE error.

We have a max of 15 minutes for the backoff, we won’t go over that. Most app sessions don’t last that long anyways.

Simple retries

We do simple retries (max of three) for NETWORK_ERROR and ERROR, when getting product details, launching the billing flow and restoring. And also for consumption and acknowledgement when triggered from a purchase or a restore.

Non-retryable use cases

For BILLING_UNAVAILABLE, FEATURE_NOT_SUPPORTED, USER_CANCELED, DEVELOPER_ERROR, ITEM_UNAVAILABLE we simply don’t retry and just send an error to the client.

Caching issues

It’s possible to get ITEM_ALREADY_OWNED when launching a billing flow, or ITEM_NOT_OWNED when consuming or acknowledging a purchase. In these cases, the recommendation is to call queryPurchases again, to refetch the purchases and confirm that the user owns it or not.

We have not implemented this yet.

Ensuring one response

The BillingClient sometimes responds more than once after a call, for example you might call queryProductDetails and get two OK responses.

1val hasResponded = AtomicBoolean(false)
2billingClient.queryProductDetailsAsync(params) { billingResult, productDetailsList ->
3   if (hasResponded.getAndSet(true)) {
4       logError()
5       return@queryProductDetailsAsync
6   }
7   listener.onProductDetailsResponse(billingResult, productDetailsList)
8}
9

+999 intent connections issue

We haven’t been able to reproduce this issue and hope the retry mechanisms we implemented fix it, but just in case we catch the IllegalStateException when calling startConnection and return a nicer error message to the client. The error message states This has been reported to occur on Samsung devices on unknown circumstances.

Conclusion (and error-response reference)

Then we tried to determine what to do for each of the functions we use. The following list shows you the different functions and possible errors the BillingClient can return, and how you should react to each of them. We hope that these recommendations will prove useful to you should you encounter the same errors!

Alternatively, you can just use RevenueCat and we’ll handle all of the errors for you 😄

  • For NETWORK_ERROR, or ERROR, a simple retry should be done if the user is in session, or a retry with exponential backoff if the user is not in session. To get more into detail, calls to queryProductDetailsAsync, launchBillingFlow, queryPurchaseHistoryAsync should retry simple. Calls to queryPurchasesAsync and showInAppMessages should retry with exponential backoff because are initiated without user interaction, and calls to consumeAsync and acknowledgePurchase should retry simply if they are performed after a purchase or a restore, otherwise they should retry with backoff.
  • For connection to the billing service issues (SERVICE_DISCONNECTED), the general advice is to reconnect and retry the request.
  • Billing issues (BILLING_UNAVAILABLE, ITEM_UNAVAILABLE), unsupported features (FEATURE_NOT_SUPPORTED), user cancellation (USER_CANCELED), and developer errors (DEVELOPER_ERROR) should propagate the error.
  • When the item is already owned (ITEM_ALREADY_OWNED), you should query the purchase again and handle accordingly. This can only happen when starting a product purchase.
  • If an item is not owned (ITEM_NOT_OWNED) but a function is called as if it was, a check for the purchase is recommended before returning an error. This error can occur when starting, consuming or acknowledging a purchase.
  • If the service is unavailable (SERVICE_UNAVAILABLE), back off if the user is not in an active session, or return an error if they are. We determine if the user is in an active session by checking if the app is in background or not.

In-App Subscriptions Made Easy

See why thousands of the world's tops apps use RevenueCat to power in-app purchases, analyze subscription data, and grow revenue on iOS, Android, and the web.

Related posts

What is SKErrorDomain Error 0 and what can I do about it?
Engineering

What is SKErrorDomain Error 0 and what can I do about it?

What to do when seeing SKErrorDomain Error code 0 from StoreKit on iOS.

Charlie Chapman

Charlie Chapman

April 24, 2024

How we solved RevenueCat’s biggest challenges on data ingestion into Snowflake
How we solved RevenueCat’s biggest challenges on data ingestion into Snowflake
Engineering

How we solved RevenueCat’s biggest challenges on data ingestion into Snowflake

Challenges, solutions, and insights from optimizing our data ingestion pipeline.

Jesús Sánchez

Jesús Sánchez

April 15, 2024

Use cases for RevenueCat Billing
Engineering

Use cases for RevenueCat Billing

3 ways you can use the new RevenueCat Billing beta today.

Charlie Chapman

Charlie Chapman

March 21, 2024

Want to see how RevenueCat can help?

RevenueCat enables us to have one single source of truth for subscriptions and revenue data.

Olivier Lemarié, PhotoroomOlivier Lemarié, Photoroom
Read Case Study