7 Meta ad testing frameworks for subscription apps
Balancing speed and accuracy with creative testing

“Don’t wait for statistical significance, it’s going to limit how fast you can move.”
My heart dropped when Cedric said this so casually, as we discussed common testing mistakes in Meta ad experiments.
I tried to collect myself as an interviewer while he continued, “The top advertisers are launching hundreds of ads, so it’s worth giving up on significance and valuing the learnings overall, rather than trying to create a perfectly fair test setup. Even if you reupload an ad, [Meta] recognizes what’s different and what’s not.”
It felt like the opposite of everything I’d learned and loved about data: statistical significance gives you confidence; the confidence to stand behind an experiment and trust its results. But that isn’t where Cedric Yarish, co-founder of AdManage.ai, a fast-growing ad management platform, places his trust. He trusts Meta and the signals it provides.
This isn’t just gut feeling, it’s backed by Cedric’s experience managing multi-million dollar ad budgets at Photoroom, along with a wide range of high-growth subscription apps like Speechify and Clearminds. He’s tested seven different ad experimentation structures, and his clear takeaway? Trust Meta over statistical significance (even if it breaks my data-driven heart).
As he explained, even if an ad has only spent $2, it’s already gained a few hundred to a thousand impressions, and with all the additional signals Meta has access to, it’s usually right.
If that left you reeling, you’re not alone. Our hour-long conversation was a deep dive into:
- What to consider when testing on iOS
- All seven methods for experimenting on Meta
- The advantages and disadvantages of each approach
- His favorite method and why
- Common pitfalls in ad experimentation
- How to know if an ad has real potential
Everyone approaches experimenting on Meta differently. I’ve seen it firsthand, nosing through 50+ ad accounts over the past ten years—one marketer swears by one ad per ad set, while another is all-in on what used to be called dynamic ads, and is now known as flexible ads.
It’s time to figure out what actually works for you—especially if speed of testing is your priority.
Still need convincing that Cedric knows his stuff? Here’s how many ads are currently being tested through his platform:

Admanage.ai status on ads testing
Woah.
He has visibility into over 68,000 ads at any given time. Together with his brother, Raphael Yarish, Cedric has combined nearly two decades of performance marketing experience to build AdManage: a tool designed to test hundreds of ads in seconds.

Before we dive in, one quick caveat: when testing subscription ads, consider the limitations on campaign and ad set numbers for iOS. If you’re already familiar with this, feel free to skip ahead to the frameworks.
iOS testing limitations
When it comes to testing on iOS 14+, there are a few important limitations to keep in mind. While Android and Web give you much more freedom to test at scale, iOS limits the number of campaigns and ad sets available per app ID.
You’re limited to:
- 18 active campaigns
- 5 active ad sets per campaign
- 50 ads per ad set
Now, you can run more campaigns for earlier iOS versions (13.7 or earlier) or test first on Android or Web. Advantage+ campaigns also come with fewer restrictions, so they can be a smart workaround if your testing needs are more complex.
For smaller advertisers, these limits usually aren’t an issue; you’ll likely have just a few campaigns running in early phases. But for larger advertisers, especially those testing across multiple countries and languages, it can get tricky fast.
That’s why factoring these limitations into your framework decisions is crucial. Since most apps generate the majority of their revenue from iOS users, ideally, you want to test on iOS where possible. But if you’re prioritizing volume or precision in your testing, Android or Web might be better starting points.
Another reason this matters: the second you switch the destination link from Android or Web to iOS, you lose all the likes and comments. And social proof—likes, shares, and comments—does matter. Ads with strong social proof consistently perform better.
In e-commerce, it’s easier to preserve social proof using post IDs, because the product link doesn’t change. But for mobile apps, switching between iOS, Android, or Web resets that engagement history. So, if possible, test on the channel you plan to scale: to keep those likes and comments intact.
Now, let’s get into the experimentation frameworks. If you’ve been running ads for a while, some of these will definitely sound familiar. Buckle up, we have a lot of frameworks to get through today.
Testing Framework 1: Advantage+ campaign with 5 ads

“It’s the golden age of apps,” Cedric Yarish says excitedly. “It’s super easy to launch an app.”
He’s not wrong, everyone left, right, and center seems to be launching an app (usually AI-powered). But just because it’s easy to launch doesn’t mean it’s easy to grow. Many of the new app developers Cedric meets have never run a Facebook ad before—and often have little to no marketing experience.
For those just getting started, Framework 1 is usually the best. It’s simple, accessible, and requires minimal knowledge of ad setup.
How it works:
- Create an Advantage+ campaign with “Trial Started” as the conversion goal
- Use the built-in Advantage+ audience (Meta will only allow one ad set here)
- Upload five ads to test
- At the end of the week, keep what’s working and rotate out the rest
Your campaign structure will look something like this:
- Campaign
- Advantage+ Audience
- Ad 1
- Ad 2
- Ad 3
- Ad 4
- Ad 5
- Advantage+ Audience
Even if Meta doesn’t spend on one or two of the ads, no stress—you pause them and test something new. Typically, Meta will allocate ~80% of spend to the top three ads, and the rest can be used to cycle in new creatives.
Cedric usually recommends that founders start with a hook + product demo format for creatives. Think low-fi, user-generated-style videos—simple, scrappy, and effective. Then just run five ads and iterate based on results. If your cost per trial is too high, it might also signal a need for product tweaks (in my experience, this is more common than people expect).
This framework is all about building initial learnings and insights before moving into a more complex setup.
Now, the disadvantage of this approach is that you’re optimizing for trials rather than paid subscribers as your key event (assuming your app offers a trial). But at this early stage, you need data, and you might not have the technical support to track subscriptions properly yet. Cedric recommends using cost per trial as your main metric. If that’s not working, fall back to cost per install, and then use tools like RevenueCat to double-check user quality. It’s not ideal, but it works when you’re operating with a smaller ad budget.
Another drawback is that you’re probably only testing 2–3 ads per week. That’s why this framework is better suited to smaller budgets, e.g., under $5,000/month.
If this framework feels too basic for you, Option 2 might be a better fit—especially for brands ready to scale faster.
Testing Framework 2 – One ad per ad set

If you’re a classic Type A like me, you’ll love this framework—it’s the ultimate control setup with zero faith in Meta’s optimization. The structure is simple and gives you full visibility:
- 1 Campaign with budgets set at ad level (ABO)
- Ad set 1
- Ad 1
- Ad set 2
- Ad 2
- Ad set 3
- Ad 3
- Ad set 4
- Ad 4
- Ad set 5
- Ad 5
- Ad set 1
Apparently, it’s not just us Type A folks who love this setup. According to Cedric, people with affiliate marketing backgrounds often lean toward this framework too. Makes sense—you’re properly testing each ad and guaranteeing it gets a minimum spend, which gives you far more clarity on performance.
I’ve found this is great for newer brands (another reason I like this framework, given that I’m mainly working with startups). Assets cost more to create at the start as you don’t have economies of scale, so you want to be confident you’ve tested them properly. Also, when using ads to test more fundamental areas—like messaging or target audience—you don’t want to make a decision after just $2 of spend.
However, it is slow (which explains why Cedric is less a fan of it) and harder to set up. So while it can work really well for smaller brands, it gets a bit tiresome when you want to be testing 100s of ads.
The limitations of iOS we discussed at the beginning also make this technique tricky. Cedric tends to see brands use this approach on Android and Web first, where it’s easier to manage, and then scale up their budget once they have a winning creative or message.
Testing Framework 3 – Meta A/B testing feature

Meta has its own built-in A/B testing platform. Cedric has used this approach the least (mainly because he’s spoilt with big budgets… I’m totally not jealous here). But don’t worry; I’ve used it plenty.
With the previous setup, there’s another catch we haven’t discussed: what if someone sees ads 1 and 2 and then converts through ad 3? That could mess up the already challenging attribution for apps.
The A/B testing feature of Meta solves this by keeping the audience clean. I typically take the setup from Framework 2 and run it through an experiment, which provides a more reliable and accurate setup.

When using this method, I usually follow the same approach as Framework 2, but with just 2-3 ad sets to stay intentional about what we’re testing.
- 1 Campaign with budgets set at ad level (ABO)
- Ad set 1
- Ad 1
- Ad set 2
- Ad 2
- Ad set 3
- Ad 3
- Ad set 1
However, you can also test campaigns against each other:

It works well for startups that want to be confident in their tests and reduce the risk of needing to retest. You can create a more controlled test for different ad creatives and test audiences as well. This is where the Campaign level comes in handy—especially if you want to compare an Advantage+ audience (which limits you to one ad set per campaign) against a broad, interest, or lookalike audience.
However, it’s time-consuming and doesn’t scale well. Managing more than 3-4 different experiments at the same time can get overwhelming. I guess that’s the other reason Cedric has stayed away from it: it’s a slow approach.
Testing Framework 4 – New campaign per experiment

As we move to Frameworks 4–7, the iOS limitations start to become a bigger challenge. These frameworks are more commonly used by larger brands.
When Cedric was leading user acquisition at PhotoRoom, they were spending over $1 million a month. At that scale, using frameworks 1–3 doesn’t work. You can’t rely on just one campaign to test, or carefully place each ad in a separate ad set.
One way to tackle this is by creating a new campaign for each experiment. Because of these limitations, this framework isn’t very popular for subscription apps, but it’s commonly used in e-commerce. So, it’s worth mentioning here.
Many people like the control campaigns give you when splitting out tests. However, it can get messy. Overall, this isn’t a framework we’d recommend unless you’re focused on Android or Web, and you don’t mind the added complexity (Type A friends, I’m looking at you—this is a hard pass).
Let’s swiftly sweep that mess under the rug and move on to a framework that better balances control with testing capacity.
Testing Framework 5 – Each ad set has 5 ads
This is one of the most common frameworks Cedric sees being used for testing. It works as follows: each ad set contains five ads, all centered around a clear theme and format that’s being tested. For example:
- 1 Campaign with budgets set at ad level (ABO)
- Ad set 1 – Testing Carousel ads
- Ad 1
- Ad 2
- Ad 3
- Ad 4
- Ad 5
- Ad set 2 – Testing Video Ads
- Ad 1
- Ad 2
- Ad 3
- Ad 4
- Ad 5
- Ad set 3 – Images
- Ad 1
- Ad 2
- Ad 3
- Ad 4
- Ad 5
- Ad set 1 – Testing Carousel ads
Visually this looks similar to Framework 1, but with multiple ad sets (so not an advantage+ campaign).
Cedric has found that by keeping the ads relative, you can ensure a fairer test. Otherwise, one format might unfairly dominate. Ah, control and volume, now we’re talking. Not only does this make testing a bit fairer, but it also allows you to increase your volume. You’re still working with a maximum of 25 ads per campaign on iOS, but with 18 campaigns, that’s still 450 ads. And if you increase that limit or add more ads to winning campaigns (remember, the actual limit is 50 ads per ad set), you can go even higher.
Other downsides? It’s still not the fastest framework to work with.
Testing Framework 6 – Each ad set has 50 ads

That was not a typo… 50 ads per ad set that you are testing. Toto, I have a feeling we’re not in Kansas anymore.
The overview then is:
- 1 Campaign with budgets set at ad level (ABO)
- Ad set 1
- Ad 1-50
- Ad set 2
- Ad 51-100
- Ad set 3
- Ad 101-150
- Ad set 4
- Ad 151-200
- Ad set 5
- Ad 201-250
- Ad set 1
Guess whose favorite framework this is? Cedric, of course! Usually, when you’re working with lots of ads, you’re optimizing for speed, but doing all of this manually takes a lot of time. As I mentioned earlier, Cedric likes to hand the reins over to Meta and let it run the show.
This framework is all about efficiency, moving fast, and ensuring you learn quickly. Using something like AdManage.ai or bulk upload is key. Mainly, big spenders use this approach because their cost per creative is low due to economies of scale, and testing creative nuances, like different headlines, hooks, backgrounds, etc., is crucial for them.
However, as you can imagine, some ads will get zero spend. This isn’t an issue if you trust Meta’s optimization.
I couldn’t believe Cedric fully hands it over to Meta, so I asked him about it, did he really never intervene? For me, it’s like saying you’ve worked at a place with a candy jar for two years and never taken a piece. Possible, but very unlikely.
He admitted to occasionally giving in and taking back control—ha, I knew it! This wasn’t based on a specific stat, but if an ad that had a lot of data behind it or he had high hopes for didn’t get any spend, he’d separate it out. Sometimes he was right; other times, he was wrong. This perfectly highlights what you need to consider with a speed approach: you might miss a winner here and there, but the speed of testing usually makes up for it.
Now, let’s move on to one more framework, it’s a bit of an odd one out compared to the rest, but it has its uses.
Testing Framework 7: Flexible Ads

Given Cedric’s wealth of Meta knowledge, I was hoping he could finally answer a question I’ve been wondering: why did Meta rebrand Dynamic Ads to Flexible Ads? Unfortunately, he had no idea either. His best guess was that maybe internal surveys showed “dynamic” was misleading. Oh, Meta—you strange, strange platform.
For those not familiar with Flexible Ads, you set it up at the ad level:

You can upload up to ten media assets (images and videos). It’s a very fast way to set up ad experiments directly in Meta, and it also allows Meta to choose which one to show your audience based on what it believes will work best.
This is great for testing around a certain theme or experimenting with smaller nuances in ads once you’ve already found your winners, but want to optimize them further. While the setup is fast, it does take longer to analyze because you need to dive into ad breakdowns to see which media performed the best.
One downside is that you’ll potentially lose social proof when scaling up a winner, as you’ll likely want to set it up as a new ad (you should be able to keep the social proof if you continue to run the flex ad).
So there you have it—seven frameworks. Now, let’s wrap up with some final advice from Cedric on experimenting successfully.
What are the most common testing mistakes made by subscription apps?
We’ve already discussed Cedric’s disdain for waiting on statistical significance, which he considers the biggest mistake. He strongly encourages, especially for bigger brands, to choose a framework that prioritizes speed over precision.
The second mistake revolves around attribution: specifically, not checking blended ROAS. Meta tends to underreport on iOS, so it’s essential to calculate your own internal metrics and set up a Mobile Measurement Partner (MMP), rather than just trusting Meta’s numbers at face value.
Cedric also sees far too many brands still relying on view-through attribution instead of using 1-day or 7-day click attribution, which he believes offers much more accurate insights.

So, not leaving engaged-view and view-through attribution on is key. In Cedric’s experience, leaving them on leads to ads cannibalizing another channel. While he’s seen a few cases where this works, most of the time, it’s meaningless.
With ads, you really have to be conscious of whether they’re incremental and if they’re actually helping. There’s so much waste in broad ad channels, and the more you can do to control for this, the better. Cedric tends to use 1-day click attribution, but if the consideration period is longer, he’s used 7-day click attribution instead.

This is where an MMP is key, as they’ll de-duplicate the data, so always benchmark against that.
In the first framework, we talked about using trials initially to get volume, but Cedric sees many brands failing to improve this setup as they grow. In his experience, around 70% of free trial users will unsubscribe, which means this signal to Meta is 70% diluted.
Ideally, you should be getting more sophisticated with your measurement. You can’t always rely on purchases, especially with longer trial periods, but you can look at factors like the type of subscription (annual vs. monthly) or even the type of user. This is exactly what Cedric did at Photobook. They used data from the onboarding flow to identify whether the user was a business or an individual creator. They then associated a higher value with business users and sent that information back to Meta. If you want to trust Meta, you need to feed it good data.
How can you know if an ad has potential?
Cedric is quite liberal when it comes to choosing which ads to scale. He usually runs an evergreen campaign with 50 ads. Then, he tends to delete the bottom 5 ads and upload the top 5 from the testing campaigns every week into the evergreen campaign.
A key reason for this is team morale. It keeps the team motivated to see that there are always at least a few winners. He even adds the creator’s name to the ad so the team can see which content is performing well and which isn’t, and so creators can see their names in the winning campaigns.
If the ads are performing similarly, this approach has very little downside. In fact, they’re likely to improve with scale as the extra data and social proof accumulate over time.
However, I pushed him to get more specific: How do you determine whether an ad is a true winner versus just a short-term success? For Cedric, CPA benchmarks are very accurate and more important than other signals. If the average CPA in the evergreen campaign is $5, and your test ad comes in at $4, that’s a good sign. A CPA of $8 might still work, but anything $10 or above should be left out. It’s all about relative performance compared to the evergreen campaign, especially since the evergreen campaign is nearly always lower.
Choosing the right framework for your app
At the end of the day, picking the right framework comes down to several key considerations:
- Time: How much time do you have to set up and also analyze campaigns?
- Expertise in Meta: If you haven’t used Meta much before, you’ll probably prefer a simpler framework
- Trust in the algorithm: Are you comfortable accepting that if Meta doesn’t spend on an ad, there’s likely a reason for it?
- Scale/Budget: Frameworks 1-3 are better suited for smaller-scale ad accounts, while frameworks 4-7 are more suitable for larger accounts
- Amount of creatives you have: If you don’t have a lot of creative assets, you’ll want to ensure you’re testing each creative more thoroughly
- Amount of data: Testing 50 ads in an ad set requires a significant amount of data for Meta to properly understand what’s working and what isn’t
Keep these factors in mind as you weigh the seven frameworks and the advantages and disadvantages of each one. There’s no concrete right answer, just the right answer for you right now.
Framework | How It Works | When to Use | Advantages | Disadvantages | Helpful Tips |
1. Advantage+ Campaign with 5 Ads | Single Advantage+ campaign with 1 ad set and 5 ads. Kill underperformers weekly and replace them. | Early-stage apps or teams with low ad knowledge or small budgets (<$5K/month). | Simple to set up. Great for low-fi UGC. Requires minimal media buying knowledge. | Limited testing volume (2-3 ads/week). Optimizes for trial not paid. Less control over distribution. | Use RevenueCat to validate quality. Prioritize hook + product demo creatives. |
2. One Ad per Ad Set (ABO) | Each ad gets its own ad set, with ad-level budgets. | Startups that want control or testing fundamentals (e.g. messaging). Useful for Android/Web. | Guarantees spend per ad. Clear attribution. Ideal when assets are costly. | Slower to execute. Harder to scale. iOS limitations make it tricky. | Great for small brands wanting to ensure accurate testing. |
3. Meta A/B Testing Feature | Run split tests using Meta’s experiment tool. Keeps audiences separate. | Startups that need confidence in results. When clean attribution matters. | Fair, isolated tests. Test campaigns, ad sets, or creatives. | Slower. Cumbersome at scale. Max 3-4 tests manageable. | Combine with Framework 2 for structured A/B testing. |
4. New Campaign per Experiment | Set up a new campaign for each test to avoid overlap. | Large budgets, ecommerce-heavy or Android/Web-focused apps. | Maximum isolation and control. | Messy and hard to manage. Not ideal for iOS or subscriptions. | Use only if other frameworks aren’t viable due to audience overlap or platform constraints. |
5. Each Ad Set Has 5 Ads | Multiple ad sets, each testing a format/theme with 5 ads each. | Brands ready to scale creative testing while maintaining structure. | Balanced control and volume. Up to 450 ads across campaigns. | Slower execution. Still capped (25 ads/campaign on iOS). | Keep each ad set’s theme tight to ensure fair comparisons. |
6. Each Ad Set Has 50 Ads | Upload 50 ads per ad set. Let Meta optimize. Use tools like AdManagei for scale. | Bigger spenders testing at speed with creative nuance. | Fast, efficient learning. Huge creative volume. | Little control. Risk of Meta skipping promising ads. Manual setup is time-consuming. | Use automation tools. Occasionally, separate key creatives if they get zero spend. |
7. Flexible Ads (Formerly Dynamic) | Upload up to 10 assets (images/videos) in a single flexible ad. Meta mixes and matches. | Quick creative optimization. Ideal after finding early winners. | Fast to set up. Good for optimizing winners further. | Harder to analyze. Loses social proof when scaling. | Use for micro-testing variations (e.g. hooks, backgrounds). Pull winners out to scale them manually. |
While I may not have gotten the definitive answer I wanted on which framework is best after chatting to Cedric, I’m 100% fine with it. The vague answer most give, “It depends,” has become a very clear and structured approach for evaluating and experimenting with frameworks to find the best one for you. The Type A struggling to escape within me will simply have to accept that answer, and pour her frustrations elsewhere… *cue internal screaming*.
You might also like
- Blog post
Detecting ad fatigue in 2025
Key metrics and methods subscription apps need to know
- Blog post
A Series of Unique Vibe Coding Catfes, Starting with Grand Central Station
Ship It with Coffee – RevenueCat Vibe Coding Catfe Tour in NY, Paris, and Tokyo
- Blog post
What the best subscription apps get right about paywalls
Growth expert Rosie Hoggmascall shares how to price, personalize, and win trust.