This is default featured slide 1 title

This is default featured slide 1 title

You can completely customize the featured slides from the theme theme options page. You can also easily hide the slider from certain part of your site like: categories, tags, archives etc. More »

This is default featured slide 2 title

This is default featured slide 2 title

You can completely customize the featured slides from the theme theme options page. You can also easily hide the slider from certain part of your site like: categories, tags, archives etc. More »

This is default featured slide 3 title

This is default featured slide 3 title

You can completely customize the featured slides from the theme theme options page. You can also easily hide the slider from certain part of your site like: categories, tags, archives etc. More »

This is default featured slide 4 title

This is default featured slide 4 title

You can completely customize the featured slides from the theme theme options page. You can also easily hide the slider from certain part of your site like: categories, tags, archives etc. More »

This is default featured slide 5 title

This is default featured slide 5 title

You can completely customize the featured slides from the theme theme options page. You can also easily hide the slider from certain part of your site like: categories, tags, archives etc. More »

 

A Quick Guide to Data-Based Display Advertising

Here’s a fun fact: In any given web session, your audience is more likely to have a heart attack than click on your banner ad.

Okay, so maybe that’s not so much a fun fact as it is a scary fact, but either way the fact is that most display ads have truly awful clickthrough rates.

The question is, why? Are your ads just unclickably ugly? Do you have the wrong CTAs? Is it just plain banner blindness?

While all of these elements could be contributing to a low clickthrough rate, often display ads struggle because they are simply showing up in front of the wrong audience.

Take this ad, for example:

grill-like-an-expert

Decent ad…terrifyingly bad placement.

However, with a little extra effort, your results can be better than this (admittedly, it’s not a very high bar). It’ll take a bit of work and research, but it will get your ads in front of more of the right people.

To improve the performance of your display ads, though, we’ll need to take a look at your analytics data. So, pull up your analytics platform and let’s get started!

Improving Your Topic/Interest Targeting

Topic/interest targeting is a very easy way to get your ads displayed on a ton of websites. However, if you don’t pick your topics carefully, your ads can easily end up in front of an irrelevant audience.

Imagine you’re marketing for a company like Slack, Salesforce or Moz. You’d probably choose a topic that your target audience is likely to be interested in, like “Business & Industrial > Advertising & Marketing > Marketing.”

Seems like a reasonable topic, right?

The only problem is, what your display network defines as a “Business & Industrial > Advertising & Marketing > Marketing” website may not be what you expect.

For example, www.getjar.com is one of the websites under this topic:

marketing-topic-targeting

Now, I have nothing against GetJar, but if you’re marketing customer relationship management (CRM) software, your ideal audience probably doesn’t consist of people who want to hack Instagram accounts…

I’ll admit that I’ve been researching CRMs, so there’s a chance that Salesforce’s ad showed up because they were using an in-market audience to target people in a CRM purchasing pattern. However, even if this is the case, if I was on GetJar because I wanted to play “Toca Life Town perfect,” I’m not very likely to click on an ad for “The Complete Guide to Lead Nurturing.”

Sure, maybe I might click on an ad for “The Complete Guide to ‘Toca Life Town’ Nurturing”, but “The Complete Guide to Lead Nurturing”? Not so much.

Do you see the problem here? While topic/interest targeting can get you on a lot of supposedly relevant sites, the actual relevance of those sites to your company or ad is often very low.

The good news is, there’s an easy fix for problems like this.

Since most people use the Google Display Network, I’ll use the GDN for my examples, though the same principles apply to most display advertising platforms.

To begin, you need to figure out where exactly your ads are displaying. You can do this by opening AdWords, picking a reasonable time frame (like 6-12 weeks) and then selecting the campaign you’re interested in and clicking Display Network.

From there, click “Placements” and you can see where your ads are being displayed:

display-network-placements

Just so you know, if you’ve already chosen certain specific sites that you want your ads displayed on, there will be a “Managed” status on those placements. Websites chosen by Google (from your targeting setting) have their status set as “Automatic”.

Once you’re here, sort by impressions or costs. Check the conversion rates of the traffic on the top websites you have listed and if you see a poor conversion rate, take a look at that website to see if it’s on that’s applicable to your business. If it doesn’t match your advertising goals, click on “Edit” and then exclude that site.

On the other hand, if you find a site with a great clickthrough or conversion rate, click on “+ Targeting” and add it to your list of managed placements. Yes, I know that sorting through and cleaning up your placements can take some extra time and effort, but it will greatly improve the effectiveness of your display ads.

Fixing Up Contextual Targeting

Contextual targeting takes a very different approach to display ad targeting. Instead of using topics or interests to identify the websites you want your ads on, with contextual targeting you create keyword lists that Google uses to identify relevant sites.

Of course, whether or not this approach is a good idea for your company depends wholly on your marketing goals and the keywords you choose.

In a lot of ways, picking the right keywords for contextual targeting is a lot like picking the right keywords for paid search advertising: you need to know what keywords and phrases your target audience searches for online. So, if you’re running a paid search campaign, some of your best data can be found in your search terms report.

You’ll be able to find this information by using a relevant time frame (of about 6-12 weeks) and clicking on the tab labeled “Keywords.” From there, click on “Search Terms”:

search-terms-report

Sort this list by conversions, clicks or impressions to see which search queries bring in the best traffic.

Generally, most accounts will have groups of 5-10 search terms or keywords that are thematically similar and drive great results. Identify those search terms are build your contextual targeting keyword lists around those themes.

It will take Google about a month or two to find the best sites for you. Once that’s up and running, you can look through your placements list and see which sites are most worthwhile (like we did in the last section). Or, if you’re already running contextually targeted ads, you may want to look at your current placements and search terms list to make sure that you are currently targeting the right keywords in your display campaigns.

Again, this approach isn’t quite as easy as simply writing down all the possibly relevant keywords you can think of and using that list to target your display ads, but the results will be much better.

Refinining Demographic Targeting

Demographic targeting is one of the simplest display ad targeting methods to use, but it’s also one of the least effectively employed ways to improve your display advertising’s performance.

The majority of advertisers forget that they have demographic data they can use to help measure the performance of their ads. You can tell you have a problem with your targeting if your target audience is men over 50 but all of your ads are showing up for women under 30.

Sadly, this sort of problem is far more common than you might think.

To see if your demographic targeting makes sense, open up Google Analytics and select “Audience.” From there, expand the Demographics submenu and choose “Overview.”

Finally, change your segment to “Converters” and you’ll see a screen like this:

google-analytics-demographics-960

Looking at the results in this GIF, you can clearly see that this client shouldn’t be targeting display ads to people who are older than 44 or younger than 25. There are no real conversions coming from that audience.

By the way, this approach only works if you’re tracking conversions, so if you don’t have conversion tracking set up…change that now. However, even in this situation, you can still get some fantastic demographic information for target demographic insight from the AdWords Report Editor (assuming that you get a lot of high-quality traffic through paid search):

report-editor-demographicsYou can see here that this client gets about 3x more clicks from men than than they get from women. With that in mind, would it make sense to run a campaign that targets primarily women? Probably not.

Once you know the demographics of your audience, it’s time to compare those demographics to the demographics of your display advertising audience. Just open AdWords and click your chosen campaign, then choose “Display Network”. From there, choose the Demographics tab:

display-network-demographicsYou can find insights on the gender, age and even the parental status of the people who convert from, click on or even just see your ads, courtesy of Google.

In this case, we’re looking at display ad demographic data for the same client as we saw in the other two screenshots. Comparing that data, it looks like their display ads are being shown to 3.4x more men than women, which matches the click demographics of the search ads they have set. That’s probably a good thing.

However, a significant portion of their impressions come from people over 44 and under 25. That’s a potential problem since that demographic doesn’t seem to convert on the client’s site. Limiting this client’s demographic targeting to people between 25 and 44 could significantly improve their ad performance.

Unlike the last two tactics, this approach is very easy to implement, so take a look at your demographic data and figure out where your best value comes from!

Conclusion

With the right targeting, you can get your display ads in front of the right audience. That can be great for building brand awareness and driving relevant clicks and conversions.

Or, your ads could be showing up in situations like this:

aflac-advertising-fail

However, by using your analytics data, you can put together a good advertising strategy that can get your display ads to the right audience and boost the odds that your display ads will get clicked.

You’ve heard my two cents, now I want to hear yours.

How do you improve display ad performance? Have you ever seen any heart attack-inducing display ad fails like these?

About the Author: Jacob Baadsgaard is the CEO and fearless leader of Disruptive Advertising, an online marketing agency dedicated to using PPC advertising and website optimization to drive sales. His face is as big as his heart and he loves to help businesses achieve their online potential. Connect with him on LinkedIn or Twitter.

Original Source File

NADIR TATI at ModaLisboa Spring Summer 2017 by Fashion Channel

Source Article

Study finds that obese women have more sex.

How much sex a person has is the result of many factors… but are there any that seem more important? To find out, these researchers collected data from 254 women in their thirties, asking them about their personal and physical lives. It turns out that over 40% of the women sampled had sex at least twice a week, and that obese women were more likely to have sex at least three times a week. I guess they really don’t call it the “dirty thirties” for nothing!

Factors Related to Coital Freq

Original Source File

MAYORAL Spring Summer 2017 | CPM Kids Moscow by Fashion Channel

Source Article

SOFIYA MALICHENKO Spring Summer 2017 | Sangue Novo Lisboa FW by Fashion Channel

Source Article

Does Effect Size Matter for fMRI?

fMRI researchers should care about (and report) the size of the effects that they study, according to a new Neuroimage paper from NIMH researchers Gang Chen and colleagues. It’s called Is the statistic value all we should care about in neuroimaging?. The authors include Robert W. Cox, creator of the popular fMRI analysis software AFNI.

Chen et al. explain the purpose of their paper:
Here we address an important issue that has been embedded within the neuroimaging community for a long tim

Original Source File

Why Your Unique Value Proposition Isn’t as Important as You Think It Is (and What Matters More)

A hot prospect has demoed your software or product, and now you’ve got Sales talking with the decision-makers about an enterprise solution that will be your biggest yet. You find out they’ve narrowed down their decision to you and two of your competitors. This should be a slam dunk—you just spent the last three months doing market research and sharpening your Unique Value Proposition (UVP), and you know your team now clearly communicates the unique value you provide.

But the sales process drags out weeks and months. . . and the prospect is asking for discounts and extra customization at no additional charge. You’re crunching the numbers, trying to figure out how to keep the deal alive and asking yourself why you’re stuck competing on price again. And then you find out the prospect chose a competitor.

What went wrong? Your UVP was strong. Your sales team was at the top of their game. What happened?

As it turns out, UVP isn’t as important as we think it is. CEB research surveyed 3,000 B2B buyers across 36 brands and 7 industries and revealed that only 14% of buyers perceive enough meaningful difference between brands’ business value to be willing to pay extra for that difference. Unless you’re selling something truly revolutionary—solving a problem that has not yet been solved in any way, shape, or form—your UVP is pretty much the same as your best competitors’ UVPs. Although there are subtle differences, your prospects are saying they’re not willing to pay for them. So you end up competing on price.

difference-between-supplies-enough-to-payOnly 14% of buyers saw enough difference between suppliers to be willing to pay a higher price for it. (Image Source)

What’s the solution? It’s not that UVP doesn’t matter at all. B2B buyers demand ROI—you have to deliver at least as much business value as your competitors do, in order to get into the consideration set. So all the work you put into developing your UVP isn’t wasted.

Personal Value Beats Business Value

But while nearly all B2B companies focus on business value and treat B2B buying as a rational decision process, the reality is that people are making these buying decisions—people who have emotions and who are concerned about things like getting a promotion, being respected by their peers, and not making mistakes. They fear risk. They want admiration. They are driven by the desire to be successful.

According to CEB’s research, over 90% of the B2B buyers surveyed would either put off the purchase indefinitely or would buy from the lowest-price supplier in their consideration set. If you’re going to consistently win deals profitably, you need to address personal value at least as much as you address business value.

buyers-who-see-personal-value-versus-those-that-dontBuyers were much more likely to purchase from the supplier that demonstrated personal value. (Image Source)

There are two sides to personal value—a positive and a negative. If you tackle both in your marketing and sales materials, you’ll build a strong case that will motivate buyers. Let’s look at each of these in detail.

Address Personal Benefits

The positive side of the personal value coin is personal benefits—how your product or service benefits your prospects personally. While every individual will have his or her own goals and desires, you’ll want to identify two or three that are shared by most of your prospects so you can focus on these in your marketing. (If you break out different market segments or personas and market separately to each, you have the freedom to get more specific with the personal benefits you highlight.)

To identify the personal benefits that will resonate with your prospects, you’ll need to do a bit of research. The easiest way to learn this info is to set up brief phone interviews with current clients or prospects who fit your ideal client profile. Here are a few questions you can ask that will give you insight.

  • What is important to you as a [title or role]?
  • What are you currently working toward? (A promotion? A role change? You’re looking for what motivates them.)
  • What are your one-year goals?
  • Where do you see yourself in two years?

Once you’ve completed your interviews, look over the words and phrases that your interviewees used to describe what matters to them. What words and phrases were used the most? These are the ones that you’ll want to incorporate into your messaging to ensure prospects fully understand and instinctively react to what you’re saying.

Address Personal Risk

The negative side of the personal value coin is personal risk. Fear is one of the strongest forces that prevent people from taking action—even action they logically know they need to take. If you want prospects to move forward in the buying journey, you’re going to have to address their fears.

Nearly every B2B buyer, no matter what his or her job role, has the following fears.

  • Potential loss of time. Would-be buyers are busy and almost always have more on their to-do lists than they can possibly get done. They worry that implementing your solution will take up too much of their valuable time.
  • Potential loss of respect. To get the deal agreed upon, buyers have to champion your solution to their teams. They worry that if your solution doesn’t deliver as promised, or if it’s a nightmare to implement, they’ll lose the support of coworkers and superiors.
  • Potential loss of job. If the performance of your product or service is bad enough and causes a large loss of money or potential revenue, a buyer could lose his or her job over the purchase. This is a fear that can easily and completely derail a purchase.

If you want to close the deal, you’ll need to address each of these fears in your bottom-of-the-funnel marketing content or sales materials.

Personal value is a powerful driver of purchase decisions.

It’s important to note that “showing” is more effective than “telling” prospects that they don’t need to worry about these potential hazards. Besides that fact that it would be weird, no one would believe you if you simply stated, “And there’s no reason to fear losing your job if you buy from us—you won’t!”

Use testimonials and case studies to demonstrate the results you’ve achieved for other companies similar to theirs. Point out how quickly or easily the implementation went and the specific ROI you delivered. Social proof (especially if you’ve got testimonials or case studies from companies well-known in their industry) will alleviate their fears better than anything else.

Dig into the Pain of Non-action

The best way to overcome that last bit of doubt remaining after you’ve addressed potential fears is to dig into the pain that will result from not moving forward with the purchase.

Find out what the buyer will lose if he or she puts off the decision, and quantify it. How much revenue is he or she sacrificing? How much time is he or she wasting?

Then compare the loss resulting from inaction to any remaining potential risk. You need to show the buyer that the reward greatly outweighs any potential risk. This is the final kick-in-the-pants that buyers need to make the purchase.

The best time to point out the pain of non-action is in your proposal. After you’ve clearly communicated business benefits and personal benefits, and after you’ve assuaged their fears, make sure they feel how much the status quo hurts—and how that pain will just continue to get worse the longer they stay there.

Never Forget You’re Selling to People

The companies that win will be the companies that thoroughly understand their prospects and clearly communicate personal value as well as business value. Never lose sight of the fact that, even as a B2B company, you’re selling to people. Show off that shiny UVP, but don’t stop the conversation at business value. And you’ll find that price is no longer holding you back from those highly-coveted enterprise deals.

About the Author: Laura MacPherson is a freelance writer who integrates persuasion psychology and research into copywriting and content for B2B companies. Follow her (or connect) on LinkedIn for an unlimited supply of marketing tips and tricks.

Original Source File

MARIA AKIREIKINA Spring Summer 2017 | CPM Moscow by Fashion Channel

Source Article

With a Whiff, Mice Can Transmit Pain to Each Other

What hurts one mouse, hurts every mouse.

That’s the conclusion of a new study examining the social transfer of pain in mice. When one group of mice was exposed to a painful stimulus, a completely unaffected group displayed the same kind of heightened sensitivity as the first. Given that mice are mammals like us, the effect could also exist in humans, as well as informing future pain research.
Testing for Pain
In their study, researchers from the Oregon Health and Science University work

Original Source File

The 7 Epiphanies Needed to Intuitively Grasp Statistical Significance

There is only one danger more deadly to an online marketer than ignorance, and that danger is misplaced confidence.

Whenever a marketer omits regular statistical significance testing, they risk infecting their campaigns with dubious conclusions that may later mislead them. But because these conclusions were based on “facts” that the marketer “empirically” observed with their own two eyes, there is scant possibility that these erroneous ideas will ever be revisited, less questioned.

The continued esteem given to these questionable conclusions causes otherwise sane marketers to irrationally believe their sterile photos to be superior, their shoddy headlines to be superb, and their so-so branding to be sublime.

Statistical significance testing is the cure to this woe. There are math courses aplenty that describe this field, but the world has no need for another. Today, I propose something different: a crash course in the necessary intuition. The goal of this piece, then, is to instill in you a series of “a-ha” moments that’ll make statistical significance click all while sending warm, fuzzy rushes of understanding into your mind.

Epiphany #1: Large sample sizes dilute eccentricity

Imagine you see a book rated an “average” of 5 stars on Amazon. If this average was based on the review of only a single reader, you would hardly think this book better than another which was rated lower (say 4.2 stars), albeit on the back of hundreds of reviews. Common sense informs you that the book with one rating might have been reviewed by a reader who, for purely idiosyncratic reasons, happened to adore it. But you, as a cautious potential purchaser, cannot tell whether that single review was more reflective of the reviewer rather than the book. Without further information, you cannot be confident of the book’s quality.

As you can see with this Amazon book review example, small sample sizes give eccentricity a chance to express itself. It’s for this exact same reason that an advertising campaign report containing only five clicks makes for an unreliable source of truth. Here, it’s possible that those five advert-clickers were oddly passionate fans of your product who happened to see your advert at the right time. The excellent results your advertising campaign seemed to enjoy may, in reality, have just been a fluke.

The intuitive idea we have just seen has not gone unnoticed by mathematicians. They have indeed packaged it up with a delightfully self-explanatory name: “The Law of Large Numbers”.

Epiphany #2: P-values are trade-offs between certainty and experiment length

Imagine a gambler who bets his house on a coin being rigged. We would pity his stupidity if he bet after seeing a coin land “heads” only three times in a row. But no one would doubt this mental acuity if the coin had instead landed “heads” a million times in a row. Intuitively—and indeed mathematically—this is a sound bet.

But notice that the gambler is still betting. He can never be fully, totally, and absolutely certain that the coin is rigged. Even after seeing a million “heads” in a row, there is still an infinitesimal yet nevertheless existent chance that a fair coin could have given the result of “a million heads in a row”. But, practically speaking, this is exceedingly unlikely, so the gambler shouldn’t let such a tiny shard of uncertainty deter him from making a fundamentally sound bet.

Now we have two extremes: flipping a coin three times, after which it is too early to make a confident bet; and flipping a coin a million times, after which it is exceedingly secure to bet. But what if our gambling man has a family wedding to attend that afternoon. He still wants to bet on the coin in confidence, but he doesn’t want to wait around until it has been flipped “heads” a million times in a row. Translated into the business context, what if we, as advertisers, don’t want to continue our Puppies vs Kitten Photo A/B test for 10 years before deciding which photo was better for sales? By waiting that long, we would have wasted 10 years in showing a proportion of our customers a photo that was comparatively ineffective at effecting sales. Had we figured out which photo was better earlier on, we could have had our best foot forward for a much longer period and earned higher profits all that while.

The crux of the matter is this: There is a trade-off between certainty and experiment length. We can see this intuitively by considering how our feelings of confidence would develop after an increasingly long series of coin flips. After 1 flip of “heads”, none of us would suspect the coin of being rigged. After 5 flips, we’d start seriously entertaining the thought. After 10 flips, most of us would strongly suspect it, but perhaps not enough to bet the house on it. After 100 flips, little doubt could remain in our minds, and we’d feel confident about making a serious wager. After 1,000 flips, we’d be screaming at the top of our lungs for the bookie to take our money.

As we have seen, the more consecutive “heads” flips we witnessed landing, the more certain we’d feel about the coin being rigged. But given that we are not immortal and that we will never reach 100% certainty with anything in our lives, we all must choose a pragmatic point where our uncertainty reaches a tolerably low level, a point where we put our hands up and say, “I’ve seen enough—let’s do this thing”.

This trade-off point is quantified by statisticians with a figure they dub the p-value. Very roughly speaking, the p-value corresponds to the chance you have of being wrong about your conclusion. The p-value can thus be thought of as a preference, one that represents your desired trade-off between certainty and experiment length. Typically, marketers set their p-value to .05, which corresponds to having a 1 in 20 chance of being wrong. If you are risk averse about making mistakes, you could set your p-value to .01, which would mean you have only a 1 in 100 chance of being wrong (but your experiment would take much longer to attain this heightened level of certainty).

jelly-beans-statisical-significance(Image Source)

Perhaps no industry is as wedded to the use of p-values as the pharmaceutical industry. As Ben Goldacre points out in his chilling book, Bad Pharma, there is terrible potential for the pharmaceutical industry to hoodwink doctors and patients with p-values. For example, a p-value of .05 means that one trial in 20 will incorrectly show a drug to be effective, even though, in actuality, that drug is no better than placebo. A dodgy pharmaceutical company could theoretically perform 20 trials of such a drug, bury the 19 trials showing it to be rubbish, and then proudly publish the one and only study that “proves” the drug works.

For the same probabilistic reasons, the online marketer who trawls through their Google AdWords/Facebook Ads/Google Analytics reports looking for patterns runs a big risk of detecting trends and tendencies which don’t really exist. Every time said marketer filters their data one way or the other, they are essentially running an experiment. By sheer force of random chance, there will inevitably be anomalies, anomalies which the marketer will then falsely attribute to underlying pattern. But these anomalies are often no more special than seeing a coin land “heads” five times in a row in 1/100 different experiments where you flipped five fair coins.

Epiphany #3: Small differences in conversion rates are near impossible to detect. Large ones, trivial.

Imagine we observed the following advertising results:

advertising-results-conversions

Upon eyeballing the data, we see that the goat variant tripled its equestrian competitor’s conversion rate. What’s more, we see that there was a large number of impressions (1,000) in each arm of the experiment. Is this enough to satisfy the aforementioned “Law of Large Numbers” and give us the certainty we need? Surely these data mean that the “Miniature Goat” is the better photo in a statistically significant way?

Not quite. Without going too deep into the math, these results fail to reach statistical significance (where p=.05). If we concluded that the goat was the better photo, we would have a 1 in 6 chance of being wrong. Our failure to reach statistical significance despite the large number of impressions shows us that impressions alone are insufficient in our quest for statistically significant results. This might surprise you. After all, if you saw a coin land “heads” 1,000 times in a row, you’d feel damn confident that it was rigged. The math of statistical significance supports this feeling—your chances of being wrong in calling this coin rigged would be about 1 in 1,000,000,000,000,000,000,000,000,000,000… (etc.)

So why is it that the coin was statistically significant after 1,000 flips but the advert wasn’t after 1,000 impressions? What explains this difference?

Before answering this question, I’d like to bring up a scary example that you’ve probably already encountered in the news: Does the use of a mobile phone increase the risk of malignant brain tumors? This is a fiendishly difficult question for researchers to answer, because the incidence of brain tumors in the general population is (mercifully) tiny to start off with (about 7 in 100,000). This low base incidence means that experimenters need to include absolutely epic numbers of people in order to detect even a modestly increased cancer risk (e.g., to detect that mobile phones double the tumor incidence to 14 cases per 100,000).

Suppose that we are brain cancer researchers. If our experiment only sampled 100 or even 1,000 people, then both the mobile-phone-using and the non-mobile-phone-using groups would probably contain 0 incidences of brain tumors. Given the tiny base rate, these sample sizes are both too small to give us even a modicum of information. Now suppose that we sampled 15,000 mobile phone users and 15,000 non-users (good luck finding those).

At the end of this experiment, we might count two cases of malignant brain cancer in the mobile-phone-using group and one case in the non-mobile-using group. A simpleton’s reading of these results would conclude that the incidence of cancer (or the “morbid conversion rate”) with mobile phone users is double that of non-mobile-phone users. But you and I know better, because intuitively this feels like too rash a conclusion—after all, it’s not that difficult to imagine that the additional tumor victim in the mobile-phone-using group turned up there merely by random chance. (And indeed, the math backs this up: this result is not statistically significant at p=.05; we’d have to increase the sample size a whopping 8 times before we could detect this difference.)

Let’s return to our coin-flipping example. Here we only considered two outcomes—that the coin was either fair (50% of the time it lands “heads”) or fully biased to “heads” (100% of the time it lands “heads”). Phrasing the same possibilities in terms of conversion rates (where “heads” counts as a conversion), the fair coin has a 50% conversion rate, whereas the biased coin has a 100% conversion rate. The absolute difference between these two conversion rates is 50% (100% – 50% = 50%). That’s stonking huge! For comparison’s sake, the (reported) difference between the miniature pony and miniature goat photo variants (from the example at the start of this section) was only .2%, and the suspected increase in cancer risk for mobile phone users was .01%.

Now we get to the point: It is easier to detect large differences in conversion rates. They display statistical significance “early” (i.e., after fewer flips or fewer impressions, or in studies relying on smaller sample sizes). To see why, imagine an alternative experiment where we tested a fair coin against one ever so slightly biased to “heads” (e.g., one that lands “heads” 51% of the time). This would require many, many coin flips before we would notice the slight tendency towards heads. After 100 flips we would expect to see 50 “heads” with a fair coin and 51 “heads” with the rigged one, but that extra “heads” could easily happen by random chance alone. We’d need about 15,000 flips to detect this difference in conversion rates with statistical significance. By contrast, imagine detecting the difference between a coin biased 0% to “heads” (i.e., always lands “tails”) and one biased 100% to “heads” (in other words, imagine detecting a 100% difference in conversion rates). After 10 coin flips we would notice that the results would be either ALL heads or ALL tails. Would there really be much point in continuing to flip 90 more times? No, there would not.

This brings us to our next point, which is really just a corollary of the above: Small differences in conversion rates are near impossible to detect. The easiest way to understand this point is to consider what happens when we compare the results of two experimental variants with identical conversion rates: After a thousand, a million, or even a trillion impressions, you still won’t be able to detect a difference in conversion rates, for the simple reason that there is none!

Bradd Libby, of Search Engine Land, calculated the rough number of impressions necessary in each arm of an experiment to reach statistical significance. He then reran this calculation for various different click-through rate (CTR) differences, showing that the smaller the expected conversion rate difference, the harder it is to detect.

impressions-needed-ctr

Notice how in the final row an infinite number of impressions are needed; as we said above, we will never detect a difference, because there is none to detect. The consequence of all this is that it’s not worth your time, as a marketer, to pursue tiny expected gains; instead, you’d be better off going for a big win that you have a chance of actually noticing.

Epiphany #4: You destroy a test’s validity by pulling the plug before its preordained test-duration has passed

Anyone wedded to statistical rigor ought to think twice about shutting down an experiment after perceiving what appears to be initial promise or looming disaster.
Medical researchers, with heartstrings tugged by moral compassion, wish that every cancer sufferer in a trial could receive what’s shaping up to be the better cure—notwithstanding that the supposed superiority of this cure has yet to be established with anything approaching statistical significance. But this sort of rash compassion can have terrible consequences, as happened in the history of cancer treatment. For far too long, surgeons subjected women to a horrifically painful and disfiguring procedure known as the ‘radical mastectomy‘. Hoping to remove all traces of cancer, doctors removed the chest wall and all axillary lymph nodes, along with the cancer-carrying breast; it later transpired that removing all this extra tissue brought no benefit whatsoever.

Generally speaking, we should not prematurely act upon the results of our tests. The earlier stages of an experiment are unstable. During this time, results may drift in and out of statistical significance. For all you know, two more impressions could cause a previous designation of “statistically significant” to be whisked out from under your feet. Moreover, statistical trends can completely switch direction during their run-up to stability. If you peep at results early instead of waiting until an experiment runs its course, you might leave with a conclusion completely at odds with reality.

For this reason, it’s best practice not to peek at an experiment until it has run its course—this being defined in terms of a predetermined number of impressions or a preordained length of time (e.g., after 10,000 impressions or two weeks). It is crucial that these goalposts be established before starting your experiment. If you accidentally happen to view your results before these points have been passed, resist the urge to act upon what you see or even to designate these premature observations as “facts” in your own mind.

Epiphany #5: “Relative” improvement matters, not “absolute” improvement

Look at the following table of data:

photo-variation-conversion-data

After applying a statistical significance test, we would see that the 80s rocker photo outperforms the 60s hippy photo in a statistically significant way. (The numerical details aren’t relevant for my point so I’ve left them out.) But we need to be careful about what business benefit these results imply, lest we misinterpret our findings.

Our first instinct upon seeing the above data would be to interpret it as proving that the 80s rocker photo converted at a 16% higher rate than the 60s hippy photo, where 16% is the difference by subtraction between the two conversion rates (30% – 14% = 16%).

But calculating the conversion rate difference as an absolute change (rather than a relative change) would lead us to understate the magnitude of the improvement. In fact, if your business achieved the above results, a switch from the incumbent 60s hippy pic to the new 80s rocker pic would cause you to more than double your number of conversions, and, all things being equal, you would, as a result, also double your revenue. (Specifically, you would have a 114% improvement, which I calculated by dividing the improvement in conversion rates, 16%, by the old conversion rate, 14%.) Because relative changes in conversion rates are what matter most to our businesses, we should convert absolute changes to relative ones, then seek out the optimizations that provide the greatest improvements in these impactful terms.

Epiphany #6: “Statistically insignificant” does not imply that the opposite result is true

What exactly does it mean when some result is statistically insignificant? The example below has a p-value of approximately .15 for the claim that the Mini Goat photo is superior, making such a conclusion statistically insignificant.

photo-variation-pony

Does the lack of statistical significance imply that there is a full reversal of what we have observed? In other words, does the statistical insignificance mean that the “Miniature Pony” variant is, despite its lower recorded conversion rate, actually better at converting than the “Miniature Goat” variant?

No, it does not—not in any way.

All that the failure to find statistical significance says here is that we cannot be confident that the goat variant is better than the pony one. In fact, our best guess is that the goat is better. Based on the data we’ve observed so far, there is an approximately 85% chance that this claim is true (1 minus the p-value, .15 = .85). The issue is that we cannot be confident of this claim’s truth to the degree dictated by our chosen p-value—to the minimum level of certainty we wanted to have.

One way to intuitively understand this idea is to think of any recorded conversion rate as having its own margin of error. The pony variant was recorded as having a .1% conversion rate in our experiment, but its confidence interval might be (using made-up figures for clarity) .06% above or below this recorded rate (i.e., the true conversion rate value would be between .04% and .16%). Similarly, the confidence interval of the goat variant might be .15% above or below the recorded .3% (i.e., the true value would be between .15% and .45%). Given these margins of error, there exists the possibility that the pony’s true conversion rate would be at the high end (.16%) of its margin of error, whereas the goat’s true conversion rate would lie at its low end (.15%). This would cause a reversal in our conclusions, with the pony outperforming the goat. But in order for this reversal to happen, we would have had to take the most extreme possible values for our margins of error—and in opposite directions to boot. In reality, these extreme values would be fairly unlikely to turn up, which is why we say that it’s more likely that goat photo is better.

Epiphany #7: Any tests that are run consecutively rather than in parallel will give bogus results

Statistical significance requires that our samples (observations) be randomized such that they fairly represent the underlying reality. Imagine walking into a Republican convention and polling the attendees about who they will vote for in the next US presidential election. Near everyone in attendance is going to say “the Republican candidate”. But it’s self-evident that the views of the people in that convention are hardly reflective of America as a whole. More abstractly, you could say that your sample doesn’t reflect the overall group you are studying. The way around this conundrum is randomization in choosing your sample. In our example above, the experimenter should have polled a much broader section of American society (e.g., by questioning people on the street or by polling people listed in the telephone directory.) This would cause the idiosyncrasies in voting patterns to even out.

If you ever catch yourself comparing the results of two advertising campaigns that ran one after the other (e.g., on consecutive days/weeks/months), stop right now. This is a really really bad idea, one that will drain every last ounce of statistical validity from your analyses. This is because your experiments are no longer randomly sampling. Following this experimental procedure is the logical equivalent of extrapolating America’s political preferences after only asking attendees of a Republican convention.

To see why, imagine you are a gift card retailer who observed that 4,000% as many people bought Christmas cards the week before Christmas compared to the week after. You would be a fool if you concluded that the dramatic difference in conversion rates between these two periods was because the dog photo you advertised with during the week preceding Christmas was 40 times better at converting than the cat photo used the following week. The real reason for the staggering difference is that people only buy Christmas cards before Christmas.

Put more generally, commercial markets contain periodic variation—ranging in granularity from full-blown seasonality to specific weekday or time of day shopping preferences. These periodic forces can sometimes fully account for observed differences in conversion rates between two consecutively run advertising campaigns, as happened with the Christmas card example above. The most reliable way to insulate against such contamination is to run your test variants at the same time as one another, as opposed to consecutively. This is the only way to ensure a fair fight and generate the data necessary to answer the question ‘which advert variant is superior?’ As far as implementation details go, you can stick your various variants into an A/B testing framework. This will randomly display your different ads, and once the experiment ends you simply tally up the results.

Perhaps you are thinking, “My market isn’t affected by seasonality, so none of this applies to me”. I strongly doubt that you are immune to seasonality, but for argument’s sake let’s assume your conviction is correct. In this case, I would still argue that you have a blind spot in that you are underestimating the temporally varying effect of competition. There is no way for you to predict whether your competitors will switch on adverts for a massive sale during one week only to turn them off during the next, thereby skewing the hell out of your results. The only way to protect yourself against this (and other) time-dependent contaminants is to run your variants in parallel.

Conclusion

Having been enlightened by the seven big epiphanies for understanding statistical significance, you should now be better equipped to pull up your sleeves and dig into statistical significance testing from a place of comfortable understanding. Your days of opening up Google AdWords reports and trawling for results are over; instead, you methodically set up parallel experiments, let them run their course, choose your desired trade-off for certainty vs. experimental time, give adequate sample sizes for your expected conversion rate differences, and calculate business impact in terms of relative revenue differences. You will no longer be fooled by randomness.

About the Author: Jack Kinsella, author of Entreprenerd: Marketing for Programmers.

Original Source File