Showing posts with label minitab. Show all posts
Showing posts with label minitab. Show all posts

Wednesday, December 24, 2014

Split testing website marketing messages with Minitab's Chi-Square % Defective comparison tests

Six sigma techniques have expanded beyond traditional manufacturing companies, into healthcare, movie industry, government and now online marketing.

Split testing (also called A/B testing) is one of the more popular techniques being used today by website developers, programmers and any company with an online presence.

Split testing describes a comparison test that is used to determine what works best on their website, by showing their site visitors different images, colors, phrases, layouts, etc and tracking which ones get the most "interest". Interest can be measured by time spent on a website, percent of clicks on a link or ad, website load time, and even tracking of cursor scrolling patterns.

For example, a website might want to determine what gets the most clicks, a button that says "Free Sign Up" (option A), or "Register" (option B).


After enough visitors arrive on the site (sample size is adequate), a comparison of the click rate is reviewed to determine which one did the best. The click rate is defined as the percentage of site visitors that clicked on the button divided by the total number of visitors who saw that option.

In the image above, you can see that 78% of the visitors clicked on the button when it said "Free Sign Up," but only 34% of the visitors clicked on "Register."

The click rate that is higher would be selected ("Free Sign Up"), and the other option is dropped (or modified again to run another A/B test).

Essentially, split (A/B) testing is a simplified hypothesis test or design of experiments (DOE).

In our example, assuming I had more than 100 site visitors, it is pretty obvious that the difference in the percentage is most likely statistically significant (not due to random chance). However, there may be situations where the percentages will be much closer together. In addition, what if we want to run an A/B/C test (3 options instead of only 2)?

Since the primary metric in split testing is a click rate (proportion), analyzing the split test cannot be done with an Analysis of Variance (ANOVA). Therefore, you either need to run a 2-sample proportions test on each comparison (A vs B, B vs C, and A vs C), or you can try Minitab's Chi-Square % Defective analysis.

We recently performed a split test on our website, using 3 different messages at the top of our website. The top bar in green is a service we used called Hello Bar, which is free for use (one website only).



The three options we used for our message, along with the background color:

  • A: "Check out all the FREE downloads and information" (Blue)
  • B: "5S and Control Chart SPC templates for FREE" (Gold)
  • C: "FREE Lean and Six Sigma Excel Templates" (Green)

After a month of displaying these 3 different messages to our site visitors (done automatically by Hello Bar), we reviewed the data.


  • A: "Check out all the FREE downloads and information" (Blue) = 0.5% click rate
  • B: "5S and Control Chart SPC templates for FREE" (Gold) = 0.8% click rate
  • C: "FREE Lean and Six Sigma Excel Templates" (Green) = 1.6% click rate
From the data, it seems that the last option performed the best. If I was a betting person, I would have predicted the 2nd option (B), but that's why we collect data to make decisions! In addition, as Six Sigma practitioners, we must also ask if these differences are statistically significant, or could the change with more samples?

To run this analysis in Minitab (version 17), we setup our data in the following format:

Using Minitab's Assistant function (highly recommended when unsure what tests to perform), select Assistant --> Hypothesis Tests...


Next, we decide that we want to "Compare two or more samples" (since we are running an A/B/C test). Under that section, since we have proportions (click rate) instead of measurement data (like website speed, time spent on website, etc), we select "Chi-square % Defective"


By the way, if you were running only an A/B test, then you would select "Compare two samples with each other" then select "2-sample % Defective"

On the next screen, we tell Minitab what our options should be assigned to in the worksheet. Under the X column, select "Style" (or whatever field you are using to identify your options).


One thing to note. In the analysis, the "clicks" are considered "defects", but in fact these are good things. Technically, we could have assigned the "non-clicks" as defects. It won't matter either way, as long as you keep track what you define as the "defect".

There are 4 pages that get generated, but the most useful is the Summary Report.


If you start in the upper left, the p-value is 0.031, which is less than 0.05, so we conclude that at least one option is statistically different than one of the other options. The upper right section tells us that the statistical difference occurs between A and C. What it also tells us is that option B is not statistically better than option A, nor is it statistically worse than option C. Therefore, we should definitely drop option A from our Hello Bar message, but we need more data to conclude whether option C is statistically better than option B.

Does that mean the message seems to make a difference? Maybe, because if you were paying attention, we also had different colors for each message, so we have confounded our results with color. Maybe site visitors are not clicking as often due to the color, not the words. Sounds like more split testing is needed!

Split testing is a great tool that has brought Six Sigma analysis into the internet marketing world. I plan to use the term "split testing" or "A/B testing" instead of DOE or experimentation when talking to more tech-savvy audiences.

Have you tried split or A/B testing on a website? Explain what you did in the comments below...

Tuesday, October 1, 2013

Zero defects does not mean the problem is solved

One of the common mistakes made by companies is the assumption that the lack of defects means the problem has gone away.

Especially in low volume companies, when a particular problem does not reoccur in a small sample, it is easy to claim victory and move on to a new problem. However, without adequate sample size, that can be a mistake.

Determining adequate sample size depends on two factors: how confident you want to be, and how big the problem was prior to the improvement.

Let's assume a defect was occurring in a process approximately 20% of the time. The team comes up with a solution and implements it. 10 more units are produced from the process with the "new solution" and there are zero defects. Success! Actually, not so fast...

How likely were we to get zero defects, if nothing was improved? It turns out, the odds are pretty good. We can calculate that exactly using Minitab.

Go to Calc --> Probability Distributions --> Binomial (Pass/Fail data) 



Let's say the failure rate is 20%, so enter 0.2 for Event Probability and 10 for # of trials, with a input constant = zero (number of failures) 




Binomial with n = 10 and p = 0.2 
x  P( X <= x )0     0.107374 


This output means that there is only a 10-11% chance of seeing zero failures in 10 samples, if we still have a failure rate of 20%. We ideally would like to see less than 5% chance.

If the failure rate is higher at 50%, then the chance of seeing zero failures would be much lower (since it's pretty likely to have a failure show up under normal conditions).


Binomial with n = 10 and p = 0.5 
x  P( X <= x ) 

0    0.0009766 

If the failure rate is only 10%, then the chance of seeing zero failures would be higher (since it's less likely to have a failure under normal conditions) 

Binomial with n = 10 and p = 0.1 
x  P( X <= x ) 

0     0.348678 

34% is too high of a risk to conclude that the problem went away. If you get a probability less than 5% (such as with a failure rate of 50%), then you can conclude that the problem likely has gone away. If greater than 5% (such as with failure rate of 10% or 20%), then we don't have enough samples to "claim victory". We would need to collect more samples. You can keep adding trials to Minitab until it gives you a sample size that shows a probability less than 5% (less than 0.05). 

In this example, with 10 samples and zero failures observed, the original failure rate should have been at least 25% before the problem was fixed, in order to statistically say that the problem has been resolved. If your previous failure rate was less than that (say 15%), then you will need more samples (trials) before you can feel confident the problem has been resolved.


Minitab has another method for figuring out the correct sample size and confidence. We will cover that in a later discussion (or email us if you need help).



Don't have Minitab? The calculations are easy manually for zero defects...
  • Probability of zero defects with 10% failure rate = (0.9 * 0.9 * 0.9 * 0.9 * 0.9 * 0.9 * 0.9 * 0.9 * 0.9 * 0.9) = 0.3486

  • Probability of zero defects with 20% failure rate = (0.8 * 0.8 * 0.8 * 0.8 * 0.8 * 0.8 * 0.8 * 0.8 * 0.8 * 0.8) = 0.1073

Next time you are reviewing your data, make sure you have the statistical confidence to say that the problem has gone away, so you don't get embarrassed later when it returns.

Wednesday, December 12, 2012

Online statistics training from Minitab has great animation and examples

Minitab statistical software is one of the most powerful and popular programs around, especially for those in the Six Sigma realm.

They recently released a suite of online training courses called Minitab Quality Trainer.



The training is very good, much better than what can be done by an instructor in a classroom. That is because the words have been streamlined, and the animation enhances the concepts being discussed. Online training can reduce the training time by 50% or more compared to traditional classroom training. Minitab has a lot of experience teaching these concepts, so it's no surprise. You can view a sample of Quality Trainer on their website.


There are nine chapters, covering the following topics:


Chapter 1: Descriptive Statistics and Graphical Analysis (60 mins)
Chapter 2: Statistical Inference (60 mins)
Chapter 3: Hypothesis Tests and Confidence Intervals (180 mins)
Chapter 4: Control Charts (75 mins)
Chapter 5: Process Capability (100 mins)
Chapter 6: Analysis of Variance (ANOVA) (60 mins)
Chapter 7: Correlation and Regression (60 mins)
Chapter 8: Measurement Systems Analysis (150 mins)
Chapter 9: Design of Experiments (120 mins)



One of the issues with their training is that there aren't any estimated times for each chapter. We like to solve problems, so we took the time to breakdown the chapters and sections, to give you and your employees a better idea how to schedule and plan for the training courses (see above). Completing all nine chapters takes about 15 hours, with some chapters as short as 50 minutes, while others are as long as 3 hours. For a detailed breakdown of each chapter section, download the estimated times here (.XLSX file).

The cost is also very reasonable. For an individual, they can have access to the training for only $30 per month (or $300 per year). If you have a network version of Minitab at your company, you can also add unlimited access to Quality Trainer for only a couple hundred dollars per licensed user.



To learn more about Minitab Quality Trainer, visit their website at http://www.minitab.com/en-US/products/quality-trainer/default.aspx