Saturday, February 27, 2016

How to use candy to teach or learn Six Sigma

During our Yellow Belt (beginner Six Sigma) class, we have found lots of success teach statistical concepts using candy.

It is a fun way to make statistics less “scary” for those who don’t have a strong math background.

We start out the class by opening up the bags, and counting out the number of pieces in each bag. Each student separates the pieces into colors, and creates a Pareto chart of their bag (see image below).

Next, we inspect the pieces to see how many have defects, so we can calculate a sigma level based on defect rates.

Next, we collect the total number of pieces by bag from each student, and calculate a mean (average), median and estimate variation using the standard deviation for the number of pieces per bag. Yes, we calculate it all by hand, not using Excel or Minitab.

In the next exercise, we create a histogram of the pieces, to look at the distribution of the class data (it comes out looking like a normal distribution, no surprise there).

We calculate capability indices (Pp and Ppk) from the data, using specification limits that might represent wasted money for the company, or customer dissatisfaction from insufficient pieces in a bag.

During some of our classes, we conduct a gage R&R study by measuring the size of candy pieces, to see how well we can get consistent measurements. Other times, we opt to use an attribute gage R&R study, and check to see how well people can determine the difference between different types of soda or brands of bottled water).

Finally, we calculate control limits for an Individuals chart, and plot the results to look for “out of control” conditions (based on Nelson Rules, an updated version of the Western Electric Rules).

If you'd like to see the class live and in-person, we have a Six Sigma Yellow Belt training class taking place in Portland >>> 

If Six Sigma is new to you, this is the best class to get started. If you are already trained or certified in Six Sigma, you should also attend, so you can see how to take this fun class back to your company to teach your employees or clients.

If you know someone who might be interested in this course, please forward this webpage to them. 

Saturday, August 22, 2015

Which lean event type should I do first, Value Stream Mapping or Kaizen Burst?

What type of lean event should we do first, Value Stream Mapping (VSM) or Kaizen Burst?

Before we can answer this question, let’s first define these two popular event types

Value Stream Mapping (VSM) – A multiple day event (2-5 days) focused on mapping the process from customer order to customer delivery. The goal is to complete a current state and future state map, then identify the 8 forms of waste that is keeping the process from achieving the future state. Requires multi-disciplined participation from all stakeholders who impact, or are impacted by the process. The deliverable is a list of projects, actions and events that need to be completed in the next 6-12 months, with names and due dates.

Kaizen Burst – Also called rapid process improvement (RPI). A 3-5 day event focused on making dramatic improvements to a specific part of a process by the end of the event (focus is on implementation, not planning). Not every action will be completed, but the goal is to have 80% of the actions completed during the event, and the remaining 20% completed within 30 days of the end of the event. The event should allow time to make adjustments in case the changes do not work. The goal is to make quick progress without pushing the waste to another department or area.

If you’re just getting started with your process improvement in a process area or department, I would highly recommend the kaizen burst event. The goal is to scope the intent and purpose of the event, then do a considerable amount of work up front (usually takes 1-2 weeks of preparation). During the event, the first couple days are focused on getting everyone familiar with the preparation, and to go and physically observe the current process, so everyone is aware of the wastes and issues. The rest of the days are focused on implementation. It’s an action-driven event. When you get done with the event, you have something to show for all that work and time away from your regular job.

Typical Kaizen Burst Workshop format, from Lean Enterprise Institute

When you pull people away from their job for 3 to 5 days, and you are just starting with process improvement (Lean and Six Sigma), you want to have something to show for all that time.

One of the problems with VSM events is that they end with a list of actions items. Now in order to see results from the event, you are asking those same people to spend more time after the event to complete these tasks. Remember, they have already taken time out of their work week. They are already behind with their regular work, and will need time to catch up. Even though there is excitement after the event, it’s not enough to carry into the following week. It requires a lot of micro-managing and project management and “hounding” people. In addition, usually only a handful of people get assigned tasks, and it is usually quite a few actions.

Don’t get me wrong, I think VSM events are excellent, and great for getting multiple departments and groups to understand the entire process, and get on the same page. However, it is really just a good planning activity, and the event alone does not result in any immediate improvement (other than networking and knowledge of the value stream). In fact, it can takes months before the actions results in any improvement.

Example of Value Stream Map

But for those getting started, or those that are needing major improvements, I would highly recommend the kaizen burst event.

Both events require upfront planning, but the kaizen burst requires more than the VSM, since there may be major improvements taking place in a short amount of time. Getting the right people involved is essential to success. That is why the kaizen burst event needs leadership authorization to empower the team at the start. In order to make these improvements, they must be allowed to try it out without a lengthy approval process. Leaders must assign and delegate people to the team they trust, or provide guidelines and rules that the team must stay within (such as budget, procedures, etc).

Finally, a kaizen burst usually involves the people doing the value-added work, more often than VSM events, which seem to be too heavy on management. You want to engage those people as much as possible early on in a Lean Six Sigma journey, since they are the ones who need to understand the concepts, and see the immediate results applied to their work.

After the process area has matured, or a couple kaizen bursts have been performed, then a VSM event might be needed, when their primary problems are coming from outside their area (outside their control). The idea is to “clean up your own backyard” before you bring outsiders into the process.

In summary, if the process area does not have a mature lean system (poor visuals, employees not trained in lean, hard to see process flow, lots of inventory, poor labels, lots of waste), then kaizen burst is the place to start.

What do you think? Do you agree? Share your comments below...

Friday, May 15, 2015

Employee engagement that will make you jealous!

One of the most difficult things for business owners and managers is to engage their employees in continuous improvements every day. Even the most advanced companies have struggled with this.
The problem is that companies focus too heavily on the tool side of process improvements. It’s actually more important how these tools are applied.

The secret is to focus on your employees and make them as successful as possible. When your employees are happy, they want to do a better job for the company. When they do a better job, your customers notice and are happier. Happy customers buy more stuff from you. Most companies don’t make this connection.

Rather than dictate improvements you would like your employees to make, you need to ask those who do the job day in and day out for their ideas, and help them implement them as quickly as possible.

Sadly, only a handful of companies have been able to break through and accomplish this.
One of them is FastCap, a small manufacturer of woodworking products and tools. The owner is Paul Akers, who has transformed his company into highly motivated employees that drive their own efficiency into their daily work.

Paul learned the culture of improvement from experts in the Toyota Production System. He wanted to share what he learned with all his employees, so he came up with a simple and easy approach that everyone in his company could understand and participate in, called “2 Second Lean”. Employees are encouraged to reduce time in their work by 2 seconds. That’s it! Who in your company would not be able to do that?

Luckily for all of us, Paul has documented and recorded his lean journey at FastCap on his YouTube channel and in his book. If you want to be motivated, inspired and amazed at their success, check out his videos at

If you want help with the technical tools, check out our website at

If you want help engaging your employees, it starts with you! We would suggest watching at least 5 of Paul’s videos, and sharing them with your management team.

Saturday, March 28, 2015

Combining DMAIC and lean events to maximize process improvements

The DMAIC approach for process improvement is the foundation for Six Sigma, and I have grown to appreciate it more each time I use it. However, DMAIC projects can take a while to complete. Lean events are great approaches to make dramatic improvements, but who is tracking the long term results after the event is complete?

There are two powerful ways to combine DMAIC with lean events:

1) Use lean events to move quickly through the DMAIC phases

2) Use DMAIC framework to manage lean activity in a work area

1) Use lean events to move quickly through the DMAIC phases

You can reduce the completion time for your project by using the lean event format to help you quickly move through the different DMAIC phases. The great thing about lean events is that you get the right people in the room, focused on a specific outcome, and you have a set timeline to get it done. This creates a strong sense of urgency that some Six Sigma projects seem to lack.

For example, you could have a lean event to build the project charter, perform an FMEA, gather detailed data for the measure phase, develop control charts, or conduct a pilot study or DOE. The idea is to get everyone together for a common task and get it done, rather than drag it out over one hour meetings every week.

You also don't need a full week for each lean event. However, you probably need more than one hour, so schedule half-day or full-day sessions with your team (we suggest at least 2 per month), so you can make lots of progress all at once, and not wait for action items to be completed. It also can be frustrating when you just start making progress, and you hit the end of your hour long meeting.

2) Use DMAIC framework to manage lean activity in a work area

You can also use the DMAIC structure to help with your lean events. During a traditional kaizen event (week long improvement workshop), the DMAIC framework is already being used, even if you don't realize it. When you are doing the prep work, you are conducting the Define and Measure phase. During the actual event, you are conducting Analyze, Improve and Control. However, sometimes the improvements and control systems are not as strong as they are during a Six Sigma project, due to time constraints. In addition, the long term tracking of metrics, to ensure that the event truly achieved the results, is often lacking after an event.

The DMAIC framework will also help you realize if you need more events to complete the improvements in order to achieve the goals for the workarea, and allow you to fully capture any cost savings or metric improvements. Perhaps the lean event made great strides, but the inventory is still too high, or they have not been able to consistently achieve their takt time. Maybe the last remaining action item is one of the most crucial items, that will make a dramatic improvement to the flow? DMAIC will keep the effort moving forward until the results are achieved.

It's important to combine DMAIC and lean events in your improvement plan. We don't want to have an event, make improvements, then walk away and not verify the team achieved their long term goals. Likewise, we don't want Six Sigma projects that take forever to complete.

Wednesday, December 24, 2014

Split testing website marketing messages with Minitab's Chi-Square % Defective comparison tests

Six sigma techniques have expanded beyond traditional manufacturing companies, into healthcare, movie industry, government and now online marketing.

Split testing (also called A/B testing) is one of the more popular techniques being used today by website developers, programmers and any company with an online presence.

Split testing describes a comparison test that is used to determine what works best on their website, by showing their site visitors different images, colors, phrases, layouts, etc and tracking which ones get the most "interest". Interest can be measured by time spent on a website, percent of clicks on a link or ad, website load time, and even tracking of cursor scrolling patterns.

For example, a website might want to determine what gets the most clicks, a button that says "Free Sign Up" (option A), or "Register" (option B).

After enough visitors arrive on the site (sample size is adequate), a comparison of the click rate is reviewed to determine which one did the best. The click rate is defined as the percentage of site visitors that clicked on the button divided by the total number of visitors who saw that option.

In the image above, you can see that 78% of the visitors clicked on the button when it said "Free Sign Up," but only 34% of the visitors clicked on "Register."

The click rate that is higher would be selected ("Free Sign Up"), and the other option is dropped (or modified again to run another A/B test).

Essentially, split (A/B) testing is a simplified hypothesis test or design of experiments (DOE).

In our example, assuming I had more than 100 site visitors, it is pretty obvious that the difference in the percentage is most likely statistically significant (not due to random chance). However, there may be situations where the percentages will be much closer together. In addition, what if we want to run an A/B/C test (3 options instead of only 2)?

Since the primary metric in split testing is a click rate (proportion), analyzing the split test cannot be done with an Analysis of Variance (ANOVA). Therefore, you either need to run a 2-sample proportions test on each comparison (A vs B, B vs C, and A vs C), or you can try Minitab's Chi-Square % Defective analysis.

We recently performed a split test on our website, using 3 different messages at the top of our website. The top bar in green is a service we used called Hello Bar, which is free for use (one website only).

The three options we used for our message, along with the background color:

  • A: "Check out all the FREE downloads and information" (Blue)
  • B: "5S and Control Chart SPC templates for FREE" (Gold)
  • C: "FREE Lean and Six Sigma Excel Templates" (Green)

After a month of displaying these 3 different messages to our site visitors (done automatically by Hello Bar), we reviewed the data.

  • A: "Check out all the FREE downloads and information" (Blue) = 0.5% click rate
  • B: "5S and Control Chart SPC templates for FREE" (Gold) = 0.8% click rate
  • C: "FREE Lean and Six Sigma Excel Templates" (Green) = 1.6% click rate
From the data, it seems that the last option performed the best. If I was a betting person, I would have predicted the 2nd option (B), but that's why we collect data to make decisions! In addition, as Six Sigma practitioners, we must also ask if these differences are statistically significant, or could the change with more samples?

To run this analysis in Minitab (version 17), we setup our data in the following format:

Using Minitab's Assistant function (highly recommended when unsure what tests to perform), select Assistant --> Hypothesis Tests...

Next, we decide that we want to "Compare two or more samples" (since we are running an A/B/C test). Under that section, since we have proportions (click rate) instead of measurement data (like website speed, time spent on website, etc), we select "Chi-square % Defective"

By the way, if you were running only an A/B test, then you would select "Compare two samples with each other" then select "2-sample % Defective"

On the next screen, we tell Minitab what our options should be assigned to in the worksheet. Under the X column, select "Style" (or whatever field you are using to identify your options).

One thing to note. In the analysis, the "clicks" are considered "defects", but in fact these are good things. Technically, we could have assigned the "non-clicks" as defects. It won't matter either way, as long as you keep track what you define as the "defect".

There are 4 pages that get generated, but the most useful is the Summary Report.

If you start in the upper left, the p-value is 0.031, which is less than 0.05, so we conclude that at least one option is statistically different than one of the other options. The upper right section tells us that the statistical difference occurs between A and C. What it also tells us is that option B is not statistically better than option A, nor is it statistically worse than option C. Therefore, we should definitely drop option A from our Hello Bar message, but we need more data to conclude whether option C is statistically better than option B.

Does that mean the message seems to make a difference? Maybe, because if you were paying attention, we also had different colors for each message, so we have confounded our results with color. Maybe site visitors are not clicking as often due to the color, not the words. Sounds like more split testing is needed!

Split testing is a great tool that has brought Six Sigma analysis into the internet marketing world. I plan to use the term "split testing" or "A/B testing" instead of DOE or experimentation when talking to more tech-savvy audiences.

Have you tried split or A/B testing on a website? Explain what you did in the comments below...

Saturday, September 27, 2014

Don't waste time on a full Gage R&R study until the data says you are ready

We support a lot of small volume manufacturing facilities (delivering low rate, high complexity products), which presents different challenges for implementing Six Sigma.

When you only produce one product per day or per week, it can be difficult to gather a good statistical sample for any analysis. In addition, even if the parts were available, the measurements can be complex with numerous data points, so the time to collect the data for each sample can take from 15 minutes, up to 8 hours or longer!

I'd like to share a best practice we've discovered with Gage Repeatability and Reproducibility (R&R) studies that can help all businesses save time and money with smaller sample sizes, not just those in the low volume production businesses.

I'll assume you have some knowledge of a Gage R&R study. If not, check out this page >>>

Let's assume you need to perform a Gage R&R on a new piece of test equipment.

We recommend a study that will require at least 30 total samples in the experiment. This allows us to gather a significant number of experimental runs to understand what is happening. You may require more, but I would suggest starting with 30, and evaluating the results before adding more runs/samples.

If we have 2 technicians running the equipment (only one piece of equipment), then typically we would take 10 parts, 2 technicians and 3 repeat measurements. That is a standard Gage R&R setup. That would be 10 x 2 x 3 = 60 total samples, which exceeds our 30 sample minimum.

What if each sample takes 2 hours to complete? Our original study will take at least 120 hours. Do you think your company would let you tie up the equipment for that long, and prevent 10 parts from being shipped? Highly doubtful in a low volume environment.

However, we should try to reduce the size of our study, and use another combination of parts, technicians and repeats to get closer to 30 samples. We could select one of the following options:

  • 5 parts x 2 technicians x 3 repeats = 30 samples
  • 8 parts x 2 technicians x 2 repeats = 32 samples
  • 6 parts x 3 technicians x 2 repeats = 30 samples
  • Or any other combination you can think of...

Which one is best? It depends on your situation. If you have a lot of uniqueness in your parts, I would select more parts for your study. If you think technicians may be driving variation, you should try to find a 3rd technician to include. If you suspect repeatability issues, then more repeat measurements may be preferred. This is where the expertise of the technicians, managers, engineers and experts can assist.

Let's select the option with 5 parts, 2 technicians and 3 repeats. Even though we have reduced the study down to 30 samples, it will still require 60 hours of testing to complete the Gage R&R, and we will be holding up 5 parts during that time. That's not what your production team will want to hear.

What can we do?

We suggest you conduct a partial Gage R&R, and evaluate those results before completing the full Gage R&R.

A partial Gage R&R would be a much smaller version of our full study. Instead of 5 parts, we should start with 3, and instead of 3 repeats, use only 2 repeats. We still want at least 2 parts, 2 repeats and 2 technicians as a minimum, so we get some estimates for repeatability and reproducibility.

This would create a Gage R&R study of 12 samples (3 parts x 2 technicians x 2 repeats). Now we have reduced the test time to 24 hours, and are only holding up 3 parts. Compared to the alternatives, that should sound pretty good to your production team.

How can we do this and still properly evaluate the equipment? 

The trick is that we might find enough variation in the partial Gage R&R results that we should stop the study, and go work the issues, before we complete the full Gage R&R. No sense wasting time gathering more data showing the same problem!

However, if the partial Gage R&R show favorable results (% of Tolerance and study variation below acceptable levels), then you will need to complete the full Gage R&R in order to ensure those results hold up with more parts, more technicians and/or more repeats.

The savings will come into effect only if there are problems with the measurement system. If there are no problems, then there will be no savings, as the full Gage R&R will still need to be completed. The nice thing is that you don't need to start over from scratch, you would just expand the partial study until it matched the setup of the full study. For our example, you would simply continue the study with the additional 2 parts, and add one additional repeat run to the study.

Let's see how this works with an example, using sample data from Minitab.

After running the full Gage R&R (30 samples), we come up with the following results (charts generated using the MSA Assistant feature in Minitab version 17):

gauge r&r

what is msa

The results show an unacceptable Gage R&R, as the % of Tolerance is above 30% (calculated at 59.7%), and the % of study variation is showing 63.1%. Most of the variation is coming from reproducibility (technician) at 56%.

The question we want to know is whether our partial Gage R&R study would have detected the same problems with the measurement system.

When we condense the data set down to 12 samples, and re-run the analysis, here are the results.

The results come out very similar!

% of Tolerance (Full) = 59.7%
% of Tolerance (Partial) = 64.5%

% of Study Variation (Full) = 63.1%
% of Study Variation (Partial) = 65.0%

% of Tolerance for Reproducibility (Full) =  56.0%
% of Tolerance for Reproducibility (Partial) =  63.5%

Since the measurement system contains reproducibility issues, we benefited by not running the full Gage R&R study. Now we need to investigate and resolve the technician issues, and conduct another Gage R&R when we feel those issues are resolved.

On a side note, do not run a Gage R&R unless you think it will be successful. If you know there are calibration problems, mismatching of equipment, outdated software installed, worn out parts, and differences in techniques used, then resolve those issues first. Otherwise, you'll have to re-run the study again later, after those improvements are made. The initial study would be a complete waste if it told you things you already suspected would be a problem. There are enough unknown variables inside a measurement system that you should deal with the obvious and known variables first.

After we make improvements to our measurement system, we would still need to run a partial Gage R&R the 2nd time, but again, do not complete the full Gage R&R until the partial Gage R&R shows results that are acceptable. Your 2nd study may find additional problems, or prove that the improvements were not effective, so we should stop and fix those right away.

Once you get acceptable results from the partial Gage R&R, only then should you continue to the full Gage R&R.

Bottom line, do not misread this and conclude that you only need to run a partial Gage R&R. A full Gage R&R is still needed to ensure all the variation in the additional parts and repeat measurements have been uncovered. But, until the measurement system issues are fully resolved, don't waste time doing a full study until the data tells you when you are ready to do so.

This concept can be applied to capability studies as well. Ideally, we would like to have 100-300 samples from a stable process produced from all of our sources of variation in the process, before we calculate an accurate number for capability indices (Cpk and Ppk). That may be easy to do in high volume production environments, but that is very difficult in low volume industries. Even getting a statistically valid sample of 30 is near impossible.

What we suggest is to gather the first 5 samples, and calculate capability. If the small samples show a problem (mean near the limits, large variation compared to limits), that might be enough information to dig into the problem. If the results are good (mean near the target, variation small compared to limits), then you will need to wait until at least 30 samples are generated before drawing any long term conclusions from the Cpk/Ppk values.

Remember, 30 samples is ideal, but 5 samples is better than only one sample, which is better than none at all!

In summary, in order to save time and money conducting a Gage R&R study, we suggest you follow these three recommendations:

1) Setup your full Gage R&R study to run only 30 samples - then decide if more are needed
2) Do not run a Gage R&R if you suspect it will not pass - address known issues first
3) Run a partial Gage R&R first, then if they are acceptable, complete the full study, otherwise go address the identified issues

If you'd like to learn more about Gage R&R studies, check out our training class >>>

Has anyone had experience trying this approach? We would love to hear from you! Add your comments below...

Saturday, September 20, 2014

What does golfing have to do with Six Sigma?

Here at BPI, we are always looking for ways to make Lean and Six Sigma concepts easier to understand for those getting started.

We will be creating a series of videos with examples that might resonate with you on basic terms and concepts. Feel free to use these videos in your training class, or send them to your attendees to help reinforce concepts you've already taught or coached.

The following video puts Six Sigma in terms of golfing, where the distance of the tee shot is being measured, compared to the limits of the golf hole.

The process of teeing off can be evaluated in Six Sigma terms based on how likely the shot will land between two sand traps.

We also explain how to connect these concepts back to the real world, such as in business with a forecasting budget process, with limits of +/- 10%.

Are there good examples you've heard, that you'd like us to capture in animation format? Let us know in the comments below...