Finding & Managing Bottlenecks in Process Plants

Finding & Managing Bottlenecks in Process Plants
December 9, 2009 Peter King

One of the primary objectives of Lean is to achieve smooth continuous flow of material through the process.  If there are bottlenecks within your process, they can inhibit flow, cause inventories, and prevent throughput from matching customer demand (Takt). In order to make progress towards Lean goals, bottleneck resources must be identified, managed and improved. From a purely financial point of view, bottleneck management can reduce inventory which in turn reduces operating cost and increases throughput, therefore increasing revenue and profitability.

Bottlenecks in Process Plants.

Bottlenecks tend to have different causes and to have more severe implications in the process industries.  In parts manufacture and assembly, people tend to be the rate limiting factor in many steps so managing bottlenecks is often a matter of managing people – by appropriate staffing and task leveling.  In process plants, throughput in most manufacturing steps is limited by equipment capability, not by labor.  In cereal manufacture, throughput can be limited by baking times or by flake extrusion rates; in paper making, by the lineal speed capability of the forming, bonding, or calendaring machinery; and in paint making, by the time required to complete the chemical reaction in resin production.   With equipment rather than operating labor being the bottleneck, throughput limitations can’t be resolved by bringing in additional labor or by scheduling overtime.  And since many process plants run around the clock on a 24/7 schedule, scheduling extra shifts is not the answer.  Further, since equipment tends to be very expensive and relatively inflexible, replacing or upgrading equipment is not often a viable option.  Managing the bottleneck is a matter of optimizing the performance of the bottleneck resource itself, protecting the bottleneck from upstream and downstream problems, and optimizing bottleneck scheduling.

Since OEE (1) factors all have variability, what is not a bottleneck at some point may become one at another. For example, a 90% yield value shown on a Value Stream Map (VSM) is an average and doesn’t mean that you lose 10% every day.  That process step may run at 99% yield on a good day and 80% on a bad day.  The step may have plenty of capacity on the 99% day, but can be a bottleneck on the 80% day.  Similarly, variability in equipment reliability on a day-by-day basis can cause non-bottlenecks to become bottlenecks for periods of time.

(1)     OEE and UPtime are measures of equipment performance, of the relationship between nameplate capacity and effective capacity.  They are defined in detail in (1).

Some of the most significant characteristics of bottlenecks found in process plants are:

1.      The root cause is generally in equipment capacity and performance, not labor staffing

2.      Root causes include yield losses and reliability problems as well as inherent capacity

3.      Non-bottlenecks can become bottlenecks due to variability in OEE factors

4.      Plants often run around the clock, so additional shifts are not a feasible solution

5.      Bottlenecks can move with product mix

6.      Bottlenecks may not be obvious – the resulting inventory and other waste is frequently hidden from sight

Capacity Constraint Resources – CCRs

It is important to recognize that throughput can be limited by factors directly related to a piece of equipment and its performance, either its inherent processing rate or losses due to downtime, poor yield, and/or rate limitations.  But throughput can also be limited by the manner in which a piece of equipment is scheduled and how well its flow is synchronized with upstream and downstream process steps.  Synchronous Manufacturing (2) explains this distinction, referring to the former as Bottleneck Resources and the latter as Capacity Constraint Resources (CCRs):

Capacity Constraint Resource – Any resource which, if not properly scheduled and managed, is likely to cause the actual flow of product through the plant to deviate from the planned product flow.

As an example of a CCR, consider the case of a cereal plant which manufactures two families of cereal, one formed into thick shapes like stars and circles, and one formed into relatively flat flakes of various shapes.  The plant can be divided into three major areas, shape manufacturing, flake manufacturing, and packaging, which includes bagging, boxing, cartoning, and palletizing.

From the data boxes contained on a more detailed map, packaging has a utilization of only 75%, even though it takes the full output of both cereal production areas.  However, in real life the storage silos often became full and forced a production line to go down. Analysis revealed that although the packaging area appeared to have excess capacity, it was being scheduled with no coordination or synchronization with either production area creating a constraint.

Most of the discussion on moving bottlenecks, hidden bottlenecks, root causes of bottlenecks, and managing and improving bottlenecks applies to CCRs as well as to bottlenecks.

Moving Bottlenecks

Identifying and managing bottlenecks can be difficult in process plants because the bottleneck may be at a different process step for one material being produced than for another material; the bottleneck may move as the process cycles through the various products being made.  As an example, consider a sheet forming and a bonding (heat treating) machine from a paper making process, Case 1 below.  As can be seen from the data boxes, each has effective capacity greater than the Takt requirement, so neither is a bottleneck.  However, the values shown in the data boxes represent the averages taken across the full product line-up at the typical mix.  When forming a sheet product with very high basis weight, the forming machine must run at a much slower lineal speed, so for that product the capacity will be less than Takt and the machine becomes a bottleneck, as depicted as Case 2. With other products, forming may have excess capacity while bonding may become the bottleneck. For products that must be bonded at higher temperature, the line speed must be slower to allow the sheet to be in contact with the heated bonding roll long enough for complete heat transfer from the roll to the sheet.  When making products requiring high bonding temperatures, bonding becomes the bottleneck, as illustrated in Case 3.

Case 1 – Forming and Bonding utilizations based on average effective capacity

Case 2 –  Product which causes forming to be a bottleneck

Case 3 – Product which causes Bonding to be a bottleneck

In a process spinning synthetic fibers, the threadline winding machine may be the bottleneck when producing very fine fibers at high winding speed, while the metering pump feeding the extrusion die may be the bottleneck with thicker, bulkier fibers.

In a salad dressing bottling line, the bottle filler will usually be the bottleneck operation when filling the larger bottles, but the label applicator can become the bottleneck when packaging into smaller bottles.  The carton erection and filling operation may have significant excess capacity when packaging in large cartons for the “big box” discount retailers, but it can become the bottleneck when filling very small cartons for convenience stores.

The fact that the bottleneck may move during the production cycle must be recognized so that appropriate bottleneck management strategies can be used with all process steps that can be bottlenecks.

Covert Bottlenecks

In order to manage bottlenecks appropriately, it is necessary first to identify where they exist in the process.  One way to find bottlenecks is to look for locations where inventory tends to build up.  But keep in mind that inventory buildup can be for reasons other than bottlenecks; for example, storage of materials not being produced at this point in the production cycle. Nonetheless, a large inventory at a point in the process is a clue that the next step might be a bottleneck resource or a CCR.

In many process plants, however, the in-process inventory is not visible; it is stored in a location somewhat removed from the main process flow, therefore it doesn’t give obvious clues to possible bottlenecks.  In sheet goods processes for example, large rolls are generally stored in a rack system out of the main process flow.  These systems may store WIP rolls from several points in the process, and may not have specific slot areas dedicated to specific WIP points, so a visual inspection of current rack contents will not give a view of the amount of material awaiting any specific process step.  Similarly, portable stainless steel tanks used to store resins in a multi-step paint making operation may be intermixed within the storage area so a large amount of portable tanks in storage will not indicate which process step is causing the hold up. Plastic pellets and cereal flakes are often stored in large silos within the process, again masking any visual indication of large inventory buildup.   Electronic storage area management systems can generally provide reports of storage area contents, but it may take some manual sorting and regrouping to understand current storage by WIP location, so diagnosing possible bottlenecks may take some effort.

An accurate Value Stream Map (VSM) will clearly define any static bottlenecks.  Large inventory values will give clues to possible bottlenecks, but the real give-away will be steps with utilization values near or over 100%.  Bottlenecks which move with product type may be harder to see, even on a very well constructed VSM.

Computer simulation models of the manufacturing process will help with both the covert inventory issue and the moving bottleneck problem.  A discrete event simulation model, using a tool like ProModel or FlexSim, can identify which steps are bottlenecks or near-bottlenecks and how the situation changes with each specific product in the overall mix.

Root Causes of bottlenecks

Once it is clearly understood which process steps are bottlenecks or near-bottlenecks, the next step is to diagnose the root cause – to understand why that step is a bottleneck.  The most common reasons in many process plants will include:

1.      Inherent equipment capacity limitations

2.      Long changeovers

3.      Mechanical reliability problems

4.      Yield losses

5.      Inappropriate scheduling, i.e. CCRs

If an apparent bottleneck is not a CCR but a true bottleneck, the root cause can be diagnosed from the data box for that process step.  In this data box for a resin reactor, a utilization of 110% shows that it is clearly a bottleneck.  Regardless of how well it is scheduled, it lacks the inherent capacity to keep up with Takt. A closer examination of the data box shows that yield and reliability are both very good, at 97% and 98%, respectively.  The culprit, the primary contributor to the 74% UPtime, is the long changeover time.  At 40 minutes, it currently gets performed 12 times during each 36 hour production cycle, so 22% of total capacity is being lost in changeovers.  The key to opening up this bottleneck is to reduce changeover time, using Single Minute Exchange of Dies (SMED) techniques, for example.  Of course, the bottleneck could also be resolved by doing changeovers less frequently (for example, by running longer campaigns), but that would cause a large increase in inventory and is a very expensive way to open the bottleneck up.

The manufacture of synthetic fibers used in blends with cotton to make t-shirts and sweatshirts usually concludes with a process step where the fibers are cut to relatively short lengths (1/2 in. to 1 ½ in.) and then baled in 500 pound bales similar to bales of cotton.  In fact, it is packaged in this form to enable the fabric converter to process it on cotton equipment and thus simplify the blending operation.

This is the data box for a cutter/baler used in this type of process.  With 117% utilization, it is clearly a bottleneck.  Again, changeover times are a contributor: with 14 changeovers consuming 45 minutes each being done in every 4 day cycle, 11% of total capacity is lost.  However, that is not the primary culprit; even if changeovers could be completely eliminated, the baler would still be a bottleneck. Mechanical/electrical reliability of the baler can be seen to be 70% so machine failures cost the operation 28 hours on every 96 hour cycle, compared to the 11 hours lost to changeovers.

If the mechanical reliability could be increased to 90%, the total UPtime percentage would rise to 78%.  Utilization would drop from 117% to 90%, and thus the bottleneck would be resolved. However, achieving 90% reliability on this type of complex electro-mechanical equipment is quite challenging; eliminating the bottleneck condition will likely require a combination of reliability improvements and changeover time reduction.

A very unscientific estimate of the relative frequency of causes of bottlenecks and CCRs in process plants is shown here.  This chart was prepared from a small sampling of VSMs from a variety of process industry manufacturing lines, by examining the data box of the bottleneck, near-bottleneck, or CCR and estimating the primary cause.  When more than one reason was a contributor to the constraint, partial weight was given to each.

Bottleneck Management – Theory of Constraints

Once the bottleneck has been identified and the root cause diagnosed, it’s time to open it up – to make whatever changes that can be identified to resolve the bottleneck.  At the same time, it is important to make sure that throughput at the bottleneck is not suffering unnecessarily from problems upstream or downstream of the bottleneck.

It may require inventory to accomplish this, but adding that waste is often the most reasonable compromise when compared to the alternative of not being able to make Takt. Consider as an example, the process shown above.  The bottleneck is obviously Step B, with a utilization of 104% compared to 50% for each of A and C.  However, if the process is configured exactly as shown, step B will not be able to process 48 gallons per minute (GPM); rather, it will be limited by downtime and losses at A and C, so its real throughput will be only 39 GPM (48 GPM x 90% x 90).

If buffer inventories can be located as shown here, step B will be able to achieve its effective rate of 48 GPM. The bottleneck hasn’t been relieved, but at least it is not further constrained by upstream and downstream outages.  The contents of the first inventory should be equal to the throughput of the bottleneck multiplied by the longest expected outage of Step A.  If Step A can fail and be down for 2 hours at a time, then this inventory should be 5800 gallons so that Step B can continue to be fed at its rate of 48 GPM for the entire 120 minute outage.  The storage capacity of the second inventory should likewise be determined by the bottleneck rate and the longest outage of Step C.  However, this inventory location should be kept empty or nearly so; its function is to provide a place to store bottleneck output while Step C is down so that outages at Step C will not shut Step B down.

The strategies for managing and optimizing bottlenecks described by Eli Goldratt in his two landmark works, The Goal and Theory of Constraints (3, 4) have become the standard for dealing with them.  The process laid out by Goldratt can be summarized as follows:

1. Identify the bottleneck

2. Exploit the bottleneck – make sure that the bottleneck is running at maximum capacity, and not wasting time on non-critical tasks.

3. Use SMED techniques to reduce changeover times to the minimum possible.  Focus not only on mechanical tasks but also on cleaning, and on getting to specified conditions and properties quickly after the changeover

4. If the bottleneck is also a CCR, make sure that scheduling processes are coordinated and synchronized to eliminate that portion of the limitation.

5. Subordinate everything else to the bottleneck – all upstream and downstream processes should operate in a way that maximizes bottleneck throughput as described above.

6. Elevate the bottleneck capacity – try to increase the capacity of the bottleneck

7. Diagnose yield losses; consider improved process control techniques, both electronic controls (5) and Statistical Process Control (SPC).  Six Sigma can also be particularly effective in these situations.

8. If the bottleneck is due to capacity limitations inherent in equipment design, structured brainstorming       workshops with mechanical and/or chemical experts can often point to cost-effective remedies.

9. If equipment reliability is the core issue, implement TPM.

10. Once the bottleneck is broken, find the next bottleneck and repeat

Variability – Why Does It Matter?

So what if the bottleneck can come and go with variation in OEE factors?  So what if the bottleneck can move around depending on which product is being produced?  Doesn’t it all even out over time?

Well, maybe it does.  But if you’ve ever been trapped on a freeway at rush hour, that may give you a feel for what can happen.  Traffic on a particular stretch of interstate may be very moderate when averaged over a 24 hour period, but that’s very little comfort to someone who must use it during a peak period.  Similarly, a process step with reasonable average utilization may not be able to keep up with demand when yield and/or reliability are at the lower part of their normal range.  And because the entire process tends to run at the rate of the slowest asset, unless you’re buffering in accordance with Theory of Constraints, you can never catch up.

Widening the Bottleneck – Lurking Bottlenecks

Once the most obvious bottleneck has been opened and is no longer a bottleneck and when that process step can now produce to the Takt requirement, it should not be taken as a given that the entire process can now make Takt.  There may be other steps which have been bottlenecks, but in a way that was masked by the most restrictive bottleneck.  Of course, a complete and accurate VSM would have shown that, but on some process lines the primary bottleneck restricts flow in a way that it is difficult to measure effective throughput at adjacent steps.  In other cases, area managers may make assumptions about bottleneck location without the benefit of a good VSM.

Consider the recent experience of the plant manager of a facility that makes salad dressings, mayonnaise, and bottled sauces.  On one particular salad dressing bottling line, market demand had increased Takt from the previous 300 bottles per minute (BPM) to 400 BPM.  The plant manager’s intuition, reinforced by some data, told him that the bottle filling step was literally the bottleneck (this is one case where “bottleneck” is more than a figure of speech).  So he challenged his technical organization to increase bottle filling to 400 BPM.  After some analysis and preliminary design, they informed the manager that with a redesign of the filling nozzles, the filling operation could indeed be elevated to 400 BPM; however, the line speed would increase to only 320 BPM.  What he hadn’t seen, and what no one else had seen until the VSM below had been developed, was that there were 3 other near-bottleneck steps which would become bottlenecks at the new Takt of 400 BPM.  The homogenizer step in the kitchen area would reach its limit at 320 BPM, the label applicator at 360 BPM, and the case packer at 400 BPM.  While it would be economically feasible to replace nozzles on the bottle filling machine, the total cost of eliminating all bottlenecks was prohibitive, so the decision was made to build a new line instead of upgrading the current one.

Current State VSM of the salad dressing bottling line – showing the desired Takt

Had the technical team not decided to map the entire process, and had the lurking bottlenecks been overlooked, the plant manager would likely have spent the money on the new nozzles and then realized that the line was still 20% short of meeting Takt.


Identifying bottlenecks and potential bottlenecks is important in any operation, but often more so in process manufacturing operations. Many of these lines run around the clock at or near full capacity; there is no extra time available to create additional capacity. This means that improving performance of the bottleneck step, if there is one – and there frequently is, is critical to meeting customer Takt.

Process bottlenecks are just as often due to reliability problems and equipment downtime as they are to inherent capacity limitations, so TPM and OEE improvement programs are even more important in process plants.

Theory of constraints provides an effective process to manage bottlenecks and ensure they aren’t further constrained by upstream or downstream outages.

If a bottleneck exists within a process, managing and opening it up is usually a far less costly alternative to capital projects as a way to increase throughput.  Thus the topic deserves significant attention and focus in capacity-limited process operations.   Good bottleneck management can reduce inventory and therefore, cost; it can significantly increase throughput and therefore, revenue.


King is founder and president of Lean Dynamics LLC (www.LeanDynamics.US). He is the author of Lean for the Process Industries – Dealing With Complexity (Productivity Press, 2009) from which this article was excerpted. He is a member of AME and APICS, and is currently President of the Process Industries Division of The Institute of Industrial Engineers.

Contact him at

Listen to our free monthly podcast!