Вы находитесь на странице: 1из 58

Define Problem

NO
Does Problem
Exist?

YES

Get Information

Analyse Faults

NO
Problem Isolated ?

YES

Make Repairs

NO Final Test
OK ?

YES

Done !
Troubleshooting
Processes
Define Problem

NO
Does Problem
Exist?

YES

Get Information

Analyse Faults

NO
Problem Isolated ?

YES

Make Repairs

NO Final Test
OK ?

YES

Done !
Common Troubleshooting Myths

; “You need to be an expert on the machine or system you’re


troubleshooting”
This is a very destructive myth because it’s expensive to wait for an “expert” to be
available, or when we spend money to hire “experts”. If you know enough about
the machine to know what tests to conduct, you can use a troubleshooting process
to narrow the problem down to root cause. Often just having the system
documentation or service manual gives you enough expertise.

; “Troubleshooting is machine dependent”


Systems and machines can vary, but the troubleshooting process is common to all!

; “Great troubleshooters are born, not taught”


BS! Troubleshooting is a set of procedures, priorities, mental tools and attitudes
that anyone can learn.

; “Either you can troubleshoot or you can’t”


Wrong! Just like any ability, there is a range, but anyone can improve their skills.
Substitute the words “drive a car” for “troubleshoot” as see how silly it sounds.

; “I can troubleshoot - I do it everyday”


Yes, but how well? See previous myth.

; “Troubleshooting is for technical people”


Nothing is farther from the truth. Substitute the words “problem solving” for
“troubleshooting and see how silly that sounds.

; “Troubleshooting isn’t as important as other skills”


Tell that to your boss when a critical machine is down and all work stops.
EXPERT PERFORMANCE
(What can experts do that I can**t?)

Generalizations About the Nature of Expertise

1. Expertise is acquired in stages.


• the level of competence gained is only as great as is necessary to
carry out desired activities or to solve desired problems
• growth of competence lessens as experts settle into their working
situations
2. Expertise is very subject-specific.
• a good mechanic is not necessarily a good electrician
• expertise does not necessarily transfer to other domains
3. Experts develop the ability to perceive large, meaningful patterns.
• takes on the character of “intuition”
• does not reflect superior perceptual abilities; rather reflects a better
organization of their knowledge
4. Experts are fast and accurate. There are two ways to explain their speed:
• as a result of many hours of practice, they can perform automatically
• by recognizing patterns they arrive at solutions without an exhaustive
search
5. Experts have superior memory organization and strategies.
• experts do not have larger memories; they have automatized their
skills which frees up their mental resources for greater storage
6. Experts take a great deal of time analyzing problems before taking action.
• experts try to understand a problem before acting while novices jump
right in and attempt a solution
• experts build a mental picture of the problem
7. Expert knowledge is procedural and causal in nature.
• experts are good at relating events in cause and effect sequences that
lead to problem solutions
8. Experts have strong self-monitoring skills.
• experts are aware of their own mistakes; they know when they don*t
understand; they know when they need to check their solutions
NOVICE TROUBLESHOOTING PERFORMANCE
(What types of problems do troubleshooters have?)

Knowledge Deficiencies

1. Ineffective troubleshooters have incomplete system knowledge. They don*t


understand how components work or how they interact with other
components within the system.
2. Ineffective troubleshooters don*t know what actions or tests can be used to
collect information or to manipulate equipment.
3. Ineffective troubleshooters are greatly affected by working memory
limitations, which prevents them from remembering the symptoms and the
test results they have already performed.
4. Ineffective troubleshooters have limited understanding of troubleshooting
techniques.

Skill Deficiencies
1. Ineffective troubleshooters have limited skills to choose from.
2. Ineffective troubleshooters have difficulty performing skills correctly.
3. Ineffective troubleshooters have difficulty tracing schematics and wiring
diagrams.

Performance Problems

1. Ineffective troubleshooters act without thinking and have preconceived


notions. They often feel that they must be active so that they look like they
know what they are doing.
2. Ineffective troubleshooters focus on only part of the problem. They act like
they are wearing blinders.
3. Ineffective troubleshooters fail to use all the available information - even
about what IS working.
4. Ineffective troubleshooters do what they know how to do instead of what the
problem requires. They rely on favorite strategies and repeat ineffective
strategies rather than attempt a new one.
5. Ineffective troubleshooters do the wrong thing even when the symptoms
suggest another approach.
6. Ineffective troubleshooters work on the right problem with the wrong tools.
TYPICAL TROUBLESHOOTING PERFORMANCE
(How did you fix that thing?)

What is technical troubleshooting?


Technical troubleshooting is a task that involves the detection, diagnosis, and repair
of faulty equipment.

How do troubleshooters typically perform?


The process of technical troubleshooting is divided into two main components:

1. Generating “best guesses” as to what the problem might be.


2. Testing out each “guess” until the fault is found.

The troubleshooting process is graphically shown as a flow chart on the next page.
First, the troubleshooter collect information that is used to identify one or more
“hypotheses” (best guesses as to what the problem might be). This phase of
troubleshooting is called Problem Definition. During this phase, the troubleshooter
collects and interprets information from many sources. This information helps the
troubleshooter better understand the problem and results in a “mental model” of the
problem. It is the quality of this “mental model” that is one of the keys to becoming
an expert troubleshooter. Following the representation of the problem, the
troubleshooter develops a “problem space.” The problem space is all of the areas
within the system that could potentially contain the fault(s).

Following the definition of the problem and the creation of the problem space, the
troubleshooter begins the second phase of the troubleshooting process called
Problem Space Evaluation. During this phase, the troubleshooter checks out the
“best guesses” that have been developed to determine which one identifies the fault.
If the fault is identified, the troubleshooter can then repair the problem in the
equipment. However, if all of the hypotheses are evaluated and the fault is still not
located, the troubleshooter then returns to the Problem Definition Phase to collect
more information and to generate additional plausible hypotheses.
TECHNICAL TROUBLESHOOTING MODEL

START

ACQUIRE
INFORMATION

INTERPRET PROBLEM
INFORMATION DEFINITION
PHASE

NO
CAN A FAULT BE
IDENTIFIED?

YES

GENERATE
POTENTIAL FAULT PROBLEM SPACE
LIST

ACQUIRE
INFORMATION

INTERPRET
INFORMATION

PROBLEM
CAN NO SPACE
EVALUATION BE EVALUATION
MADE?
PHASE
YES

NO
IS HYPOTHESIS
CORRECT?

YES

END
TROUBLESHOOTING PROCESS

Any troubleshooting process includes five important areas. These include the
various forms of knowledge that troubleshooters need, the priorities that guide
troubleshooting, the factors that influence decision making, common troubleshooting
strategies, and a systematic procedure.

Knowledge of the Equipment

1. General Knowledge
• basic reading and mathematics skills
• environmental constraints (time, weather, etc)
2. Technical Knowledge
• friction, pressure, ohm*s law, Pascal*s law
• common test equipment
3. System - Specific Information
• physical (What is it? What does it look like? Where is it located?)
• functional (What does it do?)
• behavioral (How does it work and relate to other components?)
4. Unit - Specific Information
• symptoms I complaints
• maintenance records

Priorities for Troubleshooting

• Are you only trying to isolate the fault?


• Do you want a long-term, permanent solution and repair?
• Is it critical that the repair be made quickly?
• Is cost effectiveness a priority?

Factors that Influence Troubleshooting Decisions

• Anticipated length of down time.


• Anticipated operation losses (product and labor).
• Availability of technical and engineering support.
• Availability of spare parts.
• Ease of testability. Is it easily accessible? How complex is the test? Is the test
dangerous?
• Timing of failure. When time is limited, technicians rely on brut force
methods. For example, putting a coin in a fuse box. We call this “putting
Band-Aids” on the problem.
Common Troubleshooting Strategies

• Trial and Error very common


--

• Exhaustive Search - test all possibilities - requires little expertise but is only
feasible if the set of possible faults is small (TV tubes)
• Topographic Search - mental model or schematics are used to guide search;
like following a map
• Split-half Technique - This strategy eliminate the greatest number of
possibilities. If testing is time consuming or expensive, try to eliminate as
many possible causes as possible with each test.
• Functional Search - observing the function of a system and developing
hypotheses (i.e., What would happen if?) This strategy relies on a “Mental
Model” of the system and requires the technician to create a “Problem
Space.” The mental model can be simulated mentally and then compared to
a normal functioning system. This strategy requires more system knowledge
and is more mentally difficult than the general search methods, but it is much
more accurate.
A General Troubleshooting Procedure
While troubleshooting seldom occurs in a straight-forward fashion, a
systematic approach will lead to better results. The following
troubleshooting procedure is used by many experts.

GENERAL TROUBLESHOOTING PROCEDURE

Isolate Problem (Collect as much information as possible

1. Customer Complaint
• What happened?
• What was it doing when the problem occurred?
• Was everything else working all right?
2. Operating Conditions
• What is the geography? (rocky; sandy; high altitude; etc.)
• What were the weather conditions? (extreme cold; extreme heat; high
humidity; etc.)
• Was an experienced operator using the machine at the time the
problem occurred?
3. Machine History
• What preventive maintenance has. been completed?
• What repairs have been made in the past?
4. Duplicate the Problem
• Operate the device yourself to check the accuracy of the information
you have been given.
• Have operator duplicate the situation that caused the problem.
5. Collect Additional Information as Needed
• Sensory Checks such as looking, listening, touching, smelling.
• Technical Tests such as operational adjustments, standard operating
procedures, and technical procedures.
• Job Aids such as manuals, bulletins, and schematic diagrams.
• Technical Support such as suppliers, manufacturers, and experts.

Identify Possible Faults

1. Identify as many possible causes of the problem as you can.


2. If the problem does not have a set of clear possible causes, narrow the
problem to a sub-system and then try to identify causes.
3. Collect more information if possible faults are difficult to identify.
Check-Out Possible Faults

1. Collect additional information to check-out the possible faults


2. Vary one thing at a time to test the possible causes when you are not familiar
with the system or device.
3. If testing is time consuming or expensive, try to eliminate as many possible
causes as possible with each test. (use split-half strategy)
4. Do not assume that new parts always work.
5. Reduce the number of possible causes with a systematic approach.
6. Always return the system to its original configuration after replacing a part or
making tests.
7. Take notes about test results, parts replaced, and adjustments made.
8. Always observe safety precautions/rules.

Repair Fault

1. Perform necessary procedures to remove the fault from the system.


2. Take notes about test results, parts replaced, and adjustments made.
3. Always observe safety precautions/rules.

Re-Check Solution

1. Check every solution you reach with some kind of test.


2. If the fault remains after checking the solution (or a new one
appears), go through the troubleshooting procedure again.
3. Complete required Fault Analysis paperwork.
7 TROUBLESHOOTING STEPS

Step 1 & Understand Complaint

Step 2 & Confirm Problem Exists

Step 3 & Gather Information

Step 4 & Develop Failure Theory

Step 5 & Test Theories

Step 6 & Make Indicated Repairs

Step 7 & Retest - Confirm Corrections


TROUBLESHOOTING

FOUR STAGES IN TROUBLESHOOTING PROCESS

1. Discover what others know about the problem.


a. When did the problem occur?
b. How was the machine used?
c. Was everything else working alright?
d. What repairs have been made in the past?

2. Discover what the machine can tell you about the problem

3. Think logically about the problem. Identify as many possible causes of the problem
as you can.
a. How should the system work?
b. How does the system work?

4. Make measurements. Let each measurement be an additional piece of information


with which to reason with the problem again until you know the root cause of the
problem.

ARE THERE ANY SECOND TESTS?

5. If the information gained by the measurement is helpful, but not conclusive, ask
yourself if there is a second test you can perform to prove that you have discovered
the cause of the problem.

LOOK FOR THE ROOT CAUSE OF THE PROBLEM

6. Be cautious. Ask yourself if the tests you have performed point to the root cause of
the problem.
Introduction to Troubleshooting

Troubleshooting Troubleshooting is an organized, logical method of


identifying problems and solving them. This is a
critical skill because your effectiveness and efficiency
in repairing Caterpillar hydraulic systems depends
upon your ability to quickly and correctly determine
the cause of problems.

Determine a Problem
Exists The first step in the troubleshooting process is to
make sure a problem really does exist. Inexperience
with a machine*s characteristics and improper
operation are sometimes mistaken for problems. Ask
questions of and listen to operators, mechanics, and
others familiar with the machine you are
troubleshooting.

Find out when the problem started.

• Was it sudden or gradual?


• Is the problem continual or sporadic?
• Does it occur at all speeds, gears, loads, and
temperatures?

Find out when the machine was last serviced.

• By whom?
• What was done?

Ask more than one person and see if you receive the
same information. If it is available, check the service
record of the machine. And remember that a friendly,
tactful attitude will usually gain you more and better
information.
State Problem in Writing The second troubleshooting step is to state the
problem by writing it down in simple terms.

This will help you more clearly understand the exact


nature of the problem. Be sure that you do not put a
“solution” in the statement. it is too early to make a
diagnosis, and guessing at a solution at this stage will
only waste time.

Inspect Machine The third troubleshooting step is to visually inspect


the machine.

Check for obvious damage, such as leaks, loose


bolts, and cracks. Check all fluid levels.

If possible, watch the machine in actual operation. Try


to observe the problem as it happens. Do not guess
at anything that can be visually inspected. Check it
out yourself.

List All Possible Causes Step four is to list all possible causes of the problem.

Be sure to include all of them, from the simplest to the


most unlikely and difficult.

Use the service manual and system schematics to


make sure you have considered all possibilities.

Run Tests & Record Data Step five is to run tests and record data.

The information you gathered in steps one through


four, plus the machine*s service manual, will help you
determine which tests need to be performed.

Be sure to note the test specifications and procedures


listed in the service manual. You may need to check
cycle times, pressures, temperatures, and other
machine characteristics.

Remember to record all appropriate data from the


tests.
Eliminate and Isolate Now, use all the data you have collected so far to do
step six ---- eliminate and isolate.

Use the things listed below to eliminate everything


that cannot create the problem.

• The list of possible causes you made in step four


• The input from the operator and others
• Your inspection and test results

Fix the Problem Step seven is to fix the problem.

If there is more than one possible cause remaining on


your list, start with the simplest and easiest to fix.

Once you have repaired the problem, rerun the


appropriate tests. Make sure you have really fixed the
problem. If possible, try to observe the machine in
operation. Wait for the machine to reach operating
temperature, and then watch it work under normal
conditions. If the repair fails, let it happen while you
are there.

Analyze the Failure Even after the machine is repaired, there is one final
troubleshooting step: analyze the failure.

Why did the problem occur in the first place? This


procedure may be simple, if the problem is
straightforward; or it may require more sophisticated
failure analysis methods.

Include this analysis in your service report, along with


all pertinent information about the machine.
The Scientific Method

1. Observe and describe a situation


(What)
2. Form a theory the explain the symptoms
(Why)
3. Use the theory (Hypothesis) to predict
results
(How)
4. Verify theory through testing
The Scientific Method
Expanded

1. Curious Observation
2. Is There a Problem ?
3. Goals and Planning
4. Search, Explore, and Gather Evidence
5. Generate Creative and Logical Alternative
Solutions
6. Evaluate the Evidence
7. Make the Educated “Guess” (Hypothesis)
8. Test the Solution (Hypothesis)
9. Challenge the Solution (Hypothesis)
10. Reach a Conclusion
The 8 Steps of Applied Failure Analysis
Step 1 State the Problem
Step 2 Get Organized to gather facts
Step 3 Observe and record facts

W hat do I see & Facts, surface textures, background facts

Step 4 Think logically with the facts

W hat does it mean? & Events & Something happened at


(change facts into e vents) A cer tain point in time

W here do I go next?
(More facts)

½ X X X X
l l
New Sequence of events Broke
(Cause) (Time l ine) (Result)

Step 5 Determine the most probable ROOT CAUSE

a. W h at happened first

b. How it happened -- Brainstorm list: xxxxxxxxxxxxxxxx


Investigate & eliminate XXXXXXXXXXXX
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx

c. W ho is responsible Brainstorm list: xxxxxxxxxxxxxxxx


Investigate & eliminate xxxxxxxxxxxxxxxx
XXXXXXXXXXXX
xxxxxxxxxxxxxxxx

Root cause statement = what happened first + how it happened + who is responsible
Ask the double check question
Prepare a report using the first 5 steps as an outline

Step 6 Communicate with the responsible party


Step 7 Make the repairs
Step 8 Foll o w up to assure the problem is solved
Failed Parts Application, Operation, Maint

Identify - - Identify
Interpret - Facts Facts - Interpret
Follow - - Follow

Events

Root Cause Time Line Result


The 10 Step Universal Troubleshooting Process

1. Get the Attitude


2. Get a complete and accurate symptom
description
3. Make a damage control plan
4. Reproduce the symptom
5. Do the appropriate general maintenance
6. Narrow it down to root cause
7. Repair or replace the defective
component
8. Test
9. Take pride in the solution
10. Prevent future occurrences of the
problem
Step 1 Get The Attitude
In troubleshooting, as in any other human endeavor, you must have the right
attitude to succeed. You CAN solve it. It's not magic -- there's always an
explanation. Don't try to fix it, just try to narrow it down. Don't panic. Don't get
mad. Be patient and don't skip steps. When you get in a bind, just ask yourself
"how can I narrow it down one more time?". Practice teamwork.

The best way to get and maintain the attitude is to remember that it is a
mathematical certainty that you will solve any reproducible problem in a system
for which you have knowledge or system documentation. Above all, remember
that your troubleshooting power comes from your troubleshooting process.

Step 2 Get the Symptom Description


The symptom description must be as complete and accurate as possible. The
more detailed the description, the less work you will need to do. A good symptom
description minimizes the risk of “fixing the wrong problem”. Here is some
common information to describe the symptoms:

Equipment Questions

? Age?
? Maintenance history?
? History of prior problems?
General Symptom Questions

? Why do you think there is a problem?


? Any error messages, fault codes, gauge readings?
? Is the symptom intermittent or reproducible?
Reproducibility Questions

? Is there a procedure to CONSISTENTLY reproduce the symptom? If the


answer is YES, the problem is reproducible. If the answer is NO, the problem is
intermittent.

If Reproducible

? Describe the procedure to produce the symptom.


? Is there any way to make the symptom go away?
If Intermittent

? How often does it seem to happen?


? What seems to make it more frequent?
? What seems to make it less frequent?
? Is there anything that seems to make it go away?
Other Related Symptoms

? Any other oddities evident?


? Any other symptoms present even in other systems?
Occurrence Questions

? When did the problem start?


? Did anything else happen about that time?
? Were any changes made in the equipment or it’s operation
Step 3 Make a Damage Control Plan
Before you begin, think if there is anything you might do to actually make the
problem worse. Take safety precautions. The equipment may not be working
properly and some hidden dangers may exist.

Precautions to prevent injury to people

‚ Wear proper clothing

‚ Protect Yourself!

‚ Make sure you are not creating a fire or other hazard

Precautions to prevent damage to equipment

‚ Will any test potentially cause damage?

‚ Will reproducing the problem potentially cause additional damage?

‚ Make sure proper procedures are available for disassembly and assembly.

Step 4 Reproduce the Symptom


You can’t fix what you can’t see! If you can’t reproduce the problem, you can’t
narrow down possibilities and find a solution. If you can’t reproduce the problem,
you can’t be sure if any fix actually is a correction or is creating another situation.
Step 5 Do the Appropriate Maintenance
We have all felt stupid after spending hours narrowing down a problem to
something that would have been corrected with simple general maintenance. The
trick is to determine what is appropriate maintenance. An action is appropriate
general maintenance if:

‚ It’s likely to cause the problem, is easy to do, and is a maintenance item.

‚ It’s a possible cause of an intermittent problem, and is not difficult.

Step 6 Narrow It Down


This is the most complex step. There are many different analysis techniques that
can be used to narrow the problem back to root cause. Regardless of the method
used, the goal is to eliminate possibilities. The method that achieves this most
quickly and accurately is usually the best.

Intermittent problems complicate the solution process as it may not be possible to


eliminate potential problem areas unless the problem is present during testing.

There are some techniques to improve the possibility or finding a solution that is
not always reproducible. Knowledge of the system may allow for educated
“guesses”. While this is little more than “trial and error”, the more
knowledgeable the troubleshooter is about the system, the more likely they will
find a cause.

Triple Tradeoff: Ease vs. Likelihood vs. Systematic Approach


A significant tradeoff is the fast that troubleshooting tests may be time consuming,
inconclusive, and sometimes risky. A test that eliminates the greatest number of
possibilities is often the hardest, both to design and conduct. Sometimes it is ok
to test easy and likely problems first to quickly eliminate possibilities.

Letting the Problem Out of the Box


Most analysis techniques continually try to force the problem into smaller areas
until the only thing left is the root cause. The worst thing that can happen is that
the problem escapes the box. A test that should have eliminated a possibility may
have been incorrect. When that happens, tests become inconclusive. The true
cause may escape detection. The troubleshooter may think they have eliminated a
possibility when that area actually contains the solution. Failure to find a solution
leads to confusion and wasted time. The troubleshooter begins to doubt
themselves and the process. Do Not Skip Steps.
Step 7 Repair or Replace Component
The easy part. Be sure the repair is done correctly and no new problems are
created due to workmanship.

Step 8 Test
Final testing is the best way to know if a problem is fixed. If the symptom you
obtained in Step 2 and reproduced in Step 4 is now gone, and no new problems
have occurred, the end user will probably be happy. Most horror stories occur
when final testing was inadequate or non-existent. When testing ask four quality
questions:

? Did the symptom go away?


? Did the right symptom go away?
? Did I fix the right cause?
? Did I create any new problems?
Some testing may be done on an incomplete system, but it is important to do a
final test when everything is completely reassembled.

Testing may not confirm the correction of an intermittent problem. If you had
difficulty reproducing it in the first place, checking to see if is now gone is equally
difficult. If final testing is inconclusive with an intermittent problem, the final
testing may be done by the end user, if the user understands they are doing the
testing, and they consent to using the product without confirming that the problem
has been corrected.

Step 9 Take Pride


The other steps fixed the problem. Troubleshooting can be an intense process and
must be done unemotionally. Now is the time to take pride in finding the solution
and share your success with others. Review what you did, and evaluate what you
could have done better or differently, and pat yourself on the back for any brilliant
ideas.
Step 10 Prevent Future Occurrence
Document the symptom and solution
You may have the same problem again in the future.

Tell others
Let other people benefit from your experience. They may give you some insights
into solutions they have found. In some cases, manufacturers should be notified.
There may be a problem developing that they should be aware of.

Give the customer user instructions


Often the customer or user may have done something to create or aggravate a
problem. Educating them (tactfully) about how to properly use and care for the
equipment will increase their satisfaction and reduce the likelihood of future
problems.
Common Elements of All
Troubleshooting Processes
Define problem
Duplicate problem
Compile available information
System information
Limitations
Analysis
Systematic approach
Goal is to isolate problem or fault
by ruling out possibilities
Repair
Confirm solution
Troubleshooting Questions

1. What are the symptoms ?


a. What is working normally ?
b. What is NOT working normally ?
2. What are possible causes for each symptom ?
a. For each cause, what symptoms should be present
?
b. For each cause, what symptoms should NOT be
present ?
3. What tests/ observations can you make to verify the
presence of a cause ?
a. What should the results of the test / observation be
if
i. The system is working normally ?
ii. The expected problem is present ?
4. What should be fixed to correct the cause ?
a. Why ?
b. Will this effect anything else ?
5. When repairs are made, is the problem corrected ?
6. Do any problems still exist ?
The Importance of Quality Control
The quality of the solution depends on the quality put in the process. Getting an
accurate and complete symptom description, and reproducing the symptom assures
that you fix the problem the customer wanted fixed. A good plan assures you
won’t make anything worse. A good analysis process will reduce costs, and
ensure that the real root cause is found. Proper repair prevent s further problems.

The quality of tests done during the analysis process is critical to ensure that the
results are accurate. A poor test can cause the real problem to remain hidden, and
the tests to be inconclusive. A good test can be duplicated and double checked.
An unnecessary test is just as bad as a poor quality test, since the information does
not help eliminate possibilities.

Final testing is like inspection in a factory. Small defects that escape detection
earlier in the process are caught here, by showing that the original symptoms that
were reproduced are now eliminated. Preventing a later occurance of the problem
is a key to good service.

Don’t Skip steps


Since all troubleshooting techniques depend on the idea that testing and analysis
will eliminate possibilities and lead us to the root cause, skipping steps can lead us
to disaster. The true cause may escape detection. The troubleshooter may think
they have eliminated a possibility when that area actually contains the solution.
Failure to find a solution leads to confusion and wasted time. The troubleshooter
begins to doubt themselves and the process. Do Not Skip Steps.
Common Troubleshooting Pitfalls
; No understanding of system operation
You can’t fix what isn’t broke ! How can you tell what is
wrong if you don’t know what is correct operation ?

; No Plan
If you don’t have a start and a finish in mind, how do you
find the path ?

; Skipping steps
What have you missed ? Is it important ?

; Jumping from one analysis tool to another


If you keep changing direction, will you find your goal ?

; PANIC
What ARE you doing ?

; No “Quality Control”
If a test, a tool, or procedure is inaccurate, how can you
trust the results ?

; Too many tests, too much information


Sometimes additional information may be confusing. Do
you really need that information ?
Troubleshooting Flow Chart
YES NO

NO
DON'T MESS DID YOU
WITH IT MESS WITH IT ?
YES

NO
DOES
YOU POOR
ANYONE
FOOL
KNOW ?
YES
YES
ARE YOU IN
TROUBLE ?

HIDE IT
NO

CAN YOU BLAME THROW AWAY


ANYONE ELSE ? THE EVIDENCE
NO

YES
Intermittents & Reproducibles

Definitions:
These are two kinds of symptoms, and they're opposite and mutually exclusive. Here are the
definitions:

A Reproducible is: An Intermittent is:

A symptom which can be consistently A symptom for which there is no known


reproduced using a known procedure procedure to consistently reproduce it.

Notice the following about these definitions:

An intermittent can be reproduced. However, the troubleshooter cannot cause its reproduction
because there is no known procedure to consistently reproduce it. The best the troubleshooter can
do is create an environment to increase the odds of the symptom occurring, and wait. When the
symptom occurs, it reproduces itself

An intermittent can become a reproducible. This happens when the troubleshooter finds a
procedure to consistently reproduce the symptom. In other words, these terms are from the frame
of reference of the troubleshooter, not the physical world. In the physical world everything is
reproducible if viewed in enough (molecular, atomic, etc.) detail.

The picture goes blank within an hour of turning it on is not a reproducible symptom. The
word within means sometimes it's an hour, sometimes 45 minutes, etc. The exact time is
governed by chance. It’s probable that some time it will take more than an hour to occur maybe
much more.

Why reproducibles can always be solved:


As per the definition, the troubleshooter can reproduce the symptom at will. If the troubleshooter
performs a test that stops the known procedure from reproducing the symptom, he or she has
clearly ruled out part of the search area. After a number of such tests, the technician will have
narrowed the cause to a single component, which can then be repaired or replaced. There are two
requirements for mathematical certainty of solution in reproducibles: 1) the troubleshooter has
sufficient knowledge of the system to devise tests that will narrow the search area and sufficient
knowledge to interpret those tests correctly (often possession of technical documentation is
enough), and 2) the troubleshooter use a procedure for these tests to guarantee that he or she
doesn't "go around in circles". Given these requirements, a reproducible symptom will always be
traced to its root cause.
Why intermittents are so hard to troubleshoot:
With intermittents, there's no mathematical certainty of solution. Indeed, many intermittent are
never solved. Here's why. Since reproduction of the symptom isn't in the troubleshooters hands,
there's no way of knowing whether a symptom went away because of a test the troubleshooter
performed, or because of random chance. Since no conclusive test can rule out part of the search
area, the underlying cause can't be traced. Instead, the troubleshooter uses a combination of
general maintenance, statistical analysis, intuition, and trial and error. These four tools often lead
to a solution, but sometimes don't. If the troubleshooter can't reproduce the symptom at all, all he
or she is left with is general maintenance and guesswork. In this case, the probability of solution
is low.

Intermittent busting strategies:


Ignore it: If the problem causes no danger to people or property, and if the symptom is so rare it
isn't an inconvenience, this is the best policy. Note that the hardest problems to fix are those that
happen least frequently.

General maintenance: Since intermittents are so tough to troubleshoot, general maintenance


starts looking a lot easier. Cleaning every connector in a electronic system might seem too much
work for a reproducible, but compared to the hassle of troubleshooting an intermittent it's
downright easy. It's a useful policy to “General Maintenance” an intermittent, then either test it or
give it back to the user/customer to test. Be sure the customer or user is informed of what you
did, and what he or she can expect.

Change the environment: Turn the intermittent against itself. Sometimes you can change the
environment that the system operated in. Using a heat gun to heat a component that only gives
trouble intermittantly may create the symptoms and narrow it down physically. If that doesn't
work, wiggle things looking for bad connections, or move things around and see what happens.
By turning the intermittent against itself you may actually have an easier time with intermittent
than with reproducible.

Convert the intermittent into a reproducible: If you can isolate a portion of a system and then
throughly check and exercise just a small part of the system, it may be possible to force the
system to reproduce the symptoms. Always try to find a procedure to consistently reproduce the
symptom.

Statistical analysis: We all use a human, subjective style of statistical analysis when dealing
with intermittent. "It seems to happen more when...", "It seems to happen less when I...", "It
seems to happen about once every..." are examples. The real breakthrough will come when a
diagnostic machine is able to exercise the system in several ways, record the instances of the
symptom, correlate them to the exercises, and statistically evaluate the correlation. When
something is three or more standard deviations outside the norm, bingo, you've got your
reproduction procedure.
Summary:
Intermittent and reproducible are opposites. Reproducible can be consistently reproduced by a
known procedure. This is not true of intermittent. It is a mathematical certainty that reproducible
will be traced to their root cause by a person using a systematic approach, and having sufficient
knowledge of the system to devise and interpret conclusive troubleshooting tests. This is not true
of intermittent. When confronted with an intermittent, use one or more of these approaches:
Ignore it, General maintenance, Change the environment, Convert the intermittent into a
reproducible, Statistical analysis.
Operating As Designed
“Phantom Faults”

This condition refers to instances where a system operating as designed is perceived to


be unsatisfactory or undesirable. In general this is due to:

1. A lack of understanding by the customer.

2. A conflict between customer expectations and system design intent.

3. A system performance unacceptable to the customer.

WHAT TO DO:

The technician can verify that a system is operating as designed by:

! Reviewing Published Service Information functional / diagnostic checks.

! Examining bulletins and other service information for supplementary


information.

! Compare system performance to a like system.

A. If the condition is due to a customer misunderstanding or there is a conflict


of customer expectations the technician should explain the system
operation to customer.

B. If the concern is due to a case of unsatisfactory system performance, the


technician should call Technical Assistance or the Dealer for the latest
information.
Analysis
Tools
Divide and Conquer

ELIMINATED BY TEST 1

6
TEST 4
ELIMINATED BY TEST 5
TEST 2
ELIMINATED BY
TEST 3
Analysis Tools
Ask others !!

Find out what is already known about the problem. Check service
manuals, bulletins, magazine articles, or any other printed reference. Ask
an expert. Before you do, ask yourself an important question. Do I trust
the expert and what are their qualifications? If the problem has been
experienced before, someone, somewhere may already have a solution.
Why reinvent the wheel!

“Shotgun” Approach

Unfortunately, this is a common method. Some technicians don’t have a


plan, so they just try anything. This is sometimes better referred to as
“trial and error”, usually error! A sure sign that this is the process being
used is evidence of “throwing parts” at the problem. Some technicians,
and others, seem to think the only solution is to keep changing parts until
the problem goes away. This can be very expensive and is no guarantee
of success. The problem may go away eventually - after everything is
changed! This is a very poor technique.

Linear Analysis

This is a slightly more systematic process than trial and error. It consists
of testing every component in a system in sequence. This is sometimes
referred to as “Exhaustive search” and it can be exhausting! It is like
taking first road from your house and then trying every possible route until
the desired destination is reached. Start in beginning, and check, test,
adjust everything along the path. This is a time consuming process and
should eventually get results, but only if the correct system is being
checked. This process is sometimes used as a last resort.
Topographical

Best described a following a map, schematic, flow chart, or troubleshooting


tree. The problem is that this may be someone else’s logical process. If
you don’t know the logic behind the process, and the process fails, you
won’t know why, or where to go next. However, if the process has been
well developed and tested, this can be a quick way to check out a system if
you have limited knowledge. Be aware, if this process fails, you will have
to try a different approach.

Split in Half - Interval Halving - Divide and Conquer

This is one of the best methods to quicky isolate a problem. This


technique can be used on almost any system. The concept is to devise
tests that eliminate 1/2 of the possibilities in each step. The first step is to
use a test to eliminate 1/2 of the total system. The next step takes the part
that does not test as ok, and uses a test to test 1/2 of that part. Each
successive step keeps splitting the remaining section in half. It does
require some expertise to devise realistic and effective tests, but the
reward is great. The main limitation of this technique is the accuracy of the
test, but if the tests are valid, the problem will be isolated quickly. Many
expert troubleshooter use this process whenever possible because of the
speed and accuracy.

Probabilities

Some diagnostic procedures are based on probabilities. This is a common


method used to troubleshoot computer hardware. After a system has been
in use for some time and a history of symptoms and solutions has been
established, it is relatively easy to generate a list of probabilities. Some
failure will be the most probable, followed by the next most likely, and so
on. This technique will test the most likely problem first to eliminate it, then
the next, and down through the list. The concept is that if 50% of the
problems are caused by one failure, one test can either confirm or
eliminate that as the cause. This works well only if the list is based on a
good, accurate history. When an experienced technician says “I have
seen this before and it is usually caused by ....” , they are using
probabilities. The more experience a person has on a subject, the more
likely they can identify the correct probabilities, but there are no
guarantees. This is sometimes a very good method to use first, if only a
few failures are known to cause most of the known problems. You may still
need to go back to a technique like halving if the problem remains elusive.
At least, the common problems can be eliminated quickly.

Pattern Recognition

This is a process that is sometimes used by the best troubleshooters. It


requires considerable system knowledge and the ability to “visualize” the
system in use. After the symptoms are confirmed, the troubleshooter asks
a series of “what ifs”. Is there a pattern to the problem? What problems
can co-exist? If this fails, what will happen? Sometimes this is referred to
as “free association” or “thinking out of the box”. The major pitfall of this
method is that it is easy to stray from the reality of the problem. The user
must be very disciplined mentally or the process may deteriorate into trial
and error, or guessing. The pattern may not be very clear, so it must be
tested carefully to verify that the explanation is possible and realistic.

Pyramid

This is similar to halving. Start by checking the major system. Then,


check each subsystem. Eliminate the ones that check ok. Check all
component groups in that subsystem to eliminate the ones that work.
Continue the process until one component or area is isolated. This is
sometimes a common type of flow chart.

Spatial

This is somewhat similar to linear, except that in this process, the


troubleshooter checks everything that could effect that one system /
component. Then move on to the next system / component and do the
same. This technique is used many times if there are very few
components or after another method has been used to reduce the
possibilities.

Eliminate the obvious and easy first

There are many times when the best analysis tool may be difficult to use or
time consuming. By testing the obvious and easy first, it is possible to
eliminate some possibilities very quickly. You may get lucky! This is
usually a good analysis tool as it is so easy to do.
Only checking what the problem solver already understands

There are times when a troubleshooter is unsure of which way to go or


what to do next. They usually fall back to only making tests that they are
comfortable with, on systems they understand. Unfortunately, this may not
be anywhere near the real root cause. This is unlikely to get any good
result. Some people want to avoiding learning new concepts and just
assume a new system is same as older system

Accidental solution

This happens! There are times when a solution is found by accident. The
problem solver is looking in one area or at a particular problem, when
evidence is found pointing in a completely different direction. Most
experts accept that this will happen once in awhile and keep an eye out for
the unexpected. They are also aware that a good analysis tool would have
taken them there eventually
Divide and Conquer Example

ELIMINATED BY TEST 1

6
TEST 4
ELIMINATED BY TEST 5
TEST 2
ELIMINATED BY
TEST 3
Classifying Tests
Direct Tests

A direct test is a one that verifies the actual performance of a


system or component. Examples include:

1. Load testing a battery.


2. Check cylinder to piston clearance with a micrometer.
3. Visual check of bearings.
4. Flow testing a pump.

The result of a direct test will accurately show the performance or


condition of the system or component. These tests check just the
component and usually do not show how the component affects other
areas or systems

Indirect Tests

An indirect test is usually conducted when a test of a single


component or system is very difficult or time consuming. An indirect
test may also be done when it is desired to eliminate a large number
of possibilities with one test. Since most systems contain a number of
components, if the main system is working ok, then the components
of the system should be ok. In some cases, an indirect test will be
done to determine how a system affects something else. The idea is
that if the system being tested is ok, then anything that affects the
system is also ok. Examples include:

1. Starter cranks ok and the lights are bright - is battery


ok?
2. Blowby test ok - are cylinders ok?
3. Oil pressure ok - are bearings ok?
4. Cycle times ok - is pump flow ok?

An indirect test may a require the user to have significant system


knowledge to select correct test and evaluate results. An indirect test
may not be conclusive, but will usually give a very good idea about
the general condition of a component or system.

Confirmation tests

A confirmation test is a second test of a system which is done to verify


that the first test result is correct. A confirmation test may also be the
same test done a second time to verify that a test is valid and
repeatable. A confirmation test is often done if the first test was an
indirect test and the results were inconclusive. Examples include:

1. Flow testing pump if most but not all cycle times ok.
2. Load testing a battery if starter cranks ok, but lights
dim.
3. Visual inspection of bearings and clearances if oil
pressure is within normal range but at the low end.
4. Disassemble an engine and measure clearances if
blowby is very high but power output and oil
consumption is within normal range.

A confirmation test is sometimes necessary to eliminate possibilities


and provide a conclusive, measurable result.

Redundant

A redundant test is a second test of same system that is done even if


first test is conclusive. It can also be the first test conducted a
number of times with same results. A redundant test is usually done if
a technician is unsure of themselves or of the test. Testing a
component of a system when the system has been shown to be
working correctly can be considered a redundant test. Examples
include:

1. Checking pump flow if case drain is excessively high.


2. Load testing a battery that won’t take a charge.
3. Checking blowby on engine with high oil
consumption, very low power, and high hours.
4. Pulling cylinder head for inspection even if output on
dyno is good, blowby is good.

There is a fine line between a confirmation test and a redundant test.


The difference is that the redundant test doesn’t give any new
information. When judging whether to conduct additional tests, the
question should always be asked “will this test tell me something I
don’t already know?”

Irrelevant Test

This is the worst type of test. Sometimes a test is made of a


component that is not even in system with the problem. Conducting a
test which has no purpose or no measurable results is an irrelevant
test. This is usually the result of the person making the test having
little or no knowledge of the system or what the test is designed to
check. Examples include:

1. Load testing a battery when the compliant is low


power from engine.
2. Pulling the head of engine when complaint is poor
hydraulic performance.
3. Checking voltage drop in a light relay when the
problem is in the starting circuit.

The results of a irrelevant test will tell nothing and may actually
confuse the search for a answer.
SWITCH

BATTERY

RELAY

COIL FUEL
PUMP
SOLENOID

POINTS
STARTER

FUEL AIR
PLUGS TANK CLEANER

IGNITION AIR/FUEL
CARB

UNIT
STARTS

The user complains that the engine won't start.


List the first five things you would check.

1. __________________________________

2. __________________________________

3. __________________________________

4. __________________________________

5. __________________________________
SWITCH

BATTERY

RELAY

COIL FUEL
PUMP
SOLENOID

POINTS
STARTER

FUEL AIR
PLUGS TANK CLEANER

IGNITION AIR/FUEL
CARB

UNIT
STARTS

The user complains that the engine won't start.


List the first five things you would check.

1. __________________________________

2. __________________________________

3. __________________________________

4. __________________________________

5. __________________________________
Starter Circuit Test Points

Select Tests to run and test order

1. Test point _____ to test point ______

2. Test point _____ to test point ______

3. Test point _____ to test point ______

4. Test point _____ to test point ______

5. Test point _____ to test point ______

Technique used ? __________________________________


Starter Circuit Test Points

Select Tests to run and test order

1. Test point _____ to test point ______

2. Test point _____ to test point ______

3. Test point _____ to test point ______

4. Test point _____ to test point ______

5. Test point _____ to test point ______

Technique used ? __________________________________


Troubleshooter “War Story” Exercise
Equipment : __________________________________________________________
System With Problem : _________________________________________________
Customer Complaint : _________________________________________________
_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
Observed Symptoms : _________________________________________________
_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
Operating Conditions : _________________________________________________
_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
Machine History : ______________________________________________________
_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
Potential Faults:

1. 5.
2. 6.
3. 7.
4. 8.

Methods Used To Isolate : ______________________________________________


_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
Actual Fault : _________________________________________________________
Procedures Component _________ Component _________

Sensory Checks

Look

Listen

Smell

Touch

Taste

Technical Checks

Test1 _____

Test2 _____

Test3 _____

Job Aids

Service Manual

Schematics

Other Service Lit

Technical Support

Consult Expert
Procedures Component _________ Component _________

Sensory Checks

Look

Listen

Smell

Touch

Taste

Technical Checks

Test1 _____

Test2 _____

Test3 _____

Job Aids

Service Manual

Schematics

Other Service Lit

Technical Support

Consult Expert
Notes
Notes
Notes
Notes
Prepared by
Cleveland Brothers Equip. Co., Inc.
Training Department

Вам также может понравиться