Scripting and Programming C173: Unit 10 – Toubleshooting: Hypotheses and tests

10.1 Troubleshooting: Hypotheses and tests

Introduction to troubleshooting

Mechanical and electronic systems sometimes have problems. Ex: A lamp may suddenly turn off, a smartphone won’t charge, or a car won’t start.

In everyday life, users encountering a problem must find and fix the underlying cause. But, many users don’t follow a systematic process for finding causes, instead trying seemingly-random actions. Below, the user is not getting closer to finding the cause.

Troubleshooting process

Troubleshooting is a systematic process for finding and fixing a problem’s cause in a (typically mechanical or electronic) system. Without a systematic process, the cause may never be found or may take longer to find. A common troubleshooting process is:

  1. Create a hypothesis. A hypothesis states a possible cause of a problem.
  2. Run a test. A test is a procedure whose result validates or invalidates a hypothesis.

Those steps are repeated until a test’s result validates a hypothesis.

Troubleshooting and programming

Nearly everyone who teaches computer programming laments that students have weak debugging skills (troubleshooting is called debugging in programming, and causes are called bugs). Most beginning programming students don’t follow a systematic process. Instead, students often:

  • Make random changes to improve how a program runs; akin to randomly shaking the lamp above.
  • Ask teachers for help, before even trying to create hypotheses.
  • Don’t know how to conduct tests.
Girl sitting at computer pulling her hair with look of frustration.

The result is students are frustrated.

After decades of teaching beginning programmers, this material’s authors came to realize that the weak debugging skills had a more fundamental cause of weak troubleshooting skills in general. If a person does not follow a systematic process when facing a problem with an everyday device like a smartphone, TV, or car, the person almost surely cannot be a good debugger.

Thus, this material’s approach is to first teach the basics of troubleshooting, using everyday devices as examples. The “hypothesis / test ” loop must be mastered first, including the basic logic of troubleshooting, approaches to creating hypotheses, and hierarchical hypotheses. Once mastered, the process can be adapted for programming.

Troubleshooting is important for nearly everybody, but especially for programmers. Computer programs have thousands of statements. One tiny mistake, even one letter out of tens of thousands, can cause problems when a program runs. Thus, programmers must follow a systematic process to find bugs. Following a systematic process can reduce frustration.

Also, programmers should realize that debugging is not a nuisance, but rather is an important part of a programmer’s job, just as troubleshooting is a big part of a doctor’s job. A doctor seeing a patient who is ill doesn’t think “What a nuisance” — troubleshooting is a big part of the doctor’s job.

When faced with bugs, a programmer may take solace in knowing that every time he/she systematically debugs a program, he/she becomes a better programmer.

10.2 Logic of troubleshooting

Hypotheses

When troubleshooting a problem, a hypothesis should be a statement of a possible cause, stated so as to be either true or false. Ex: If the problem is a lamp doesn’t light, then “The bulb is broken” is a possible cause and is either true or false. Ex: “The bulb should be replaced” is not a possible cause, but rather a solution.

Testing a hypothesis

When troubleshooting, a test is a procedure typically with two possible results that either validate or invalidate a hypothesis. Ex: To test the hypothesis “the bulb is broken”, a test is to try the bulb in a known-working lamp; the bulb not lighting validates the hypothesis, while the bulb lighting invalidates the hypothesis.

Asymmetric tests

Some tests are symmetric: A result of Yes validates, and a No invalidates, a hypothesis (or vice versa). Ex: For hypothesis “Lamp wire unplugged”, observing the wire unplugged validates, while observing the wire plugged-in invalidates. Other tests are asymmetric: A result of Yes validates, but a No does not invalidate (or vice versa). Ex: For hypothesis “Bulb broken”, shaking the bulb and hearing rattling validates (the bulb’s filament is broken), but not hearing rattling does NOT invalidate (the bulb might be broken in another way).

The user must think carefully about whether a test is asymmetric, and avoid inappropriately validating or invalidating a hypothesis based on a test result.

Asymmetric tests are often used to try to quickly validate a hypothesis. If a lamp doesn’t light, shaking the bulb is an easy test to quickly validate a “Bulb broken” hypothesis, before trying other more time-consuming tests like putting the bulb into a known-working lamp.

More on tests and solutions

  • After a test validates, the solution may be obvious. Once a cause is found, a user tries to solve the problem. The solution may be obvious. Ex: If a test visually inspects a lamp’s wire, which is found unplugged, the solution is to plug in the wire. If a test checks a smartphone’s volume setting, which is found turned off, the solution is to turn the volume on.
  • Some tests solve the problem too. Sometimes the test itself solves the problem. Ex: A test for hypothesis “Bulb broken” is “Insert a new bulb”. If the lamp lights, the hypothesis is validated, and the problem is solved.
  • Some hypotheses can’t be directly tested, so a solution is tried. Sometimes a direct test of a hypothesis isn’t possible, but a solution can be tried. If the problem is solved, the hypothesis is indirectly validated. Ex: Problem: A user can’t hear audio on Skype. Hypothesis: Skype’s software has been running a long time and is in a bad state. Test: No simple test exists, since users can’t examine the software. However, a solution would be to restart Skype, which starts in a good state. If that solves the problem, the hypothesis was indirectly validated, but one can’t be sure.
  • Some tests address multiple hypotheses. Sometimes a single test may invalidate multiple hypothesis. Ex: A problem is the grass turned brown. Hypotheses include “Water is off”, “Sprinkler system is off”, and “Sprinkler system has large underground leak”. One test, “Check if grass is wet”, can invalidate all three hypotheses; if wet, the water must be on, the sprinkler system must be on, and a large leak must not exist (assuming the user knows rain hasn’t occurred lately).

10.3 Creating hypotheses

Creating hypotheses

When faced with a problem, a user typically creates multiple hypotheses of causes. Usually, a user initially creates just a few hypotheses, for the most-likely and easily-testable causes.

10.6 Knowledge

Knowledge yields hypotheses and tests

When a system has a problem, each hypothesis and test comes from knowledge of how the system’s parts work. Ex: If a lamp doesn’t light, hypothesis “Not plugged in” and test “Check if wire is firmly plugged in” comes from knowledge that a lamp requires electricity from a wire firmly plugged into an outlet.

People often don’t care to know how things work, like a car, a house’s electricity or plumbing, or a smartphone, seeing themselves just as users. But if a problem occurs, as is often the case, having such knowledge helps troubleshoot.

10.10 Hierarchical hypotheses

Hierarchy

Hierarchy means an object can be decomposed into sub-objects. Ex: The U.S.A. can be decomposed into states, each state into counties, and each county into cities. In troubleshooting, a hypothesis may be decomposed into more precise sub-hypotheses.

Commonly, a user strives to find where in a large item or system a problem’s cause exists. Binary search divides an item into two halves, runs a test to decide in which half something lies, and repeats the binary search on that half. Ex: A user’s hypothesis “File has a bad character” may be decomposed into “Bad character is in first half” and “Bad character is in second half”. Upon determining which half, the user repeats the process on that half.

Halving decreases size quickly. Ex: If a file has 200 lines, each halving yields 100, 50, 25, 13, 7, 4, 2, 1 (rounding fractions up), so only 8 halvings arrives at 1 line where the cause must exist.


Leave a Reply

Your email address will not be published. Required fields are marked *