Part 8: Basic Statistical Analysis of Random Errors

Professional Surveyor Magazine - April 1998

Random error was introduced and defined in Part 2 of this series ("Mistakes and Errors," Professional Surveyor, April 1997, pp. 19-22). The nature of these errors was more completely explained in Part 3 ("Dealing with Errors," Professional Surveyor, May/June 1997, pp. 66-68), where I stated that random errors follow statistical behavioral laws such as the laws of probability and compensation. In Part 6 ("Level of Certainty," Professional Surveyor, November/December 1997, pp. 30-32), I mentioned that statisticians and researchers use controlled experiments to determine repeatability and probability of certain outcomes or occurrence of certain selected events. In that article standard deviation was introduced and mentioned as a tool for testing precision. Level of certainty was explained, in part, by stating that the standard deviation is at the 68.3 percent level of certainty. With this as review, I will further explain this statistical tool and how to use it to test precision by using repetitions of a measurement.

Definition of Standard Deviation

The standard deviation is the value on the X axis of the probability curve that occurs at the points of inflection of the curve. To those who wish to apply this tool to practical matters in surveying, it is an estimate of the precision in the variable being tested, or the expected random error in this variable. The standard deviation is sometimes called "standard deviation of a single value," meaning that it is the expected precision of one measurement using the procedure. To arrive at a reliable value for such a single observation, a test must be done, using several repetitions of a measurement, simulating the field conditions to be used later.The late Sol A. Bauer, a former surveyor from Cleveland, Ohio, and ACSM President in 1949, performed extensive testing of his field procedures over a 15-year period starting in the early 1930s. His results were published in Surveying and Mapping, Volume VII, No. 3 (1947). In a later article in Volume XI, No. 2 (1951), Mr. Bauer discussed the need for more study of errors by the Property Surveys Division of ACSM. His pioneering studies set a good example and illustrate what is possible, especially considering that he didn't have the advantage of the computer for data reduction. Sol Bauer is a rare exception.

Lack of Interest in Error Testing

Overall interest in the investigation of errors seems to be relatively low among surveying professionals. It has probably been limited to analyses by federal government surveyors for high order control surveying, by engineers employed by instrument manufacturers and by a handful of people who have taken an academic interest in this important aspect of surveying. It was my experience when I was a college professor that research money was not available for such testing. Such limited testing has left the profession with incomplete and often inaccurate data on the precision of even the most common surveying methods. It is probable that most surveyors simply use assumptions, "defaults," closing errors, "least counts" or "professional judgment" to estimate errors, if they are estimated or confronted at all.

A surveyor does not have very good understanding of or control over errors unless they are investigated using some sort of controlled experiments. Assumptions, or manufacturers' statements as to precision, can be considerably far from reality, and the geometry and atmospheric conditions of the survey affect the errors much more than many realize. This has been proven to me many times, in practice, in my own research and while teaching seminars on measurement analysis in which group opinions ("judgments") were often solicited, then compared with results of statistical testing and geometric analyses.

Errors that are close to reality are the only thing that makes least squares adjustments valid and justifiable because the weights for adjustments are proportional to the inverse of the squares of the random errors. Furthermore, when propagating errors to determine error estimates for any indirect measurement, or to estimate positional errors, having even one poor estimate in the several errors used in the propagation causes the results to be "garbage." I shudder when I think of the thousands of people using sophisticated software for analysis of errors who have not carefully considered the numerical estimates of the errors being plugged into those computers (or used as defaults)—who then consider the results, with their error ellipses and data printouts, as being worth any more than the print paper itself.

Almost any measuring procedure can be tested for precision. Often individual error sources can be isolated and tested for their contribution to the total error occurring in the procedure. This breakdown has several advantages, which will be cited in the last section herein.

Basic Measurement Error Testing

Precision tests should be made to isolate random error sources and eliminate or compensate for systematic errors that might affect the data. For example, horizontal angle errors due to horizontal axis tilt caused by bubble misleveling are far different over flat terrain compared with hilly terrain. Long distances have the effect of reducing directional errors compared with short distances, when considering target or instrument centering errors as the source. Most such effects can be quantified by using geometric analysis. Ideally, errors should be investigated under several types of conditions. Even changing the observer can change precision. Any experiment must be carefully devised to control the error sources and know what the results mean.

If I am considering precision of a horizontal angle, I might start by breaking down the angle into the variables that contribute to the total random error. I count at least 15 such variables, including 1) reading, 2) pointing, 3) target centering, 4) instrument centering, 5) bubble centering, 6) bubble sensitivity, 7) distance to backsight, 8) distance to foresight, 9) vertical angle to backsight, 10) vertical angle to foresight, 11) measuring program (repetition versus directional method), 12) number of repetitions, 13) sighting conditions (sun, haze, heat waves, ground stability and so forth), 14) criteria used for rejecting any readings and 15) size of the horizontal angle itself. Note that all three error sources are represented here—1) instrumental, 2) personal and 3) natural.

How many readers may have thought that the most significant, or perhaps the only, error source was the reading precision of the instrument? Of the 15 variables above, reading precision would probably rank among the bottom three in importance for most situations, yet it is most often used to denote the precision of the angles. How these 15 variables could be tested might be the subject of a graduate thesis. How they combine into a final angular error is done by error propagation, which will be addressed in the next article in this series. Let us now look at the reading error.

Making A Statistical Test for Precision

Data gathering A 1-second optical theodolite was tested by carefully focusing the eyepiece, rotating the mirror to gain optimum lighting of the circle, then rotating the micrometer knob to align the coincidence marks and estimating each reading to 0.1." The micrometer knob was then moved slightly to "destroy" the reading, and the coincidence marks realigned to get a second independent reading. This procedure was repeated until 25 readings were recorded. The readings, just as they were taken, are recorded in the first column of Table 1.

Each reading is assumed to contain only my personal, random error in aligning the coincidence marks, combined with a less important error of making an estimate to the tenth of a second on the scale. These two errors define the "reading error." Although personal in nature, being a measure of the care and skill of the observer, it is affected by the quality and cleanliness of the instrument optics, the lighting, the micrometer focus and possibly changing environmental factors. Thus, I have introduced several more variables that could be further varied and tested. Space limitations do not permit discussion of how these other variables might be isolated. It will suffice to say that they will cause slightly different results between identical tests, even with the same instrument and observer. The differences are much less significant, however, than the difference between the results of a valid test and "least counts" or numbers based on incomplete information or assumptions.

Calculation of standard deviation Standard deviation is calculated from the equation:

s = SQRT(Svi2/n-1) (1)

where s = standard deviation, vi = the "ith" (or any) residual and n = the number of observations in the set. The

S symbol means "sum of." A "residual" is the difference between an observation and the arithmetic mean, expressed as:

vi = xi - x (2)

where xi = the value of the particular observation, x = the arithmetic mean and

x = (Sxi/n) (3)

The solution of the standard deviation is next illustrated, with intermediate steps shown in Table 1 and final calculations shown below the table.

Table 1.

Theodolite Micrometer Readings

Calculation of Standard Deviation of Theodolite Readings

i Reading vi vi2

1 29.0 0.064 0.004

2 28.2 -0.736 0.542

3 28.3 -0.636 0.404

4 27.8 -1.136 1.290

5 30.1 1.164 1.355

6 27.4 -1.536 2.359

7 29.0 0.064 0.004

8 27.8 -1.136 1.290

9 29.2 0.264 0.070

10 30.1 1.164 1.355

11 29.2 0.264 0.070

12 28.9 -0.036 0.001

13 28.5 -0.436 0.190

14 28.7 -0.236 0.056

15 28.2 -0.736 0.542

16 28.5 -0.436 0.190

17 29.8 0.864 0.746

18 29.6 0.664 0.441

19 29.1 0.164 0.027

20 29.7 0.764 0.584

21 27.5 -1.436 2.062

22 28.2 -0.736 0.542

23 31.0 2.064 4.260

24 28.8 -0.136 0.018

25 30.8 1.864 3.474

= 0.955 sec. (5)

The surveyor does not need to do the calculations longhand, as above. Most scientific calculators, computer spreadsheets and statistical software have routines requiring only data input and a few keystrokes to execute them. The complete solution is shown here to teach understanding of the equation, which is important to understand its usefulness. Some observations in this regard are as follows.

Explanation of the Results

Note that if the sample were twice as large, both the numerator and denominator in the equation for s would also be essentially doubled. Thus, s is relatively unaffected by sample size, assuming that "n" is large enough. We could have done 50, 100 or more repetitions, and if all were done with the same skill and care, we could expect s to be essentially the same as calculated above. This is valuable insight, as it teaches that we can assume the error to be approximately the same in the future, given similar procedures (instrument, observer, conditions and so forth). My tests for this instrument have always yielded s between about 0.7 and 0.9 seconds. The above result was actually a little high, which indicates that I am probably getting out of practice taking readings.

Of course, s does not stabilize until a certain minimum number of repetitions are made. That number is generally around 15 or 20 when there are measurable deviations. If deviations are not easily detected, the number of observations must be much higher. Paraphrasing B. Austin Barry in Engineering Measurements, crudeness of method hides discrepancies, whereas refinement of method amplifies discrepancies. A crude method would be difficult to quantify as to precision using the standard deviation test.

A study of the equation also reveals that given any particular size of sample (denominator constant), a larger standard deviation results when the numerator is larger. This obviously happens when the residuals, or deviations from the mean, are larger. Larger deviations mean higher random errors or less agreement among duplicate readings, which, by definition, is lower precision. All of this illustrates how and why the standard deviation is a precision index.

Relating Error to Level of Certainty

Note that Equation 2 for a residual looks a lot like the equation for systematic error, except that the definition of error has true value in it rather than mean. The significance of this is that the standard deviation is an estimate of precision, not accuracy, because we are investigating the scatter of readings among themselves, each being compared with the mean and not the unknown true or exact value.

Relating all of the above to level of certainty (the third dimension to a measurement), recall that the standard deviation is at the 68.3 percent certainty level. From my test, I can declare that I can make one reading (note that I did not say measure one angle!) with this instrument and be 68.3 percent confident that it is precise to ±0.955 seconds. Recall also from Part 6 that if I want more certainty, I must widen the range (for example, the 2s error is about 95 percent certain). Some commonly accepted multipliers were given in Part 6. A more complete table is provided in my Surveying Measurements and their Analysis text and other books on the subject. Readers are invited to test the theory using the sample test. For example, 17 of the 25 readings will fall within ± s with respect to the mean. That is 68 percent, as the theory predicts. Multiplying s by 1.645 yields 1.57. Adding and subtracting this from 28.936 yields a range of 27.37 to 30.50. There are two values outside of this range (30.8 and 31.0), leaving 92 percent within the range. This is as close to the theoretical 90 percent as is possible in this set of 25 readings. The results follow the theory, as expected.

Much more could be said here about the data, the equations and the analyses of the testing procedures, but space does not permit doing so. For example, had we plotted a histogram (bar graph) of this data, or the normal probability curve, or even constructed a simple frequency distribution table, we would have seen the symmetrical shape of the data. This shape would reveal the attributes of random errors mentioned in earlier parts in this series—equal likelihood of an error of any size being positive as negative, small errors more likely than large errors and so forth. Readers are invited to study the text mentioned above for more thorough explanations and visual study of the graphics associated with statistical analysis of data.

In this test, we isolated one error source in measuring horizontal angles with a particular theodolite. Because of the repeatability of the results, we can rely on the standard deviation discovered to predict reading precision in future observations. This is helpful in several ways. One is simply to be able to cite a close estimate of range of error, with a level of certainty attached. If we can perform controlled tests to isolate other error sources (a "systems analysis"), we can observe the relative magnitudes of the various contributing sources. This gives us control over errors. For example, with such information we can take measures to reduce the contributions from some sources, have data for making wise decisions on modifying procedures to achieve specified standards with minimum effort and expense and so forth. Another related use of such knowledge is for derivation of specifications and devising theoretically sound measurement standards. As mentioned earlier, we also need close estimates of the random errors to perform weighted least squares adjustments. The first step to analysis of anything is knowledge. If we do not know our errors, we cannot control, manage or even discuss them wisely.

Error Testing Is Complex

Most of the applications are not done with individual variable test results viewed in isolation. All known error sources must be investigated individually, then combined mathematically, using error propagation. If we consider the errors stemming from the other 14 variables listed earlier, we see that we have an even more complex problem if we want to analyze the precision of just one angle measurement at one point. The mathematical procedures of error propagation for combining them into a final estimate of the error will be the subject of the next article in this series.


About the Author

  • Dr. Ben Buckner, LS, PE, CP
    Ben Buckner is an educator, author and seminar presenter with Surveyors' Educational Seminars and was a contributing author for the magazine

» Back to our April 1998 Issue

Website design and hosting provided by 270net Technologies in Frederick, Maryland.