A naive first year will think "Well, I can sort-of interpolate in between these two tic marks, so my error is half a Volt."
Ahem. "Everybody set the apparatus up the same way, didn't they? Look at those measurements. What's their standard deviation? Have we learned anything from this little exercise?"
Oops. It turns out that inconspicuous changes in the procedure can mean noticeable changes in the measurement--more than the difference between a couple of tic marks on the meter. If you make enough measurements with the same apparatus, you can often figure out the true value to better than the "tic-mark" resolution.
If you don't know the error, you don't know the measurement.
As you make calculations with that measurement, you have to carry the error along, where it joins with other lovely errors. ("The speed of sound depends on temperature? I didn't take the temperature, but it must have been about 22C. Plus or minus 2. And my speed of sound has an error estimate now.")
So far, so painful. That little plus-or-minus at the end is acting like a katamari ball.
It gets worse. Your calculation, whatever it may be, is based on a model, and that model has some limits. For example, electric currents in a circuit seem like straightforward things to calculate, and even AC circuits aren't too bad. But each wire can be an antenna, sending and receiving. At low frequencies, the effect of that is too small to worry about, but at higher frequencies not all the energy is going through the wires. If you use the simple model rather than the hairy radiowave-included calculation, your result will have some model-dependent error--systematic error.
Most people keep that "systematic-error" separate from the "statistical error." The final answer looks something like
3.14 ± .05(stat) ± .025(syst) radians/second
It can take as long to figure out what the uncertainty on a measurement is as it does to make the measurement in the first place.
This doesn't show up very often in popular science reporting. It needs to.
That single measurement may be exactly what you want and need, but very often a distribution tells you more.
I took the very first computer-programming course our university offered: they didn't even have a textbook ready. It was FORTRAN, of course, via punched cards into an IBM 370. Both the business and engineering schools decided to require it.
The course grade average was somewhere around 80, so you'd think the course and grading were well-designed. Except--if you looked at the grade distribution, there were two "bell" curves--one centered down in the 60's and the other flattened out in the 90's. One group was ready for the course--it was perhaps even too easy--and the other was missing some training and found it hard to keep up.
For another example, I remember a school meeting in which the principal was comparing test averages among the area schools and taking great pride in a few tenths percent difference in average score. I knew roughly what the distributions looked like--pretty much the same everywhere in the area. Having a handful of students at a school with learning problems could change the average. It could easily be just the luck of the draw; there was no way to deduce how well the teachers were doing. Using averages hid that.
For something like the mass of the Sun, show the error estimate. For something like the recovery time for COVID, please show the distributions. There will be more than one. Distributions for different age ranges, distributions for different comorbidities--ideally the n-tuples would be available so we could examine it ourselves and look at recovery times for 30-40yo male smokers. (The statistics peter out when you put too many requirements on the search.) But anything would be a good start.
"Mathematicians are a species of Frenchman: if you say something to them they translate into their own language and presto! it is something entirely different." Goethe
Goethe was unfair. It's easy to be fuzzy with ordinary language--poets love to be able to say two or three things at once. But "How often are foxes rabid?" is very different from "How often is a rabid creature a fox?"
More topically, consider a possible cure X for COVID. "Am I 90% sure X is a cure for COVID?" and "Am I 90% sure that X is NOT a cure for COVID?" sound like the same question, but they aren't. It is perfectly possible for the answer to both to be No. The key to understanding why is that uncertainty I mentioned at the start. I made it explicit in these questions, but news reports very rarely do, and headlines never--they trumpet "Vitamin D no use on COVID!" without any qualification. Maybe it is and they just can't prove it yet.
So:
- What is the uncertainty?
- Is there a distribution, and what does it look like?
- What exactly is the question being answered?
2 comments:
With small high schools, such as the one my son's went to, with less than 30 in the graduating class, even a single outlier can be visible in the mean of the SAT scores. As those scores are pretty tight from school to school to begin with, a jump or drop of five points can move it up or down in the rankings quite a ways.
William and Mary's Physics 101 would have two bell curves every year, which bothered them not at all. They wanted to know who was who right out of the gate.
Including measurement range or confidence will increase accuracy and understanding but decrease readership. Guess which will win?
One of the possibilities we get with web-publishing, that wasn't there with print, is the option to provide optional material. Of course if the reporter doesn't know what he has, the core story will still be munged up, and the auxiliary material be incomplete or inappropriate.
Post a Comment