Friday, October 29, 2021

To err is human--to really screw things up requires a computer

Baseball pitchers threw curve balls long before scientists understood exactly how they worked. Knowing how to do it isn't knowing how it works. Skill isn't the same as understanding.

Of course you'd rather have a pro pitcher throwing curve balls than a fluid dynamics specialist. Often all you need is the skill.

Engineering lives at the boundary. Rules of thumb are excellent things, but they have limits, and understanding the forces behind the rules of thumb can save your dam.

Neural networks and AI are all the rage. Colleagues use Boosted Decision Trees and other tools. If you've many variables, and you need to parse out the interesting events from the rest, such tools can be far easier to deal with than nonlinear equations in umpteen dimensions. That gives us rules of thumb. For the honor of truth, sometimes they're very good and sometimes they're lousy, but they're generally better than nothing.

But what do they mean? Reverse engineering AI conclusions seems to be very hard. "3 from column A and 16 from column B and .... 17 from column BZ"--what does it mean? Can you tell what's going on?

Remember that face recognition (google's, I think) algorithm that classified black faces as gorillas? It's easy to guess what went wrong, but preventing that sort of oversight is hard. In retrospect, they should have trained their system on more (maybe even an equal number) of black faces, but how do you know in advance which sorts of distinctions the AI will find? Maybe it will pick up freckles as significant, and classify redheads as dalmations.

People are aware of the problem. There's active research on how to translate/interpret neural net weights, but it looks to this outsider as though that's always going to be problem-specific. One method w on't work for everything.

There's also research in how to embed problem-domain knowledge into the neural net system--once again, customization.

Weather forecasting is so complex that trying to solve the problems from first principles is hopeless. We understand the components and forces (mostly), but it's too big a problem. Rules of thumb are probably all we need to give us the week's weather--most of the time. Did Jimmy the Groundhog see his shadow? But is there about to be another Carrington Event?

Some things are just too complicated to model and solve exactly (e.g. economics, Leontief input-output work notwithstanding). You can bet your company on rules-of-thumb--people won't die. Who'd be crazy enough to bet the whole economy--without understanding it? Don't answer that; rhetorical question.

On the other hand, one application was trying to figure out which arrestees were a greater risk of fleeing/re-offending, and which could be released without bail. IIRC the result was racially skewed. Was that the result of a poorly chosen training set, or was it detecting something real? Do you dare use the algorithm without understanding the answer it gave? IIRC the upshot was that study was rejected without trying to understand the result.

Robbert Dijkgraaf is wrong. Science won't be benefiting from "the new alchemy." Technology may benefit, if we can make sure we understand the limits of the algorithms. Which is hard to do without understanding.

3 comments:

  1. From the early days: "Our new computer can make a million mistakes a second".

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. And Gary Paulsen remembered in Some Birds Don't Fly that when the higher-ups tried computerized scheduling back in the early 60's, a computer assigned one tech to shuttle back and forth between two locations; unfortunately it scheduled no transit time for the twelve-mile drives necessary.

    ReplyDelete