Re: [nexa] When a black box is valuable

April 5, 2019

      Hi Elizabeth,

I'm a programmer from Italy.

I've just read your perspective about the advantages of black boxes at
http://science.sciencemag.org/content/364/6435/26?rss=1

While I admit that it has some merits (that I'd like to discuss with
you more extensively later), as a programmer I can see how your
perspective is flawed by a couple of common pitfalls.

Indeed you ask:
...
What good is knowing the answer when it is unclear why it is the answer?
The results of your informal survey of your colleagues shows that none
of them is an experienced programmer.
Otherwise you would have been explained that "Why?" is a question for
humans, to computers we ask "How?": we don't care about their
intentions, their beliefs or their feelings because computers...
well... computers compute! ;-)

The first question an experienced software engineer would ask to Deep
Thought is: "where is your test suite?"

Before asking _how_ (not "why") a software computed the output from
the input we need to know how to distinguish a correct computation
from a broken one.

That's because we know well something that most people are trained to
ignore: REAL software has BUGS.

As an imaginary system, Deep Thought can pretend to be a fully
bug-free system like only AI researchers might dream in their wet
dreams.

But the problem is that it's just a dream.
Real-world software have bugs.

Since we know this, we don't ask "why this is the answer?" but "how
can I know if the computation performed is correct?".

To answer this simple question we developed basically two techniques:

1. verify that the output produced for every possible input matches
    the expected one
2. prove that each step of the computation is correct

Sometime you can only use one of these techniques, but if you can sell
software that cannot be checked by neither of those, you didn't got
the software wrong, but your career!
Software is knowledge, and you basically managed to sell a sort of
fake news that nobody could falsify!
Your skills are priceless! :-D

Anyway, guess what? With AI black boxes, you can't check either.
For the vast majority of the input domain you don't know the expected
output AND you cannot verify if each step of the computation is
correct because there is at least a step that is a black box.

To be fair, you might argue that we know how such black box are
calibrated (sorry, I refuse to use anthropomorphic terms like learning
and training for something that is totally unrelated to the common
experience of such terms), but that's not a great objection: we know
how we got the black box, but what we see about the computation
performed is meaningless.

Should we throw such black boxes out of the window?

Not yet.
Such statistical tools can be useful in some contexts and you cite a
couple of them. Unfortunately, your analysis if severely biased toward
the thesis you want to sustain.

As a first example you say that black boxes are fine when the cost of
a wrong answer is low relative to the value of a correct answer. Let
me reformulate in proper technical term.

Black boxes can be useful when the cost of their bugs is low.
In other words, if the cost of their bugs is high, talking about their
usefulness is pointless.

So when we can discuss the usefulness of a software (AI or not)?

For example whenever the software do not mess with humans!
If the data in input does not comes from humans and the output
produced does not affect human life, our rights and our freedom, we
can safely execute a somewhat brittle software if we like to.
Feel free to use AI for your study of materials!
That's a fine use (as long as you don't blindly trust its output to
build a bridge!).

But what about targeted advertising?
The "canonical example" mess with people!
Indeed even if we don't care about how it can be used to change the
result of an election, we have to consider the danger of collecting
the large amount of personal data that it needs as input!
Such collection is part of the Ad system that would not exists without
them, so how they are collected, legally shared and illegally leaked
is part of the issue of the AI application itself.

The "general atmosphere of suspicion" you see in Europe (well
addressed by the GDPR), is not about the technology itself but about
the people who sell it as a solution to everything without the basic
intellectual honesty of highlighting the serious dangers it poses.

Since Facebook's collaboration with Cambridge Analytica we cannot put
"targeted advertising" among the safe uses of AI anymore.

But don't worry, it's not your fault, it's just your bias kicking here.

When you write: "A black box can and should be used when it produces
the best results" you do such a wide variety of assumptions that a
mail can not confute them all.

Let's just note what you say later: "self-driving vehicles eventually
will be safer than those piloted by humans; they will produce the best
results with respect to traffic injuries and fatalities".

This is an act of Faith in Machines!
(and in the priests that ask for their favor and interpret their will ;-D)

I'm all for the freedom of religion!

But when you write for a scientific journal, you should probably keep
your litany out, or it will sound as advertising in disguise!

Indeed the paper you refer explicitly states that:
...
These findings demonstrate that developers of this technology
and third-party testers cannot simply drive their way to safety.
Instead, they will need to develop innovative methods of
demonstrating safety and reliability. And yet, the possibility remains
that it will not be possible to establish with certainty
the safety of autonomous vehicles.
What could I add? :-D

You mitigate a little this position by writing: "As with any
decision-making system, the black box must be used with knowledge,
judgment, and responsibility." but any experienced developer would
object that no software MUST BE USED just because it has been written!

Indeed we need even more knowledge, judgement and responsibility to
decide IF we want to use a software.

And this is the crucial issue of today AI.
Only people able to build an AI system from scratch can understand its limits.
Since most people cannot do this, how can they vote for people who
understand the issues?
And if politicians do not understand the matter, how can they prevent
to be manipulated from business?
A gist: they can't. And the murder of Elaine Herzberg shows this very well.

Again, it's not a problem with the technology, but with people.
Once everybody will be able to code and calibrate their own CNN, we
will be able to reduce the evident "mistrust of inexplicable systems"
because everybody will be able to understand their limitation.
As of today, though, caution is paramount.

Meanwhile, instead of trying to delegate Ethics to machines, we should
teach them to business and developers.

The last two section of your essay are very interesting.
...
Although AI thought processes can be limited, biased,
or outright wrong, they are also different from human
thought processes in ways that can reveal new
connections and approaches.
AI don't think. It's software executed on a computer.
Computers compute.

This way of talking influence what people understand about the field,
and we should really be a little more intellectually honest and change
this language with a more technical and descriptive one.
...
in a groundbreaking medical imaging study, scientists
trained a deep learning system to diagnose diabetic
retinopathy—a diabetes complication that affects the
eyes—from retinal images. [...]
..the system could accurately identify a number of
other characteristics that are not normally assessed
 with retinal images, including cardiological risk factors,
age, and gender (14).
Think about it: this is exactly why you can never exclude unethical
biases in a data set!
That's why we should NEVER apply these tools to humans!

Yet, I totally agree with what follow in your text:
...
No one had previously noticed gender-based differences
in human retinas, so the black box observation inspired
researchers to investigate how and why male and female
retinas differ.
Pursuing those questions took them away from the
black box in favor of interpretable artificial and human
intelligence.
IMHO this is the only appropriate use of these wonderful tools.

As STATISTICAL tools to gain new insights, to "see" new things and
develop new models of the reality around us.

In other words, this is the only appropriate application of "AI" black
boxes: as stepping stones to reach a better understanding of the
world.

Always remembering that, just like a telescope, a black box shows us
something that we couldn't see while hiding something else from our
sight at the same time.

Giacomo

Re: [nexa] When a black box is valuable

Giacomo Tesio