Excavating AI - The Politics of Images in Machine Learning Training Sets - nexa - server-nexa.polito.it

Oct. 10, 2019

      You open up a database of pictures used to train artificial
intelligence systems. At first, things seem straightforward. You’re
met with thousands of images: apples and oranges, birds, dogs, horses,
mountains, clouds, houses, and street signs. But as you probe further
into the dataset, people begin to appear: cheerleaders, scuba divers,
welders, Boy Scouts, fire walkers, and flower girls. Things get
strange: A photograph of a woman smiling in a bikini is labeled a
“slattern, slut, slovenly woman, trollop.” A young man drinking beer
is categorized as an “alcoholic, alky, dipsomaniac, boozer, lush,
soaker, souse.” A child wearing sunglasses is classified as a
“failure, loser, non-starter, unsuccessful person.” You’re looking at
the “person” category in a dataset called ImageNet, one of the most
widely used training sets for machine learning.

Something is wrong with this picture.

Where did these images come from? Why were the people in the photos
labeled this way? What sorts of politics are at work when pictures are
paired with labels, and what are the implications when they are used
to train technical systems?

In short, how did we get here?
[...]

ImageNet quickly became a critical asset for computer-vision research.
It became the basis for an annual competition where labs around the
world would try to outperform each other by pitting their algorithms
against the training set, and seeing which one could most accurately
label a subset of images. In 2012, a team from the University of
Toronto used a Convolutional Neural Network to handily win the top
prize, bringing new attention to this technique. That moment is widely
considered a turning point in the development of contemporary AI.[12]
The final year of the ImageNet competition was 2017, and accuracy in
classifying objects in the limited subset had risen from 71.8% to
97.3%. That subset did not include the “Person” category, for reasons
that will soon become obvious.
____

Continua su https://www.excavating.ai/

Giacomo

Excavating AI - The Politics of Images in Machine Learning Training Sets

Giacomo Tesio

tags

participants (1)