Re: [nexa] Is GitHub Copilot a blessing, or a curse? · fast.ai

July 26, 2021


      no, non c'è pericolo.
è invece interessante osservare come le medesime tecniche generative
producano risultati soddisfacenti (almeno sotto qualche aspetto) se
applicate al linguaggio naturale, ma assolutamente casuali (29% di
accuratezza è nulla) se applicate al codice. il fatto è che per il codice
la semantica non è un 'optional' :-)

G.

On Mon, 26 Jul 2021 at 09:10, maurizio lana <maurizio.lana@uniupo.it> wrote:
...
in questo ambito è possibile che la moneta cattiva scacci quella buona?
cioè che il codice scritto dai sistemi di AI, anche se crappy, soppianti
quello buono /ottimo prodotto da analisti e programmatori capaci?
se è possibile, si può fare qualcosa cosa per evitarlo?
nel senso che se già è così difficile [eufemismo] che sistemi di AI
prodotti da soggetti umani operino in modo eticamente e culturalmente
soddisfacente, c'è da chiedersi come possano operare dei crappy software
prodotti da sistemi di AI imperfetti [eufemismo]
Maurizio
Il 25/07/21 12:00, nexa-request@server-nexa.polito.it ha scritto:
...
Date: Sun, 25 Jul 2021 09:52:35 +0000
From: Alberto Cammozzo <ac+nexa@zeromx.net>
To: Nexa <nexa@server-nexa.polito.it>
Subject: [nexa] Is GitHub Copilot a blessing, or a curse? · fast.ai
Message-ID: <E381DC52-434E-41A8-B178-D071057565EB@zeromx.net>
Content-Type: text/plain; charset="utf-8"
<https://www.fast.ai/2021/07/19/copilot>
[...]
According to OpenAI’s paper, Codex only gives the correct answer 29% of
the time. And, as we’ve seen, the code it writes is generally poorly
refactored and fails to take full advantage of existing solutions (even
when they’re in Python’s standard library).
Copilot has read GitHub’s entire public code archive, consisting of tens
of millions of repositories, including code from many of the world’s best
programmers. Given this, why does Copilot write such crappy code?
The reason is because of how language models work. They show how, on
average, most people write. They don’t have any sense of what’s correct or
what’s good. Most code on GitHub is (by software standards) pretty old, and
(by definition) written by average programmers. Copilot spits out it’s best
guess as to what those programmers might write if they were writing the
same file that you are. OpenAI discuss this in their Codex paper:
“As with other large language models trained on a next-token
prediction objective, Codex will generate code that is as similar as
possible to its training distribution. One consequence of this is that such
models may do things that are unhelpful for the user”
One important way that Copilot is worse than those average programmers
is that it doesn’t even try to compile the code or check that it works or
consider whether it actually does what the docs say it should do. Also,
Codex was not trained on code created in the last year or two, so it’s
entirely missing recent versions, libraries, and language features. For
instance, prompting it to create fastai code results only in proposals that
use the v1 API, rather than v2, which was released around a year ago.
Complaining about the quality of the code written by Copilot feels a bit
like coming across a talking dog, and complaining about its diction. The
fact that it’s talking at all is impressive enough!
Let’s be clear: The fact that Copilot (and Codex) writes
reasonable-looking code is an amazing achievement. From a machine learning
and language synthesis research point of view, it’s a big step forward.
But we also need to be clear that reasonable-looking code that doesn’t
work, doesn’t check edge cases, and uses obsolete methods, and is verbose
and creates technical debt, can be a big problem.
------------------------------------------------------------------------
the knowledge gap between rich and poor is widening
witten, bainbridge, nichols, how to build a digital library
------------------------------------------------------------------------
Maurizio Lana - 347 7370925
_______________________________________________
nexa mailing list
nexa@server-nexa.polito.it
https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa

Re: [nexa] Is GitHub Copilot a blessing, or a curse? · fast.ai

Guido Vetere