Lo spreadsheet colpisce ancora? Excel may have caused loss of 16, 000 Covid tests in England
Buonasera nexiane, Quella del covid-19 è la crisi sanitaria dove con tutta probabilità si stanno spendendo più risorse di qanto mai fatto fino ad oggi nella storia dell'umanità, QUINDI se quello che leggo è vero comincio ad essere seriamente disperato. Oggi, quando ad un telegiornale ho sentito di sfuggita parlare di "errori tecnici" che avrebbero ritardato il conteggio dei contagiati in UK, mi sono troppo incuriosito; ho cercato e ho trovato il comunicato stampa ufficiale che recita: https://www.gov.uk/government/news/phe-statement-on-delayed-reporting-of-cov... --8<---------------cut here---------------start------------->8--- [...] The technical issue was caused by the fact that some files containing positive test results exceeded the maximum file size that takes these data files and loads then into central systems. A rapid mitigation has been put in place that splits large files and a full end to end review of all systems has also been instigated to mitigate the risk of this happening again. There are already a number of automated and manual checks that happen throughout. [...] The delayed reporting are all positive cases identified via Pillar 2 testing between 24 September and 1 October. --8<---------------cut here---------------end--------------->8--- Quando ho letto «exceeded the maximun file size» ho cominciato a sentire puzza di bruciato, pensavo di essere TROPPO malizioso a pensar male, mi sono detto «figurati, avranno dei sistemi adeguanti»… Poi un caro amico e collega mi segnala l'articolo che riporto sotto, con questo commento da parte sua: --8<---------------cut here---------------start------------->8--- da come l'articolo è scritto potrebbe essere una storia acida di miocuggino, ma siamo comunque nel dominio del plausibile, conoscendo Excel e i suoi tipici usi... --8<---------------cut here---------------end--------------->8--- Ho fatto un paio di ricerche e pare di no, probabilmente non è una maldicenza di suo cugino: il database nazionale ufficiale (che se non capisco male è pure quello usato per le notifiche ai contagiati) è un foglio di calcolo Excell. Vorrei che fosse un pesce di Aprile in ritardo a causa del lockdown, ma pare proprio vero. Effettivamente di storie dell'Errore causate dall'uso spregiudicato dei fogli di calcolo ce ne sono ormai parecchie, adesso *pare* ne abbiamo una nuova: https://www.theguardian.com/politics/2020/oct/05/how-excel-may-have-caused-l... --8<---------------cut here---------------start------------->8--- A million-row limit on Microsoft’s Excel spreadsheet software may have led to Public Health England misplacing nearly 16,000 Covid test results, it is understood. The data error, which led to 15,841 positive tests being left off the official daily figures, means than 50,000 potentially infectious people may have been missed by contact tracers and not told to self-isolate. PHE was responsible for collating the test results from public and private labs, and publishing the daily updates on case count and tests performed. But the rapid development of the testing programme has meant that much of the work is still done manually, with individual labs sending PHE spreadsheets containing their results. Although the system has improved from the early days of the pandemic, when some of the work was performed with phone calls, pens and paper, it is still far from automated. In this case, the Guardian understands, one lab had sent its daily test report to PHE in the form of a CSV file – the simplest possible database format, just a list of values separated by commas. That report was then loaded into Microsoft Excel, and the new tests at the bottom were added to the main database. But while CSV files can be any size, Microsoft Excel files can only be 1,048,576 rows long – or, in older versions which PHE may have still been using, a mere 65,536. When a CSV file longer than that is opened, the bottom rows get cut off and are no longer displayed. That means that, once the lab had performed more than a million tests, it was only a matter of time before its reports failed to be read by PHE. Microsoft’s spreadsheet software is one of the world’s most popular business tools, but it is regularly implicated in errors which can be costly, or even dangerous, because of the ease with which it can be used in situations it was not designed for. [...] --8<---------------cut here---------------end--------------->8--- Questo articolo di Gizmondo riporta qualche dettaglio in più: https://gizmodo.com/excel-error-believed-to-have-caused-uk-to-lose-15-841-c-... Cordiali saluti, Giovanni. -- Giovanni Biscuolo
Il 05/10/2020 19:17, Giovanni Biscuolo ha scritto:
Microsoft’s spreadsheet software is one of the world’s most popular business tools, but it is regularly implicated in errors which can be costly, or even dangerous, because of the ease with which it can be used in situations it was not designed for.
That's why we should teach Informatics to All (https://www.informaticsforall.org/) and not just "Digital Skills"... Scusatemi, non son riuscito a trattenermi... Ciao, Enrico -- EN ===================================================================== Prof. Enrico Nardelli Dipartimento di Matematica - Universita' di Roma "Tor Vergata" Via della Ricerca Scientifica snc - 00133 Roma tel: +39 06 7259.4204 fax: +39 06 7259.4699 mobile: +39 335 590.2331 e-mail: nardelli@mat.uniroma2.it home page: http://www.mat.uniroma2.it/~nardelli blog: http://www.ilfattoquotidiano.it/blog/enardelli/ http://link-and-think.blogspot.it/ ===================================================================== --
Enrico Nardelli <nardelli@mat.uniroma2.it> writes:
Il 05/10/2020 19:17, Giovanni Biscuolo ha scritto:
Microsoft’s spreadsheet software is one of the world’s most popular business tools, but it is regularly implicated in errors which can be costly, or even dangerous, because of the ease with which it can be used in situations it was not designed for.
That's why we should teach Informatics to All (https://www.informaticsforall.org/) and not just "Digital Skills"...
Informatics is NOT for All if software lacks one of the following users freedoms: (0) to run the program (1) to study and change the program in source code form (2) to redistribute exact copies, and (3) to distribute modified versions. The Rome declaration https://www.informaticsforall.org/rome-declaration/ completely misses this point.
Scusatemi, non son riuscito a trattenermi...
Scusa ma non sono riuscito a trattenermi :-) [...] Ciao, Giovanni. -- Giovanni Biscuolo
Buongiorno, un follow-up. Giovanni Biscuolo <giovanni@biscuolo.net> writes: [...]
Effettivamente di storie dell'Errore causate dall'uso spregiudicato dei fogli di calcolo ce ne sono ormai parecchie, adesso *pare* ne abbiamo una nuova:
https://www.theguardian.com/politics/2020/oct/05/how-excel-may-have-caused-l...
[...] La notizia è apparsa anche su "The Spreadsheet News Network": «UK Government loses data because of Excel mistake.» https://yewtu.be/watch?v=zUp8pkoeMss Pare, *pare*, che tutti gli indizi evidenzino un penoso vendor lock-in a una vecchia versione di MS Excell, *forse* per l'utilizzo di macro non più supportate nelle nuove versioni. Il caso giustamente ha trovato posto nelle "Horror Stories" curate dallo European Spreadsheet Risks Interest Group (EUSpRig) con identificativo POB2001: http://www.eusprig.org/horror-stories.htm Questo articolo di Nicole Kobie del 13 Ottobre 2020: «Meet the Excel warriors saving the world from spreadsheet disaster» https://www.wired.co.uk/article/spreadsheet-excel-errors racconta l'eroica missione di EUSpRig, il nocciolo della questione è: --8<---------------cut here---------------start------------->8--- [...] The process of examining each spreadsheet is unique. There are software and tools to look for inconsistent formulas or problems with the structure, but a human touch is still required, says Simon Thorne, a lecturer in computing at Cardiff Metropolitan University and a EUSpRig member, because logical problems can’t be picked up by such tools. [...] “The logic is flawed in some way, and they [errors] are hard to spot because you have to be a domain expert to understand that it’s the wrong choice in a scenario.” To audit a complex spreadsheet, Miric uses software to go line-by-line to spot errors, as well as ones that could crop up from continued use. One basic test is to change the inputs and see if the outputs react as expected, perhaps putting in extremely high figures or random letters. In short, that means this work comes down to spending entire days reading spreadsheets. [...] That’s a common theme for this work: even the people who make a spreadsheet can’t always explain what’s happening in it. --8<---------------cut here---------------end--------------->8--- La *semplice* e "banale" realtà dei fatti è che chi "inserisce logica" in uno spreadsheet è SVILUPPATORE che usa un IDE (Integrated Development Environment, al 99.9% grafica) per programmare... purtroppo spesso a sua insaputa. https://en.wikipedia.org/wiki/End-user_development --8<---------------cut here---------------start------------->8--- The most popular EUD tool is the spreadsheet.[4][5] Due to their unrestricted nature, spreadsheets allow relatively un-sophisticated computer users to write programs that represent complex data models, while shielding them from the need to learn lower-level programming languages.[6] Because of their common use in business, spreadsheet skills are among the most beneficial skills for a graduate employee to have, and are therefore the most commonly sought after[7] In the United States of America alone, there are an estimated 13 million end-user developers programming with spreadsheets[8] --8<---------------cut here---------------end--------------->8--- (il resto dell'articolo merita, in particolare "Cost-benefit modeling" ma soprattutto "Criticism".) Interessantissima, a questo proposito, la ricerca «End User Computing: The Dark Matter (and Dark Energy) of Corporate IT» https://www.researchgate.net/publication/239765100_End_User_Computing_The_Da... --8<---------------cut here---------------start------------->8--- [...] We saw in the previous section that corporations on U.S. stock exchanges are required to maintain strong controls over all material aspects of financial reporting. We also saw that the use of spreadsheets of material importance is very common in financial reporting. There is little control over these spreadsheets, so it is difficult to see how these corporations are in compliance with Sarbanes–Oxley. [...] There are many other indications that spreadsheet applications are extremely important. For example, Croll (2005) studied spreadsheet use in the City of London (the financial district of London). He concluded that “the City of London runs on spreadsheets.” He also noted that in an environment that routinely handles transactions worth hundreds of millions to billions of dollars, “nothing of importance happens without passing through a spreadsheet.” Hinh, Lewicki, and Wilkinson (2009) at the Jet Propulsion Laboratory discussed the importance of spreadsheets at JPL in a paper titled, “How Spreadsheets Get Us to Mars and Beyond.” [...] In fact, a lack of serious testing has been reported in every study that has looked at testing in spreadsheet development. Studies have shown that companies rarely test their spreadsheets extensively [...] Furthermore, when spreadsheet users do what they call testing, what they do is typically a pale imitation of what programmers do when they test. [...] In addition, end user applications raise several other concerns. One is privacy. Too many application files contain personally identifiable information (PII) that companies have an obligation to protect. The leakage of even a single spreadsheet or data file containing such information can be a disaster for a firm. [...] Given that end user computing is many years old, how can these conclusions be true? The simple answer is that we have not looked at end user computing seriously in the past. [...] Spreadsheet error rates at the module level are about the same as statement error rates in 3GL programming, so spreadsheet development tools per se are not the problem. The problem appears to be poor application development, especially an almost complete lack of comprehensive testing. --8<---------------cut here---------------end--------------->8--- ...lo studio analizza *solo* il "Corporate IT", e la ricerca scientifica? Ma davvero non c'è modo per gli "end-users" (mi dissocio da questa definizione) di programmare utilizzando un linguaggio (magari in un DSL) *e* una IDE che NON mascherino completamente anche il solo fatto che si stia programmando, con tutto ciò che ne consegue in termini di tecniche e tool per il testing e il debugging?!? …mica per tutto, almeno per le cose importanti? Secondo me "lo spreadsheet" è IL problema :-D Saluti, Giovanni. P.S.: il solo fatto che si usi "Excel" come sinonimo di "spreadsheet" come se non esistesse altro la dice lunga sulla maturità del settore, compreso EUSpRig -- Giovanni Biscuolo
participants (2)
-
Enrico Nardelli -
Giovanni Biscuolo