OpenSAFELY - A new way to study medical records
Roberto Dolci <rob.dolci@aizoon.us> writes:
Ed anche Looking without looking https://www.economist.com/node/21786103?frsc=dg%7Ce perche’ il tema dell’accesso ai dati necessari per gli studi epidemiologici e’ caldo
Molto interessante, grazie! --8<---------------cut here---------------start------------->8--- Instead, its coders wrote software which let them perform their analysis within that data centre. Even then, Dr Goldacre’s crew were not given free rein to poke around inside TPP’s systems. Instead they wrote a series of programs which let them interrogate the patient records through a secure connection. A log was also kept of all queries that the group ran on the records—thus the watchers were themselves watched. --8<---------------cut here---------------end--------------->8--- Pare sia proprio vero, da https://opensafely.org/: --8<---------------cut here---------------start------------->8--- Security All data that carries any privacy risk (even a theoretical risk, and even when pseudonymised) remains within the secure data centre of the electronic health record vendor, where it already resides. This also means that all activity is logged for independent review. All processing takes place in the same secure data centre, where the patients’ electronic records were already stored. The only information to ever leave the data centre is summary tables (with low numbers suppressed) from statistical models. Within the data centre, all pseudonymised data is stored in a tiered system of increasingly less disclosive data stores tailored to each analysis. All underlying software and research code is open to review for security profiling, scientific evaluation, and to re-use as open source tools improving science across the community. Overall this approach is therefore highly secure, and supports high quality science: in contrast to working on intermittent “data extracts”, our approach also ensures that the statistical models run across up-to-date records, which is vital during a global health emergency. Further details on security and governance are given below. [...] The UK, with the NHS, is the only country on the planet with the scale of data needed to deliver these analyses. --8<---------------cut here---------------end--------------->8--- Pero, però.... il codice è qui e ho dato una rapida occhiata: https://github.com/ebmdatalab/opensafely-risk-factors-research Licenza... non pervenuta (per come è pubblicato ora _tutto_ è "All rights reserved" [1]), per ora c'è scritto: --8<---------------cut here---------------start------------->8--- All the software is Open Source; however, it's recently come to our attention the code may include some commercial IP, so we have temporarily removed it from Github until we address that. We expect the code to be published again in during the week commencing 18 May. --8<---------------cut here---------------end--------------->8--- La ricerca è implementata per mezzo di Jupiter Notebook [2], i requisiti per eseguire i notebook sono MS SQL Server (probabilmente usato come database intermedio, compartimentato) e Stata. In altre parole il codice sarà anche open source (una volra risolta la questione indicata nel README e citata sopra) ma il sistema di build, quello per "complilare" il codice, non è affatto libero. Sostituire il requisito di MS SQL Server probabilmente è banale ma tutto il codice Stata andrebbe riscritto in R; non sono un esperto di Stata ma mi pare che nemmeno https://github.com/lbraglia/RStata risolva il problema. Non mi pare proprio il massimo degli esempi di Open Source, insomma :-) ...meglio di niente ma siccome non sono gli ultimi arrivati mi sarei aspettato uno sforzino in più. Diciamo che quantomeno il protocollo adottato dal progetto conferma che «si può fare!». Grazie, Giovanni. [1] scrivere nel README "è open source" non vale come licenza [2] https://en.wikipedia.org/wiki/Project_Jupyter#Jupyter_Notebook -- Giovanni Biscuolo
participants (1)
-
Giovanni Biscuolo