Opportunità di ricerca in un progetto di Scienza Aperta e infrastrutture dati
Segnalo questa opportunità di ricerca nell’intersezione tra Scienza Aperta e infrastrutture dati. Si tratta di un assegno di ricerca “professionalizzante” categoria A presso il Dipartimento di Elettronica e Telecomunicazioni del Politecnico di Torino: https://careers.polito.it/default.aspx?id=410/2024-AR Content and finality of the research program Universities have a lot of data about their own research processes, which should be used to measure performance and to devise strategies to progress towards university’s goals. In this respect, institutional data sovereignty is instrumental. The Research Program aims at building a proof-of-concept institutional data lake consistent with the FAIR principles (Findability, Accessibility, Interoperability, Reusability) fed by the multiple university’s databases which store information on several research dimensions (projects, proposals, contracts, patents, publications and data, staff registry, etc.) to enable the creation of indicators and tools for research assessment and self-assessment which, in line with the COARA commitments, shall be multi-dimensional, inclusive, transparent and reproducible. Required activity The research fellow will be responsible for the technical activities of developing and validating a proof-of-concept institutional FAIR data lake built upon the integration of internal interoperable databases mapping all the relevant research activities and personnel information. The data lake will collect those type of data that today feed the univerty's data warehouse, with ad-hoc standardized metadata for the different types of information. It will implement preservation of historical data, controlled and secure protocols as well as anonymization levels tailored on the different stakeholders (e.g. governance, individual researcher), interoperability with internal and external databases and accessibility according to the “as open as possible as closed as necessary” philosophy; data will be equipped with appropriate licenses for reuse. The activity will be carried out in collaboration with three University's centres of study dedicated to open science, gender, and strategies and with the development area of the information system department. The detailed workplan of the overall project includes: [M1-M4] Analysis of the architecture of the current data warehouse and data lake design. MilestoneT1: Technical spec for data lake [M4]. [M5-M9] Implementation and testing phase on a subset of data to validate the functionality and usability. The subset of data will be identified in two parallel tasks devoted to 1) develop tools to monitor open science practices; 2) study gender bias issues of internal research assessment exercises the internal . Deliverable-T1.1: Proof-of-concept FAIR data lake available [M9]. [M11-M12] Design the complete FAIR data lake with a full set of research information data as identified by the other project tasks. Deliverable-T1.2: Design report and implementation plan [M12]. Qualifications and publications topics - Handling and modelling contexts (ETL/ELT, Data Lakehouse, DWH, Semantic Model). - Data lake and data warehouse technologies. - Programming languages such as SQL and Python. - Data visualisation and analysis tools such as Tableau. - Data security practices and regulatory compliance. Interview topics - Data lake and data warehouse solutions consistent with FAIR principles (findable, accessible, interoperable, reusable). - Creation and management of FAIR data and research data infrastructure. - Programming languages such as SQL and Python. - Data analytics and data visualization. - Data security and basic knowledge of relevant legislation Per maggiori informazioni: Prof.ssa Federica Cappelluti https://www.polito.it/personale?p=federica.cappelluti
participants (1)
-
Antonio Vetro'