EOSC: Open Science at the European level

Corporate

At the end of May, the European Commission and the CNRS are to discuss the future of the European Open Science Cloud (EOSC) which provides scientists from all disciplines with a catalogue of shared services that work in favour of Open Science. The EOSC initiative came into operation in 2021 and this is an opportunity to take stock of its progress.

The dual objectives of the European Open Science Cloud (EOSC) idea issued in 2016 by the European Commission are to turn Open Science into a habit for researchers and to structure an internet of so-called 'FAIR' (findable, accessible, interoperable and re-usable) data and services for European and even global research. As such, the EOSC initiative represents a true cultural step change. EOSC entered its implementation phase a little over two years ago after an initial design phase (French page) and is now already envisaging its future in the European framework programme to follow Horizon Europe which ends in 2027. Two key events for the EOSC will be held this week with the general assembly of the EOSC association on May 22nd and 23rd (see boxed text) and the European Commission's visit to the CNRS to discuss the subject on May 26th.

"Along with feedback on CNRS scientists' use of the EOSC, the CNRS's role in the strategic development of the initiative and in its governance will be core subjects for discussion," states Suzanne Dumouchel, head of international cooperation at the CNRS's Open Research Data Department (DDOR) and a member of the EOSC association's board of directors. "The European Commission's visit on May 26th is a strong sign of its interest in our organisation," she explains.

Governance of the EOSC

The European Open Science Cloud's governance associates the European Commission, the member countries on the steering committee - including France, represented by the National Institute for Research in Computer Science and Control (Inria) - and the research community the EOSC association itself represents. This association was created in December 2020 and is currently made up of over 190 stakeholders from the EOSC ecosystem. Sylvie Rousset, the Director of the DDOR, represents the CNRS and in 2020 Suzanne Dumouchel was elected as one of the association's board of directors for a three-year mandate working alongside seven other European directors.

The initiative aims to help science and innovation advance by providing all scientists working in European public and private institutions with access to all available data supported by the appropriate related infrastructure and services. Another objective is to enable the decentralised usage, storage, sharing and interoperability of European research data, all of which needs to be tailored to effectively respond to the requirements of each research community.

"EOSC is a core support element for the circulation, dissemination and adoption of knowledge in the European Research Area (ERA) while also supporting innovation. It aims to enhance the quality of scientific results by pooling costs and efforts," explains Alain Mermet, director of the CNRS's Brussels Office. The Council of the European Union has therefore placed “Open Science, including through the EOSC” first on the list of 20 actions set out in the ERA's policy agenda for 2022-2024 announced in November 2021. 25 member countries, 3 associated countries and 9 other stakeholders committed to contribute to the proper implementation of this action.

The CNRS is a member of the EOSC association and contributes to its Multi-Annual Roadmap. As a major European multidisciplinary research institution, the CNRS is driving the development of Open Science and interacts with many of the EOSC's active users and contributors. "The CNRS thus intends to play an important role in the EOSC infrastructure structuring and governance in the future," explains Alain Schuhl, the CNRS Deputy CEO for Science. Many CNRS Institutes are also stakeholders taking part in the development of services made available in the framework of the EOSC.

A multilingual discovery platform for HSS resources

The CNRS Institute for Humanities and Social Sciences (INSHS) is a leading beneficiary of the EOSC's work on the interoperability of services, the creation of standards and the alignment of the vocabularies used by different disciplines. The GoTriple discovery and reuse platform for HSS resources enables researchers to discover data and publications, researcher profiles and research projects in 11 European languages. The GoTriple platform derives from the TRIPLE project coordinated by the Huma-Num research infrastructure and is now one of the major services provided by the European research infrastructure OPERAS, whose French hub is OpenEdition. GoTriple facilitates and promotes collaboration in the field of the humanities and social sciences for scientific, societal and industrial applications. "This discovery platform gives HSS researchers a powerful multilingual tool to develop and promote their research," sums up Suzanne Dumouchel.

The CNRS's National Institute of Nuclear and Particle Physics (IN2P3) has long been working on the issue of the analysis of large masses of data. This experience has enabled the IN2P3 to play a decisive role in the development of data storage, management and processing services that are of core importance to the EOSC's functioning, particularly through its collaboration with the European Grid Infrastructure (EGI). The EOSC was designed as a group of federated infrastructures and several French infrastructures under CNRS supervisory authority like France-Grilles, CC-IN2P3, OpenEdition, Data-Terra or the Centre for Direct Scientific Communication1  have contributed to structuring the initiative by developing services and resources made available in the framework of the EOSC.

Currently over 1200 CNRS resources such as databases, research results and so forth are in the EOSC catalogue. The interoperable services developed cover the entire life cycle of data, from exploration to storage and including analysis, publication, visualisation and reuse. An infrastructure and IT systems are also needed to host these services and resources. The CNRS is contributing to designing the required architecture, facilitating interoperability between services and defining standards for metadata and so on.

Finally, the organisation is working in close collaboration with other higher education and research institutions that are members or observers of the EOSC association to identify the requirements of French research communities as regards computing infrastructure, data management or visualisation. The objective of this work is to promote the creation of a 'French EOSC' to "make research more visible and innovative".

  • 1The France Grilles  infrastructure is a group of machines hosting software services for processing scientific data. The IN2P3 Computing Centre (CC-IN2P3) designs and operates a set of services, particularly a mass storage system and resources for the processing large masses of data. The OpenEdition portal is made up of four electronic humanities and social sciences publication platforms. The Data Terra E-infrastructure is a global facility for accessing and processing data, products and services dedicated to Earth observation. The Centre for Direct Scientific Communication (CCSD) provides services for the archiving, dissemination and promotion of scientific publications and data like the HAL open archive and its associated platforms.

Galaxy-E: the EOSC contributes to the ecological science community

"Using EOSC infrastructures means we don't have to roll out and manage our own infrastructure so we can focus on developing FAIR services and resources," explains Yvan Le Bras, scientific and technical infrastructure manager at the French National Natural History Museum's National Biodiversity Data Centre which set up the Galaxy-E platform. This platform is dedicated to reproducible computer analysis for the ecological science community achieved through shared data, tools and work processes. Galaxy-E is based on the Galaxy open source platform created for the analysis, management and visualisation of FAIR data. Sharing data is therefore "necessary to provide different stakeholders working on the study and conservation of biodiversity with trusted biodiversity indicators".

The CNRS is the first beneficiary of European Research Council (ERC) grants, the recipients of which are encouraged to open up their data particularly through the EOSC. This of course means the CNRS is "a major contributor to EOSC" as Suzanne Dumouchel sums it up while also emphasising the "savings in time and resources" made possible by the "joint and collaborative' development of services at the European level.

The EOSC has an operating budget of one billion euros for the 2021-2027 period. Half of this is funded by the European Commission and the other half derives from in-kind contributions from the association's partner institutions. The General Assembly in Brussels on May 22nd and 23rd will give the EOSC community the opportunity to reflect on and discuss the post-2027 governance of the initiative. Should the EU Member States be involved? What role should be played by national institutions and European research infrastructures? What is the best economic model for services? The European Commission also wishes to consult the CNRS regarding its position on all these questions.

Physicists commit to the EOSC to prepare their own Open Science

The international "European science cluster of astronomy & particle physics ESFRI research infrastructures" (ESCAPE) led by the CNRS is a major collaborative response from the leading astrophysics and particle physics research infrastructures to the challenges of Open Science in Europe. Particle physicists have played a pioneering role in managing large volumes of data and adapting open software services for the analysis, visualisation and management of statistical data. Astrophysicists set the standards for data publication via the Virtual Observatory which enables analysis tools and databases from large instruments to be shared. CNRS researchers from the IN2P3 and the National Institute for Earth Sciences and Astronomy (INSU) contribute to ESCAPE in Europe and are collaborating on the definition of the EOSC's architecture. The new ESCAPE facilities will greatly extend the scientific community's capacity to answer questions about the structure and evolution of the Universe and its constituent objects. To achieve this, "we are rolling out a distributed federated infrastructure prototype called a data lake which will optimise data archiving and processing along with a virtual open work environment made up of a series of software services for querying, sharing and combining data," explains Giovanni Lamanna, the director of ESCAPE.