Sharing life science data in Europe

24.02.2022

Worldwide, the life sciences generate an unprecedented amount of scientific data. The main aim of ELIXIR Europe is to contribute to the coordination of a sustainable e-infrastructure for proper data management and data sharing of life science data in Europe.

Illustration of fast connection and abstract technology.

Collaboration and data sharing

In a life science research project, the data is often generated from biological samples that a research group collects and prepares before a Core Facility operating advanced instruments will assist in the actual generation of measurements from the samples. Many levels of expertise are needed at different steps along the way and the processes will often involve lab technicians, research staff from students to professors, data analysis and modelling collaborators etc.

Details of the contribution of all staff can influence the outcome of the data generation and later analysis and interpretation, so efficient capturing and communication of the right meta-data is an essential task.

The Norwegian e-infrastructure for Life Science (NeLS) is an online e-infrastructure developed with the aim to facilitate organising, sharing and maintenance of scientific data in a project, by all the staff with different expertise that contributes to the process.

From its birth as raw data, through gradual improvements into a valuable data asset by the project team, and shared in av FAIR manner in an applicable public repository such as the ELIXIR Deposition Databases. To facilitate the required large-capacity structured data storage for the research projects in collaboration with Sigma2, NeLS has been designed and implemented to use the NIRD Storage platform as the primary storage layer.

NeLS and NIRD = StoreBioinfo

The architecture of NeLS can be described as having two storage layers, for two different purposes. The NeLS web portal and micro service based middle-ware operated by ELIXIR Norway is seamlessly integrated with matching micro services running on the NIRD Service Platform, to coordinate data transfers between the two layers.

Bilde
Illustration of the NeLS architecture.
Two storage layers for two different purposes.

The StoreBioinfo layer: Is based directly on the NIRD Storage system, and provides structured large capacity storage with clearly defined datasets of a given data type, and data is organised relative to how far they have been processed, annotated and interpreted. To utilise this functionality, the project will need to apply for a StoreBioinfo storage quota. This application is processed by the ELIXIR Norway helpdesk that manages a larger block quota allocated to ELIXIR Norway by the Sigma2 Resource Allocation Committee.

The NeLS storage layer: Is designed and used as a work area for hot data and temporary results, where data is actively being shared, transferred to other systems and new derived data returns. Through a continuous process in the project team, selected curated data is transferred from NeLS Storage to the StoreBioinfo layer at NIRD to gradually improve the authoritative data sets for the project maintained there.

Fruitful collaboration going forward

In Norway, Sigma2 and ELIXIR Norway have had a continuous fruitful collaboration for more than 10 years within areas of data storage, computational resources and data management plans. This has allowed ELIXIR Norway to offer services adapted to the needs of the Life Science research community.

Into the coming year, ELIXIR Norway and Sigma2 will collaboratively work towards several new features of NeLS, including improved integration of Data Management Plans, adopted usage reporting routines to match the new Sigma2 User Contribution Model3, and a brand new version of the NeLS Web portal.