Huge amounts of data provide unique archaeogenomic insights

23.02.2022

The term ancient DNA (aDNA) refers to DNA derived from plants and animals that have been dead for a prolonged period of time, typically for more than 100 years. Inference from such ancient specimens can provide unique insights into how organisms have responded to the onset of human exploitation, domestication and climate change.

Illustration of a 4000 year old Atlantic bluefin tuna vertebra.

The illustration above shows an approximately 4000-year-old Atlantic bluefin tuna vertebra from southern Norway. Excellent preservation conditions allow the retrieval of DNA of sufficient quality and quantity to allow whole-genome reconstruction of ancient tuna specimens from this location. Researchers from the University of Oslo and the Museum of Cultural History, Oslo collaborate to learn more about tuna exploitation since historical times. Photo by: Emma Falkeid Eriksen.

Revolutionising the field of ancient DNA

Recent advances in DNA sequencing technologies and associated downstream analyses have revolutionised the field of ancient DNA. Nonetheless, these advances go hand in hand with an exponential growth in the amount of data that needs to be analysed, stored and shared with collaborators.

Managing terabytes of data on the National e-infrastructure

"It is without a doubt that without this infrastructure our work would not be able to efficiently progress."
Baastian Star, Associate Professor, Centre for Ecological and Evolutionary Synthesis,

In the Archaeogenomics group, we regularly work with terabytes of data. This work would simply not have been possible without the computational infrastructure provided by Sigma2. We use both the storage capacity provided by NIRD, as well as the high-performance computational capacity provided by Saga. Moreover, using Saga, it is straightforward to check which programs and which version of programs were used for particular analyses. Such information is essential for the reproducibility of our results, a fundamental requirement for science.

Finally, the fast connection of Saga to a range of servers around the world makes it possible to quickly download a large amount of publicly available sequence data for comparative purposes. This connection also allows us to quickly upload our data when we need to share this.