Gene Transcriptional Signatures in Healthy Ageing and Related Pathologies

Gene Transcriptional Signatures in Healthy Ageing and Related Pathologies 2020-2022
Acronym: GeT-SHARP
Budget: 431,900 RON
Project director: Robi Tacutu

The project aims to analyze and compare the age-related transcriptomics signatures in variuos tisues, both in healthy and pathological individuals, in order to identify shared or unique aging signature that drive aging or age-related diseases.


Aging is considered a major risk factor for the development of late-onset pathologies such as atherosclerosis, cancer, type 2 diabetes and neurodegenerative diseases. This suggests that age-related changes in gene expression should, to some extent, resemble the changes in gene expression observed in the above diseases. While the presence of common differentially- expressed genes is essential, this is not sufficient evidence to assume a common molecular basis for aging and ARDs. The important point is whether these genes display similar expression profiles as a whole. However, up until now, this question has not been fully addressed. In this study, we propose to analyze the age-related signatures of different human tissues (brain, muscle, lungs, kidney, and skin), compare them in order to get insights on whether the signatures are tissue-specific or whether there is a “common” aging signature across tissues. Additionally, we aim to search for similar-to-aging gene expression profiles among genetic screens of age-related pathologies. The identified pathological conditions with signatures similar to age-related transcriptional profiles of various tissues and cell types could then be used to build a dual aging-diseases centric model. This model could be extremely useful both at a theoretical level, for a better understanding of the mechanisms behind aging and diseases, as well as at an applicative level, for early-stage detection of late-onset diseases and related pathological conditions.

Project objectives:

The overall goal of the proposed research is to integratively study aging and age-related diseases and their common links at a genetic/molecular level. More specifically, the objectives are: 1) Building a joint model of age-related gene expression changes in different human tissues; 2) Investigating the particular genetic signatures and molecular pathways shared between aging and various age-related conditions with profiles similar to healthy aging; 3) Constructing an integrative model that describes the changes in gene expression in aging and in age-related diseases.

Estimated outcomes:

Upon the completion of the project, the following outcomes should be achieved: 1) the group will have a list of aging signatures and a ranked list of datasets based on their gene expression similarity to aging; 2) the group will build a gene network model aimed at explaining how the molecular components that determine the similarity to aging interact between themselves; 3) the group will have a graph model of physiological and pathological transitions occurring with age which could allow to make hypotheses and inferences about the aging process.

Robi Tacutu, Ph.D.
Robi Tacutu, Ph.D.

Group leader, CS II

A thorough, analytical and curiosity-driven scientist, with a multidisciplinary background in biology and computer science, Robi received his BSc in computer science from the University Politehnica Bucharest, and his MSc in biochemistry and molecular biology from the University of Bucharest. He has a long-term commitment and interest in the field of biogerontology (since 2005) and received his PhD in 2013 from the Ben-Gurion University of the Negev, in the lab of Prof. Vadim Fraifeld. His thesis was on an ageing-related topic: studying the relationships between aging and age-related diseases with the use of bioinformatics and network biology approaches, and More...

Dmitri Toren, Ph.D.
Dmitri Toren, Ph.D.

Assistant researcher

Dmitri is a bioinformatics researcher at the Institute of Biochemistry, focusing on curation for aging-related data, analysis of large screen datasets, dataset annotation and network-based analyses. He also holds a Ph.D. d More...

Gabriela Bunu, Mrs.
Gabriela Bunu, Mrs.

PhD student, Bioinformatics research assistant

Gabriela is a bioinformatics researcher at the Institute of Biochemistry, focussing on curation for aging-related data, analysis of large screen datasets and network-based analyses. She has a multidisciplinary background, More...

As a result of the project implementation (2020-2022), all proposed activities (A1-A10) were carried out successfully and all deliverables (D1-D7) were obtained.
Briefly, 1) a series of publicly available transcriptomics datasets were collected, evaluated (to account for different experimental conditions, different tissues, etc.), and selected if being of interest for the study of ageing, or for age-related diseases (such as Alzheimer's disease, Parkinson's disease, Type 2 Diabetes, Atherosclerosis, Osteoporosis, and Sarcopenia). All selected datasets were then bioinformatically processed, including data normalization (using the same normalization method for all studies) and computation of differential expression for each gene.
For each of the available tissues (whole blood, bone marrow, brain, eye, leukocytes, lymphocytes, liver, muscle, skin, stem cells), a common "transcriptional signature" was derived - in other words, a pattern of core transcriptional changes found in that tissue. Analyzing the transcriptional signatures for each tissue, it was observed that major differences between age-related changes exist across tissues, but there was also a small number of molecular components working in tandem across multiple tissues - which could constitute a more universal basis for ageing and age-related diseases.

Next, to be able to compare profiles of gene expression changes, a script was developed that calculates a similarity score, based on a "Gene Set Enrichment Analysis" type algorithm. This algorithm was further used for two types of searches in the GEO database: 1) an agnostic search, without prior knowledge of what a relevant dataset might be, and 2) a targeted search to directly analyze the molecular links between aging and age-related pathologies.
These similarity scores allowed for the construction of a similarity matrix, which was then represented by an unoriented graph/network. This network model, which includes both physiological transitions in ageing but also changes caused by age-related pathologies, allowed us to select several subsets of core transitions which were then the input for meta-analyses. Within the project, 2 subsets were chosen; first, a subset chosen manually as central in the initial similarity network, and then a subset representing the largest cluster (relevant in particular to brain aging and neurodegeneration). The signatures obtained in the end, that is, those common changes for the comparisons/transitions in the dataset subsets were functionally and topologically analyzed. Subsequently, the signatures were used to identify drugs or other chemical compounds that could reverse the changes. Interestingly, among the proposed drugs a number of compounds already known to have a role in aging or related processes were also revealed (either from in vitro studies or animal models). This result can be considered as a partial validation of the potential of the methodology developed in the project.

Overall, the activities in the TE "GeT-SHARP" project have led to the development of some tools and of a bioinformatics model that can help both fundamental research - in the study of aging and of ageing-associated pathologies, but also in applied research - giving the possibility to achieve some predictions for modulatory drugs or genetic therapies. As such, we are aiming to disseminate these results through the publication of one or more scientific articles. This dissemination will include both the description of the methodology, as well as the results obtained with the help of the created model. Such an article is currently under development.

Summary of results obtained in 2020:

In the 2020 phase, a series of datasets of interest (aging-related), used previously have been re-evaluated and part of them have been selected as being relevant to the current project. This subset has been expanded by a semi-automated search process, followed-up by manual curation and annotation. As a result, currently the list of datasets that will be analyzed in this project includes 78 transcriptomic datasets. Several of these have already been pre-processed in 2020, using the same protocol (normalization with the same methodology).

These steps have been performed using software tools that were already developed in our group and needed only to be adapted/updated to the needs of the current project. 

At the end of this phase, all the activities in the project are according to the proposal's GANTT.

Summary of results obtained in 2021:

In this phase (2021), the collection and annotation of selected transcriptomics datasets from databases and scientific research for the study of aging and age-related diseases, has been carried out. Aging-related data has been analyzed considering multiple studies and different experimental designs (different tissues, different experimental/clinical conditions, etc), and a common transcriptional signature from them has been derived.

This signature was then used to agnostically search in the GEO database for other datasets (not necessarily aging-related) to identify similar datasets, of potential interest to the field. To validate this core aging signature, we have also run a more targeted search among datasets associated with age-related diseases. These datasets have been collected and manually annotated for Alzheimer's disease, Parkinson, Type 2 Diabetes, Atherosclerosis, Osteoporosis and Sarcopenia. An algorithm for similarity has been designed and implemented and based on the similarity results, a network graph has been constructed to navigate through the identified dataset similarities. These activities (the meta-analysis for aging/age-related pathologies data, the analysis for the similarity transition graph) will continue in the next phase (2022) as well.

Currently, all the activities in the project are on track and according to the proposal's initial GANTT.

Summary of results obtained in 2022:

In this phase (2022), the previously constructed network model, based on a matrix of similarities between transcriptomic datasets has been extended. Namely, the matrix / graph of transcriptomics comparisons / transitions, representing physiological ageing changes, as well as pathological changes from age-related diseases, was clustered into several groups of tightly related studies. 

The main cluster, which seems to be relevant for brain ageing and neurodegenerative diseases was then selected and further analyzed. The transcriptional changes that form the signature of this cluster were meta-analyzed, and the results were evaluated for functional enrichment and topological characteristics.

Furthermore, the up- and down-regulated genes were then used in conjunction with the ConnectivityMap database to predict drugs or other chemical compounds that could potentially reverse the core changes in the cluster. Interestingly, some of the identified drugs have been previously associated with ageing-related processes, which supports the idea that this model could also serve as a tool for prediction of novel geroprotectors or other therapeutics, relevant to both ageing and age-related diseases.