Open data sharing efforts can advance research by analyzing existing data for new trends and fostering collaborations across institutions and countries, which are especially useful for complex diseases such as enfermedad de Alzheimer (AD), a progressive neurodegenerative disorder that affects memory and cognition in the elderly. Research into the causes and mechanisms of AD to prevent, accurately diagnose, and treat this debilitating illness requires a worldwide collaborative effort. The Global Alzheimer’s Association Interactive Network (GAAIN, www.gaain.org) connects AD researchers around the world to a data sharing platform and has been helping researchers discover existing datasets and foster new collaborations since 2015.
Currently, over 50 research groups, or GAAIN Data Partners, from 17 different countries contribute data of over 500,000 subjects and more than 30,000 attributes. GAAIN features four clinical trials (testing new drugs or regimens), nine cross-sectional studies (comparing different populations at one time point), 38 longitudinal studies (following one population over time), and two multi-project repositories.
"Users can use the GAAIN Interrogator to analyze trends in existing AD and dementia datasets."
In the GAAIN Interrogator (www.gaaindata.org), users can visually analyze trends in existing AD and dementia datasets. This is particularly useful for meta-analysis studies, or secondary analyses of primary studies, by reviewing and combining existing datasets to reach broad generalizations about a phenomenon. Variables in the Interrogator include demographic, diagnostic, socioeconomic, cognitive, genetic, imaging, and biomarker data. Users can get started by choosing one or more datasets to analyze, defining pooled binary or categorial variables, and building cohorts based on the variables. After identifying new trends and datasets of interest in the Interrogator, users can easily apply for data access from the Data Partners in order to collaborate for a publication.
Risk factors for Alzheimer’s disease
One example of GAAIN usage is exploring some of the risk factors for Alzheimer’s disease.
"Users can use GAAIN to explore some of the risk factors of Alzheimer's disease"
The APOE4 version of the apolipoprotein E (APOE) protein is a main contributing genetic risk factor for developing AD. APOE3 is the most common version and has no effect on developing AD and the APOE2 version is the least common and may have some protective effects. Individuals have two alleles, or copies, per gen, resulting in an APOE genotype that can be any combination of the three versions. In addition, those with two alleles of APOE4 are at higher risk than those with one allele.
Using logistic regression in the GAAIN Interrogator to analyze the effects of APOE genotype on disease outcome from combined data of 50,000 subjects collected by five Data Partners, we see that the odds ratio of having one allele of APOE4 is 3 (orange) and that of two alleles of APOE4 is 10 (green). Odds ratios reflect the risk level for a certain outcome, and the larger the value, the higher the risk. In addition, when comparing those with one or two alleles of APOE2 with those with two alleles of APOE3 in a group of subjects without the APOE4 allele, we can see that the odds ratio is around 0.5 (blue). Odds ratios below 1 indicate a protective effect.
Hypertension, or high blood pressure, is another risk factor for developing AD. Using the GAAIN Interrogator linear regression analysis, we see that hypertension has a weak positive correlation with cognitive impairment in both males (orange) and females (blue) of 8000 subjects combined from three Data Partners, meaning that high blood pressure contributes to cognitive impairment in this group of subjects. Here, cognitive impairment is defined by Mini Mental State Examination (MMSE) scores below 24. The MMSE is a 30-point cognitive questionnaire, and a score of 24 or below indicates cognitive impairment; it is used alongside other clinical tools to diagnose AD, other dementias, and cognitive impairment.
Alzheimer’s disease meta-analyses
As shown above, the GAAIN Interrogator can be used to analyze multiple datasets to answer specific research questions related to Alzheimer’s disease and dementia.
Researchers who have used GAAIN for meta-analyses were able to combine existing data to answer questions that would not be possible when analyzing one dataset alone. For example, in a meta-analysis study using datasets discovered in GAAIN, researchers found that women with the APOE3/4 genotype had an increased risk of developing AD compared to men between 65 and 75 years of age but have similar risks overall between 55 to 85 years of age.
Open data sharing
"Funding agencies have been pushing for data sharing to become the norm with open data sharing helping to accelerate research and the formation of new collaborations"
Recently, funding agencies have been pushing for data sharing to become the norm rather than an exception. At the end of 2019, the National Institutes of Health (NIH) in the United States drafted new policies requiring all NIH-funded researchers to share their data upon study completion. Open data sharing would accelerate research by allowing scientists to access existing data and form new collaborations. That was the exact motivation for the Alzheimer’s Association to sponsor GAAIN, developed by the Laboratory of Neuro Imaging at the University of Southern California.
Written by Cally Xiao. Illustrated by Sumana Shrestha.
Edited by Joel Frohlich, Sean Noah and Desislava Nesheva.
What are your thoughts on open data sharing efforts? What research questions related to Alzheimer’s disease would you like to explore in existing datasets?
Read more about one of the latest technologies in the Alzheimer's field in our recent article.
¿Interesada/o en apoyar a Knowing Neurons? Conviértase en patrocinador hoy y ayúdenos a cumplir nuestra misión de hacer que la neurociencia sea accesible para todos.
Alzheimer’s Association. (n.d.). Global Alzheimer’s Association Interactive Network. Retrieved from: https://www.alz.org/research/for_researchers/partnerships/gaain
Conrado, D. J., Karlsson, M. O., Romero, K., Sarr, C., & Wilkins, J. J. (2017) Open innovation: Towards sharing of data, models and workflows. European Journal of Pharmaceutical Sciences, 109, S65-71. https://dx.doi.org/10.1016/j.ejps.2017.06.035
Kaiser, J. (2019, November 11). Why NIH is beefing up its data sharing rules after 16 years. Retrieved from: https://www.sciencemag.org/news/2019/11/why-nih-beefing-its-data-sharing-rules-after-16-years
Neu, S. C., Pa, J., Kukull, W., Beekly, D., Kuzma, A., Gangadharan, P., Wang, L.-S., Romero, K., Arneric, S. P., Redolfi, A., Orlandi, D., Frisoni, G. B., Au, R., Devine, S., Auerbach, S., Espinosa, A., Boada, M., Ruiz, A., Johnson, S. C., … Toga, A. W. (2017) Apolipoprotein E genotype and sex risk factors for Alzheimer disease: A meta-analysis. JAMA Neurology, 74, 1178-1189. https://doi.org/10.1001/jamaneurol.2017.2188
Neu, S. C., Crawford, K. L., & Toga, A. W. (2016) Sharing data in the Global Alzheimer’s Association Interactive Network. NeuroImage, 124, 1168-1174. https://doi.org/10.1016/j.neuroimage.2015.05.082
Silva, M. V. F., Loures, C. M. G., Alves, L. C. V., de Souza, L. C., Borges, K. B. G., & Carvalho, M. D. G. (2019) Alzheimer’s disease: Risk factors and potentially protective measures. Journal of Biomedical Science, 26, 33. https://doi.org/10.1186/s12929-019-0524-y
Toga, A. W., Neu, S. C., Bhatt, P., Crawford, K. L., & Ashish, N. (2016) The Global Alzheimer’s Association Interactive Network. Alzheimer’s & Dementia, 12, 49-54. https://doi.org/10.1016/j.jalz.2015.06.1896