🚧 Website Maintenance in Progress: Thank you for visiting! We are currently in the process of enhancing our website to serve you better. Please check back soon for our new and improved website.

Overview of Electronic Health Record data

As in many fields, the use of “big data” has become more common in psychiatric research, alongside major technologic and computational advances in recent decades. Electronic health records (EHR) can be considered one form of big data used for trauma-related research. These data can come from individual hospitals or healthcare clinics, or larger, single-payer health care systems like those used in Nordic countries and within the Veteran’s Health Administration (VHA) in the United States. In this blog, we discuss use of these data to study post-traumatic psychopathology, providing three applied examples which were presented as part of our symposium at the 2023 ISTSS Annual Meeting.

Population-based, nationwide registries in Nordic countries follow individuals from birth or immigration, and gather data on health-related and administrative characteristics throughout their lifetimes. Given the single-payer healthcare systems in these countries, essentially all encounters with the healthcare system are included, providing a rich source of data. Similarly, the VHA in the United States acts like a single-payer healthcare system for US military veterans, and thus holds medical record data on that population.

EHRs in general were initially adopted to improve patient care and efficiency in health care, but has also become a research resource. Available data can include diagnoses, clinical summaries, lab test results, and medications. Strengths of using these types of data sources compared to “traditional” survey-based research or recruitment include efficiency and lower cost (since the data already exist), a lower chance of selection bias into studies, relatively consistent and accurate classification of diseases, the ability to study rare events or diseases given large sample sizes, and incentives for large-scale, global collaboration and data sharing. These data sources are also usually longitudinal, providing important information on temporality of health-related events. Challenges or weaknesses of using these data include changing disease codes over time, difficulty of capturing social or economic determinants of health, high likelihood of statistically significant findings due to large sample sizes that are not necessarily clinically meaningful, and logistical challenges of acquiring and accessing these data.

Below are three applied examples describing this type of registry-based research.

Applied example 1:

Traumatic experiences are common, yet only a minority of people who have traumatic experiences develop mental health problems, like posttraumatic stress disorder (PTSD) or depression, as a result of these experiences. Because most people who experience trauma recover, there is great interest in understanding why only some people have worse mental health outcomes.  The goal of our study was to use national data from the population of Denmark to learn what pre-trauma characteristics might predict who will have severe mental health consequences following trauma (defined as three or more new mental health diagnoses within 5 years of a traumatic event). All data were obtained from regularly maintained Danish national health care and social registries that capture all diagnostic and treatment data for the entire population, in addition to demographic and social data. Our sample included 20,361 people who experienced severe mental health consequences following trauma according to our definition, and they were compared to 81,444 people from the original population who did not meet the case definition. Variables examined from before the traumatic experience included hundreds of demographic and social variables, psychiatric and physical health diagnoses, and prescriptions. We used a form of exploratory analyses called machine learning to determine which of the pre-trauma variables were most important to understanding severe post-trauma mental health outcomes.  In the full sample, we found that mental health diagnoses prior to trauma were the most important predictors of severe mental health outcomes following trauma.

We then ran a second analysis among on the participants with no mental health diagnoses prior to trauma. We found that among these people, demographic and social variables (e.g., marital status), trauma type, medications used to treat psychiatric symptomatology, anti-inflammatory medications, and gastrointestinal distress were all associated with severe mental health outcomes following trauma. These results confirmed well- known predictors of worse mental health outcomes following trauma (e.g., trauma type). The inclusion of medications used to treat psychiatric symptomatology in our results among people without mental health diagnoses could point to the importance of mental health symptoms that are not severe enough to meet diagnostic criteria or represent use of psychiatric medications for other purposes without additional mental health symptoms present (e.g., sleep disturbance). Importantly, these results also add to the literature that suggests a meaningful connection between posttraumatic mental health problems and inflammation and gastrointestinal distress, shedding additional light on the physical health underpinnings that may increase risk for mental health problems following trauma.

Applied example 2:

It is well known that PTSD runs in families. Family studies generally rely on intact families and thus are unable to separate out influences that are genetically versus environmentally transmitted. Another type of family design, using twins, has found that PTSD is influenced by both genetic and environmental factors, but these studies can only examine these sources of influence within the same generation. We were interested in extending this work to determine if PTSD is transmitted from parents to offspring. In other words, we wanted to know about cross generational transmission. We also wanted to test to what degree the resemblance between parents and offspring is due to genes versus rearing effects. We used a population register-based design using the entire country of Sweden. Using medical registries, we looked at PTSD diagnoses from parents and offspring (n= 2,194,171, born 1960-1992) from six types of families (intact, not lived with biological father or mother, step father or mother, and adoptive). Three sources of parent-offspring resemblance were calculated: genes plus rearing, genes only, and rearing only. We found that PTSD was transmitted from parents to offspring through rearing and genes, nearly equally. Additional analyses suggest that shared traumatic event exposure likely plays an important role in transmission; however, rearing effects remained substantial, even when controlling for potential shared exposures.

Applied example 3:

PTSD is associated with a variety of physical health problems, such as heart disease and type 2 diabetes (T2D). Yet, studying the link between PTSD and physical health is challenging because it often requires information on many people over many years, which is a costly and labor-intensive endeavor. With the now wide use of EHRs, data collected as part of regular medical care can be used to help us better understand the physical health consequences of PTSD. We described how data from the VHA EHRs could be used to study PTSD and T2D. Our study plans to use EHR data from over 6 million Veterans without diabetes. We will follow these Veterans over time to see who develops T2D and if PTSD is an important factor in determining who does and does not develop this condition. Preliminary results suggest that PTSD may increase the likelihood of T2D in Veterans, but stay tuned for additional findings. In sum, use of EHR data from sources such as VHA may yield new and important insights about PTSD and physical health. EHR data has several advantages to other types of data sources because it has comprehensive information on mental and physical health outcomes, includes lots of people who are followed over time, and is a cost and time-efficient way to collect this information.

Future Directions

In addition to the above examples, other uses and future directions of this type of research include linkage to biorepository data and other sources of data – even those that may have socio-economic data. Overall, using registry or EHR data is an innovative, promising, and increasingly common approach to study trauma and its effects in populations.

Discussion Questions 

1. What are some strengths and weaknesses of applying EHR data or national registries to study PTSD and related conditions?

2. Why might we find different results across studies using different type of data (e.g., EHR vs. volunteer-based surveys?)

3. What are some future directions in using EHR data for psychiatric research?

4. What are other types of “big data” that could be leveraged for psychiatric research?

About the Authors

Laura Sampson, PhD is an Assistant Professor in the Program in Public Health at Stony Brook University. She studies the mental and physical health effects of stress and trauma. Dr. Sampson can be contacted at Laura.Sampson@Stonybrookmedicine.edu and followed on social media: @LauraSampson611.

Ananda B. Amstadter, PhD is a Professor of Psychiatry, Psychology, & Human and Molecular Genetics at Virginia Commonwealth University School of Medicine. She is a Past President of ISTSS, and her research focuses on risk and resiliency factors for traumatic stress related conditions. Dr. Amstadter can be followed on social media: @DrAmstadter.

Kelsey N. Serier, PhD is a Post-doctoral Fellow at the Women’s Health Sciences Division of the National Center for PTSD. She studies the relationships between PTSD and physical health, as well as disordered eating, among United States Veterans.

Jaimie L. Gradus, DSc is a Professor of Epidemiology at Boston University School of Public Health. Her research focuses on the epidemiology of trauma and trauma-related disorders, especially suicide outcomes. Dr. Gradus can be followed on social media: @JaimieGradus.


Lewis, C.M., Hagenaars, S.P. (2019). Progressing polygenic medicine in psychiatry through electronic health records. JAMA Psychiatry, 76(5), 470-472.

Smoller, J.W. (2018). The use of electronic health records for psychiatric phenotyping and genomics. Am J Med Genet B Neuropsychiatr Genet, 177(7), 601–612.

Weissman, M.M. (2020). Big data begin in psychiatry. JAMA Psychiatry, 77 (9), 967-973. 

Weissman, M.M., Pathak, J., Talati. A. (2020). Personal Life Events - A promising dimension for psychiatry in electronic health records. JAMA Psychiatry, 77(2), 1, 15-116.