The MITRE Corporation
Over 1 million synthetic patient records
The Synthea Generated Synthetic Data in FHIR hosts over 1 million synthetic patient records generated using Synthea
This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
This public dataset is also available in Google Cloud Storage and available free to use. The URL for the GCS bucket is gs://gcp-public-data--synthea-fhir-data-1m-patients. Use this quick start guide
Please cite SyntheaTM as:
Jason Walonoski, Mark Kramer, Joseph Nichols, Andre Quina, Chris Moesel, Dylan Hall, Carlton Duffett, Kudakwashe Dube, Thomas Gallagher, Scott McLachlan, Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record, Journal of the American Medical Informatics Association, Volume 25, Issue 3, March 2018, Pages 230–238, https://doi.org/10.1093/jamia/ocx079
Try the sample queries below in the BigQuery UI.
What was the occurrence of each condition by year for the last 3 years?
This query counts all of condition occurrences from the Condition table and summarizes them for each of the last three years.Run this query.
Which patients have a hypertension or diabetes diagnosis, or both, that have received more than 7 medications?
Query patients from Condition table and aggregate by required conditions. Join results with Patient and MedicationRequest table. Do the count for total amount of of medication requests per patient and limit it by 7. The result is 272372 patients in total, which includes 213162 patients with Hypertension, 23472 with Diabetes and 35738 with both diagnoses. Run this query.
Data hosted within Synthea Generated Synthetic Data in FHIR has been generated by SyntheaTM
Google Cloud Console has failed to load JavaScript sources from www.gstatic.com.
Possible reasons are: