Cloudwave : Cloud Computing for Clinical "BIG Data"
Cloudwave is a Web-based platform that features an intuitive signal analysis interface integrated with a Hadoop-based data processing module implemented on clinical data stored in a “private cloud”. Cloudwave has been developed as part of the National Institute of Neurological Disorders and Strokes (NINDS) funded multi-center Prevention and Risk Identification of SUDEP Mortality (PRISM) project.
EEG data represents postsynaptic potentials from large group of neurons and electrodes (either scalp or intracranial) record the voltage differences between the different brain regions. Patients are usually admitted to Epilepsy Monitoring Units (EMU) to record multiple channel signal data, including EEG, heart rate, blood oxygen levels, and electrocardiogram (EKG), over a period of 5 days. These comprehensive multi-channel patient recordings (in the order of 100s) generate very large datasets, for example a 24-hour recording for a patient represents 8000 screen images, each image containing signal from 100 channels. About 5-10GB of data is generated for a single patient and an EMU usually admits about 100-150 patients in a year, which creates significant data management challenges similar to other “Big Data” application in terms of efficient storage, visualization, and analysis. The increasing need for multi-center clinical research studies exacerbates this challenge by introducing the need to share, interoperate, and integrate signal data in real time.
About 964 patients have been processed in the CWRU-UH
EMU since January 2011 after the start of the PRISM project,
and about 116 of these patients have consented to participate in
the PRISM project (figure 1A). An average of 321 MB of electrophysiological
data is generated from recordings of a single
patient per day, and about 1.6 GB of data over a typical 5-day
admission period in the EMU. This has resulted in 9.5 TB of
total signal data collected in the CWRU-UH EMU and about
4 TB of data collected from patients recruited for the PRISM
project since 2011. The rate of data collection in the EMU is
increasing every year—for example, the volume of data at the
end of 2012 was 6 TB, but 9.5 TB of data had already been collected
by May 2013 (figure 1B illustrates the growth in total
data collected from all patients in the EMU and patients
recruited for the PRISM project). Hence, there is an acute need
to define efficient algorithms and develop an effective informatics
platform to manage this electrophysiological big data.
Growth of Electrophysiological Signal Data
The PRISM project aims to recruit about 1200 patients from four participating EMUs at the Case Western Reserve University (CWRU) University Hospitals-Case Medical Center (UH-CMC), Ronald Reagan University of California Los Angeles (UCLA) Medical Center (RRUMC-Los Angeles), the National Hospital for Neurology and Neurosurgery (NHNN, London, UK), and Northwestern Memorial Hospital (NMH Chicago). Hence, the primary informatics challenge in the PRISM project is to allow real time access to patient data from different institutions in a secured collaborative environment for clinical researchers. Cloudwave is part of this informatics infrastructure with specific focus on enabling researchers to seamlessly search, query, and visualize signal data annotated with clinical events for patient cohort identification.
Cloudwave is a high-performance integrated signal analysis platform with an intuitive Web interface (for use by clinicians and research staff members) integrated with Hadoop-based computation engine for distributed processing of large electrophysiological signal datasets. The figure below illustrates the high level system architecture of Cloudwave.
Cloudwave Components and Workflow
The Cloudwave platform was developed using four design principles:
- Fast access to individual signals stored in a multimodal EEG EDF store;
- Ability to partition signals into meaningful “segments” based on seizure related events for easier quantitative analysis;
- Use of EDF file header record and seizure events as metadata for faster identification of appropriate patient cohorts; and
- Polygraph visualization of multiple signals on a single page for visual analysis by human reader.
A set of use cases were defined and systematically document-ed to identify appropriate features to be implemented in Cloudwave, such as the ability to:
- Search for seizure event information in signal data, in-cluding the time of event occurrence or the time duration between start and end of an event. For example, occur-rence of ‘Sign-of-Four’ lateralizing sign event, time dura-tion between “onset of jittery phase and end of jittery phase”;
- List patients with Cardiac Arrhythmia who also have ir-regular heartbeat rates. This can be further classified as Bradycardia or Tachycardia, with markings on EKG sig-nal when heartbeat rate is below or above a threshold, such as 60 beats per minute (BPM);
- Measure Heart Rate Variability (HRV) for selected pa-tients, which is a physiological phenomenon of variation in the time interval between heartbeats;
- List patients with respiratory arrhythmia with markings on the respiration signal where the rate of respiration is above or below a threshold (e.g. 5/minute); and
- Provide EEG suppression information for patients, includ-ing time duration of “EEG Suppression” in patients after a seizure occurrence and time duration of “EEG suppres-sion to return to baseline” event.
- Sahoo SS, Jayapandian C, Garg G, Kaffashi F, Chung S, Bozorgi A, Chen CH, Loparo K, Lhatoo SD, Zhang GQ. Heartbeats in the Cloud: Distributed Analysis of Electrophysiological “Big Data” using Cloud Computing for Epilepsy Clinical Research. Journal of American Medical Informatics Association JAMIA (special issue on Big Data in Healthcare and Biomedical Research) 2013. http://www.ncbi.nlm.nih.gov/pubmed/24326538?dopt=Abstract
- Jayapandian CP, Chen CH, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. Cloudwave: Distributed Processing of “Big Data” from Electrophysiological Recordings for Epilepsy Clinical Research Using Hadoop. American Medical Informatics Association (AMIA) Annual Symposium, 2013. pp. 691-700
- Jayapandian CP, Chen CH, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. Electrophysiological Signal Analysis and Visualization using Cloudwave for Epilepsy Clinical Research. The 14th World Congress on Medical and Health Informatics (MedInfo), 2013. http://www.ncbi.nlm.nih.gov/pubmed/23920671
Electrophysiological Signal Analysis and Visualization using Cloudwave for Epilepsy Clinical Research - presented at the 14th World Congress on Medical and Health Informatics, MedInfo 2013