Jump to: navigation, search


Cloudwave : Cloud Computing for Clinical "BIG Data"

Cloudwave is a Web-based platform that features an intuitive signal analysis interface integrated with a Hadoop-based data processing module implemented on clinical data stored in a “private cloud”. Cloudwave has been developed as part of the National Institute of Neurological Disorders and Strokes (NINDS) funded multi-center Prevention and Risk Identification of SUDEP Mortality (PRISM) project.

EEG data represents postsynaptic potentials from large group of neurons and electrodes (either scalp or intracranial) record the voltage differences between the different brain regions. Patients are usually admitted to Epilepsy Monitoring Units (EMU) to record multiple channel signal data, including EEG, heart rate, blood oxygen levels, and electrocardiogram (EKG), over a period of 5 days. These comprehensive multi-channel patient recordings (in the order of 100s) generate very large datasets, for example a 24-hour recording for a patient represents 8000 screen images, each image containing signal from 100 channels. About 5-10GB of data is generated for a single patient and an EMU usually admits about 100-150 patients in a year, which creates significant data management challenges similar to other “Big Data” application in terms of efficient storage, visualization, and analysis. The increasing need for multi-center clinical research studies exacerbates this challenge by introducing the need to share, interoperate, and integrate signal data in real time.

About 964 patients have been processed in the CWRU-UH EMU since January 2011 after the start of the PRISM project, and about 116 of these patients have consented to participate in the PRISM project (figure 1A). An average of 321 MB of electrophysiological data is generated from recordings of a single patient per day, and about 1.6 GB of data over a typical 5-day admission period in the EMU. This has resulted in 9.5 TB of total signal data collected in the CWRU-UH EMU and about 4 TB of data collected from patients recruited for the PRISM project since 2011. The rate of data collection in the EMU is increasing every year—for example, the volume of data at the end of 2012 was 6 TB, but 9.5 TB of data had already been collected by May 2013 (figure 1B illustrates the growth in total data collected from all patients in the EMU and patients recruited for the PRISM project). Hence, there is an acute need to define efficient algorithms and develop an effective informatics platform to manage this electrophysiological big data.

Growth of Electrophysiological Signal Data

The PRISM project aims to recruit about 1200 patients from four participating EMUs at the Case Western Reserve University (CWRU) University Hospitals-Case Medical Center (UH-CMC), Ronald Reagan University of California Los Angeles (UCLA) Medical Center (RRUMC-Los Angeles), the National Hospital for Neurology and Neurosurgery (NHNN, London, UK), and Northwestern Memorial Hospital (NMH Chicago). Hence, the primary informatics challenge in the PRISM project is to allow real time access to patient data from different institutions in a secured collaborative environment for clinical researchers. Cloudwave is part of this informatics infrastructure with specific focus on enabling researchers to seamlessly search, query, and visualize signal data annotated with clinical events for patient cohort identification.


Cloudwave is a high-performance integrated signal analysis platform with an intuitive Web interface (for use by clinicians and research staff members) integrated with Hadoop-based computation engine for distributed processing of large electrophysiological signal datasets. The figure below illustrates the high level system architecture of Cloudwave.

Cloudwave Components and Workflow

The Cloudwave platform was implemented using agile software engineering approach with close and frequent interactions with users for rapid prototype development and feedback. Cloudwave uses the Model View Controller (MVC) architecture with Ruby on Rails technology stack and an open source JavaScript charting library.

System Overview

The Cloudwave platform was developed using four design principles:

  1. Fast access to individual signals stored in a multimodal EEG EDF store;
  2. Ability to partition signals into meaningful “segments” based on seizure related events for easier quantitative analysis;
  3. Use of EDF file header record and seizure events as metadata for faster identification of appropriate patient cohorts; and
  4. Polygraph visualization of multiple signals on a single page for visual analysis by human reader.

A set of use cases were defined and systematically document-ed to identify appropriate features to be implemented in Cloudwave, such as the ability to:

  1. Search for seizure event information in signal data, in-cluding the time of event occurrence or the time duration between start and end of an event. For example, occur-rence of ‘Sign-of-Four’ lateralizing sign event, time dura-tion between “onset of jittery phase and end of jittery phase”;
  2. List patients with Cardiac Arrhythmia who also have ir-regular heartbeat rates. This can be further classified as Bradycardia or Tachycardia, with markings on EKG sig-nal when heartbeat rate is below or above a threshold, such as 60 beats per minute (BPM);
  3. Measure Heart Rate Variability (HRV) for selected pa-tients, which is a physiological phenomenon of variation in the time interval between heartbeats;
  4. List patients with respiratory arrhythmia with markings on the respiration signal where the rate of respiration is above or below a threshold (e.g. 5/minute); and
  5. Provide EEG suppression information for patients, includ-ing time duration of “EEG Suppression” in patients after a seizure occurrence and time duration of “EEG suppres-sion to return to baseline” event.


  1. Sahoo SS, Jayapandian C, Garg G, Kaffashi F, Chung S, Bozorgi A, Chen CH, Loparo K, Lhatoo SD, Zhang GQ. Heartbeats in the Cloud: Distributed Analysis of Electrophysiological “Big Data” using Cloud Computing for Epilepsy Clinical Research. Journal of American Medical Informatics Association JAMIA (special issue on Big Data in Healthcare and Biomedical Research) 2013. http://www.ncbi.nlm.nih.gov/pubmed/24326538?dopt=Abstract
  2. Jayapandian CP, Chen CH, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. Cloudwave: Distributed Processing of “Big Data” from Electrophysiological Recordings for Epilepsy Clinical Research Using Hadoop. American Medical Informatics Association (AMIA) Annual Symposium, 2013. pp. 691-700
  3. Jayapandian CP, Chen CH, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. Electrophysiological Signal Analysis and Visualization using Cloudwave for Epilepsy Clinical Research. The 14th World Congress on Medical and Health Informatics (MedInfo), 2013. http://www.ncbi.nlm.nih.gov/pubmed/23920671


    Electrophysiological Signal Analysis and Visualization using Cloudwave for Epilepsy Clinical Research - presented at the 14th World Congress on Medical and Health Informatics, MedInfo 2013