Cyberheart-Diagnostics Software Package for Automated Electrocardiogram Analysis Based on Machine Learning Techniques

Moskalenko V.A., Nikolskiy A.V., Zolotykh N.Yu., Kozlov A.A., Kosonogov K.A., Kalyakulina A.I., Yusipov I.I., Levanov V.M.

Key words: electrocardiogram; automated analysis; ECG database; machine learning techniques.

The aim of the study was to develop the Cyberheart-Diagnostics software module, an automated electrocardiogram analysis system being part of the Cyberheart software and hardware complex, and to select machine learning techniques for testing the system based on the comparative analysis of their capabilities.

Materials and Methods. The software package was developed using various machine learning techniques working on a large sample of labeled data, i.e. ECG database with known diagnostic conclusions: support-vector machines, decision tree, artificial neural networks, linear and quadratic discriminant analysis, the random subspace method, AdaBoost, random forest, logistic regression (McCulloch–Pitts neuron model). For comparative analysis and evaluation of the obtained results, the Cyberheart-Diagnostics software was tested using open international ECG databases: Arrhythmia Data Set, PhysioNet PTBDB, PhysioNet Competition 2017 as well as our own database comprising 1652 records of a standard 12-lead resting ECG. The ECG records were interpreted by expert physicians who then formed structured medical conclusions considered as reference.

Results. In different classes of attributes, the diagnostic accuracy of the Cyberheart-Diagnostics software appeared to be 83.8 to 94.5% as compared to the conclusions of expert doctors — 66.3 to 95.1%. Thus, the developed software is comparable with the world analogues in quality of electrocardiogram analysis.

Introduction

Development of automated data processing and advances in healthcare information technology along with the use of portable medical equipment open up new opportunities for improving the methods for early diagnosis of circulatory diseases and remote patient monitoring [1, 2]. Electrocardiogram (ECG) still remains the most common method of instrumental diagnostics [3]. One of the challenging directions in this field is development of an intellectual electrocardiograph, i.e. an automated system for decoding electrocardiogram signals followed by issuing a medical report most similar to that of a doctor [2, 4]. Machine learning techniques based on representative ECG sampling are used to achieve this goal.

Nowadays, mobile software products based on methods of automated ECG analysis appear in different countries. At the same time, the degree of confidence in the automated ECG diagnosis without human intervention has remained a controversial issue for several decades. Therefore, the algorithm providing the necessary accuracy of diagnostic conclusion is likely to be the crucial factor in choosing a particular system.

The scientists of the National Research Lobachevsky State University of Nizhny Novgorod created Cyberheart software and hardware complex for collection, storage, and automated analysis of inhomogeneous medical data. Cyberheart consists of medical probes monitoring the performance of the cardiovascular system and a software module, Cyberheart-Diagnostics. The purpose of this module is automated analysis of ECG of various duration and making pre-hospital diagnostic conclusion.

To improve diagnostic accuracy, the authors have developed and tested an algorithm that includes training and testing of the module using open international databases and structured medical conclusions.

The aim of the study was to develop the Cyberheart-Diagnostics software module, an automated electrocardiogram analysis system being part of the Cyberheart software and hardware complex, and to select machine learning techniques for testing this system based on the comparative analysis of their capabilities.

Materials and Methods

The authors have developed a basic version of ECG analysis software, Cyberheart-Diagnostics, which allows subsequent training. The software interface is written in C# language, mathematical methods are in Python using the matrix computing library NumPy (www.numpy.org), libraries of wavelet analysis PyWavelets (github.com/PyWavelets), and libraries of machine learning techniques Scikit-learn (scikit-learn.org/stable).

The authors have created their own database (Cardiobase) comprising 1652 records of a standard 12-lead resting ECG in digital EDF format. ECG was obtained in adult patients aged 17–80 years, including 743 males (45%) and 909 females (55%). ECG was recorded from patients seeking medical care of cardiologists, arrhythmologists, cardiac surgeons in out-patient departments of Nizhny Novgorod and from those who were hospitalized in cardiology departments and cardiovascular centers.

All 1652 ECG records were interpreted by expert physicians (cardiologists and physicians of functional diagnostics), who then formed structured medical conclusions. Subsequently, the ECG records were analyzed automatically using key point detection (KPD), segmented and automatically described in the form of a pre-hospital report with the software developed by the authors according to the classical criteria for ECG analysis [5].

KPD application algorithm included the following stages:

ECG signal pre-processing: filtering (noise suppression), detecting the baseline (isoline) [1, 3];

KPD signal: detection of the beginning, peak and end of the QRS-complex, P- and T-waves and determining their morphology (Figure 1);

automated report generation.

Figure 1. PQRST complex segmentation

These data served as the basis for calculating standard signal characteristics (see, for example, [3]). Standard functions were used to calculate numerical characteristics (average duration and height of ECG signal complexes, their standard deviations, etc.). As a result, 38 attributes were obtained to describe each ECG lead.

The Cyberheart-Diagnostics software was trained using 1242 out of 1652 ECGs available in the Cardiobase and formalized medical conclusions. The most common machine learning techniques were used for this purpose: the support-vector machine, decision tree, linear and quadratic discriminant analysis [6], the random subspace method [7], AdaBoost [8], random forest [9], logistic regression (McCulloch–Pitts neuron model) with attribute preprocessing through the Batch Normalization layer [10].

In these techniques, the algorithms are automatically built using a large sample of labeled data, i.e. ECG database with known diagnostic conclusions, while the algorithm of establishing a diagnosis (decisive function) is not programmed explicitly. The model is “set up” according to the training sample data, a set of characteristics and attributes of the patient’s ECG with a known diagnostic conclusion.

The automated ECG analysis software was trained using machine learning techniques on attributes obtained with KPD (Figure 2). More information about the KPD algorithm is provided in work [5].

Figure 2. ECG (12 leads) of patient U., 69 years:

(a) initial; (b) with segmentation

To test the Cyberheart-Diagnostics software with the above machine learning techniques, experimental ECG analysis was carried out using the largest open ECG databases processed by medical experts: Arrhythmia Data Set (http://archive.ics.uci.edu/ml/datasets/arrhythmia), PhysioNet PTBDB (https://www.physionet.org/physiobank/database/ptbdb), PhysioNet Competition 2017 (https://physionet.org/challenge/2017). The diagnostic accuracy with respect to the main ECG attributes and classes was taken into account when analyzing the results.

Arrhythmia DataSet database is represented by 410 records of a 12-lead electrocardiogram. Each ECG is described by 279 attributes and belongs to one of the 16 classes of ECG attributes.

PhysioNet PTBDB database consists of 549 ECG records from 290 patients. Each patient belongs to one of nine classes.

PhysioNet Competition 2017 database consists of 8528 single-channel five-minute ECG records assigned to one of four classes.

Developing the Cyberheart-Diagnostics software, the authors created their own database (Cardiobase) represented by 1652 ECG records divided into 9 classes and used ECG derived from 1242 patients for training (Table 1).

Table 1. Cardiobase structure according to classes of diagnostic conclusions

The Cardiobase classes were developed on the basis of classic clinical and electrophysiological approaches to ECG findings (see Table 1). The following groups of findings were the most frequently presented: normal ECG, sinus arrhythmia, hypertrophy of the left heart, atrial fibrillation, ischemic changes.

The findings obtained by automated analysis of 410 ECGs from the Cardiobase were compared with the data in medical expert reports, which were nominally considered as reference. Figure 3 shows the ECG report interface.

Figure 3. Example of ECG report interface

The findings were comprehensively evaluated according to attribute classes. Fivefold cross-checking was carried out using ROC-AUC metric for quality assessment of diagnosis [11, 12].

Results and Discussion

In accordance with the objectives of the investigation, sequential testing of the developed software was carried out using machine learning techniques (algorithms) and available international databases.

Arrhythmia database. Testing experiments were carried out with a sample consisting of objects (observations) belonging to the most representative classes: class 1 (“Norm”), class 2 (“Ischemic changes”), and class 10 (“Right bundle branch block”) (a total of 339 records). The software was tested using such methods as support-vector machine, random subspace, the nearest neighbors, decision tree, neural networks, linear and quadratic discriminant analysis. Fivefold cross-checking was carried out using ROC-AUC metric for quality assessment. Table 2 shows the parameters of the algorithms that yielded the best results.

Table 2. Results of the experiment with preliminary selection of attributes (%)

An experiment on three-class classification (classes 1, 2, and 10) without preliminary selection of attributes was also carried out. The following machine learning techniques were used: support-vector machine, logistic regression, AdaBoost, random forest. The data were randomly divided into training (70%) and test (30%) samples. This division was carried out 100 times. The achieved accuracy of the algorithms is presented in Table 3.

Table 3. Three-class classification accuracy obtained on the training sample

Testing on PhysioNet PTBDB database. The task of binary classification between classes 1 (“Myocardial infarction”) and 9 (“Healthy”) was considered. Other classes were under-represented, therefore it was difficult to identify any patterns useful for proper diagnostics. Logistic regression yielded the best result (McCulloch–Pitts neuron model) with attribute preprocessing through the Batch Normalization layer. The accuracy of 86.1% was achieved with the balanced sample.

Testing on PhysioNet Competition 2017 database. A series of binary classification tasks were set, the results of the experiments are described below
(Table 4).

Table 4. Results of the experiments using PhysioNet Competition 2017 database

The value of F1-score (the harmonic average between specificity and sensitivity) was found to be 0.81. It should be noted, the best F1-score indices in the international study equaled 0.83, according to PhysioNet Competition 2017. Thus, the results of algorithms implemented by us are comparable in diagnostic accuracy with the world analogues.

Testing on the created Cardiobase. The results of testing the software on the created Cardiobase are presented in Table 5.

Table 5. Results of testing the Cyberheart-Diagnostics software on the created Cardiobase (n=410)

The sensitivity of diagnostic conclusions made by the Cyberheart-Diagnostics software equaled 66.7–100.0% for different classes as compared to the conclusions of doctors-diagnosticians. Specificity of the method amounted to 62.9–99.0%, accuracy — 62.9–95.1%.

Conclusion

The results of testing the developed Cyberheart-Diagnostics software with the main machine learning techniques showed rather high accuracy of diagnostic conclusions achieved by this software as compared to conventionally “ideal” conclusions made by doctors-diagnosticians.

The data obtained by the authors as a result of automated ECG decoding with the presented software and the authors’ own Cardiobase materials were compared with the data from three large publicly available ECG databases. Despite the fact that these databases had differences in the duration of cardiosignal recording, the number of leads, sets of classes and other criteria, diagnostic accuracy of the algorithm developed by the authors in the corresponding classes of attributes amounted to 62.9–95.1%.

Study funding. The study was supported by the Ministry of Education and Science of the Russian Federation (contract No.02.G25.31.0157 of 01.12.2015).

Conflict of interests. The authors have no conflict of interests to disclose.

References

Yurovskiy A.Yu., Sukhov S.S. Distant analysis of ECG and computerized electrocardiography ― modern alternatives to classic “paper” solutions. Prakticheskaya meditsina 2017; 2: 14–17.
Strutynskiy A.V. Elektrokardiogramma: analiz i interpretatsiya [Electrocardiogram: analysis and interpretation]. Moscow: MEDpress-inform; 2017; 224 p.
Vorobiov L.V. ECG analysis of cardiac activity of a healthy person. Mezhdunarodnyy zhurnal prikladnykh i fundamental’nykh issledovaniy 2016; 10: 549–553.
Drozdov D.V., Levanov V.M. Automatic ECG analysis: problems and prospects. Zdravookhranenie i meditsinskaya tekhnika 2004; 1: 10.
Kalyakulina A.I., Yusipov I.I., Moskalenko V.A., Nikolskiy A.V., Kozlov A.A., Zolotykh N.Yu., Ivanchenko M.V. Finding morphology points of electrocardiographic signal waves using wavelet analysis. Izvestiya Vuzov. Radiofizika 2018; 61(8): 773–789.
Duda R.O., Hart P.E., Stork D.G. Pattern classification. Wiley-Interscience; 2000.
Tin Kam Ho. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 1998; 20(8): 832–844, https://doi.org/10.1109/34.709601.
Freund Y., Schapire R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 1997; 55(1): 119–139, https://doi.org/10.1006/jcss.1997.1504.
Breiman L. Random forests. Machine Learning 2001; 45(1): 5–32.
Ioffe S., Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. 2015. URL: https://arxiv.org/pdf/1502.03167.pdf.
Brown C.D., Davis H.T. Receiver operating characteristics curves and related decision measures: a tutorial. Chemometr Intell Lab Syst 2006; 80(1): 24–38, https://doi.org/10.1016/j.chemolab.2005.05.004.
Petrov V., Lebedev S., Pirova A., Vasilyev E., Nikolskiy A., Turlapov V., Meyerov I., Osipov G. CardioModel ― new software for cardiac electrophysiology simulation. In: Voevodin V., Sobolev S. (editors). Supercomputing. RuSCDays 2018. Communications in computer and information science. Vol. 965. Springer, Cham; 2018; p. 195–207, https://doi.org/10.1007/978-3-030-05807-4_17.