Вы находитесь на странице: 1из 26

12SL Statement of Validation and Accuracy

for the Physicians Guide

416791-003

Revision A

P R O G R A M

12SL

127( The information in this manual only applies to the 12SL Statement of Validation and Accuracy. Due to continuing product innovation, specifications in this manual are subject to change without notice.

Listed below are GE Medical Systems Information Technologies trademarks. All other trademarks contained herein are the property of their respective owners. 900 SC, ACCUSKETCH, AccuVision, APEX, AQUA-KNOT, ARCHIVIST, Autoseq, BABY MAC, C Qwik Connect, CardioServ, CardioSmart, CardioSys, CardioWindow, CASE, CD TELEMETRY, CENTRA, CHART GUARD, CINE 35, CORO, COROLAN, COROMETRICS, Corometrics Sensor Tip, CRG PLUS, DASH, Digistore, Digital DATAQ, E for M, EAGLE, Event-Link, FMS 101B, FMS 111, HELLIGE, IMAGE STORE, INTELLIMOTION, IQA, LASER SXP, MAC, MAC-LAB, MACTRODE, MANAGED USE, MARQUETTE, MARQUETTE MAC, MARQUETTE MEDICAL SYSTEMS, MARQUETTE UNITY NETWORK, MARS, MAX, MEDITEL, MEI, MEI in the circle logo, MEMOPORT, MEMOPORT C, MINISTORE, MINNOWS, Monarch 8000, MULTI-LINK, MULTISCRIPTOR, MUSE, MUSE CV, Neo-Trak, NEUROSCRIPT, OnlineABG, OXYMONITOR, Pres-R-Cuff, PRESSURE-SCRIBE, QMI, QS, Quantitative Medicine, Quantitative Sentinel, RAC RAMS, RSVP, SAM, SEER, SILVERTRACE, SOLAR, SOLARVIEW, Spectra 400, Spectra-Overview, Spectra-Tel, ST GUARD, TRAM, TRAM-NET, TRAM-RAC, TRAMSCOPE, TRIM KNOB, Trimline, UNION STATION, UNITY logo, UNITY NETWORK, Vari-X, Vari-X Cardiomatic, VariCath, VARIDEX, VAS, and Vision Care Filter are trademarks of GE Medical Systems Information Technologies registered in the United States Patent and Trademark Office. 12SL, 15SL, Access, AccuSpeak, ADVANTAGE, BAM, BODYTRODE, Cardiomatic, CardioSpeak, CD TELEMETRY-LAN, CENTRALSCOPE, Corolation, EDIC, EK-Pro, Event-Link Cirrus, Event-Link Cumulus, Event-Link Nimbus, HI-RES, ICMMS, IMAGE VAULT, IMPACT.wf, INTER-LEAD, IQA, LIFEWATCH, Managed Use, MARQUETTE PRISM, MARQUETTE RESPONDER, MENTOR, MicroSmart, MMS, MRT, MUSE CardioWindow, NST PRO, NAUTILUS, O2SENSOR, Octanet, OMRS, PHiRes, Premium, Prism, QUIK CONNECT V, QUICK CONNECT, QT Guard, SMART-PAC, SMARTLOOK, Spiral Lok, Sweetheart, UNITY, Universal, Waterfall, and Walkmom are trademarks of GE Medical Systems Information Technologies. GE Medical Systems Information Technologies, 2000. All rights reserved.

T-2

12SL Statement of Validation and Accuracy 416791-003

Revision A 23 October, 2000

Statement of Validation and Accuracy


This paper reviews several aspects regarding the development and validation of the GE Marquette 12SL Program. Program accuracy levels for major statement categories are supplied. When the program is used in conjunction with a cardiologist, it has been shown that it can improve both the speed1 and accuracy of reviewing ECGs2.

1. Sheffield et. al., 1987. Electrocardiography and computerization: a winning combination. Card. Prod. News. 47-51 2. Willems et. al., 1990. Assessment of the diagnostic performance of ECG computer programs and cardiologists. In: Common Standards for Electrocardiograph: 10th and Final Progress Report Leuven:Ref.Nr. CSE 90-12-31 148-266

Revision A

12SL Statement of Validation and Accuracy

416791-003

Introduction
The first human electrocardiogram was taken over a hundred years ago, and computerized electrocardiography has been in existence since the late 1950s.1, 2 In spite of its widespread use,3 long history, and the voluminous amount of literature regarding the scientific aspects of this technology, there is little written that directly addresses the intent of computerized electrocardiography. The pioneers of this technology had motivations which ranged from the esoteric goal of proving that a computer could mimic human activity to the basic requirement of efficiently recording artifact free tracings.4 Some of the favorable developments which resulted from the evolution of this technology were hardly imagined at its inception. Consider, for example, work patterns at facilities which provide ECG services; they have been greatly streamlined.5 Additionally, computerization has resulted in two practical advantages for the overreading physician. First, the computer can serve as an additional expert opinion. Second, physicians have found that it is possible for them to overread computer analyzed tracings in half the time required for conventional, nonanalyzed ECGs.6 The computer, therefore, is not only used to efficiently record, store, transmit, and present the ECG; but it is also used to assist the physician in overreading an ECG. Consequently, developers inherit a certain responsibility. GE Marquette has accepted this serious challenge. GE Marquette must develop the most accurate program possible, substantiate the program with data, and document the extent of the computers capabilities. This is the purpose of this paper. It should be made clear that a computerized analysis is not a substitute for human interpretation. There are two reasons for this. First, statements of accuracy need to be viewed from a statistical perspective. Although accuracy levels may be high, outliers can and will exist. Second, a computer does not have the ability to include the entire clinical picture of the patient. Despite the fact that the 12SL analysis program has a high level of accuracy, it will occasionally not correctly interpret an ECG. The ECG tracing is significant only when interpreted

1. Burch et. al., 1964. A history of electrocardiography. Year Book Medical Publishers. Chicago, Ill. 2. Pipberger et. al., 1975. Computer methods in electrocardiography. Annu. Rev. Biophys. Bioeng. 4:15-42 3. Drazen et. al., 1988. Survey of computer-assisted electrocardiography in the United States. J Electro. 21 (suppl):98-104 4. Rowlandson, I., 1990. Computerized electrocardiography: a historical perspective. Ann. New York Academy of Science. 601:343-352 5. Sheffield et. al., 1987. Electrocardiography and computerization: a winning combination. Card. Prod. News. 47-51 6. Sheffield et. al., 1987. Seminar on computer applications for the cardiologist, computer-aided electrocardiography. JACC, 10(2):448-55

12SL Statement of Validation and Accuracy

Revision A

416791-003

in conjunction with clinical findings. Thus, it is critical that a physician utilizes his/her best clinical judgement when reviewing the ECG interpretation.

Revision A

12SL Statement of Validation and Accuracy

416791-003

Signal Acquisition/Signal Conditioning


A programs accuracy is directly dependent upon the quality of the signal it acquires. In 1979, GE Marquette introduced an electrocardiograph that simultaneously acquired all of the leads from the 12-lead electrocardiogram. Prior to this time, all commercially available electrocardiographs could only acquire 3 leads at a time. Simultaneous recording was adapted so that the computer could use all signals from all 12 leads to properly detect and classify each QRS complex. The advantage of this technique was independently verified by Willems et. al.: Conclusion: The simultaneous recording and analysis of all 12 standard leads...is certainly an improvement over the conventional recording of three leads at a time. Similarly...multilead programs proved to be more stable than those obtained by conventional programs analyzing three leads at a time...1

1. Willems et. al., 1987. A reference data base for multilead electrocardiographic computer measurement programs. JACC, 10(6):1313-21

12SL Statement of Validation and Accuracy

Revision A

416791-003

Median Beat/Signal Averaging


Computer measurement for features within the QRS complex are very susceptible to artifact. To remove artifact, filtering may be used. Beyond filtering, there is another method able to eliminate noise from the QRS complete: signal averaging. Instead of analyzing a single QRS complex, the GE Marquette 12SL Program generates a median complex. In other words, all QRSs of the same shape are aligned in time. Next, the algorithm generates a representative QRS complex from the median voltages that are found at each successive sample time. This is more complicated than creating an average, but the method results in a cleaner signal since it is able to disregard outliers. Willems et. al., independently verified the value of this technique: On the basis of the finding of the present study, a measurement strategy based on selective averaging is recommended for diagnostic ECG computer programs.1 A specific study that included results from the GE Marquette 12SL Program also found that Comparisons of all averaging against all non-averaging programs yields a 70% higher detection instability for the non-averaging programs and worse performance in most tested measurements.2

1. Willems et. al., 1987. Influence of noise on wave boundary recognition by ECG measurement programs. Comp. Biomed. Res., 83 2. Zyweitz et. al., 1987. Noise tolerance of ECG amplitude measurements. results and recommendations from the cse project. Comp. in Cardiol.

Revision A

12SL Statement of Validation and Accuracy

416791-003

Measurements/Onsets and Offsets


All ECG computer programs are composed of two parts: one which measures the waveforms and one which does the analysis based upon those measurements. The main task of computerized measurement determines the location of major reference points (onsets and offsets for P, QRS, and T waves). Consistent with the signal processing portion of the program, the major wave onsets and offsets are delineated by an analysis of the slopes in all 12 simultaneous leads. That is, QRS duration is measured from the earliest onset in any lead to the latest deflection in any lead. Similarly, the QT interval is measured from the earliest detection of depolarization in any lead to the latest detection of repolarization in any lead. The measurement accuracy of the GE Marquette 12SL Program was evaluated in an independent study. It used a sample of 250 ECGs. These ECGs represented a wide spectrum of waveform patterns. They were then analyzed by five referee cardiologists and sixteen different programs. Results from this study can be found in Appendix I. The GE Marquette 12SL Program was found to be within 1 millisecond of the reference standard for the measurement of P duration, PR interval, QRS duration, and QT interval. In another independent study by Mulcahy et. al., the computer determined heart rate, PR interval, QRS duration, QT interval and mean frontal QRS axis. These results were then compared to measurements by two cardiologists. They stated: We conclude that the GE Marquette MAC II Computer Augmented Cardiograph can be relied upon for routine electrocardiographic measurements...1 After the onsets and offsets for P, QRS, and T complexes have been demarcated, the waves within each complex are then measured according to published standards.2 These amplitudes and durations result in a measurement matrix containing more than 1600 values. Measurements are then passed onto the criteria portion of the program so that it can generate an interpretation.

1. Mulcahy et. al., 1986. Can a computer assisted electrocardiograph replace a cardiologist for ECG measurements?. Irish J. of Med. Sci., 155 (12):410414 2. Willems et. al., 1982. Common standards for quantitative electrocardiography. second progress report. CSE reference 82-11-20. Leuven:Acco, 1982:1-246.

12SL Statement of Validation and Accuracy

Revision A

416791-003

Computer Interpretation/Development Process


The GE Marquette 12SL Program was introduced in 1980. All improvements to the program have been accomplished via a systematic, logical, and controlled methodology. A major aspect of this methodology benefits from the use of stored ECGs. GE Marquette stores ECGs in such a fashion that they can be reanalyzed by the 12SL Program.1, 2 In other words, the fidelity of the stored ECG is such that it can be used as if it was newly acquired. This allows us to access large volumes of stored ECGs for the purposes of either training or testing the program. Any change to the program requires a great deal of research. This effort can be instigated by a variety of sources. The constant pursuit of clinically correlated databases can yield statistics that indicate whether a change should be considered. New criteria published in the literature can be evaluated and sometimes incorporated into the program. Consultations with cardiologists also stimulate investigations. This is especially true when they have stored ECGs whose interpretation has been verified by other, non-ECG data. Before a change can be instituted, it must always be evaluated in relation to the current performance of the program. This validation process is facilitated by a set of research tools developed specifically for this purpose. These tools are then used in conjunction with our reanalysis capability. As an ECG is re-analyzed, a score pertaining to the item under investigation is stored and collated by the computer. Later, after many ECG records have been scored, the computer can generate statistics on the entire set of ECGs that it reanalyzed and stored. To determine ECGs that might be affected by a program change, the entire ECG set is reanalyzed twice: once with the change and once without. After this is done, the computer automatically culls out and plots any ECGs that scored differently between the two versions of the program. This work has resulted in an efficient set of research tools that allows GE Marquette to automatically determine how a change might affect program performance on a large database.3 Given these sophisticated tools, the next issue relevant to the development process is the selection of an appropriate database. Appendix II contains a list of gold standard databases that GE Marquette has used for program validation. These databases are extremely valuable, because they are time consuming and expensive to

1. Huffman, D.A., 1952. A method for the construction of minimum redundancy codes. Proc. Inst. Radio Eng.: 1098-1101 2. Reddy et. al., 1991. Data compression for storage of resting ECGs digitized at 500 samples per second. Association for the Advancement of Medical Instrumentation Meeting 3. Rowlandson I., 1986. New techniques in criteria development. Computerized Interpretation of the Elec. X., 177-184

Revision A

12SL Statement of Validation and Accuracy

416791-003

obtain. Nevertheless, they are an essential ingredient. Without an objective yardstick, the program will not excel since the target for performance will be vaguely and inconsistently defined by the consensus cardiologist.1 It should also be noted that GE Marquette uses different databases during the development and validation process. This precludes us from developing an algorithm that works beautifully on the training set but cannot be applied, with the same success, to other populations.2, 3 During the training phase, we use a database that has been correlated with a gold standard. The choice of the gold standard depends upon the problem we are investigating. For criteria that references a particular patho-physiologic state (like myocardial infarction), we use a database that is correlated with other non-ECG evidence (like cardiac catheterization, echocardiography, autopsy, cardiac enzymes, patient history, etc.). For measurements, or arrhythmia statements that can be confirmed by the ECG itself, we use the ECG in conjunction with expert opinion. During the test phase, we not only use an independent gold standard database, but we also use other databases. We do this because gold standard databases have some limits. Examples include the following:4
s

The gold standard may not be representative of the disease in the clinical setting. For example, an ECG database which contains autopsy proven myocardial infarctions (MI) may not be indicative of what a typical MI looks like since many patients survive a MI. Gold standard databases often contain only one, isolated disease. For example, a database may only contain MIs and normals. The algorithm, however, must also operate in the presence of ischemia, LVH, drug effects, etc. There may be a systematic bias when selecting patients for a gold standard test. CATH proven normals often receive the test because they were symptomatic. A gold standard database may only contain extremes of normal versus abnormal. Algorithms dont operate in a black and white world. And finally, a gold standard cannot be considered perfect: every test comes with its own inherent level of inaccuracy.

1. Gorman et. al., 1964. Observer variation in interpretation of the electrocardiogram. Med. Ann. DC 33:97-99 2. Devijver P.A. and Kittler J., 1982. Pattern recognition: a statistical approach. Prentice Hall International, Englewood Cliffs, New Jersey 3. Talmon J.L., 1983. Pattern recognition of the ECG: a structured analysis. Univ. of Amsterdam, Amsterdam, Holland 4. Laks MM., 1981. Gold standard for ECG diagnosis - revised 1981., In: Bonner et. al., eds., Comp. Interp. of the ECG VI. New York; Engineering Foundation, 1982; 267-75

12SL Statement of Validation and Accuracy

Revision A

416791-003

Given these aforementioned limitations, testing must go beyond the use of gold standard databases. We must test the algorithm with a wider spectrum of data. GE Marquette accomplishes this by measuring the algorithms performance on a large database (>150,000 ECGs acquired from one of the over 400 installed GE Marquette ECG storage systems). This process, which the computer can do in less than a day, confronts the algorithm with multiple diseases and varying degrees of abnormality. ECGs that changed their analysis results due to program modification can be further investigated with either confirmation from medical records and/or expert opinion. Only after this retrospective testing is complete, can we finally incorporate the change into an ECG cart and evaluate its performance at a clinical site. If this last test is successful, the change is incorporated into the program for general release.

Revision A

12SL Statement of Validation and Accuracy

416791-003

Accuracy of Computer Interpretation


The following section summarizes the diagnostic performance for the major statement categories as made by the GE Marquette 12SL Program. These categories include: rhythm, conduction, hypertrophy, infarction, repolarization, and overall classification (that is, normal versus abnormal).

Rhythm

Tables 1 and 2 presented below are from a study that reported retrospective and prospective evaluation results of the GE Marquette 12SL Program.1 The tables show the performance indices of 12SL for major rhythm categories.

Table 1. Sensitivity, Specificity Negative and Positive Predictive Accuracy of Rhythm Interpretation of MAC-RHYTHM Analysis During Prospective Testing No. Sinus rhythms Atrial fibrillation Atrial flutter Junctional 2nd-degree AV blocks 9,324 832 106 64 26 Sensitivity (%) 98.7 87.5 76.4 92.2 80.8 Specificity (%) 91.0 99.4 99.7 99.5 99.6 NPA (%) 91.5 99.0 99.8 100.0 100.0 PPA (%) 98.6 92.4 71.7 52.7 (72.8*) 32.8

* After excluding paced ECGs with failed pace detection. NPA, negative predictive accuracy; PPA, positive predictive accuracy.

1. Reddy et. al., 1998. Prospective Evaluation of a microprocessor-assisted cardiac rhythm algorithm: Results from one clinical center. J. Electrocardiol. 30:28-33

10

12SL Statement of Validation and Accuracy

Revision A

416791-003

Table 2. Sensitivity and Specificity of Current Release (MACR) and an earlier 12SL release (Dec. 1993) in Retrospective Testing Sensitivity (%) n Sinus rhythms Abnormal rhythms Atrial fibrillation Atrial flutter Junctional rhythm 2nd-degree AV block Others* Subtotal: abnormal Total 283 100 101 167 59 710 4,176 82.0 80.0 81.2 79.6 67.8 79.4 95.2 65.0 73.0 39.6 12.0 15.3 44.2 88.9 .0001 NS .0001 .0001 .0001 .0001 .0001 99.4 99.7 99.1 99.7 99.7 98.6 99.0 99.9 99.7 99.9 99.9 98.5 NS NS NS ns NS NS 3,438 MACR 98.6 Old 98.5 P NS MACR 91.7 Specificity (%) Old 86.5 P .0001

* Others: atrioventricular dissociation and complete heart blocks. NS, not significant

Revision A

12SL Statement of Validation and Accuracy

416791-003

11

Conduction

Two independent studies have evaluated the performance of the GE Marquette 12SL Program for analyzing conduction abnormalities. As with rhythm, the overreading cardiologist was used as the source for the correct interpretation. These two studies were very large. At Mount Sinai Medical Center in New York City, over 39,000 ECGs were overread and reviewed for computer accuracy.1 At the Mayo Clinic, a total of over 12,000 ECGs were evaluated.2 Table 3 contains the results of the computer analysis for right bundle branch block (RBBB). Both studies found a specificity near 100% and a sensitivity near 90%. Table 3. Sensitivity and Specificity for RBBB SENSITIVITY 1502 / 1661 (90%) 354 / 391 (91%) SPECIFICITY 37736 / 37736 (100%) 12135 / 12159 (100%)

Table 4 contains performance results for left bundle branch block (LBBB). Both studies displayed similar outcomes. The specificity for LBBB was found to be as high as RBBB. LBBB sensitivity, however, was found to be lower. A detailed inspection of the data from the Mount Sinai study showed that the cardiologist often changed the computer diagnosis to LBBB (n=97) from another conduction abnormality already stated by the program (like ILBBB or nonspecific intraventricular conduction block). If these other conduction abnormalities were included as part of the analysis, the sensitivity would increase from 11% to 88%, thus resulting in essentially the same performance as was found at the Mayo Clinic. Table 4. Sensitivity and Specificity for LBBB SENSITIVITY 670 / 860 (78%) 215 / 248 (87%) SPECIFICITY 38358 / 38378 (100%) 12741 / 12793 (100%)

1. Swartz et. al., 1982. Marquette 12SL ECG analysis program: evaluation of physician changes. Comp. in Cardio. 437-440 2. Hammil et. al., 1989, personnal communication - Mayo Clinic

12

12SL Statement of Validation and Accuracy

Revision A

416791-003

Hypertrophy

Two independent studies have evaluated the performance of the GE Marquette 12SL Program for left ventricular hypertrophy (LVH) using echocardiography (ECHO). At the Mayo Clinic, an ECHO was performed within 30 days of the ECG.1 ECGs demonstrating WPW syndrome, paced rhythm, or LBBB were excluded from the study. ECHO studies were excluded for patients who were less than 21 years of age. All two dimensional and M-mode ECHO studies were technically adequate and required clear delineation of interventricular septal thickness (IVST), posterior wall thickness (PWT), and left ventricular internal dimension (LVID). Patients with IVST/ PWT>1.5, segmental wall motion abnormalities, pericardial effusion, or infiltrative cardiomyopathy were excluded from the study. This resulted in a test population of 4,300 patients. ECHO measurements were made according to the American Society of Echocardiography. ECHO studies revealed LVH in 1,029 patients. LVH was defined as: ECHO LV mass >265g LV mass = 1.04 ((LVID + PWT + IVST)3 - (LVID)3) - 13.6g The GE Marquette 12SL Program correctly identified 328 patients with LVH and 3,010 patients without LVH. The program was scored as stating LVH for the full breadth of statements that refer to the abnormality; including minimal (and moderate) voltage criteria for LVH, may be normal. Table 5 summarizes the programs performance. Table 5. Sensitivity and Specificity for LVH SENSITIVITY 328 / 1,029 (31.9%) SPECIFICITY 3, 010 / 3,271 (92%)

In addition to the Mayo Clinic study, another international study evaluated program performance for LVH.2 There are fewer patients in this study (normals = 382, LVH = 183). A normal individual was defined as being free of significant cardiopulmonary disease on the basis of a health screening examination (negative history, normal physical exam, normal chest X-ray; n=285) or invasive cardiac study (n=97). Invasive studies usually entailed cardiac catheterization (CATH) for atypical chest pain or ST/T abnormalities evident at rest or during exercise. LVH was based on CATH or ECHO. ECHO diagnosis of LVH was based on an increased indexed LV mass >100 g/m2 for females and >134 g/m2 for

1. Hammil et. al., 1988. Comparison of a computerized interpretive ECG program using xyz leads with a program using scalar leads. J. of Elect. (Supp) 88 2. Willems et. al., 1990. Assessment of the diagnostic performance of ECG computer programs and cardiologists. In: Common Standards for Electrocardiography: 10th and Final Progress Report Leuven:Ref.Nr. CSE 90-12-31 148-266

Revision A

12SL Statement of Validation and Accuracy

416791-003

13

males. Patients had left valvular acquired or congenital heart disease, including aortic stenosis (NYHA grade 2,3), aortic regurgitation (grade 2,3,4), mitral incompetence (grade 2,3,4) or aortic coarctation. Cardiomyopathy patients were excluded from this study. The statement output of the program was scored similarly to the Mayo study. Table 6 summarizes the performance of the program from this study. Table 6. Sensitivity and Specificity for LVH SENSITIVITY 76.2% SPECIFICITY 91.2%

The difference in sensitivity between these two studies is probably due to the degree of disease in the two populations. The study with the higher sensitivity has LVH of a greater extreme than the wider spectrum of disease that was recorded at the Mayo Clinic. It should also be noted that the criteria for LVH is age sensitive. If age is not entered into the electrocardiograph, the program defaults to the most sensitive criteria (old age, i.e. >80 years). The international study by Willems et. al. also investigated right ventricular hypertrophy (RVH). This abnormality is notoriously difficult to verify. As a result, the presence of RVH was determined via a minimum requirement of clear cut-clinical evidence coupled with investigative evidence of conditions which would be expected to cause RVH. The RVH cases (n=55) had acquired congenital RV pressure or volume overload, primary or secondary pulmonary hypertension (mean pulmonary pressure > 25mmHg) or left-to-right shunt at the atrial level of more than 1.5. The results of this study were as follows: Table 7. Sensitivity and Specificity for RVH SENSITIVITY 29.1% SPECIFICITY 100%

14

12SL Statement of Validation and Accuracy

Revision A

416791-003

Infarction

The acute infarction interpretation package in the current 12SL program has, based on an evaluation at GE Marquette, an overall sensitivity of 65% with a specificity of 98%.1 Kudenchuk et al.,2 in a separate independent study found that the GE Marquette 12SL program had an overall sensitivity of 74% in chest pain patients whose ECGs had ST elevation and other abnormalities. He also determined the sensitivity for anterior acute infarction alone to be 56% and for inferior infarction alone to be 87%. The overall specificity was found to be 100% for the 12SL program versus 82% for the electrocardiographer in that same chest pain data set. However, in the ECGs where only ST elevation was present without other concomitant repolarization abnormalities, the 12SL program was found to have a sensitivity of 45% for correctly interpreting acute infarction, but retained the same high specificity. The improvements to the current acute infarction package which led up to these figures are described in detail in the reference article: The Dilemma of Sensitivity versus Specificity in computer-interpreted Acute Myocardial Infarction. To summarize, the original package had very high specificity (99-100%), but a very low sensitivity (21%). GE Marquette worked with Dr. Douglas Weaver and used his enzyme correlated chest pain acute MI database, which is described in Kudenchuks article, to develop new interpretation criteria to increase the sensitivity for interpretation of acute infarction, and not decrease the high specificity. The package for interpreting acute infarction was required to have extremely high specificity to minimize the possibility of clinicians treating inappropriate patients; thereby needlessly subjecting the patients to the risk of potentially life-threatening complications of the medication. GE Marquette succeeded in developing more sensitive interpretive criteria without jeopardizing the already high specificity by requiring evidence of concomitant or reciprocal repolarization changes to be present in the ECGs before interpreting them as having evidence of acute infarction. Concomitant repolarization changes refer to evidence of other ST segment depression and/or T wave inversion in leads other than those exhibiting ST elevation. As stated above, the 12SL program has a much higher sensitivity for acute infarction when other repolarization changes are present in the ECG (74%) than when they are not present (45%). The 12SL emphasis is to always maintain the current specificity. Table 8. Sensitivity and Specificity for Overall Acute MI SENSITIVITY 65% SPECIFICITY 98%

1. Elko PP, Waver WD, Kudenchuk PJ, Rowlandson GI: The dilemma of sensitivity versus specificity in computer interpreted acute myocardial infarction. J of Electrocardiol 24:S2-7,1991 2. Kudenchuk PJ, Ho MT, Waver WD, et. al.,: Accuracy of ComputerInterpreted Electrocardiography in Selecting Patients for Thrombolytic Therapy. J Am Coll Cardiol 1991;17:1486-91

Revision A

12SL Statement of Validation and Accuracy

416791-003

15

Table 9. Sensitivity and Specificity for In Chest Pain Patients SENSITIVITY 74% SPECIFICITY 98%

Kudenchuk et. al., has presented a very good synopsis of his findings on the performance of the GE Marquette 12SL program in the chest pain screening area. Part is included here: in patients with suspected acute myocardial infarction, the overall sensitivity for the current computer algorithm (GE Marquette 12SL) to detect acute injury was 52% compared with the 66% sensitivity of the electrocardiographer in identifying patients with acute myocardial infarction who had ST elevation on their first ECG. This difference occurred in part because the electrocardiographer used less stringent criteria (100 V ST elevation without consideration of associated QRS or ST changes, unless left ventricular hypertrophy, left bundle branch block or a ventricular arrhythmia was present) than those used by the computer algorithm. The computer algorithm was developed to help differentiate early repolarization and nonspecific ECG changes from those of acute injury and, unlike the electrocardiographer, did not presume that ST elevation in a patient with chest pain was more likely than not to indicate acute infarction. Although more sensitive, the electrocardiographer has an overall incidence of 5% false positive diagnoses, including a 22% incidence of false positive diagnoses in patients with isolated ST segment elevation. In contrast, the computer was nearly perfect at excluding patients without acute myocardial infarction, but did so at the expense of diminished sensitivity. Computer specificity was as high or higher than that of the electrocardiographer, regardless of the patients clinical characteristics (gender, age, cardiac history or duration of symptoms), and inappropriate patients were extremely unlikely to be designated for thrombolytic intervention. The present algorithm is clearly adequate for first line screening of patients with chest pain by paramedics or in the emergency department. Its sensitivity is no worse than that of the emergency physician and its specificity is superior to that of a trained electrocardiographer. Use of the computer-interpreted ECG will provide for almost immediate triage of patients and obviate the time delays required when consulting with an electrocardiographer before proceeding with treatment. If findings are interpreted as normal or nondiagnostic, the ECG can then be further assessed by a skilled electrocardiographer. We are unaware of any similar extensive testing of other commercially available algorithms.

16

12SL Statement of Validation and Accuracy

Revision A

416791-003

Otto and Aufderheide1 reinforce the usefulness of concomitant repolarization evidence for interpretation of acute infarction from the ECG. They state: ST segment elevation criteria are not the sole factors used to determine the final ECG diagnosis of acute myocardial infarction because they lack the added impact of observer interpretation. ST segment elevation criteria function to triage ECGs and represent a minimally acceptable guideline requiring additional observer interpretation for consideration of the acute myocardial infarction diagnosis. Significant observer variation in interpreting ECGs is well established. Improving the positive predictive value of ST segment elevation criteria is likely to decrease the impact of observer variation. This study demonstrated that inclusion of reciprocal changes in prehospital ST segment elevation criteria improves the positive predictive value from 49% to more than 90%. Clinically, reciprocal changes are present in patients with large myocardial infarctions, lower ventricular ejection fractions, and higher mortality rates. Reciprocal changes also can be seen in some patients with significant disease of a noninfarct related artery. Therefore, inclusion of reciprocal changes into ST segment elevation criteria not only significantly improves positive predictive value but also selects the subgroup of acute myocardial infarction patients that stands to benefit most from rapid intervention. To summarize, the current 12SL program relies heavily on concomitant or reciprocal repolarization and ST elevation to have what GE Marquette believes is the best acute infarction package in the industry in terms of combined sensitivity and specificity. GE Marquette Medical Systems position with respect to the use of the 12SL computerized ECG analysis program; is: Despite the fact that the 12SL analysis program has a high level of accuracy, it will occasionally not correctly diagnose an ECG. The ECG tracing is significant only when interpreted in conjunction with clinical findings. Thus, it is critical that a physician utilizes his/her best clinical judgment when reviewing the ECG interpretation and that all ECGs need to be reviewed by a physician. This explanation and information is intended to provide the user with a better, more accurate, and realistic understanding of the capabilities of the 12SL program for the detection of acute infarction. GE Marquette continually strives to make the 12SL ECG analysis algorithm better, and believes that the program currently is the best and most completely validated in the industry; although not perfect.

Anterior Myocardial Infarction

Selection of Patients Using a computerized database of the patients who had undergone cardiac catheterization at the Syracuse Veterans Administration

1. Otto LA, Aufderheide TP: Evaluation of ST Segment Elevation Criteria for the Prehospital Electrocardiographic diagnosis of Acute Myocardial Infarction. Annals of Emergency Medicine 1994; vol 23, no. 1:17-24

Revision A

12SL Statement of Validation and Accuracy

416791-003

17

Medical Center (VAMC) from 1975 through 1993, we identified all patients who were angiographically normal (normal subjects) and all patients who had angiographic evidence of previous Anterior Myocardial Infarction (AMI). We defined the former as those who had no evidence of coronary arterial abnormalities on coronary angiography in multiple projections, no abnormalities on coronary angiography in multiple projects, no abnormalities of the left ventricle in the right anterior oblique projection (mean ejection fraction >70), and normal (<14 mmHg) left ventricular filling pressures. We defined the latter as those who had a 75% or greater narrowing of the left anterior coronary artery and either akinesia or dyskinesia of the anterior wall of the left ventricle as shown on a ventriculogram in the right anterior oblique projection. None of the AMIs had had their myocardial infarctions within 30 days of their catheterization. We eliminated from our study all patients who had significant valvular heart disease, patients whose ECGs revealed either left bundle branch block or paced rhythm, and patients whose ECGs had been obtained on analog ECG systems. However, we made no attempt to identify and exclude patients with either left ventricular enlargement or chronic obstructive pulmonary disease, conditions that can reduce the specificity of ECG criteria for AMI.1 All the ECGs that we analyzed had been obtained on or near the day of each patients catheterization. We studied 137 patients. The normal group consisted of 82 patients (76 men and 6 women) aged 31 72 years (mean, 52 years). The AMI group consisted of 55 patients (all men) aged 3781 years (mean, 58 years). Table 10. Diagnostic Performances on the Current Set of Data INTERPRETER 12SL MDS SENSITIVITY (%) 64 75 SPECIFICITY (%) 99 98 RELATIVE ODDS 142 117

1. Warner R, Reger M, Hill N et al: Electrocardiographic criteria for the diagnosis of anterior myocardial infarction. Importance of the duration of precordial R waves. Am J Cardiol 52:690, 1983

18

12SL Statement of Validation and Accuracy

Revision A

416791-003

Repolarization

GE Marquette 12SL Program accuracy for recognizing ST segment elevation associated with acute myocardial infarction has been evaluated in a large study (n=1,189) that acquired ECGs from patients within 6 hours of the onset of chest pain. This study used cardiac enzymes as the gold standard.1 Their conclusion: the positive predictive value of the computer- and physician-interpreted ECG was, respectively, 94% and 86% and the negative predictive value was 81% and 85%. The present algorithm is clearly adequate for first line screening of patients with chest pain by paramedics or in the emergency department. Its sensitivity is no worse than that of the emergency physician and its specificity is superior to the trained electrocardiographer. Raw numbers for algorithm performance is given in Table 8. Table 11. Sensitivity and Specificity for acute MI as determined by cardiac enzymes SENSITIVITY Computer Physician 202/391 (52%) 259/391 (66%) SPECIFICITY 785/798 (98%) 757/798 (95%)

These results also led to the following conclusion: Although more sensitive, the electrocardiographer had an overall incidence of a 5% false positive diagnosis, including a 22% incidence of false positive diagnoses in patients with isolated ST segment elevation. In contrast, the computer was nearly perfect at excluding patients without acute myocardial infarction, but did so at the expense of diminished sensitivity. With regard to other repolarization abnormalities, program evaluation presents a difficult dilemma. As opposed to such abnormalities as hypertrophy and infarction, there are few gold standards to quantify the performance of such things as metabolic disorders, drug effects, etc. Furthermore, and as opposed to rhythm and conduction, the ECG cannot be reliably used as the source for correct interpretation of these abnormalities. This dilemma is especially evident for such statements as non-specific ST/T abnormality. These statements refer to the recognition of electrocardiographic features that have been identified without quantification; in spite of the fact that there is still no standard for measuring them. Without a standard, it is difficult to quantify performance when the statement has no objective measure, either from the ECG itself or from other non-ECG data.

1. Kudenchuk et. al., 1991. Accuracy of computer-interpreted electrocardiography in selecting patients for thrombolytic therapy. JACC 17(7):1486-91

Revision A

12SL Statement of Validation and Accuracy

416791-003

19

Overall Classification

Several studies have addressed the issue of whether or not the computer can reliably classify the ECG as either normal or abnormal. These studies found: The GE Marquette program is reliable in diagnosing normality: even the disagreements are arguable.1 From a practical point of view, the eventual consensus opinion of the cardiologists was that only one tracing reported as normal by the GE Marquette system definitely should have been reported as abnormal to a family doctor, resulting in a negative predictive value of 98.4%. In view of the cardiologists inter-observer variation with regard to what is normal, this may well be higher than an individual cardiologists negative predictive value and suggests that the system examined may safely be used to exclude major abnormalities which would affect clinical management. A total of 39, 238 electrocardiograms were reviewedThe program placed the ECG into the following diagnostic classifications: normal 22%, otherwise normal 6%, borderline 5%, abnormal 66%. The reviewing physician agreed with this classification in 96.3% of all cases The most striking information shows the agreement of the physicians with the computer diagnosis of an abnormal electrocardiogram in 97.7% of the 25,295 tracings. In only 204 records out of 25,987 tracings (.8%), the physicians edited a computer-called abnormal electrocardiogram and changed it to normal. Likewise, in only 63 of 8,632 (.7%) tracings of which the computer called normal did the physicians edit this tracing to read abnormal.2

1. Graham, et. al., 1986. User evaluation of a commercially available computerized electrocardiographic interpretation system. In: Willems et. al eds. Computer ECG Analysis: Towards Standardisation. Amsterdam: North Holland, 1986;191-3. 2. Mulcahy et. al., 1986. Can a computer diagnosis of normal ECG be trusted?. Irish J of Med. Sci., 155(12):410-414

20

12SL Statement of Validation and Accuracy

Revision A

416791-003

Appendix I Mean Differences and Corresponding Standard Deviations (in ms) of Basic Intervals Derived from the Global References Standards P Duration (n=218) CSI Program GE Marquette Louvain Hannover HP IBM Nagoya Lyon AVA Glasgow Halifax Padova Telemed Modular Sicard-Riedl Mean -0.4 9.3 2.4 9.3 7.8 8.9 11.4 -9.6 0.5 -12.9 -2.8 10.4 6.9 1.7 SD 9.0 8.8 7.7 16.2 7.5 14.1 8.5 7.4 9.3 12.4 8.7 12.8 7.9 8.1 PR Interval (n=218) Mean -.0.6 3.3 2.6 2.7 1.2 3.3 -5.6 -3.8 -1.9 -3.9 -2.4 -0.2 4.3 -3.3 SD 5.8 6.9 7.3 11.5 5.8 13.1 6.0 4.8 7.0 6.3 5.4 10.3 4.2 6.5 QRS Duration (n=218) Mean -0.6 0.6 -1.2 1.6 8.2 1.8 -0.1 -5.1 0.1 -2.2 -3.4 8.6 0.3 2.2 SD 5.4 4.2 6.0 7.2 7.7 15.2 6.1 4.8 4.9 7.0 5.8 8.1 7.8 5.9 QT Interval (n=218) Mean 0.9 -7.0 1.8 6.0 0.2 5.8 0.3 -7.3 4.5 -3.2 -4.1 4.6 3.9 -0.9 SD 12.2 11.6 9.5 14.7 10.7 15.8 10.4 8.9 12.2 9.2 6.4 12.3 9.8 7.9

Appendix II Gold Standard Databases


s s s s s s s s s s s s

Normals, proven via health screening examination (5000) Prolonged QT syndrome, proven via personal and family history (300) Inferior Myocardial Infarction, proven via CATH (700) Inferior Myocardial Infarction, proven via ECHO (200) Anterior Myocardial Infarction, proven via CATH (500) Normals, proven via CATH (500) Acute Myocardial Infarction, proven via cardiac enzymes (1200) Myocardial Infarction, proven via autopsy (500) Left Ventricular Hypertrophy, proven via ECHO (500) Normals from Health and Nutrition Evaluation Survey (2000) ECGs from Common Standards from Electrocardiography (1000) Serial ECGs correlated with evolving infarction (several thousand)

Revision A

12SL Statement of Validation and Accuracy

416791-003

21

22

12SL Statement of Validation and Accuracy

Revision A

416791-003

Вам также может понравиться