AI to study heart function using echo

Introduction

Ultrasound diagnostics called echocardiography is used to evaluate the anatomy and functionality of the human heart. An initial screening for systolic and diastolic heart failure can be performed following an echocardiography test. This test is also important to differentiate between systolic and diastolic dysfunction without echocardiography cardiac imaging. Therefore, manual echocardiography interpretation can be time-consuming and prone to inaccuracy.

The following article provides the results of how created a completely automated deep-learning approach to categorize, segment, and annotate two-dimensional (2D) films and echocardiography Doppler modalities.

Study Methods

This research implemented methodologies for building the workflow from the prospective heart failure research platform (Asian Network for Translational Research and Cardiovascular Trials; ATTRaCT) in Asia, using preceding manual tracings by professional sonographers. The workflow was validated against manual measurements in the study using a curated dataset from Canada (Alberta Heart Failure Etiology and Analysis Research Team; HEART; n=1029 echocardiograms), a real-world dataset from Taiwan (n=31 241), the US-based EchoNet-Dynamic dataset (n=10 030), and an independent prospective assessment of the Asian (ATTRaCT) and Canadian (Alberta HEART) datasets (n=142) with two qualified sonographers’ repeated independent measurements. In total, 1145 individual echocardiograms from 1076 people from the training sample were used during the study. The sample included 406 different echocardiograms from 406 different patients. The availability of the data determined the sample sizes. After that, without further adjustment, three external datasets were used to confirm the accuracy of the automated measures. Patients in the test and validation sets did not overlap.

Results

CNNs successfully differentiated between various perspectives on 2D films and Doppler modalities in the test set from Asia, with accuracy ranging from 911% for PWTDI medial to 989% for PLAX. With a mean Dice similarity coefficient (a measure of the similarity of the annotations) ranging from 930% to 943% for both the left atrium and left ventricle, CNNs were able to segment cardiac chambers. For E wave (MAE 74 cm/s), the correlation between the automated measurements and the manual measurements was r=088; for LVESV (MAE 102 mL), the correlation was r=095. The correlation between automated and manual measurements for the most clinically important measures was r=089 (MAE 55% for LVEF, r=092 (MAE 07 cm/s) for e’ lateral, and r=090 (MAE 17) for E/e’ ratio. The AUC for detecting participants with systolic dysfunction (LVEF 40%) was 096 (95% CI 092-099), 095 (088-099) for an e’ lateral wave velocity less than 10 cm/s, and 096 (092-099) for an E/e’ ratio of 13 or greater. In post hoc interaction analysis (interaction >010), the relationship between ground truth E/e’ ratio and automated E/e’ readings was not altered by age, BMI, or gender.

The correlations between automated and manual measurements varied from r=067 for e’ medial (MAE 10 cm/s) to r=091 for LVESV (MAE 175 mL). The correlation between automated and manual measures was r=075 (MAE 86%), r=078 (MAE 12 cm/s), and r=075 (MAE 22) for LVEF and E/e’ ratio. Based on automated measures, the AUC was 091 (088-094) for identifying people with LVEF less than 40%, 088 (084-092) for an e’ lateral velocity less than 10 cm/s, and 091 (088-094) for an E/e’ ratio of 13 or above. In the Taiwan dataset, 0-29% of 2D and Doppler modality pictures had poor view quality, whereas 13-281% had poor measurement quality. The correlations between automated and manual measurements varied from r=062 for LAESV (MAE 92 ml) to r=088 for e’lateral (MAE 16 cm/s).

Conclusion

After the development and training of the automated process for echocardiographic analysis in the training dataset from Asia (ATTRaCT program), researchers evaluated the workflow’s effectiveness in an internal test set and in distinct, independent validation datasets that had never been seen before. The method was externally validated using three datasets: a curated dataset from Canada (Alberta HEART Study), a real-world dataset from Taiwan (Mackay Memorial Hospital), and a reference dataset from the United States (EchoNet-Dynamic dataset). In the Canadian cohort, 0-20% of the 2D films and Doppler modalities had poor view quality, while 13%-109% had poor measurement quality.

In the Canadian and Taiwan cohorts, the MAEs of measures were greater in patients with atrial fibrillation than in individuals without atrial fibrillation.

However, r values for LVESV and LVEDV in the Taiwanese cohort and LVEF, LAESV, and E/e’ ratio in the Canadian cohort were similar to or greater in atrial fibrillation patients.

The US EchoNetDynamic dataset, which had 10 030 clinically recorded LVEF values, was used by researchers to confirm their LVEF measurements.

Reference

Lancet Digit Health. 2022 Jan;4(1):e46-e54.

Leave a Reply