ECG T-wave processing¶

The goal of this study is to find an automatic methodology to identify T-Wave alternants from the analysis of an ECG signal. In order to do that, we will use the wfdb library in Python.

We start by import the usual libraries for signal processing and graphical representation.

import sys
import numpy as np
import math
import wfdb
from wfdb import processing
import matplotlib.pyplot as plt
import pandas as pd

Along the process we will see the different uses of each of this libraries. We use the twadb database from Physionet. This database was used previously in the 2008 Computers in Cardiology Challenge.

We start by defining a variable that will carry the ecg filename and read the annotations.

ecgname = 'twa11'
annsignals = wfdb.rdann(ecgname,'qrs',pb_dir='twadb')

Annotations will matter because, in this case, will give the position of the QRS's. As we will see later, these annotations will produce bad results in the automatically delimitation of the T-wave segments. From these, we compute the much needed mean duration of a heart beat for this specific signal. Notice the degree of variability among all the signals.

diffbeats =np.delete(np.roll(annsignals.sample,-1),len(annsignals.sample)-1) - np.delete(annsignals.sample,len(annsignals.sample)-1)

beatmean = math.ceil(np.mean(diffbeats))
print(beatmean)

394

The idea of the above cell is pretty simple. Instead of making a for loop (on which python is pretty lame), we make the difference of consecutive positions of the array in the following way. We create an array, called tmp1, with consecutive numbers. We shifted each entry one step to the left. After we delete the last entry of this copy and of the original array and, finally, we do the difference elementwise. Successively,

tmp1 = np.arange(10)
print(np.roll(tmp1,-1))
print(np.delete(np.roll(tmp1,-1),len(tmp1)-1))
print(np.delete(np.roll(tmp1,-1),len(tmp1)-1)-np.delete(tmp1,len(tmp1)-1))

[1 2 3 4 5 6 7 8 9 0]
[1 2 3 4 5 6 7 8 9]
[1 1 1 1 1 1 1 1 1]

At the end, we just compute the average and the result is the variable beatmean. We load the main caracheteristics of the file and the physical records into fields and signals. We print the fields content to get some perspective of the information therein.

signals , fields = wfdb.rdsamp(ecgname,channels = 'all',pb_dir='twadb')
print(fields)

{'fs': 500, 'sig_len': 56349, 'n_sig': 12, 'base_date': None, 'base_time': None, 'units': ['mV', 'mV', 'mV', 'mV', 'mV', 'mV', 'mV', 'mV', 'mV', 'mV', 'mV', 'mV'], 'sig_name': ['I', 'II', 'III', 'aVR', 'aVL', 'aVF', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6'], 'comments': []}

Among relevant information, the values of sample frequency 'fs', of signal number of samples 'sig_len' and number of leads 'n_sig', are of special importance for us. Being so, we load them into the variables:

fqsamp = fields['fs']
nsigs = fields['n_sig']
lensig = fields['sig_len']
halffqsamp = int(fqsamp/2)

The first problem that we face is the presence of a baseline wander in our signal. This is characterized by a presence of exogenous fluctuation of the signal. To get some sense of this and a context solution for some individual file, please see the notebook Baseline wander presence in ECG. Here, we compute this for the complete length of the signal and for all leads at the same time.

# we need a basaeline for each lead. We do not need to declare before use. We kept it to be aware of the strucuture.
# baseline = np.zeros((nsigs,lensig-fqsamp))
# flat_signal = baseline

# I just prefer to have the values along the rows
shift_baseline = np.transpose(pd.DataFrame(signals).rolling(fqsamp).mean())
baseline = np.asarray(shift_baseline.dropna(axis = 1))
flat_signal = np.delete(np.roll(np.transpose(signals),-halffqsamp),np.s_[lensig-fqsamp+1:lensig],axis = 1) - baseline

To understand what was done, we make a graphical representation of part of the lead of the signal. Changing the values of firstbeat and last beat we will get more or less detail.

# # this is just a representation of the signals to see that everything is going OK!!!
firstbeat = annsignals.sample[0]
lastbeat = annsignals.sample[30]
    
tt = np.linspace(0,lastbeat - firstbeat, lastbeat - firstbeat)

lead = 1

plt.figure(figsize=(12,9))

plt.subplot(211)
plt.title("ECG with Baseline wander")
plt.plot(tt,np.roll(signals[firstbeat:lastbeat,lead],-halffqsamp))
plt.plot(tt , baseline[lead,firstbeat:lastbeat])

plt.subplot(212)
plt.title("ECG without Baseline wander")
plt.plot(tt,flat_signal[lead,firstbeat:lastbeat])
plt.show()

Now, we have just arrived to the crucial part of the method to identify the T-Wave alternans. To do it, we will look in the first 64 beats and try to locate the presence in the signal of the 64-period and 128-period phenomena. Because the rest of the signal have frequencies that overlap the ones we are looking for, we will create a method to copy the T-Wave segment to a new array. It will be over this array that we will compute the FFT.

# We add one more beat because the diff array has one element less than the origianl
numbeats = 65
init_beatnum = 0 
firstbeat = 0

partial_diffbeats = np.delete(np.roll(annsignals.sample[init_beatnum:init_beatnum + numbeats],-1),numbeats-1)\
- np.delete(annsignals.sample[init_beatnum:init_beatnum + numbeats],numbeats-1)

# the typical duration of the QRS interval is located in the interval [0.08,0.12]
qrs_length = math.ceil(fqsamp * 0.08)
init_tint = qrs_length
finit_tint = init_tint + math.ceil(0.5 * beatmean)

lead = 1

batida = 0
shift_pos = 0 

# just declare an empyt array
tsignal = np.zeros(0)

So we are located at the beginning of the signal. In front of us we have 65 beats of the lead 1 (counting from 0). The array partial_diffbeats tell us how much we need to move to jump to the next QRS. From that point we compute the local peak and put ourselves at the right of this point. To do this, we use the information that the typical QRS-segment's duration is located in the interval (0.08,0.12) seconds. Then a segment with length of 0.5 * beatmean to array tsignal.

for i in range(numbeats-1):
    
    qrspos = processing.find_local_peaks(flat_signal[lead,batida:batida + beatmean],beatmean)
    shift_pos = qrspos[0] + batida
    my_partial = flat_signal[lead,shift_pos + init_tint : shift_pos + finit_tint]
    
    batida += partial_diffbeats[i] 
    
    tsignal = np.concatenate((tsignal,my_partial))

One example of the results of this procedure is represented in the image bellow.

Now we are finally in conditions to use FFT over the tsignal array. We set to zero all the values before a certain entry (typically there are a huge peak at the beginning of the result of the application of the FFT). Then we compute the maximum value in 0:75, set to zero all the entries of the array bellow 90% of this value and count the number of positive entries in the result. If more than 2 (cases of twa55 and twa66) we have evidences of T-Wave alternans. When we have just 1 peak then there are no evidences of T-Wave alternans (the case of twa11). This is done by

# now we compute the fft of the tsignal
fft_tsignal = np.absolute(np.fft.fft(tsignal))

# we are just lookin for the persistence of period 128 over 64
comprimento = 150  #int(len(tsignal)/2)

# Now we will cut the signal to preserve just the most important features
partialfft = fft_tsignal[0:comprimento]
partialfft[0:20] = 0
local64 = np.amax(partialfft[0:int(comprimento/2)])
erase_values = partialfft < 0.9 * local64
partialfft[erase_values] = 0

goldjoy = (partialfft > 0).sum()
print()

if (goldjoy > 1):
    print("There are evidences of T-wave alternans in the first ",numbeats-1," beats of ecg ",ecgname, "\n\n" , sep = "")
else :
    print("There are no evidences of T-wave alternans in the first ", numbeats-1," beats of ecg ",ecgname, "\n\n" , sep = "")

There are no evidences of T-wave alternans in the first 64 beats of ecg twa11

The result of this method over FFT has the followin graphical representation

xx = np.linspace(0,comprimento,comprimento)

plt.figure(figsize=(12,9))
plt.stem(xx,partialfft)
plt.title('Fourier Transform of %s' %ecgname)
plt.show()