### Circuits vs. Software

UPDATE: An improved ECG design was posted in August, 2016.
Check out: http://www.swharden.com/wp/2016-08-08-diy-ecg-with-1-op-amp/

Would I rather design circuits or software? I’m a software guy (likely due to my lack of working knowledge of circuits) so I’d rather record noisy signals and write software to eliminate the noise, rather than assembling circuits to eliminate the noise for me. In the case of my DIY ECG machine, I’d say I’ve done a great job of eliminating noise via the software route. Most DIY ECG circuits on the net use multiple op-amps and diodes to do this, and have a hardware-based band-pass filter to eliminate frequencies around 60 Hz. Instead of all that fancy stuff, I made a super-crude circuit (a single op-amp and two resisters) to record my ECG. It was INCREDIBLY noisy! So, how did I clean it up with software? I’ll tell you.

The first step in removing electrical noise is classifying it. Most of the noise in my signal were overlapping sine waves caused by my electrodes picking up signals not from my body. This was determined by simply close-up observation of the original trace. Since this sine-type interference is consistant, power-spectral analysis could be applied to determine the frequencies of the noise so I could block them out. I used the fast Fourier transform algorithm (FFT) on the values from my wave file to generate a plot of noise level with respect to frequency (noise was seen as sharp peaks). I manually blocked-out certain regions of the FFT trace that I thought were noise-related (colored bands, all the values of which were set to zero) – a process known as band-pass filtering (something which is possible to do electronically with a more complicated circuit) – then performed an inverse FFT algorithm on the trace. The result was a trace with greatly reduced noise and after a moving window smoothing algorithm was applied the signal was better than ever. Below are some figures. Note that I recorded the raw WAV file with “sound recorder” (not GoldWave) and did all of the processing (including band-pass filtering) within Python.

What an awesome chart! On top we have the power spectral analysis with band-stop filters applied at the colored regions. Below is the trace of the original WAV file (light gray), the inverse-FFT-filtered trace (dark gray), and the smoothed inverse-FFT-filtered trace (black) – the final ECG signal I intend to use.

This is a magnified view of a few heartbeats after the inverse FFT filtering (multi-band-blocking) process was applied. Not bad! Not bad at all…

And, of course, the updated code:

``` import wave, struct, numpy, pylab, scipy

fname='./success3.wav'

"""load raw data directly from a WAV file."""
global rate
w=wave.open(wavfilename,'rb')
(nchannel, width, rate, length, comptype, compname) = w.getparams()
print "[%s] %d HZ (%0.2fsec)" %(wavfilename, rate, length/float(rate))
return numpy.array(struct.unpack("%sh" %length*nchannel,frames))

def shrink(data,deg=100):
"""condense a linear data array by a multiple of [deg]."""
global rate
small=[]
print "starting with", len(data)
for i in range(len(data)/deg):
small.append(numpy.average(data[i*deg:(i+1)*deg]))
print "ending with", len(small)
rate = rate/deg
#return small[40000:50000]
return small

def normalize(data):
"""make all data fit between -.5 and +.5"""
data=data-numpy.average(data)
big=float(max(data))
sml=float(min(data))
data=data/abs(big-sml)
data=data+float(abs(min(data)))-.47
return data

def smooth(data,deg=20,expand=False):
"""moving window average (deg = window size)."""
for i in range(len(data)-deg):
if i==0: cur,smooth=sum(data[0:deg]),[]
smooth.append(cur/deg)
cur=cur-data[i]+data[i+deg]
if expand:
for i in range(deg):
smooth.append(smooth[-1])
return smooth

def smoothListGaussian(list,degree=10,expand=False):
window=degree*2-1
weight=numpy.array([1.0]*window)
weightGauss=[]
for i in range(window):
i=i-degree+1
frac=i/float(window)
gauss=1/(numpy.exp((4*(frac))**2))
weightGauss.append(gauss)
weight=numpy.array(weightGauss)*weight
smoothed=[0.0]*(len(list)-window)
for i in range(len(smoothed)):
smoothed[i]=sum(numpy.array(list[i:i+window])*weight)/sum(weight)
if expand:
for i in range((degree*2)-1):
smoothed.append(smoothed[-1])
return smoothed

def goodSmooth(data):
#data=smooth(fix,20,True)
data=smooth(fix,100,True)
#data=smooth(fix,20,True)
return data

def makeabs(data):
"""center linear data to its average value."""
for i in range(len(data)): data[i]=abs(data[i])
return data

def invert(data):
"""obviously."""
for i in range(len(data)): data[i]=-data[i]
return data

"""a do-everything function to get usable, smoothed data from a WAV."""
wav=shrink(wav)
wav=invert(wav)
wav=smooth(wav)
wav=smooth(wav,10)
wav=normalize(wav)
Xs=getXs(wav)
return Xs,wav

def getXs(datalen):
"""calculate time positions based on WAV frequency resolution."""
Xs=[]
for i in range(len(datalen)):
Xs.append(i*(1/float(rate)))
print len(datalen), len(Xs)
return Xs

def integrate(data):
"""integrate the function with respect to its order."""
inte=[]
for i in range(len(data)-1):
inte.append(abs(data[i]-data[i+1]))
inte.append(inte[-1])
return inte

def getPoints(Xs,data,res=10):
"""return X,Y coordinates of R peaks and calculate R-R based heartrate."""
pXs,pYs,pHRs=[],[],[]
for i in range(res,len(data)-res):
if data[i]>data[i-res]+.1 and data[i]>data[i+res]+.1:
if data[i]>data[i-1] and data[i]>data[i+1]:
pXs.append(Xs[i])
pYs.append(data[i])
if len(pXs)>1:
pHRs.append((1.0/(pXs[-1]-pXs[-2]))*60.0)
pHRs.append(pHRs[-1])
return pXs,pYs,pHRs

def bandStop(fft,fftx,low,high,show=True):
lbl="%d-%d"%(low,high)
print "band-stopping:",lbl
if show:
col=pylab.cm.spectral(low/1200.)
pylab.axvspan(low,high,alpha=.4,ec='none',label=lbl,fc=col)
#pylab.axvspan(-low,-high,fc='r',alpha=.3)
for i in range(len(fft)):
if abs(fftx[i])>low and abs(fftx[i])<high :
fft[i]=0
return fft

def getXs(data):
xs=numpy.array(range(len(data)))
xs=xs*(1.0/rate)
return xs

def clip(x,deg=1000):
return numpy.array(x[deg:-deg])

pylab.figure(figsize=(12,8))
xs = getXs(raw)
fftr = numpy.fft.fft(raw)
fft = fftr[:]
fftx= numpy.fft.fftfreq(len(raw), d=(1.0/(rate)))

pylab.subplot(2,1,1)
pylab.plot(fftx,abs(fftr),'k')

fft=bandStop(fft,fftx,30,123)
fft=bandStop(fft,fftx,160,184)
fft=bandStop(fft,fftx,294,303)
fft=bandStop(fft,fftx,386,423)
fft=bandStop(fft,fftx,534,539)
fft=bandStop(fft,fftx,585,610)
fft=bandStop(fft,fftx,654,660)
fft=bandStop(fft,fftx,773,778)
fft=bandStop(fft,fftx,893,900)
fft=bandStop(fft,fftx,1100,max(fftx))
pylab.axis([0,1200,0,2*10**6])
pylab.legend()
pylab.title("Power Spectral Analysis",fontsize=28)
pylab.ylabel("Power",fontsize=20)
pylab.xlabel("Frequency (Hz)",fontsize=20)

pylab.subplot(2,1,2)
pylab.title("Original Trace",fontsize=28)
pylab.ylabel("Potential",fontsize=20)
pylab.xlabel("Time (sec)",fontsize=20)
pylab.plot(clip(xs),clip(raw),color='.8',label='1: raw')

fix = scipy.ifft(fft)
pylab.plot(clip(xs),clip(fix)+5000,color='.6',label='2: band-stop')
pylab.plot(clip(xs),clip(goodSmooth(fix))-5000,'k',label='3: smoothed')
pylab.legend()
pylab.title("Band-Stop Filtered Trace",fontsize=28)
pylab.ylabel("Potential",fontsize=20)
pylab.xlabel("Time (sec)",fontsize=20)

pylab.savefig('out.png',dpi=100)
pylab.show()
print "COMPLETE"
```

### DIY ECG Trial 1

UPDATE: An improved ECG design was posted in August, 2016.
Check out: http://www.swharden.com/wp/2016-08-08-diy-ecg-with-1-op-amp/

I’ve succeeded in building my own electrocardiograph (ECG) to record the electrical activity of my own heart! Briefly, I built a micropotential amplifier using an op-amp and attached it to makeshift electrodes on my chest (pennies and shampoo lol), fed the amplified signal into my sound card, and recorded it as a WAV. The signal is INCREDIBLY noisy though. I was able to do a great job at removing this noise using band/frequency filters in GoldWave (audio editing software designed to handle WAV files). I band-blocked 50-70 Hz (which removed the oscillations from the 60 Hz AC lines running around my apartment). I then wrote the Python code (at the bottom of this entry) to load this WAV file as a single list of numbers (voltage potentials). I performed a data condensation algorithm (converting 100 points of raw WAV data into a single, averaged point, lessening my processing load by 100x), followed by two consecutative moving window averages (20-point window, performed on the condensed data). The result was a voltage reading that had most of the random interference oscillations removed and, behold, a BEAUTIFUL ECG signal!!! I also tossed in some code to determine the peak of the R wave, label it (yellow dot), and use the inverse R-R time distance to calculate heart rate. Nifty, huh?

I swear, this is legitimate data that I recorded myself tonight! This is my actual ECC signal as record by a circuit similar to the one in the previous entry, recorded through my sound card, and processed with the Python script below. Incredible, huh? You can clearly see the Q, R, S, and T spikes (described in the previous couple entries)! I can’t wait to solder-up a prototype (it’s currently breadboarded) and try to analyze hours of data rather than just a few seconds! I’ll take pictures of this device soon. Until then, here’s some real data!

This is a zoomed-in view of a representative QRST wave of mine. Amazing, huh? I’m proud of my ticker! =o)

This is the output of the python script I wrote tonight

And here’s the code I used: note that it relies on the WAV file I recorded – if you want it just ask me for it and I’ll email it to you – it’s about 12 sec long. Oh yeah, not all of the functions in this script are used to create the image (such as integration calculations), but I left ’em in because you may find them useful.

``` import wave, struct, numpy, pylab, scipy

"""load raw data directly from a WAV file."""
global rate
w=wave.open(wavfilename,'rb')
(nchannel, width, rate, length, comptype, compname) = w.getparams()
print "[%s] %d HZ (%0.2fsec)" %(wavfilename, rate, length/float(rate))
return numpy.array(struct.unpack("%sh" %length*nchannel,frames))

def shrink(data,deg=100):
"""condense a linear data array by a multiple of [deg]."""
global rate
small=[]
print "starting with", len(data)
for i in range(len(data)/deg):
small.append(numpy.average(data[i*deg:(i+1)*deg]))
print "ending with", len(small)
rate = rate/deg
return small

def normalize(data):
"""make all data fit between -.5 and +.5"""
data=data-numpy.average(data)
big=float(max(data))
sml=float(min(data))
data=data/abs(big-sml)
data=data+float(abs(min(data)))-.47
return data

def smooth(data,deg=20):
"""moving window average (deg = window size)."""
for i in range(len(data)-deg):
if i==0: cur,smooth=sum(data[0:deg]),[]
smooth.append(cur/deg)
cur=cur-data[i]+data[i+deg]
return smooth

def makeabs(data):
"""center linear data to its average value."""
for i in range(len(data)): data[i]=abs(data[i])
return data

def invert(data):
"""obviously."""
for i in range(len(data)): data[i]=-data[i]
return data

"""a do-everything function to get usable, smoothed data from a WAV."""
wav=shrink(wav)
wav=invert(wav)
wav=smooth(wav)
wav=smooth(wav,10)
wav=normalize(wav)
Xs=getXs(wav)
return Xs,wav

def getXs(datalen):
"""calculate time positions based on WAV frequency resolution."""
Xs=[]
for i in range(len(datalen)):
Xs.append(i*(1/float(rate)))
print len(datalen), len(Xs)
return Xs

def integrate(data):
"""integrate the function with respect to its order."""
inte=[]
for i in range(len(data)-1):
inte.append(abs(data[i]-data[i+1]))
inte.append(inte[-1])
return inte

def getPoints(Xs,data,res=10):
"""return X,Y coordinates of R peaks and calculate R-R based heartrate."""
pXs,pYs,pHRs=[],[],[]
for i in range(res,len(data)-res):
if data[i]>data[i-res]+.1 and data[i]>data[i+res]+.1:
if data[i]>data[i-1] and data[i]>data[i+1]:
pXs.append(Xs[i])
pYs.append(data[i])
if len(pXs)>1:
pHRs.append((1.0/(pXs[-1]-pXs[-2]))*60.0)
pHRs.append(pHRs[-1])
return pXs,pYs,pHRs

px,py,pHR=getPoints(Xs,Ys)

pylab.figure(figsize=(12,6))
pylab.subplot(2,1,1)
#pylab.axhline(color='.4',linestyle=':')
pylab.plot(Xs,Ys,'b-')
pylab.plot(px,py,'y.')
pylab.axis([None,None,-.6,.6])
pylab.title("DIY Electrocardiogram - Trial 1",fontsize=26)
pylab.ylabel("Normalized Potential",fontsize=16)
#pylab.xlabel("Time (sec)")
ax=pylab.axis()
pylab.subplot(2,1,2)
pylab.plot(px,pHR,'k:')
pylab.plot(px,pHR,'b.')
pylab.axis([ax[0],ax[1],None,None])
pylab.ylabel("Heart Rate (BPM)",fontsize=16)
pylab.xlabel("Time (seconds)",fontsize=16)
pylab.savefig("test.png",dpi=120)
pylab.show()
print "DONE"
```

### Compress Strings and Store to Files in Python

While writing code for my graduate research thesis I came across the need to lightly compress a huge and complex variable (a massive 3D data array) and store it in a text file for later retrieval. I decided to use the zlib compression library because it’s open source and works pretty much on every platform. I ran into a snag for a while though, because whenever I loaded data from a text file it wouldn’t properly decompress. I fixed this problem by adding the “rb” to the open line, forcing python to read the text file as binary data rather than ascii data. Below is my code, written in two functions to save/load compressed string data to/from files in Python.

```

import zlib

def saveIt(data,fname):

data=str(data)

data=zlib.compress(data)

f=open(fname,'wb')

f.write(data)

f.close()

return

def openIt(fname,evaluate=True):

f=open(fname,'rb')

f.close()

data=zlib.decompress(data)

if evaluate: data=eval(data)

return data

```

Oh yeah, don’t forget the evaluate option in the openIt function. If set to True (default), the returned variable will be an evaluated object. For example, “[[1,2],[3,4]]” will be returned as an actual 2D list, not just a string. How convenient is that?

### Linear Data Smoothing in Python

Here’s a scrumptious morsel of juicy python code for even the most stoic of scientists to get excited about. Granted, it’s a very simple concept and has surely been done countless times before, but there aren’t any good resources for this code on the internet. Since I had to write my own code to perform a variety of different linear 1-dimensional array data smoothing in python, I decided it would be nice to share it. At the bottom of this post you can see a PNG image which is the file output by the code listen even further below. If you copy/paste the code into an empty text file and run it in Python, it will generate the exact same PNG file (assuming you have pylab and numpy libraries configured).

```

### This is the Gaussian data smoothing function I wrote ###

def smoothListGaussian(list,degree=5):

window=degree*2-1

weight=numpy.array([1.0]*window)

weightGauss=[]

for i in range(window):

i=i-degree+1

frac=i/float(window)

gauss=1/(numpy.exp((4*(frac))**2))

weightGauss.append(gauss)

weight=numpy.array(weightGauss)*weight

smoothed=[0.0]*(len(list)-window)

for i in range(len(smoothed)):

smoothed[i]=sum(numpy.array(list[i:i+window])*weight)/sum(weight)

return smoothed

```

Basically, you feed it a list (it doesn’t matter how long it is) and it will return a smoother version of the data. The Gaussian smoothing function I wrote is leagues better than a moving window average method, for reasons that are obvious when viewing the chart below. Surprisingly, the moving triangle method appears to be very similar to the Gaussian function at low degrees of spread. However, for huge numbers of data points, the Gaussian function should perform better.

```

### This is the code to produce the image displayed above ###

import pylab,numpy

def smoothList(list,strippedXs=False,degree=10):

if strippedXs==True: return Xs[0:-(len(list)-(len(list)-degree+1))]

smoothed=[0]*(len(list)-degree+1)

for i in range(len(smoothed)):

smoothed[i]=sum(list[i:i+degree])/float(degree)

return smoothed

def smoothListTriangle(list,strippedXs=False,degree=5):

weight=[]

window=degree*2-1

smoothed=[0.0]*(len(list)-window)

for x in range(1,2*degree):weight.append(degree-abs(degree-x))

w=numpy.array(weight)

for i in range(len(smoothed)):

smoothed[i]=sum(numpy.array(list[i:i+window])*w)/float(sum(w))

return smoothed

def smoothListGaussian(list,strippedXs=False,degree=5):

window=degree*2-1

weight=numpy.array([1.0]*window)

weightGauss=[]

for i in range(window):

i=i-degree+1

frac=i/float(window)

gauss=1/(numpy.exp((4*(frac))**2))

weightGauss.append(gauss)

weight=numpy.array(weightGauss)*weight

smoothed=[0.0]*(len(list)-window)

for i in range(len(smoothed)):

smoothed[i]=sum(numpy.array(list[i:i+window])*weight)/sum(weight)

return smoothed

### DUMMY DATA ###

data = [0]*30 #30 "0"s in a row

data[15]=1    #the middle one is "1"

### PLOT DIFFERENT SMOOTHING FUNCTIONS ###

pylab.figure(figsize=(550/80,700/80))

pylab.suptitle('1D Data Smoothing', fontsize=16)

pylab.subplot(4,1,1)

p1=pylab.plot(data,".k")

p1=pylab.plot(data,"-k")

a=pylab.axis()

pylab.axis([a[0],a[1],-.1,1.1])

pylab.text(2,.8,"raw data",fontsize=14)

pylab.subplot(4,1,2)

p1=pylab.plot(smoothList(data),".k")

p1=pylab.plot(smoothList(data),"-k")

a=pylab.axis()

pylab.axis([a[0],a[1],-.1,.4])

pylab.text(2,.3,"moving window average",fontsize=14)

pylab.subplot(4,1,3)

p1=pylab.plot(smoothListTriangle(data),".k")

p1=pylab.plot(smoothListTriangle(data),"-k")

pylab.axis([a[0],a[1],-.1,.4])

pylab.text(2,.3,"moving triangle",fontsize=14)

pylab.subplot(4,1,4)

p1=pylab.plot(smoothListGaussian(data),".k")

p1=pylab.plot(smoothListGaussian(data),"-k")

pylab.axis([a[0],a[1],-.1,.4])

pylab.text(2,.3,"moving gaussian",fontsize=14)

#pylab.show()

pylab.savefig("smooth.png",dpi=80)

```

Hey, I had a great idea, why don’t I test it on some of my own data? Due to the fact that I don’t want the details of my thesis work getting out onto the internet too early, I can’t reveal exactly what this data is from. It will suffice to say that it’s fractional density of neurite coverage in thick muscle tissue. Anyhow, this data is wild and in desperate need of some smoothing. Below is a visual representation of the differences in the methods of smoothing. Yayness! I like the gaussian function the best.

I should note that the degree of window coverage for the moving window average, moving triangle, and gaussian functions are 10, 5, and 5 respectively. Also note that (due to the handling of the “degree” variable between the different functions) the actual number of data points assessed in these three functions are 10, 9, and 9 respectively. The degree for the last two functions represents “spread” from each point, whereas the first one represents the total number of points to be averaged for the moving average. Enjoy.