THIS CODE HAS BEEN UPDATED!
THIS CODE HAS BEEN UPDATED!
THIS CODE HAS BEEN UPDATED!

>>> CHECK OUT THE NEW CODE < << [Generate Apache-Style HTTP Access Logs via SQL and PHP]

OBSOLETE CODE IS BELOW…

My web server blocks access to my apache-generated visitor logs (commonly stored in “access.log”). Therefore, many great site usage stats generators (such as awstats – see this example) cannot be used to analyze web traffic to my site. (How many people go what pages? Where do they come from? What search phrases do they type into Google to find my website?) My web host does allow PHP, and access to php.ini, so I figured that I could generate my own access.log using PHP code. I succeeded, but had a hard time doing this because it’s not clearly documented elsewhere – so I’ll make it clear.

Sample line from access.log generated by my PHP script:
132.170.10.227 – – [22/Jan/2009:11:58:49 +0800] “GET /blog/2005-06-29-eva-05-attack-scotts-sanity/ HTTP/1.1” 200 – “http://www.google.com/search?hl=en&client=firefox-a&rls=org.mozilla%3Aen-US%3Aofficial&hs=8Lk&q=swharden+eva-05&btnG=Search” “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5”

All I had to do was insert the following line at the end of my php.ini file:

  

 auto_append_file = "/home/content/n/i/b/nibjb/html/logme.php"  

 

And I placed logme.php in my root folder with the following code:

  

 $logwriter_logformat = "combined"; // log format,combined or common  

 $logwriter_logdir = "/home/content/n/i/b/nibjb/html/logs/"; // physical path where your log file located  

 $logwriter_logfilename = "access.log"; // your log file's filename  

 $logwriter_timezone = "+0800"; // your server's time zone. +0800 means GMT+8  

   

 function logwriter_writelog($logstring){  

   

 global $logwriter_logdir,$logwriter_logfilename;  

 $fullpathfilename = $logwriter_logdir.$logwriter_logfilename;  

   

 if (!is_file($fullpathfilename)) {  

 print "Log file doesn't exist or file is corrupt.";  

 return;  

 }  

   

 if (!is_writeable($fullpathfilename)) {  

 print "Log file is not writable,please change its permission.";  

 return;  

 }  

   

 if($fp = @fopen($fullpathfilename, "a")) {  

 flock($fp, 2);  

 fputs($fp, $logstring);  

 fclose($fp);  

 }  

 }  

   

 function logwriter_handlevar($varname,$defaultvalue) {  

 $tempvar = getenv($varname);  

 if(!empty($tempvar)) {  

 return $tempvar;  

 } else {  

 return $defaultvalue;  

 }  

 }  

   

 if (!empty($REMOTE_HOST)) {  

 $logwriter_remote_vistor = $REMOTE_HOST;  

 }else{  

 $logwriter_remote_vistor = logwriter_handlevar("REMOTE_ADDR","-");  

 }  

   

 $logwriter_remote_ident = logwriter_handlevar("REMOTE_IDENT","-");  

 $logwriter_remote_user = logwriter_handlevar("REMOTE_USER","-");  

 $logwriter_date = date("d/M/Y:H:i:s");  

   

 $logwriter_server_port = logwriter_handlevar("SERVER_PORT","80");  

 if($logwriter_server_port!="80") {  

 $logwriter_server_port = <a href="".$logwriter_server_port;"></a>  

 }else{  

 $logwriter_server_port = "";  

 }  

   

 $logwriter_request_method = logwriter_handlevar("REQUEST_METHOD","GET");  

 $logwriter_request_uri = logwriter_handlevar("REQUEST_URI","");  

 $logwriter_server_protocol = logwriter_handlevar("SERVER_PROTOCOL","HTTP/1.1");  

   

 if ($logwriter_logformat=="common") {  

 $logwriter_logstring = "$logwriter_remote_vistor $logwriter_remote_ident $logwriter_remote_user [$logwriter_date $logwriter_timezone] "$logwriter_request_method $logwriter_request_uri $logwriter_server_protocol" 200 - 

 ";  

 }else{  

   

 $logwriter_http_referer = logwriter_handlevar("HTTP_REFERER","-");  

 $logwriter_http_user_agent = logwriter_handlevar("HTTP_USER_AGENT","");  

   

 $logwriter_logstring = "$logwriter_remote_vistor $logwriter_remote_ident $logwriter_remote_user [$logwriter_date $logwriter_timezone] "$logwriter_request_method $logwriter_request_uri $logwriter_server_protocol" 200 - "$logwriter_http_referer" "$logwriter_http_user_agent" 

 ";  

   

 }  

   

 logwriter_writelog($logwriter_logstring);  

 

Note that the PHP code must be surrounded with < ? php ?> as demonstrated here

The result? As you can tell, my logme.php dumps data to www.swharden.com/logs/access.log – if you browse a few pages on my website, or even use Google to search for me (ie: google for ‘swharden’ and ‘minidisc’) you can see yourself in the logfile – pretty cool huh? Once I have a good volume of log data I’ll demonstrate how to turn it into useful information.





Additional Resources

UPDATE: An improved ECG design was posted in August, 2016.
Check out: http://www.swharden.com/wp/2016-08-08-diy-ecg-with-1-op-amp/

It’s time for a lecture. I’ve been spending a lot of time creating a DIY dlectrocardiogram and it produces fairly noisy signals. I’ve spent some time and effort researching the best ways to clean-up these signals, and the results are incredibly useful! Therefore, I’ve decided to lightly document these results in a blog entry.

Here’s an example of my magic! I take a noisy recording and turn it into a beautiful trace. See the example figure with the blue traces. How is this possible? Well I’ll explain it for you. Mostly, it boils down to eliminating excess high-frequency sine waves which are in the original recording due to electromagnetic noise. A major source of noise can be from the alternating current passing through wires traveling through the walls of your house or building. My original ECG circuit was highly susceptible to this kind of interference, but my improved ECG circuit eliminates most of this noise. However, noise is still in the trace (see the figure to the left), and it needed to be removed.

The key is the FFT (Fast Fourier Transformation) algorithm which can get pretty intimidating to research at first! I’ll simplify this process. Let’s say you have a trace with repeating sine-wave-shaped noise. The output of the FFT transformation of the signal is the breakdown of the signal by frequency. Check out this FFT trace of a noisy signal from a few posts ago (top graph). High peaks represent frequencies which are common. See the enormous peak around 60 Hz? (Hz means “per second” by the way, so 60 Hz is a sine wave that repeats 60 times a second) That’s from my AC noise. Other peaks (shown in colored bands) are other electromagnetic noise sources, such as wireless networks, TVs, telephones, and your computer processor. The heart produces changes in electricity that are very slow (the heartbeat is about 1 Hz, or 1 beat per second), so if we can eliminate all of the sine waves with frequencies higher than what we want to isolate we can get a pretty clear trace. This is called a band-stop filter (we block-out certain bands of frequencies) A band-pass filter is the opposite, where we only allow frequencies which are below (low-pass) or above (high-pass) a given frequency. By eliminating each of the peaks in the colored regions (setting each value to 0), then performing an inverse fast Fourier transformation (going backwards from frequency back to time), the result is the signal trace (seen as light gray on the bottom graph) with those high-frequency sine waves removed! (the gray trace on the bottom graph). A little touch-up smoothing makes a great trace (black trace on the bottom graph).

Here’s some Python code to get you started in cleaning-up your noisy signals! The image below is the output of the Python code at the bottom of this entry. This python file requires that test.wav (~700kb) (an actual ECG recording of my heartbeat) be saved in the same folder. Brief descriptions of each portion of the graph will follow.

(A) The original signal we want to isolate. (IE: our actual heart signal)

(B) Some electrical noise. (3 sine waves of different amplitudes and periods)

(C) Electrical noise (what happens when you add those 3 sine waves together)

(D) Static (random noise generated by a random number generator)

(E) Signal (A) plus static (D)

(F) Signal (A) plus static (D) plus electrical noise (C)

(G) Total FFT trace of (F). Note the low frequency peak due to the signal and electrical noise (near 0) and the high frequency peak due to static (near 10,000)

(H) This is a zoomed-in region of (F) showing 4 peaks (one for the original signal and 3 for high frequency noise). By blocking-out (set it to 0) everything above 10Hz (red), we isolate the peak we want (signal). This is a low-pass filter.

(I) Performing an inverse FFT (iFFT) on the low-pass iFFT, we get a nice trace which is our original signal!

(J) Comparison of our iFFT with our original signal shows that the amplitude is kinda messed up. If we normalize each of these (set minimum to 0, maximum to 1) they line up. Awesome!

(K) How close were we? Graphing the difference of iFFT and the original signal shows that usually we’re not far off. The ends are a problem though, but if our data analysis trims off these ends then our center looks great.

Here’s the code I used to make the image:

 import numpy, scipy, pylab, random

 # This script demonstrates how to use band-pass (low-pass)
 # filtering to eliminate electrical noise and static
 # from signal data!

 ##################
 ### PROCESSING ###
 ##################

 xs=numpy.arange(1,100,.01) #generate Xs (0.00,0.01,0.02,0.03,...,100.0)
 signal = sin1=numpy.sin(xs*.3) #(A)
 sin1=numpy.sin(xs) # (B) sin1
 sin2=numpy.sin(xs*2.33)*.333 # (B) sin2
 sin3=numpy.sin(xs*2.77)*.777 # (B) sin3
 noise=sin1+sin2+sin3 # (C)
 static = (numpy.random.random_sample((len(xs)))-.5)*.2 # (D)
 sigstat=static+signal # (E)
 rawsignal=sigstat+noise # (F)
 fft=scipy.fft(rawsignal) # (G) and (H)
 bp=fft[:]
 for i in range(len(bp)): # (H-red)
     if i>=10:bp[i]=0
 ibp=scipy.ifft(bp) # (I), (J), (K) and (L)

 ################
 ### GRAPHING ###
 ################

 h,w=6,2
 pylab.figure(figsize=(12,9))
 pylab.subplots_adjust(hspace=.7)

 pylab.subplot(h,w,1);pylab.title("(A) Original Signal")
 pylab.plot(xs,signal)

 pylab.subplot(h,w,3);pylab.title("(B) Electrical Noise Sources (3 Sine Waves)")
 pylab.plot(xs,sin1,label="sin1")
 pylab.plot(xs,sin2,label="sin2")
 pylab.plot(xs,sin3,label="sin3")
 pylab.legend()

 pylab.subplot(h,w,5);pylab.title("(C) Electrical Noise (3 sine waves added together)")
 pylab.plot(xs,noise)

 pylab.subplot(h,w,7);pylab.title("(D) Static (random noise)")
 pylab.plot(xs,static)
 pylab.axis([None,None,-1,1])

 pylab.subplot(h,w,9);pylab.title("(E) Signal + Static")
 pylab.plot(xs,sigstat)

 pylab.subplot(h,w,11);pylab.title("(F) Recording (Signal + Static + Electrical Noise)")
 pylab.plot(xs,rawsignal)

 pylab.subplot(h,w,2);pylab.title("(G) FFT of Recording")
 fft=scipy.fft(rawsignal)
 pylab.plot(abs(fft))
 pylab.text(200,3000,"signals",verticalalignment='top')
 pylab.text(9500,3000,"static",verticalalignment='top',
            horizontalalignment='right')

 pylab.subplot(h,w,4);pylab.title("(H) Low-Pass FFT")
 pylab.plot(abs(fft))
 pylab.text(17,3000,"sin1",verticalalignment='top',horizontalalignment='left')
 pylab.text(37,2000,"sin2",verticalalignment='top',horizontalalignment='center')
 pylab.text(45,3000,"sin3",verticalalignment='top',horizontalalignment='left')
 pylab.text(6,3000,"signal",verticalalignment='top',horizontalalignment='left')
 pylab.axvspan(10,10000,fc='r',alpha='.5')
 pylab.axis([0,60,None,None])

 pylab.subplot(h,w,6);pylab.title("(I) Inverse FFT")
 pylab.plot(ibp)

 pylab.subplot(h,w,8);pylab.title("(J) Signal vs. iFFT")
 pylab.plot(signal,'k',label="signal",alpha=.5)
 pylab.plot(ibp,'b',label="ifft",alpha=.5)

 pylab.subplot(h,w,10);pylab.title("(K) Normalized Signal vs. iFFT")
 pylab.plot(signal/max(signal),'k',label="signal",alpha=.5)
 pylab.plot(ibp/max(ibp),'b',label="ifft",alpha=.5)

 pylab.subplot(h,w,12);pylab.title("(L) Difference / Error")
 pylab.plot(signal/max(signal)-ibp/max(ibp),'k')

 pylab.savefig("SIG.png",dpi=200)
 pylab.show()




Additional Resources

UPDATE: An improved ECG design was posted in August, 2016.
Check out: http://www.swharden.com/wp/2016-08-08-diy-ecg-with-1-op-amp/

No 3-day weekend would be complete without a project that’s, well, virtually useless. I present to you my new and improved ECG machine! Instead of using a single op-amp circuit like the previous entries which gave me decent but staticky traces, I decided to build a more advanced ECG circuit documented by Jason Nguyen which boasted 6 op amps! (I’d only been using one) Luckily I picked up a couple LM 324 quad op amp chips at radioshack for about $1.40 each, so I had everything I needed. I’ll skip to the results. In short, they’re gorgeous. Noise is almost nothing, so true details of the trace are visible. I can now clearly see the P-Q-R-S-T features in the wave (before the P was invisible). I’ll detail how I did this in a later entry. For now, here are some photos of the little device and a video I uploaded to YouTube. It’s not fancy.

UPDATE: Upon analyzing ~20 minutes of heartbeat data I found a peculiarity. Technically this could be some kind of noise (a ‘pop’ in the microphone signal due to the shuffling of wires or a momentary disconnect from the electrodes or perhaps even a static shock to my body from something), but because this peculiarity happened only once in 20 minutes I’m not ruling out the possibility that this is the first irregular heartbeat I captured with my DIY ECG. Note that single-beat irregularities are common, and that this does not alarm me so much as fascinates me. Below is the section of the data which contains this irregular beat.





Additional Resources

While searching for a more efficient bandpass filter for python (to apply to multi-hour WAV files because the WAV->FFT->bandstop->iFFT->smooth process is way too slow) I came across The Stoic Monkey’s VROOM! which apparently is a vector-based racing game game (written in Python with PyGame libraries) . It doesn’t flaunt the capabilities of PyGame for game design, but it was a cool concept because it’s very 3D-ish and is done by mathematically processing anti-aliasing splines at impressively high speed. Check it out! “[download for windows]”http://www.stoicmonkey.com/vroom/VROOM_setup.exe





Additional Resources

UPDATE: An improved ECG design was posted in August, 2016.
Check out: http://www.swharden.com/wp/2016-08-08-diy-ecg-with-1-op-amp/

Last night I finished building my DIY ECG as a prototype (I finally got the circuit off the breadboard and onto a plastic sheet). This is a similar circuit to the one used to record data from the last entry (resister values are now identical to the crude circuit described several posts ago). I left-in the crude band-pass filter (made by grounding my primary electrode sensor through a 0.1µF capacitor) because it seemed to help a great deal, and wasn’t hard to implement. I picked up all of my parts (including the LF324 quad-op-amp microchip) at RadioShack. Of note, the quad-op-amp is overkill because I’m only using one of the 4 op-amps. Theoretically I could add 3 more electrodes to this circuit (which would allow for multi-sensor recording, similar to the electrodes placed at http://en.wikipedia.org/wiki/File:Limb_leads.svg) but this would require multiple microphone jacks, which isn’t very common. I guess I could use 2 microphone jacks, and differentiate right/left channels. Anyway, here are some photos.


I made the prototype by drilling holes in a small rectangular piece of a non-conductive plastic-fiberish material. (I picked up a stack of these rectangular sections for a quarter at a local electrical surplus store and they’re perfect for prototyping). The two green wires coming out the left side attach to a ~5v power supply (either a plugged in AC->DC transformer, 3 or 4 AA batteries, or even a 9V should work). The blue wires on the right attach to the electrodes I put on my chest. The black wires go to a headphone-jack which I plug into the microphone hole of my PC to record the signal.


This is the back of the device which shows my crummy soldering technique. Let it slide folks, I’m a molecular biology not an electrical engineer. Anyhow, basic point-to-point contacts, nothing special. The white/yellow wires correspond to the left/right channels of the microphone connector. I only use the left one (white), but attached the right channel (yellow) to the op-amp just in case I decide to add another sensor later – this is not required.


Here’s the full shabang! You can clearly see the circuit (note its small size – easy to mount inside of a tictac box or something) with the green wires leading to a power supply, black cable being the microphone connector, and the blue wires leading to electrodes made… from… fanta… cans…? Yes, in the spirit of ghetto-rigging electronics (my specialty) I found that surprisingly nice chest electrodes can be made from aluminum soda cans! If you go this route, cut them delicately so you don’t get metal shards in your skin like I did at first. Also, note that you have to firmly scrape each side of the aluminum to get the insulating waxy-plastic stuff off or it just won’t work. (I guess it’s coated with something to prevent the soda from tasting like metal.) Aluminum rapidly transfers heat, so it’s nearly impossible to solder leads onto these pads, so I wrapped a wire (tipped with a bead of solder) with the edge of the aluminum several times and crushed the heck out of it with pliers and it seems to stay on well and make a good connection. Also, before taping these onto your skin, you have to put a highly-conductive goo on it to make the connection better. I chose to use a copious squirt of my wife’s skin moisturizer on each electrode (shh, don’t tell her!) and taped the gooey electrode directly onto my chest.

I recorded ~20 minutes of data last night with this rig and it looked pretty good. I went to analyze it with Python and it kept crashing! The python script I gave you previously loads the entire wave file into an array of numbers, but with a 20 minute wave file (at over 40,000 samples a second) this is just too big for memory. I wrote an updated wave loader function which loads large wave files in parts which is much more efficient. It also performs the condensation method at load time. Basically, it loads 100 data points (or whatever deg is set to), averages them, and adds this value to a list. The result is a smoothed trace with a resolution of ~400Hz instead of ~40,000Hz. I’d apply it to my wave file I recorded last night but it’s on my laptop which is in the car and I’ve got to get back to work. Here’s that function:

 def loadWav(fname,deg=100):
     global hz
     w=wave.open(fname)
     nchannel, width, rate, length, comptype, compname = w.getparams()
     print "[%s]
 rate: %d Hz
 frames: %d
 length: %.02f sec" %
           (fname, rate, length, float(length)/rate)
     hz=rate/deg
     chunks=int(length/deg)
     data=[]
     for i in range(chunks):
         if i%7777==0:
             print "processing chunk %d of %d (%0.02f%%)" %
                   (i,chunks,100.0*i/chunks)
         vals = struct.unpack("%sh" %deg,w.readframes(deg))
         data.append(sum(vals)/float(deg))
     print "complete!"
     return data




Additional Resources

UPDATE: An improved ECG design was posted in August, 2016.
Check out: http://www.swharden.com/wp/2016-08-08-diy-ecg-with-1-op-amp/

Would I rather design circuits or software? I’m a software guy (likely due to my lack of working knowledge of circuits) so I’d rather record noisy signals and write software to eliminate the noise, rather than assembling circuits to eliminate the noise for me. In the case of my DIY ECG machine, I’d say I’ve done a great job of eliminating noise via the software route. Most DIY ECG circuits on the net use multiple op-amps and diodes to do this, and have a hardware-based band-pass filter to eliminate frequencies around 60 Hz. Instead of all that fancy stuff, I made a super-crude circuit (a single op-amp and two resisters) to record my ECG. It was INCREDIBLY noisy! So, how did I clean it up with software? I’ll tell you.

The first step in removing electrical noise is classifying it. Most of the noise in my signal were overlapping sine waves caused by my electrodes picking up signals not from my body. This was determined by simply close-up observation of the original trace. Since this sine-type interference is consistant, power-spectral analysis could be applied to determine the frequencies of the noise so I could block them out. I used the fast Fourier transform algorithm (FFT) on the values from my wave file to generate a plot of noise level with respect to frequency (noise was seen as sharp peaks). I manually blocked-out certain regions of the FFT trace that I thought were noise-related (colored bands, all the values of which were set to zero) – a process known as band-pass filtering (something which is possible to do electronically with a more complicated circuit) – then performed an inverse FFT algorithm on the trace. The result was a trace with greatly reduced noise and after a moving window smoothing algorithm was applied the signal was better than ever. Below are some figures. Note that I recorded the raw WAV file with “sound recorder” (not GoldWave) and did all of the processing (including band-pass filtering) within Python.

What an awesome chart! On top we have the power spectral analysis with band-stop filters applied at the colored regions. Below is the trace of the original WAV file (light gray), the inverse-FFT-filtered trace (dark gray), and the smoothed inverse-FFT-filtered trace (black) – the final ECG signal I intend to use.

This is a magnified view of a few heartbeats after the inverse FFT filtering (multi-band-blocking) process was applied. Not bad! Not bad at all…

And, of course, the updated code:

 import wave, struct, numpy, pylab, scipy

 fname='./success3.wav'

 def readwave(wavfilename):
     """load raw data directly from a WAV file."""
     global rate
     w=wave.open(wavfilename,'rb')
     (nchannel, width, rate, length, comptype, compname) = w.getparams()
     print "[%s] %d HZ (%0.2fsec)" %(wavfilename, rate, length/float(rate))
     frames = w.readframes(length)
     return numpy.array(struct.unpack("%sh" %length*nchannel,frames))

 def shrink(data,deg=100):
     """condense a linear data array by a multiple of [deg]."""
     global rate
     small=[]
     print "starting with", len(data)
     for i in range(len(data)/deg):
         small.append(numpy.average(data[i*deg:(i+1)*deg]))
     print "ending with", len(small)
     rate = rate/deg
     #return small[40000:50000]
     return small

 def normalize(data):
     """make all data fit between -.5 and +.5"""
     data=data-numpy.average(data)
     big=float(max(data))
     sml=float(min(data))
     data=data/abs(big-sml)
     data=data+float(abs(min(data)))-.47
     return data

 def smooth(data,deg=20,expand=False):
     """moving window average (deg = window size)."""
     for i in range(len(data)-deg):
         if i==0: cur,smooth=sum(data[0:deg]),[]
         smooth.append(cur/deg)
         cur=cur-data[i]+data[i+deg]
     if expand:
         for i in range(deg):
             smooth.append(smooth[-1])
     return smooth

 def smoothListGaussian(list,degree=10,expand=False):
     window=degree*2-1
     weight=numpy.array([1.0]*window)
     weightGauss=[]
     for i in range(window):
         i=i-degree+1
         frac=i/float(window)
         gauss=1/(numpy.exp((4*(frac))**2))
         weightGauss.append(gauss)
     weight=numpy.array(weightGauss)*weight
     smoothed=[0.0]*(len(list)-window)
     for i in range(len(smoothed)):
         smoothed[i]=sum(numpy.array(list[i:i+window])*weight)/sum(weight)
     if expand:
         for i in range((degree*2)-1):
             smoothed.append(smoothed[-1])
     return smoothed

 def goodSmooth(data):
     #data=smooth(fix,20,True)
     data=smooth(fix,100,True)
     #data=smooth(fix,20,True)
     return data

 def makeabs(data):
     """center linear data to its average value."""
     for i in range(len(data)): data[i]=abs(data[i])
     return data

 def invert(data):
     """obviously."""
     for i in range(len(data)): data[i]=-data[i]
     return data

 def loadwav(fname):
     """a do-everything function to get usable, smoothed data from a WAV."""
     wav=readwave(fname)
     wav=shrink(wav)
     wav=invert(wav)
     wav=smooth(wav)
     wav=smooth(wav,10)
     wav=normalize(wav)
     Xs=getXs(wav)
     return Xs,wav

 def getXs(datalen):
     """calculate time positions based on WAV frequency resolution."""
     Xs=[]
     for i in range(len(datalen)):
         Xs.append(i*(1/float(rate)))
     print len(datalen), len(Xs)
     return Xs

 def integrate(data):
     """integrate the function with respect to its order."""
     inte=[]
     for i in range(len(data)-1):
         inte.append(abs(data[i]-data[i+1]))
     inte.append(inte[-1])
     return inte

 def getPoints(Xs,data,res=10):
     """return X,Y coordinates of R peaks and calculate R-R based heartrate."""
     pXs,pYs,pHRs=[],[],[]
     for i in range(res,len(data)-res):
         if data[i]>data[i-res]+.1 and data[i]>data[i+res]+.1:
             if data[i]>data[i-1] and data[i]>data[i+1]:
                 pXs.append(Xs[i])
                 pYs.append(data[i])
                 if len(pXs)>1:
                     pHRs.append((1.0/(pXs[-1]-pXs[-2]))*60.0)
     pHRs.append(pHRs[-1])
     return pXs,pYs,pHRs

 def bandStop(fft,fftx,low,high,show=True):
     lbl="%d-%d"%(low,high)
     print "band-stopping:",lbl
     if show:
         col=pylab.cm.spectral(low/1200.)
         pylab.axvspan(low,high,alpha=.4,ec='none',label=lbl,fc=col)
         #pylab.axvspan(-low,-high,fc='r',alpha=.3)
     for i in range(len(fft)):
         if abs(fftx[i])>low and abs(fftx[i])<high :
             fft[i]=0
     return fft

 def getXs(data):
     xs=numpy.array(range(len(data)))
     xs=xs*(1.0/rate)
     return xs

 def clip(x,deg=1000):
     return numpy.array(x[deg:-deg])

 pylab.figure(figsize=(12,8))
 raw = invert(shrink(readwave(fname),10))
 xs = getXs(raw)
 fftr = numpy.fft.fft(raw)
 fft = fftr[:]
 fftx= numpy.fft.fftfreq(len(raw), d=(1.0/(rate)))

 pylab.subplot(2,1,1)
 pylab.plot(fftx,abs(fftr),'k')

 fft=bandStop(fft,fftx,30,123)
 fft=bandStop(fft,fftx,160,184)
 fft=bandStop(fft,fftx,294,303)
 fft=bandStop(fft,fftx,386,423)
 fft=bandStop(fft,fftx,534,539)
 fft=bandStop(fft,fftx,585,610)
 fft=bandStop(fft,fftx,654,660)
 fft=bandStop(fft,fftx,773,778)
 fft=bandStop(fft,fftx,893,900)
 fft=bandStop(fft,fftx,1100,max(fftx))
 pylab.axis([0,1200,0,2*10**6])
 pylab.legend()
 pylab.title("Power Spectral Analysis",fontsize=28)
 pylab.ylabel("Power",fontsize=20)
 pylab.xlabel("Frequency (Hz)",fontsize=20)

 pylab.subplot(2,1,2)
 pylab.title("Original Trace",fontsize=28)
 pylab.ylabel("Potential",fontsize=20)
 pylab.xlabel("Time (sec)",fontsize=20)
 pylab.plot(clip(xs),clip(raw),color='.8',label='1: raw')

 fix = scipy.ifft(fft)
 pylab.plot(clip(xs),clip(fix)+5000,color='.6',label='2: band-stop')
 pylab.plot(clip(xs),clip(goodSmooth(fix))-5000,'k',label='3: smoothed')
 pylab.legend()
 pylab.title("Band-Stop Filtered Trace",fontsize=28)
 pylab.ylabel("Potential",fontsize=20)
 pylab.xlabel("Time (sec)",fontsize=20)

 pylab.subplots_adjust(hspace=.5)
 pylab.savefig('out.png',dpi=100)
 pylab.show()
 print "COMPLETE"




Additional Resources

UPDATE: An improved ECG design was posted in August, 2016.
Check out: http://www.swharden.com/wp/2016-08-08-diy-ecg-with-1-op-amp/

I’ve succeeded in building my own electrocardiograph (ECG) to record the electrical activity of my own heart! Briefly, I built a micropotential amplifier using an op-amp and attached it to makeshift electrodes on my chest (pennies and shampoo lol), fed the amplified signal into my sound card, and recorded it as a WAV. The signal is INCREDIBLY noisy though. I was able to do a great job at removing this noise using band/frequency filters in GoldWave (audio editing software designed to handle WAV files). I band-blocked 50-70 Hz (which removed the oscillations from the 60 Hz AC lines running around my apartment). I then wrote the Python code (at the bottom of this entry) to load this WAV file as a single list of numbers (voltage potentials). I performed a data condensation algorithm (converting 100 points of raw WAV data into a single, averaged point, lessening my processing load by 100x), followed by two consecutative moving window averages (20-point window, performed on the condensed data). The result was a voltage reading that had most of the random interference oscillations removed and, behold, a BEAUTIFUL ECG signal!!! I also tossed in some code to determine the peak of the R wave, label it (yellow dot), and use the inverse R-R time distance to calculate heart rate. Nifty, huh?

I swear, this is legitimate data that I recorded myself tonight! This is my actual ECC signal as record by a circuit similar to the one in the previous entry, recorded through my sound card, and processed with the Python script below. Incredible, huh? You can clearly see the Q, R, S, and T spikes (described in the previous couple entries)! I can’t wait to solder-up a prototype (it’s currently breadboarded) and try to analyze hours of data rather than just a few seconds! I’ll take pictures of this device soon. Until then, here’s some real data!

This is a zoomed-in view of a representative QRST wave of mine. Amazing, huh? I’m proud of my ticker! =o)

This is the output of the python script I wrote tonight

And here’s the code I used: note that it relies on the WAV file I recorded – if you want it just ask me for it and I’ll email it to you – it’s about 12 sec long. Oh yeah, not all of the functions in this script are used to create the image (such as integration calculations), but I left ’em in because you may find them useful.

 import wave, struct, numpy, pylab, scipy

 def readwave(wavfilename):
     """load raw data directly from a WAV file."""
     global rate
     w=wave.open(wavfilename,'rb')
     (nchannel, width, rate, length, comptype, compname) = w.getparams()
     print "[%s] %d HZ (%0.2fsec)" %(wavfilename, rate, length/float(rate))
     frames = w.readframes(length)
     return numpy.array(struct.unpack("%sh" %length*nchannel,frames))

 def shrink(data,deg=100):
     """condense a linear data array by a multiple of [deg]."""
     global rate
     small=[]
     print "starting with", len(data)
     for i in range(len(data)/deg):
         small.append(numpy.average(data[i*deg:(i+1)*deg]))
     print "ending with", len(small)
     rate = rate/deg
     return small

 def normalize(data):
     """make all data fit between -.5 and +.5"""
     data=data-numpy.average(data)
     big=float(max(data))
     sml=float(min(data))
     data=data/abs(big-sml)
     data=data+float(abs(min(data)))-.47
     return data

 def smooth(data,deg=20):
     """moving window average (deg = window size)."""
     for i in range(len(data)-deg):
         if i==0: cur,smooth=sum(data[0:deg]),[]
         smooth.append(cur/deg)
         cur=cur-data[i]+data[i+deg]
     return smooth

 def makeabs(data):
     """center linear data to its average value."""
     for i in range(len(data)): data[i]=abs(data[i])
     return data

 def invert(data):
     """obviously."""
     for i in range(len(data)): data[i]=-data[i]
     return data

 def loadwav(fname='./wavs/bandpassed.wav'):
     """a do-everything function to get usable, smoothed data from a WAV."""
     wav=readwave(fname)
     wav=shrink(wav)
     wav=invert(wav)
     wav=smooth(wav)
     wav=smooth(wav,10)
     wav=normalize(wav)
     Xs=getXs(wav)
     return Xs,wav

 def getXs(datalen):
     """calculate time positions based on WAV frequency resolution."""
     Xs=[]
     for i in range(len(datalen)):
         Xs.append(i*(1/float(rate)))
     print len(datalen), len(Xs)
     return Xs

 def integrate(data):
     """integrate the function with respect to its order."""
     inte=[]
     for i in range(len(data)-1):
         inte.append(abs(data[i]-data[i+1]))
     inte.append(inte[-1])
     return inte

 def getPoints(Xs,data,res=10):
     """return X,Y coordinates of R peaks and calculate R-R based heartrate."""
     pXs,pYs,pHRs=[],[],[]
     for i in range(res,len(data)-res):
         if data[i]>data[i-res]+.1 and data[i]>data[i+res]+.1:
             if data[i]>data[i-1] and data[i]>data[i+1]:
                 pXs.append(Xs[i])
                 pYs.append(data[i])
                 if len(pXs)>1:
                     pHRs.append((1.0/(pXs[-1]-pXs[-2]))*60.0)
     pHRs.append(pHRs[-1])
     return pXs,pYs,pHRs

 Xs,Ys=loadwav()
 px,py,pHR=getPoints(Xs,Ys)

 pylab.figure(figsize=(12,6))
 pylab.subplot(2,1,1)
 #pylab.axhline(color='.4',linestyle=':')
 pylab.plot(Xs,Ys,'b-')
 pylab.plot(px,py,'y.')
 pylab.axis([None,None,-.6,.6])
 pylab.title("DIY Electrocardiogram - Trial 1",fontsize=26)
 pylab.ylabel("Normalized Potential",fontsize=16)
 #pylab.xlabel("Time (sec)")
 ax=pylab.axis()
 pylab.subplot(2,1,2)
 pylab.plot(px,pHR,'k:')
 pylab.plot(px,pHR,'b.')
 pylab.axis([ax[0],ax[1],None,None])
 pylab.ylabel("Heart Rate (BPM)",fontsize=16)
 pylab.xlabel("Time (seconds)",fontsize=16)
 pylab.savefig("test.png",dpi=120)
 pylab.show()
 print "DONE"




Additional Resources

While writing code for my graduate research thesis I came across the need to lightly compress a huge and complex variable (a massive 3D data array) and store it in a text file for later retrieval. I decided to use the zlib compression library because it’s open source and works pretty much on every platform. I ran into a snag for a while though, because whenever I loaded data from a text file it wouldn’t properly decompress. I fixed this problem by adding the “rb” to the open line, forcing python to read the text file as binary data rather than ascii data. Below is my code, written in two functions to save/load compressed string data to/from files in Python.

  

 import zlib  

   

 def saveIt(data,fname):  

     data=str(data)  

     data=zlib.compress(data)  

     f=open(fname,'wb')  

     f.write(data)  

     f.close()  

     return  

   

 def openIt(fname,evaluate=True):  

     f=open(fname,'rb')  

     data=f.read()  

     f.close()  

     data=zlib.decompress(data)  

     if evaluate: data=eval(data)  

     return data  

 

Oh yeah, don’t forget the evaluate option in the openIt function. If set to True (default), the returned variable will be an evaluated object. For example, “[[1,2],[3,4]]” will be returned as an actual 2D list, not just a string. How convenient is that?





Additional Resources

Here’s a scrumptious morsel of juicy python code for even the most stoic of scientists to get excited about. Granted, it’s a very simple concept and has surely been done countless times before, but there aren’t any good resources for this code on the internet. Since I had to write my own code to perform a variety of different linear 1-dimensional array data smoothing in python, I decided it would be nice to share it. At the bottom of this post you can see a PNG image which is the file output by the code listen even further below. If you copy/paste the code into an empty text file and run it in Python, it will generate the exact same PNG file (assuming you have pylab and numpy libraries configured).

  

 ### This is the Gaussian data smoothing function I wrote ###  

 def smoothListGaussian(list,degree=5):  

     window=degree*2-1  

     weight=numpy.array([1.0]*window)  

     weightGauss=[]  

     for i in range(window):  

         i=i-degree+1  

         frac=i/float(window)  

         gauss=1/(numpy.exp((4*(frac))**2))  

         weightGauss.append(gauss)  

     weight=numpy.array(weightGauss)*weight  

     smoothed=[0.0]*(len(list)-window)  

     for i in range(len(smoothed)):  

         smoothed[i]=sum(numpy.array(list[i:i+window])*weight)/sum(weight)  

     return smoothed  

 

Basically, you feed it a list (it doesn’t matter how long it is) and it will return a smoother version of the data. The Gaussian smoothing function I wrote is leagues better than a moving window average method, for reasons that are obvious when viewing the chart below. Surprisingly, the moving triangle method appears to be very similar to the Gaussian function at low degrees of spread. However, for huge numbers of data points, the Gaussian function should perform better.

  

 ### This is the code to produce the image displayed above ###  

 import pylab,numpy  

     

 def smoothList(list,strippedXs=False,degree=10):  

     if strippedXs==True: return Xs[0:-(len(list)-(len(list)-degree+1))]  

     smoothed=[0]*(len(list)-degree+1)  

     for i in range(len(smoothed)):  

         smoothed[i]=sum(list[i:i+degree])/float(degree)  

     return smoothed  

   

 def smoothListTriangle(list,strippedXs=False,degree=5):  

     weight=[]  

     window=degree*2-1  

     smoothed=[0.0]*(len(list)-window)  

     for x in range(1,2*degree):weight.append(degree-abs(degree-x))  

     w=numpy.array(weight)  

     for i in range(len(smoothed)):  

         smoothed[i]=sum(numpy.array(list[i:i+window])*w)/float(sum(w))  

     return smoothed  

   

 def smoothListGaussian(list,strippedXs=False,degree=5):  

     window=degree*2-1  

     weight=numpy.array([1.0]*window)  

     weightGauss=[]  

     for i in range(window):  

         i=i-degree+1  

         frac=i/float(window)  

         gauss=1/(numpy.exp((4*(frac))**2))  

         weightGauss.append(gauss)  

     weight=numpy.array(weightGauss)*weight  

     smoothed=[0.0]*(len(list)-window)  

     for i in range(len(smoothed)):  

         smoothed[i]=sum(numpy.array(list[i:i+window])*weight)/sum(weight)  

     return smoothed  

   

 ### DUMMY DATA ###  

 data = [0]*30 #30 "0"s in a row  

 data[15]=1    #the middle one is "1"  

   

 ### PLOT DIFFERENT SMOOTHING FUNCTIONS ###  

   

 pylab.figure(figsize=(550/80,700/80))  

 pylab.suptitle('1D Data Smoothing', fontsize=16)  

   

 pylab.subplot(4,1,1)  

 p1=pylab.plot(data,".k")  

 p1=pylab.plot(data,"-k")  

 a=pylab.axis()  

 pylab.axis([a[0],a[1],-.1,1.1])  

 pylab.text(2,.8,"raw data",fontsize=14)  

   

 pylab.subplot(4,1,2)  

 p1=pylab.plot(smoothList(data),".k")  

 p1=pylab.plot(smoothList(data),"-k")  

 a=pylab.axis()  

 pylab.axis([a[0],a[1],-.1,.4])  

 pylab.text(2,.3,"moving window average",fontsize=14)  

   

 pylab.subplot(4,1,3)  

 p1=pylab.plot(smoothListTriangle(data),".k")  

 p1=pylab.plot(smoothListTriangle(data),"-k")  

 pylab.axis([a[0],a[1],-.1,.4])  

 pylab.text(2,.3,"moving triangle",fontsize=14)  

   

 pylab.subplot(4,1,4)  

 p1=pylab.plot(smoothListGaussian(data),".k")  

 p1=pylab.plot(smoothListGaussian(data),"-k")  

 pylab.axis([a[0],a[1],-.1,.4])  

 pylab.text(2,.3,"moving gaussian",fontsize=14)  

   

 #pylab.show()  

 pylab.savefig("smooth.png",dpi=80)  

 

Hey, I had a great idea, why don’t I test it on some of my own data? Due to the fact that I don’t want the details of my thesis work getting out onto the internet too early, I can’t reveal exactly what this data is from. It will suffice to say that it’s fractional density of neurite coverage in thick muscle tissue. Anyhow, this data is wild and in desperate need of some smoothing. Below is a visual representation of the differences in the methods of smoothing. Yayness! I like the gaussian function the best.

I should note that the degree of window coverage for the moving window average, moving triangle, and gaussian functions are 10, 5, and 5 respectively. Also note that (due to the handling of the “degree” variable between the different functions) the actual number of data points assessed in these three functions are 10, 9, and 9 respectively. The degree for the last two functions represents “spread” from each point, whereas the first one represents the total number of points to be averaged for the moving average. Enjoy.