4:42:47 pm on 1/7/09
Menu
» Home
» About Scott
» Old Stuff
» Archive
» Contact

Writings
» MD Labels
» Streamrip
» AIM Thoughts
» WindowsXP?
» Partitioning
» CD/DVD Repair
» Monitor Info
» CRT Deflection
» Venomcrack
» Flash Thing
» Heart/Brain
» Diabetes
» Triops

Friends
» Kyle
» Nick
» Louis
» Tom



Archives
» January 2009
» December 2008
» November 2008
» October 2008
» September 2008
» September 2007
» December 2006
» August 2006
» January 2006
» December 2005
» August 2005
» July 2005
» June 2005
» May 2005
» April 2005
» March 2005
» February 2005
» January 2005
» December 2004
» November 2004
» October 2004
» September 2004
» August 2004
» July 2004
» June 2004
» May 2004
» April 2004
» March 2004
» February 2004
» January 2004
» December 2003
» November 2003
» October 2003
» September 2003
» August 2003
» July 2003
» June 2003
» May 2003
» April 2003
» March 2003
» February 2003
» January 2003
» December 2002
» November 2002
» October 2002
» September 2002
» June 2001
« Free Damask Seamless Tiling Backgrounds
Run Ubuntu Live CD From a USB Drive »


Linear Data Smoothing in Python
789 words

Here’s a scrumptious morsel of juicy python code for even the most stoic of scientists to get excited about. Granted, it’s a very simple concept and has surely been done countless times before, but there aren’t any good resources for this code on the internet. Since I had to write my own code to perform a variety of different linear 1-dimensional array data smoothing in python, I decided it would be nice to share it. At the bottom of this post you can see a PNG image which is the file output by the code listen even further below. If you copy/paste the code into an empty text file and run it in Python, it will generate the exact same PNG file (assuming you have pylab and numpy libraries configured).

### This is the Gaussian data smoothing function I wrote ###
def smoothListGaussian(list,degree=5):
    window=degree*2-1
    weight=numpy.array([1.0]*window)
    weightGauss=[]
    for i in range(window):
        i=i-degree+1
        frac=i/float(window)
        gauss=1/(numpy.exp((4*(frac))**2))
        weightGauss.append(gauss)
    weight=numpy.array(weightGauss)*weight
    smoothed=[0.0]*(len(list)-window)
    for i in range(len(smoothed)):
        smoothed[i]=sum(numpy.array(list[i:i+window])*weight)/sum(weight)
    return smoothed

Basically, you feed it a list (it doesn’t matter how long it is) and it will return a smoother version of the data. The Gaussian smoothing function I wrote is leagues better than a moving window average method, for reasons that are obvious when viewing the chart below. Surprisingly, the moving triangle method appears to be very similar to the Gaussian function at low degrees of spread. However, for huge numbers of data points, the Gaussian function should perform better.

### This is the code to produce the image displayed above ###
import pylab,numpy
  
def smoothList(list,strippedXs=False,degree=10):
    if strippedXs==True: return Xs[0:-(len(list)-(len(list)-degree+1))]
    smoothed=[0]*(len(list)-degree+1)
    for i in range(len(smoothed)):
        smoothed[i]=sum(list[i:i+degree])/float(degree)
    return smoothed

def smoothListTriangle(list,strippedXs=False,degree=5):
    weight=[]
    window=degree*2-1
    smoothed=[0.0]*(len(list)-window)
    for x in range(1,2*degree):weight.append(degree-abs(degree-x))
    w=numpy.array(weight)
    for i in range(len(smoothed)):
        smoothed[i]=sum(numpy.array(list[i:i+window])*w)/float(sum(w))
    return smoothed

def smoothListGaussian(list,strippedXs=False,degree=5):
    window=degree*2-1
    weight=numpy.array([1.0]*window)
    weightGauss=[]
    for i in range(window):
        i=i-degree+1
        frac=i/float(window)
        gauss=1/(numpy.exp((4*(frac))**2))
        weightGauss.append(gauss)
    weight=numpy.array(weightGauss)*weight
    smoothed=[0.0]*(len(list)-window)
    for i in range(len(smoothed)):
        smoothed[i]=sum(numpy.array(list[i:i+window])*weight)/sum(weight)
    return smoothed

### DUMMY DATA ###
data = [0]*30 #30 "0"s in a row
data[15]=1    #the middle one is "1"

### PLOT DIFFERENT SMOOTHING FUNCTIONS ###

pylab.figure(figsize=(550/80,700/80))
pylab.suptitle('1D Data Smoothing', fontsize=16)

pylab.subplot(4,1,1)
p1=pylab.plot(data,".k")
p1=pylab.plot(data,"-k")
a=pylab.axis()
pylab.axis([a[0],a[1],-.1,1.1])
pylab.text(2,.8,"raw data",fontsize=14)

pylab.subplot(4,1,2)
p1=pylab.plot(smoothList(data),".k")
p1=pylab.plot(smoothList(data),"-k")
a=pylab.axis()
pylab.axis([a[0],a[1],-.1,.4])
pylab.text(2,.3,"moving window average",fontsize=14)

pylab.subplot(4,1,3)
p1=pylab.plot(smoothListTriangle(data),".k")
p1=pylab.plot(smoothListTriangle(data),"-k")
pylab.axis([a[0],a[1],-.1,.4])
pylab.text(2,.3,"moving triangle",fontsize=14)

pylab.subplot(4,1,4)
p1=pylab.plot(smoothListGaussian(data),".k")
p1=pylab.plot(smoothListGaussian(data),"-k")
pylab.axis([a[0],a[1],-.1,.4])
pylab.text(2,.3,"moving gaussian",fontsize=14)

#pylab.show()
pylab.savefig("smooth.png",dpi=80)

Hey, I had a great idea, why don’t I test it on some of my own data? Due to the fact that I don’t want the details of my thesis work getting out onto the internet too early, I can’t reveal exactly what this data is from. It will suffice to say that it’s fractional density of neurite coverage in thick muscle tissue. Anyhow, this data is wild and in desperate need of some smoothing. Below is a visual representation of the differences in the methods of smoothing. Yayness! I like the gaussian function the best.

I should note that the degree of window coverage for the moving window average, moving triangle, and gaussian functions are 10, 5, and 5 respectively. Also note that (due to the handling of the “degree” variable between the different functions) the actual number of data points assessed in these three functions are 10, 9, and 9 respectively. The degree for the last two functions represents “spread” from each point, whereas the first one represents the total number of points to be averaged for the moving average. Enjoy.



This entry was posted on Monday, November 17th, 2008 at 2:50 pmand is filed under General. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

Leave a Reply




copyright © 2006 swharden@gmail.com