Microcontroller Action Potential Generator

Here I demonstrate how to use a single microcontroller pin to generate action-potential-like waveforms. The output is similar my fully analog action potential generator circuit, but the waveform here is created in an entirely different way. A microcontroller is at the core of this project and determines when to fire action potentials. Taking advantage of the pseudo-random number generator (rand() in AVR-GCC’s stdlib.h), I am able to easily produce unevenly-spaced action potentials which more accurately reflect those observed in nature. This circuit has a potentiometer to adjust the action potential frequency (probability) and another to adjust the amount of overshoot (afterhyperpolarization, AHP). I created this project because I wanted to practice designing various types of action potential measurement circuits, so creating an action potential generating circuit was an obvious perquisite.

 

The core of this circuit is a capacitor which is charged and discharged by toggling a microcontroller pin between high, low, and high-Z states. In the high state (pin configured as output, clamped at 5V) the capacitor charges through a series resistor as the pin sources current. In the low state (pin configured as output, clamped at 0V) the capacitor discharges through a series resistor as the pin sinks current. In the high-Z / high impedance state (pin configured as an input and little current flows through it), the capacitor rests. By spending most of the time in high-Z then rapidly cycling through high/low states, triangular waveforms can be created with rapid rise/fall times. Amplifying this transient and applying a low-pass filter using a single operational amplifier stage of an LM-358 shapes this transient into something which resembles an action potential. Wikipedia has a section describing how to use an op-amp to design an active low-pass filter like the one used here.

The code to generate the digital waveform is very straightforward. I’m using PB4 to charge/discharge the capacitor, so the code which actually fires an action potential is as follows:

// rising part = charging the capacitor
DDRB|=(1<<PB4); // make output (low Z)
PORTB|=(1<<PB4); // make high (5v, source current)
_delay_ms(2); // 2ms rise time
	
// falling part
DDRB|=(1<<PB4); // make output (low Z)
PORTB&=~(1<<PB4); // make low (0V, sink current)
_delay_ms(2); // 2ms fall time
_delay_us(150); // extra fall time for AHP

// return to rest state
DDRB&=~(1<<PB4); // make input (high Z)

Programming the microcontroller was accomplished after it was soldered into the device using test clips attached to my ICSP (USBtinyISP). I only recently started using test clips, and for one-off projects like this it’s so much easier than adding header sockets or even wiring up header pins.

I am very pleased with how well this project turned out! I now have an easy way to make irregularly-spaced action potentials, and have a great starting point for future projects aimed at measuring action potential features using analog circuitry.

Project code (GitHub)

Notes

  • Action potential half-width (relating to the speed of the action potential) could be adjusted in software by reducing the time to charge and discharge the capacitor. A user control was not built in to the circuit shown here, however it would be very easy to allow a user to switch between regular and fast-spiking action potential waveforms.
  • I am happy that using the 1n4148 diode on the positive input of the op-amp works, but using two 100k resistors (forming a voltage divider floating around 2.5V) at the input and reducing the gain of this stage may have produced a more reliable result.
  • Action potential frequency (probability) is currently detected by sensing the analog voltage output by a rail-to-rail potentiometer. However, if you sensed a noisy line (simulating random excitatory and inhibitory synaptic input), you could easily make an integrate-and-fire model neuron which fires in response to excitatory input.
  • Discussion related to the nature of this “model neuron” with respect to other models (i.e., Hodgkin–Huxley) are on the previous post.
  • Something like this would make an interesting science fair project

     

Action Potential Generator Circuit

Few biological cells are as interesting to the electrical engineer as the neuron. Neurons are essentially capacitors (with a dielectric cell membrane separating conductive fluid on each side) with parallel charge pumps, leak currents, and nonlinear voltage-dependent currents. When massively parallelized, these individual functional electrical units yield complex behavior and underlie consciousness. The study of the electrical properties of neurons (neurophysiologically, a subset of electrophysiology) often involves the development and use of sensitive electrical equipment aimed at studying these small potentials produced by neurons and currents which travel through channels embedded in their membranes. It seems neurophysiology has gained an emerging interest from the hacker community, as evidenced by the success of Back Yard Brains, projects like the OpenEEG, and Hack-A-Day’s recent feature The Neuron – a Hacker’s Perspective.

In pondering designs for complex action potential detection and analysis circuitry, I realized that it would be beneficial to be able to generate action-potential-like waveforms on my workbench. The circuit I came up with to do this is a fully analog (technically mixed signal) action potential generator which produces lifelike action potentials.

 

Cellular Neurophysiology for Electrical Engineers (in 2 sentences): Neuron action potentials (self-propagating voltage-triggered depolarizations) in individual neurons are measured in scientific environments using single cell recording tools such as sharp microelectrodes and patch-clamp pipettes. Neurons typically rest around -70mV and when depolarized (typically by external excitatory input) above a threshold they engage in a self-propagating depolarization until they reach approximately +40mV, at which time a self-propagating repolarization occurs (often over-shooting the initial rest potential by several mV), then the cell slowly returns to the rest voltage so after about 50ms the neuron is prepared to fire another action potential. Impassioned budding electrophysiologists may enjoy further reading Active Behavior of the Cell Membrane and Introduction to Computational Neuroscience.

The circuit I describe here produces waveforms which visually mimic action potentials rather than serve to replicate the exact conductances real neurons employ to exhibit their complex behavior. It is worth noting that numerous scientists and engineers have designed more physiological electrical representations of neuronal circuitry using discrete components. In fact, the Hodgkin-Huxley model of the initiation and propagation of action potentials earned Alan Hodgkin and Andrew Huxley the Nobel Prize in Physiology and Medicine in 1936. Some resources on the internet describe how to design lifelike action potential generating circuits by mimicking the endogenous ionic conductances which underlie them, notably Analog and Digital Hardware Neural Models, Active Cell Model, and Neuromorphic Silicon Neuron Circuits. My goal for this project is to create waveforms which resemble action potentials, rather than waveforms which truly model them. I suspect it is highly unlikely I will earn a Nobel Prize for the work presented here.

The analog action potential simulator circuit I came up with creates a continuous series action potentials. This is achieved using a 555 timer (specifically the NE555) in an astable configuration to provide continuous square waves (about 6 Hz at about 50% duty). The rising edge of each square wave is isolated with a diode and used to charge a capacitor*. While the charge on the capacitor is above a certain voltage, an NPN transistor (the 2N3904) allows current to flow, amplifying this transient input current. The capacitor* discharges predictably (as an RC circuit) through a leak resistor. A large value leak resistor slows the discharge and allows that signal’s transistor to flow current for a longer duration. By having two signals (fast and slow) using RC circuits with different resistances (smaller and larger), the transistors are on for different durations (shorter and longer). By making the short pulse positive (using the NPN in common collector configuration) and the longer pulse negative (using the NPN in common emitter configuration), a resistor voltage divider can be designed to scale and combine these signals into an output waveform a few hundred mV in size with a 5V power supply. Pictured below is the output of this circuit realized on a breadboard. The blue trace is the output of the 555 timer.

*Between the capacitance of the rectification diode, input capacitance of the transistor, and stray parasitic capacitance from the physical construction of my wires and the rails on my breadboard, there is sufficient capacitance to accumulate charge which can be modified by changing the value of the leak resistor.

This circuit produces similar output when simulated. I’m using LTspice (free) to simulate this circuit. The circuit shown is identical to the one hand-drawn and built on the breadboard, with the exception that an additional 0.1 µF capacitor to ground is used on the output to smooth the signal. On the breadboard this capacitance-based low-pass filtering already exists due to the capacitive nature of the components, wires, and rails.

A few improvements naturally come to mind when considering this completed, functional circuit:

  • Action potential frequency: The resistor/capacitor network on the 555 timer determines the rate of square pulses which trigger action potentials. Changing these values will cause a different rate of action potential firing, but I haven’t attempted to push it too fast and suspect the result would not be stable is the capacitors are not given time to fully discharge before re-initiating subsequent action potentials.
  • Microcontroller-triggered action potentials: Since action potentials are triggered by any 5V rising edge signal, it is trivially easy to create action potentials from microcontrollers! You could create some very complex firing patterns, or even “reactive” firing patterns which respond to inputs. For example, add a TSL2561 I2C digital light sensor and you can have a light-to-frequency action potential generator!
  • Adjusting size and shape of action potentials: Since the waveform is the combination of two waveforms, you can really only adjust the duration (width) or amplitude (height) of each individual waveform, as well as the relative proportion of each used in creating the summation. Widths are adjusted by changing the leak resistor on the base of each transistor, or by adding additional capacitance. Amplitude and the ratio of each signal may be adjusted by changing the ratio of resistors on the output resistor divider.
  • Producing -70 mV (physiological) output: The current output is electirically decoupled (through a series capacitor) so it can float at whatever voltage you bias it to. Therefore, it is easy to “pull” in either direction. Adding a 10k potentiometer to bias the output is an easy way to let you set the voltage. A second potentiometer gating the magnitude of the output signal will let you adjust the height of the output waveform as desired.
  • The 555 could be replaced by an inverted ramp (sawtooth): An inverted ramp / sawtooth pattern which produces rapid 5V rising edges would drive this circuit equally well. A fully analog ramp generator circuit can be realized with 3 transistors: essentially a constant current capacitor charger with a threshold-detecting PNP/NPN discharge component.
  • This action potential is not all-or-nothing: In real life, small excitatory inputs which fail to reach the action potential threshold do not produce an action potential voltage waveform. This circuit uses 5V rising edges to produce action potential waveforms. However, feeding a 1V rising edge would produce an action potential 1/5 the size. This is not a physiological effect. However, it is unlikely (if not impossible) for many digital signal sources (i.e., common microcontrollers) to output anything other than sharp rising edge square waves of fixed voltages, so this is not a concern for my application.
  • Random action potentials: When pondering how to create randomly timed action potentials, the issue of how to generate random numbers arises. This is surprisingly difficult, especially in embedded devices. If a microcontroller is already being used, consider Make’s write-up on the subject, and I think personally I would go with a transistor-based avalanche nosie generator to create the randomness.
  • A major limitation is that irregularly spaced action potentials have slightly different amplitudes. I found this out the next day when I created a hardware random number generator (yes, that happened) to cause it to fire regularly, missing approximately half of the action potentials. When this happens, breaks in time result in a larger subsequent action potential. There are several ways to get around this, but it’s worth noting that the circuit shown here is best operated around 6 Hz with only continuous regularly-spaced action potentials.

In the video I also demonstrate how to record the output of this circuit using a high-speed (44.1 kHz) 16-bit analog-to-digital converter you already have (the microphone input of your sound card). I won’t go into all the details here, but below is the code to read data from a WAV file and plot it as if it were a real neuron. The graph below is an actual recording of the circuit described here using the microphone jack of my sound card.

import numpy as np
import matplotlib.pyplot as plt
Ys = np.memmap("recording.wav", dtype='h', mode='r')[1000:40000]
Ys = np.array(Ys)/max(Ys)*150-70
Xs = np.arange(len(Ys))/44100*1000
plt.figure(figsize=(6,3))
plt.grid(alpha=.5,ls=':')
plt.plot(Xs,Ys)
plt.margins(0,.1)
plt.title("Action Potential Circuit Output")
plt.ylabel("potential (mV)")
plt.xlabel("time (ms)")
plt.tight_layout()
plt.savefig("graph.png")
#plt.show()

Let’s make some noise! Just to see what it would look like, I created a circuit to generate slowly drifting random noise. I found this was a non-trivial task to achieve in hardware. Most noise generation circuits create random signals on the RF scale (white noise) which when low-pass filtered rapidly approach zero. I wanted something which would slowly drift up and down on a time scale of seconds. I achieved this by creating 4-bit pseudo-random numbers with a shift register (74HC595) clocked at a relatively slow speed (about 200 Hz) having essentially random values on its input. I used a 74HC14 inverting buffer (with Schmidt trigger inputs) to create the low frequency clock signal (about 200 Hz) and an extremely fast and intentionally unstable square wave (about 30 MHz) which was sampled by the shift register to generate the “random” data. The schematic illustrates these points, but note that I accidentally labeled the 74HC14 as a 74HC240. While also an inverting buffer the 74HC240 will not serve as a good RC oscillator buffer because it does not have Schmidt trigger inputs.

The addition of noise was a success, from an electrical and technical sense. It isn’t particularly physiological. Neurons would fire differently based on their resting membrane potential, and the peaks of action potential should all be about the same height regardless of the resting potential. However if one were performing an electrical recording through a patch-clamp pipette in perforated patch configuration (with high resistance between the electrode and the internal of the cell), a sharp microelectrode (with high resistance due to the small size of the tip opening), or were using electrical equipment or physical equipment with amplifier limitations, one could imagine that capacitance in the recording system would overcome the rapid swings in cellular potential and result in “noisy” recordings similar to those pictured above. They’re not physiological, but perhaps they’re a good electrical model of what it’s like trying to measure a physiological voltage in a messy and difficult to control experimental environment.

This project was an interesting exercise in analog land, and is completed sufficiently to allow me to move toward my initial goal: creating advanced action potential detection and measurement circuitry. There are many tweaks which may improve this circuit, but as it is good enough for my needs I am happy to leave it right where it is. If you decide to build a similar circuit (or a vastly different circuit to serve a similar purpose), send me an email! I’d love to see what you came up with.

 

UPDATE: add a microcontroller

I enhanced this project by creating a microcontroller controlled action potential generator. That article is here: https://www.swharden.com/wp/2017-08-20-microcontroller-action-potential-generator/


     

Raspberry Pi RF Frequency Counter

I build a lot of RF circuits, and often it’s convenient to measure and log frequency with a computer. Previously I’ve built standalone frequency counters, frequency counters with a PC interface, and even hacked a classic frequency counter to add USB interface (twice, actually). My latest device uses only 2 microchips to provide a Raspberry Pi with RF frequency measurement capabilities. The RF signal clocks a 32-bit counter SN74LV8154 ($1.04 on Mouser) connected to a 16-bit IO expander MCP23017 ($1.26 on Mouser) accessable to the Raspberry Pi (via I²C) to provide real-time frequency measurements from a python script for $2.30 in components! Well, plus the cost of the Raspberry Pi. All files for this project are on my GitHub page.

img_8773

The entire circuit is only two microchips! I have a few passives to clean up the RF signal (the RF input is loaded with a 1k resistor to ground, decoupled through a series 100 nF capacitor, and balanced at VCC/2 through a voltage divider of two 47k resistors), but if the measured signal is already a strong square wave they could be omitted. The circuit requires a gate pulse which typically will be 1 pulse per second (1PPS) and can be generated by dividing-down a 32.768kHz oscillator, a spare pin on a microcontroller, a fancy 1PPS time reference, or like in my case a GPS module (Neo-6M) with 1PPS output to provide an extremely accurate gate.

schem

The connections are intuitive! The I2C address is 0x20 when A0, A1, and A2 are grounded. GPB(1-4) control the register select of the counter, and GPA(0-7) reads each bit of the selected register. The whole thing is controlled from Python, but could be trivially written in any language.

img_8777

Here’s a quick summary describing how the code works: First I send bytes to address 0 and 1 to set all pins of GPIO A as inputs, and GPIO B as outputs. Note that only 4 of 8 pins are used for the output, so technically 4 extra pins could be used for things like blinking LEDs or controlling other devices. I then set the register select pins by sending a value to 0x13 (GPIO B), and read the entire GPIO A bus (INTCAPB, 0x18). For address details, consult the datasheet. I do this 4 times (1 for each byte of the 32-bit counter), do a little math to turn it into a frequency value, and compare the current value with the last value and take the difference to display as the measured frequency.

screenshot

An advantage of this continuously running mode is that no clock cycles are lost, so a gate which accidentally fires a bit early due to jitter and cuts-off a cycle will compensate for it on a subsequent read. This is shown above, as a very stable 10MHz frequency reference is measured with this method. A “slow” 1PPS clock tick causes a reading slightly higher, compensated-for by the next reading being slightly lower. In this way, clock sources which are extremely accurate but suffer from low precision (like GPS time sources) are able to maximize the long-term measurement of frequency. Combining this frequency measurement technique with the ability to generate an analog voltage with a Raspberry Pi will allow me to perform some interesting experiments with a voltage controlled crystal oscillator.

Useful Links:

 


     

Generating Analog Voltage with Raspberry Pi

I recently had the need to generate analog voltages from the Raspberry PI, which has rich GPIO digital outputs but no analog outputs. I looked into the RPi.GPIO project which can create PWM (which I wanted to smooth using a low pass filter to create the analog voltage), but its output on the oscilloscope looked terrible! It stuttered all over the place, likely because the duty is continuously under software control. I ended up solving my problem with a MCP4921 12-bit DAC chip (about $1.50 on eBay). It’s controlled via SPI, and although I could have written a python program to bit-bang its protocol with RPi.GPIO I realized I could write directly to the Raspberry Pi SPI device using the echo command. Dividing 3.3V into 12-bits (4096) means that I can control voltage in steps of less than 1mV each, right from the bash console!

img_8696

Video: The Problem (RPi PWM jitters)

Video: My Solution (SPI DAC)

Hardware Connection

There’s very little magic in how the microchip is connected to the Pi. It’s a straight shot to its SPI bus! Here’s a quick drawing showing which pins to connect. Check your device against the Raspberry Pi GPIO pinout diagram for different devices.

img_8701-1

Controlling the DAC with a Bus Pirate

Before I used a Raspberry Pi to control the DAC chip, I tested it out with a Bus Pirate. I don’t have a lot of pictures of the project, but I have a screenshot of a serial console used to send commands to the chip. One advantage of the Bus Pirate is that I can type bytes in binary, which helps to see the individual bits. I don’t have this ability when I’m working in the bash console.

serial

I’m less familiar with the Bus Pirate, but this was a good opportunity to get to know it a little better. It look me a long time (requiring I pull out the logic analyzer) to realize that I had to manually enable/disable the chip-select line, using the “[” and “]” commands. When I set up the SPI mode (command m5) I told it to use active low, but I wasn’t sure how to reverse the active level of the chip-select commands, so I just did ]this[ instead of [this] and it worked great.

frompi

This is the signal probed when it was controlled by the Raspberry Pi, but it looked essentially identical when values were sent via the Bus Pirate. The only difference is there was an appreciable delay between the “]” commands and each of the bytes. It worked fine though.

Controlling the DAC with Console Commands

Once the hardware was configured, the software was trivial. I could control analog voltages by sending two properly-formatted bytes to the SPI hardware device. Importantly, you must use raspi-config to enable SPI.

# set analog voltage to minimum value (about 0V)
echo -ne "\x30\x00" > /dev/spidev0.0 # minimum

# set analog voltage to something a little higher
echo -ne "\x30\xAB" > /dev/spidev0.0 

# set analog voltage to maximum value (about 3.3V)
echo -ne "\x3F\xFF" > /dev/spidev0.0

Helpful Links:

 


     

Realtime Audio Visualization in Python

Python’s “batteries included” nature makes it easy to interact with just about anything… except speakers and a microphone! As of this moment, there still are not standard libraries which which allow cross-platform interfacing with audio devices. There are some pretty convenient third-party modules, but I hope in the future a standard solution will be distributed with python. I appreciate the differences of Linux architectures such as ALSA and OSS, but toss in Windows and MacOS in the mix and it gets to be a huge mess. For Linux, would I even need anything fancy? I can run “cat file.wav > /dev/dsp” from a command prompt to play audio. There are some standard libraries for operating system specific sound (i.e., winsound), but I want something more versatile. The official audio wiki page on the subject lists a small collection of third-party platform-independent libraries. After excluding those which don’t support microphone access (the ultimate goal of all my poking around in this subject), I dove a little deeper into sounddevice and PyAudio. Both of these I installed with pip (i.e., pip install pyaudio)

For a more modern, cleaner, and more complete GUI-based viewer of realtime audio data (and the FFT frequency data), check out my Python Real-time Audio Frequency Monitor project.

I really like the structure and documentation of sounddevice, but I decided to keep developing with PyAudio for now. Sounddevice seemed to take more system resources than PyAudio (in my limited test conditions: Windows 10 with very fast and modern hardware, Python 3), and would audibly “glitch” music as it was being played every time it attached or detached from the microphone stream. I tried streaming, but after about an hour I couldn’t get clean live access to the microphone without glitching audio playback. Furthermore, every few times I ran this script it crashed my python kernel! I very rarely see this happening. iPython complained: “It seems the kernel died unexpectedly. Use ‘Restart kernel’ to continue using this console” and I eventually moved back to PyAudio. For a less “realtime” application, sounddevice might be a great solution. Here’s the minimal case sounddevice script I tested with (that crashed sometimes). If you have a better one to do live high-speed audio capture, let me know!

import sounddevice #pip install sounddevice

for i in range(30): #30 updates in 1 second
    rec = sounddevice.rec(44100/30)
    sounddevice.wait()
    print(rec.shape)

Here’s a simple demo to show how I get realtime microphone audio into numpy arrays using PyAudio. This isn’t really that special. It’s a good starting point though. Note that rather than have the user define a microphone source in the python script (I had a fancy menu system handling this for a while), I allow PyAudio to just look at the operating system’s default input device. This seems like a realistic expectation, and saves time as long as you don’t expect your user to be recording from two different devices at the same time. This script gets some audio from the microphone and shows the values in the console (ten times).

import pyaudio
import numpy as np

CHUNK = 4096 # number of data points to read at a time
RATE = 44100 # time resolution of the recording device (Hz)

p=pyaudio.PyAudio() # start the PyAudio class
stream=p.open(format=pyaudio.paInt16,channels=1,rate=RATE,input=True,
              frames_per_buffer=CHUNK) #uses default input device

# create a numpy array holding a single read of audio data
for i in range(10): #to it a few times just to see
    data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
    print(data)

# close the stream gracefully
stream.stop_stream()
stream.close()
p.terminate()

01

I tried to push the limit a little bit and see how much useful data I could get from this console window. It turns out that it’s pretty responsive! Here’s a slight modification of the code, made to turn the console window into an impromptu VU meter.

import pyaudio
import numpy as np

CHUNK = 2**11
RATE = 44100

p=pyaudio.PyAudio()
stream=p.open(format=pyaudio.paInt16,channels=1,rate=RATE,input=True,
              frames_per_buffer=CHUNK)

for i in range(int(10*44100/1024)): #go for a few seconds
    data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
    peak=np.average(np.abs(data))*2
    bars="#"*int(50*peak/2**16)
    print("%04d %05d %s"%(i,peak,bars))

stream.stop_stream()
stream.close()
p.terminate()

The results are pretty good! The advantage here is that no libraries are required except PyAudio. For people interested in doing simple math (peak detection, frequency detection, etc.) this is a perfect starting point. Here’s a quick cellphone video:

I’ve made realtime audio visualization (realtime FFT) scripts with Python before, but 80% of that code was creating a GUI. I want to see data in real time while I’m developing this code, but I really don’t want to mess with GUI programming. I then had a crazy idea. Everyone has a web browser, which is a pretty good GUI… with a Python script to analyze audio and save graphs (a lot of them, quickly) and some JavaScript running in a browser to keep refreshing those graphs, I could get an idea of what the audio stream is doing in something kind of like real time. It was intended to be a hack, but I never expected it to work so well! Check this out…

Here’s the python script to listen to the microphone and generate graphs:

import pyaudio
import numpy as np
import pylab
import time

RATE = 44100
CHUNK = int(RATE/20) # RATE / number of updates per second

def soundplot(stream):
    t1=time.time()
    data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
    pylab.plot(data)
    pylab.title(i)
    pylab.grid()
    pylab.axis([0,len(data),-2**16/2,2**16/2])
    pylab.savefig("03.png",dpi=50)
    pylab.close('all')
    print("took %.02f ms"%((time.time()-t1)*1000))

if __name__=="__main__":
    p=pyaudio.PyAudio()
    stream=p.open(format=pyaudio.paInt16,channels=1,rate=RATE,input=True,
                  frames_per_buffer=CHUNK)
    for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
        soundplot(stream)
    stream.stop_stream()
    stream.close()
    p.terminate()

Here’s the HTML file with JavaScript to keep reloading the image… 

<html>
<script language="javascript">
function RefreshImage(){
document.pic0.src="03.png?a=" + String(Math.random()*99999999);
setTimeout('RefreshImage()',50);
}
</script>
<body onload="RefreshImage()">
<img name="pic0" src="03.png">
</body>
</html>

Here’s the result! I couldn’t believe my eyes. It’s not elegant, but it’s kind of functional!

Why stop there? I went ahead and wrote a microphone listening and processing class which makes this stuff easier. My ultimate goal hasn’t been revealed yet, but I’m sure it’ll be clear in a few weeks. Let’s just say there’s a lot of use in me visualizing streams of continuous data. Anyway, this class is the truly terrible attempt at a word pun by merging the words “SWH”, “ear”, and “Hear”, into the official title “SWHear” which seems to be unique on Google. This class is minimal case, but can be easily modified to implement threaded recording (which won’t cause the rest of the functions to hang) as well as mathematical manipulation of data, such as FFT. With the same HTML file as used above, here’s the new python script and some video of the output:

import pyaudio
import time
import pylab
import numpy as np

class SWHear(object):
    """
    The SWHear class is made to provide access to continuously recorded
    (and mathematically processed) microphone data.
    """

    def __init__(self,device=None,startStreaming=True):
        """fire up the SWHear class."""
        print(" -- initializing SWHear")

        self.chunk = 4096 # number of data points to read at a time
        self.rate = 44100 # time resolution of the recording device (Hz)

        # for tape recording (continuous "tape" of recent audio)
        self.tapeLength=2 #seconds
        self.tape=np.empty(self.rate*self.tapeLength)*np.nan

        self.p=pyaudio.PyAudio() # start the PyAudio class
        if startStreaming:
            self.stream_start()

    ### LOWEST LEVEL AUDIO ACCESS
    # pure access to microphone and stream operations
    # keep math, plotting, FFT, etc out of here.

    def stream_read(self):
        """return values for a single chunk"""
        data = np.fromstring(self.stream.read(self.chunk),dtype=np.int16)
        #print(data)
        return data

    def stream_start(self):
        """connect to the audio device and start a stream"""
        print(" -- stream started")
        self.stream=self.p.open(format=pyaudio.paInt16,channels=1,
                                rate=self.rate,input=True,
                                frames_per_buffer=self.chunk)

    def stream_stop(self):
        """close the stream but keep the PyAudio instance alive."""
        if 'stream' in locals():
            self.stream.stop_stream()
            self.stream.close()
        print(" -- stream CLOSED")

    def close(self):
        """gently detach from things."""
        self.stream_stop()
        self.p.terminate()

    ### TAPE METHODS
    # tape is like a circular magnetic ribbon of tape that's continously
    # recorded and recorded over in a loop. self.tape contains this data.
    # the newest data is always at the end. Don't modify data on the type,
    # but rather do math on it (like FFT) as you read from it.

    def tape_add(self):
        """add a single chunk to the tape."""
        self.tape[:-self.chunk]=self.tape[self.chunk:]
        self.tape[-self.chunk:]=self.stream_read()

    def tape_flush(self):
        """completely fill tape with new data."""
        readsInTape=int(self.rate*self.tapeLength/self.chunk)
        print(" -- flushing %d s tape with %dx%.2f ms reads"%\
                  (self.tapeLength,readsInTape,self.chunk/self.rate))
        for i in range(readsInTape):
            self.tape_add()

    def tape_forever(self,plotSec=.25):
        t1=0
        try:
            while True:
                self.tape_add()
                if (time.time()-t1)>plotSec:
                    t1=time.time()
                    self.tape_plot()
        except:
            print(" ~~ exception (keyboard?)")
            return

    def tape_plot(self,saveAs="03.png"):
        """plot what's in the tape."""
        pylab.plot(np.arange(len(self.tape))/self.rate,self.tape)
        pylab.axis([0,self.tapeLength,-2**16/2,2**16/2])
        if saveAs:
            t1=time.time()
            pylab.savefig(saveAs,dpi=50)
            print("plotting saving took %.02f ms"%((time.time()-t1)*1000))
        else:
            pylab.show()
            print() #good for IPython
        pylab.close('all')

if __name__=="__main__":
    ear=SWHear()
    ear.tape_forever()
    ear.close()
    print("DONE")

I don’t really intend anyone to actually do this, but it’s a cool alternative to recording a small portion of audio, plotting it in a pop-up matplotlib window, and waiting for the user to close it to record a new fraction. I had a lot more text in here demonstrating real-time FFT, but I’d rather consolidate everything FFT related into a single post. For now, I’m happy pursuing microphone-related python projects with PyAudio.

UPDATE: Displaying a single frequency

Use Numpy’s FFT() and FFTFREQ() to turn the linear data into frequency. Set that target and grab the FFT value corresponding to that frequency. I haven’t tested this to be sure it’s working, but it should at least be close…

import pyaudio
import numpy as np
np.set_printoptions(suppress=True) # don't use scientific notation

CHUNK = 4096 # number of data points to read at a time
RATE = 44100 # time resolution of the recording device (Hz)
TARGET = 2100 # show only this one frequency

p=pyaudio.PyAudio() # start the PyAudio class
stream=p.open(format=pyaudio.paInt16,channels=1,rate=RATE,input=True,
              frames_per_buffer=CHUNK) #uses default input device

# create a numpy array holding a single read of audio data
for i in range(10): #to it a few times just to see
    data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
    fft = abs(np.fft.fft(data).real)
    fft = fft[:int(len(fft)/2)] # keep only first half
    freq = np.fft.fftfreq(CHUNK,1.0/RATE)
    freq = freq[:int(len(freq)/2)] # keep only first half
    assert freq[-1]>TARGET, "ERROR: increase chunk size"
    val = fft[np.where(freq>TARGET)[0][0]]
    print(val)

# close the stream gracefully
stream.stop_stream()
stream.close()
p.terminate()

UPDATE: Display peak frequency

If your goal is to determine which frequency is producing the loudest tone, use this function. I also added a few lines to graph the output in case you want to observe how it operates. I recommend testing this script with a tone generator, or a YouTube video containing tones of a range of frequencies like this one.

import pyaudio
import numpy as np
import matplotlib.pyplot as plt

np.set_printoptions(suppress=True) # don't use scientific notation

CHUNK = 4096 # number of data points to read at a time
RATE = 44100 # time resolution of the recording device (Hz)

p=pyaudio.PyAudio() # start the PyAudio class
stream=p.open(format=pyaudio.paInt16,channels=1,rate=RATE,input=True,
              frames_per_buffer=CHUNK) #uses default input device

# create a numpy array holding a single read of audio data
for i in range(10): #to it a few times just to see
    data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
    data = data * np.hanning(len(data)) # smooth the FFT by windowing data
    fft = abs(np.fft.fft(data).real)
    fft = fft[:int(len(fft)/2)] # keep only first half
    freq = np.fft.fftfreq(CHUNK,1.0/RATE)
    freq = freq[:int(len(freq)/2)] # keep only first half
    freqPeak = freq[np.where(fft==np.max(fft))[0][0]]+1
    print("peak frequency: %d Hz"%freqPeak)

    # uncomment this if you want to see what the freq vs FFT looks like
    #plt.plot(freq,fft)
    #plt.axis([0,4000,None,None])
    #plt.show()
    #plt.close()

# close the stream gracefully
stream.stop_stream()
stream.close()
p.terminate()

 

This program shows left vs right audio level:

import pyaudio
import numpy as np

maxValue = 2**16
p=pyaudio.PyAudio()
stream=p.open(format=pyaudio.paInt16,channels=2,rate=44100,
              input=True, frames_per_buffer=1024)
while True:
    data = np.fromstring(stream.read(1024),dtype=np.int16)
    dataL = data[0::2]
    dataR = data[1::2]
    peakL = np.abs(np.max(dataL)-np.min(dataL))/maxValue
    peakR = np.abs(np.max(dataR)-np.min(dataR))/maxValue
    print("L:%00.02f R:%00.02f"%(peakL*100, peakR*100))

Output

L:47.26 R:45.17
L:47.55 R:45.63
L:49.44 R:45.98
L:45.27 R:49.80
L:44.39 R:45.75
L:47.50 R:46.96
L:41.49 R:42.64
L:42.95 R:41.39
L:49.56 R:49.62
L:48.29 R:48.80
L:45.03 R:47.62
L:47.99 R:49.35
L:41.58 R:49.21

Or with a tweak…

import pyaudio
import numpy as np

maxValue = 2**16
bars = 35
p=pyaudio.PyAudio()
stream=p.open(format=pyaudio.paInt16,channels=2,rate=44100,
              input=True, frames_per_buffer=1024)
while True:
    data = np.fromstring(stream.read(1024),dtype=np.int16)
    dataL = data[0::2]
    dataR = data[1::2]
    peakL = np.abs(np.max(dataL)-np.min(dataL))/maxValue
    peakR = np.abs(np.max(dataR)-np.min(dataR))/maxValue
    lString = "#"*int(peakL*bars)+"-"*int(bars-peakL*bars)
    rString = "#"*int(peakR*bars)+"-"*int(bars-peakR*bars)
    print("L=[%s]\tR=[%s]"%(lString, rString))

graphical output: