I realized that the C code from yesterday wasn’t showing-up properly because of textile, a rapid, inline, tag-based formatting system. The app converted blog code from [“text”:http://www.SWHarden.com/ *like* _this_] to [text like this. ] While it’s fun and convenient to use, it’s not always practical. The problem I was having was that in C code, variable names (such as _delay_) were becoming irrevocably italicized, and nothing I did could prevent textile from ignoring code while styling text. The kicker is that I couldn’t disable it easily, because I’ve been writing in this style for over four years! I decided that the time was now to put my mad Python skills to the test and write code to handle the conversion from textile-format to raw HTML.
I accomplished this feat in a number of steps. Yeah, I could have done hours of research to find a “faster way”, but it simply wouldn’t have been as creative. In a nutshell, I backed-up the SQL database using PHPMyAdmin to a single “x.sql” file. I then wrote a pythons script to parse this [massive] file and output “o.sql”, the same data but with all of the textile tags I commonly used replaced by their HTML equivalent. It’s not 100% perfect, but it’s 99.999% perfect. I’ll accept that. The output? You’re viewing it! Here’s the code I used to do it:

## This Python (1.0) script removes *SOME* textile formatting from WordPress
## backups in plain text SQL format (dumped from PHP MyAdmin). Specifically,
## it corrects bold and itallic fonts and corrects links. It should be easy
## to expand if you need to do something else with it.
## Enjoy! --Scott Harden (www.SWHarden.com)

infile = 'x.sql' # < < THIS IS THE INPUT FILE NAME!

replacements=   ["r"," "],["n"," n "],["*:","* :"],["_:","_ :"],
                ["n","<br>n"],[">*","> *"],["*< ","* <"],
                [">_","> _"],["_< ","_ <"],
                [" *"," <b>"],["* "," "],[" _"," <i>"],["_ ","</i> "]
                #These are the easy replacements

def fixLinks(line):
    ## replace ["links":URL] with [<a href="URL">links</a>]. ##
    words = line.split(" ")
    for i in range(len(words)):
        word = words[i]
        if '":' in word:
            upto=1
            while (word.count('"')&lt;2):
                word = words[i-upto]+" "+word
                upto+=1
            word_orig = word
            extra=""
            word = word.split('":')
            word[0]=word[0][1:]
            for char in ".),'":
                if word[1][-1]==char: extra=char
            if len(extra)>0: word[1]=word[1][:-1]
            word_new='<a href="%s">%s</a>'%(word[1],word[0])+extra
            line=line.replace(word_orig,word_new)
    return line

def stripTextile(orig):
    ## Handle the replacements and link fixing for each line. ##
    if not orig.count("', '") == 13: return orig #non-normal post
    line=orig
    temp = line.split
    line = line.split("', '",5)[2]
    if len(line)&lt;10:return orig #non-normal post
    origline = line
    line = " "+line
    for replacement in replacements:
        line = line.replace(replacement[0],replacement[1])
    line=fixLinks(line)
    line = orig.replace(origline,line)
    return line

f=open(infile)
raw=f.readlines()
f.close
posts=0
for raw_i in range(len(raw)):
    if raw[raw_i][:11]=="INSERT INTO":
        if "wp_posts" in raw[raw_i]: #if it's a post, handle it!
            posts+=1
            print "on post",posts
            raw[raw_i]=stripTextile(raw[raw_i])


print "WRITING..."
out = ""
for line in raw:
    out+=line
f=open('o.sql','w')
f.write(out)
f.close()

I certainly held my breath while the thing ran. As I previously mentioned, this thing modified SQL tables. Therefore, when I uploaded the “corrected” versions, I kept breaking the site until I got all the bugs worked out. Here’s an image from earlier today when my site was totally dead (0 blog posts)

hostingwork





THIS CODE HAS BEEN UPDATED!
THIS CODE HAS BEEN UPDATED!
THIS CODE HAS BEEN UPDATED!

>>> CHECK OUT THE NEW CODE < << [Generate Apache-Style HTTP Access Logs via SQL and PHP]

OBSOLETE CODE IS BELOW…

A few months ago I wrote about a way I use PHP to generate apache-style access.log files since my web host blocks access to them. Since then I’ve forgotten it was even running! I now have some pretty cool-looking graphs generated by Python and Matplotlib. For details (and the messy script) check the original posting.

This image represents the number of requests (php pages) made per hour since I implemented the script. It might be a good idea to perform some linear data smoothing techniques (which I love writing about), but for now I’ll leave it as it is so it most accurately reflects the actual data.