2008-11-07

haskell and scrambled text

I've recently been playing around with Haskell, and one of the toy applications that I like to write is a text "munger." It takes text and outputs the same text with the words labeled as to their order and then it ASCIIbetizes them. This article (up to the end of this sentence) would look like this:

"munger."[22] (up[46] ASCIIbetizes[42] Haskell,[7] I've[1] I[15] It[23] This[44] a[20] and[26] and[39] and[8] applications[13] around[5] article[45] as[35] been[3] end[49] is[19] it[41] labeled[34] like[16] like[55] look[54] of[10] of[50] one[9] order[38] outputs[27] playing[4] recently[2] same[29] sentence)[52] takes[24] text[21] text[25] text[30] that[14] the[11] the[28] the[32] the[48] their[37] them.[43] then[40] this:[56] this[51] to[17] to[36] to[47] toy[12] with[31] with[6] words[33] would[53] write[18]

So you could reconstruct that with a little work, but it is nicer to do it by feeding into the "demunge" function. Enjoy:

munge.hs -- Copyright 2008 Chris Wilson
-- Code is licensed under the GNU GPL
--   http://www.gnu.org/licenses/gpl.html

import Data.List

-- Label each string in the order that it is encountered
--   [("Word", 1),..("Lastword", n)]
label :: (Num a, Enum a) => [String] -> [(String, a)]
label [] = []
label sl = zip sl [1..]

-- Add these sequential numbers to the end of words
--   [("Word",1)] -> ["Word[1]"]
attachLabel :: (Num t) => [(String, t)] -> [String]
attachLabel [] = []
attachLabel (s:ss) = (fst s ++ "[" ++ show (snd s) ++ "]": attachLabel ss

-- Compose all these functions together
munge :: String -> String
munge "" = ""
munge s =  intercalate " " . sort . attachLabel . label $ words s

-- Extract the oder from a tagged word
--   "is[5]" -> 5
getOrder :: String -> Int
getOrder "" = 0
getOrder s = 0 + (read . reverse . tail . fst . span (/='['. reverse $ s)

-- Extract the word-part of a tagged word
--   "is[5]" -> "is"
getWord :: String -> String
getWord "" = ""
getWord s  = reverse . tail . snd . span (/='['. reverse $ s

-- Put the words into a list of tuples: (order, word)
reconstruct :: String -> [(Int, String)]
reconstruct "" = []
reconstruct s = [ (getOrder item, getWord item) | item <- words s ]

-- Put all the extraction functions together
demunge :: String -> String
demunge [] = ""
demunge xs = intercalate " " [ snd item | item <- sort (reconstruct xs) ]

2008-08-30

A plan for CAPTCHAs

I had the idea (I'm uncertain if it is original [pdf]) of mixing two distinct elements to make a better CAPTCHA. The problem is that there are essentially two ways to solve a CAPTCHA. The first is by designing some kind of image recognition system, this is an OCR problem and there are indications that nearly every major CAPTCHA has been broken. The other main avenue of attack seems to come down to paying people to crack them. This is a much tougher nut to crack because CAPTCHAs are not designed to prevent humans from cracking them; in some sense, this attack isn't a break of the system at all, but a design goal of CAPTCHAs. But much like the mathematician in that old joke, this definition of acceptable wouldn't sit well with some. The solution is to pose a question in such a way that is very hard for a computer to perceive but at the same time hard for a person to solve by herself. As an example, imagine a cryptographically hard problem posed in a way that a computer couldn't readily interpret, and a person would find difficult (or impossible) to solve by hand. This could be the source code for a simple discrete logarithm solver. The code would be obscured in the normal way that a CAPTCHA is (wavy text, visual noise, etc.) The user is instructed to type the code into a text box (where a built-in solver, client-side, can operate on it), and then use the result for entry into the protected resource. It would be easy to tune the toughness of the discrete log problem (randomly chosen of course) to generate any penalty you want. This penalty would cost an attacker any amount of CPU time that the challenger desires. By mixing a character-recognition task (that a human is good at) with a number-crunching task (which a computer is good at, but can be made arbitrarily CPU-intensive and thus slow) you protect from both types of attacks:
  1. If an OCR program can read your text, they still must compute a computationally expensive value.
  2. If humans are being used to circumvent the OCR task, a computer must be used to compute the expensive task as well, still incurring most of the penalty (by tweaking the exact type and hardness of the one-way function, this part can come to dominate the time needed for a successful break).
Some ideas on how to make the computation time more palatable for a legitimate user:
  • Make this be the first question on a form that the user has to fill out.
  • Grant the user provisional access to the site while the computation proceeds in the background.

2008-08-15

Hex Clock

I was reading Hal's blog the other day when I thought that I'd try and implement a clock that uses the hexadecimal time that he talks about. I have seen a bunch of these so-called binary clocks that display the time is some nifty format (blinkenlights). The thing is, what they display is usually binary coded decimal (i.e. 1100:111011 = 12:59). Sometimes they even code each digit: 0001 0010:0101 1001! Hal defined the hexadecimal second to be 1/2^16 of a day, making 65,536 "hexeconds" per day. Since there are 86,400 seconds in a day, there are about 1.318 sec/hxs. Naturally, there are FF hexeconds in a "hexinute." A hexinute is 1/2^8 of a day, making each 337.5 seconds long, or 5.625 minutes (5 min, 37.5 sec). This leads to a nice property (besides being easily convertible to binary), A day then is recursive, there are 256 hxm per day and there are 256 hxs per hxm. Much better than 24 hours, with 60 minutes with 60 seconds! Who could remember that? You just have to get used to your clock displaying a time like: BE|EF (~64,440 seconds into your day or 5:54 PM, clearly not a vegetarian dinner time). Another thing that I thought about was the "stability" of a given digit. Each digit "stays put" for 16 times as long as the digit to it's right. The fastest-changing digit advances each ~1.318 second, to its left the 16's place advances each 16*1.318 = ~21 seconds. In the one's place in the hexinutes each advances 337.5 seconds (or 5.625 minutes), lastly, the most stable digit advances each every 90 minutes. That's great for measuring the orbital period of the space shuttle, each orbit is 10|00 long! That's my proposal for writing the time, by the way, for the programmers out there it will be evocative of a bitwise OR; if you think of the time as something like 0xAB00 | 0xCD. Without further ado:
hexclock.py #!/usr/bin/env python
#
# Based on discussion here: http://halcanary.org/vv/2007/10/31/706/
#
# Copyright 2008 Chris Wilson.
#
# This program is free software: you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the
# Free Software Foundation, either version 3 of the License, or (at your
# option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General
# Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program.  If not, see <http://www.gnu.org/licenses/>.

from Tkinter import *
from random import random
from time import sleep
from hs import hexarray, hexdisplay
import sys
import math
import getopt

master = Tk()

def usage():
    
    print """\nDislpay a Tk-based animated binary clock. The fastest pixel
blinks about once every 1.318 seconds, there are 2^16 such transitions
per day.

%s [options]

\t-h  --help This message
\t-s --size SIZE. make the display SIZE pixels on a side
\t-d --digits display digits in addition to the graphical binary
\t            representation of the time.
\t-f --foreground COLOR. the "lit up" pixels are this color.  
\t-b --background COLOR. the background pixels are this
\t            color.
\t-t --textcolor COLOR. the text will appear this color. Obviously,
\t            this option only makes sense with -d""" % sys.argv[0]
    sys.exit(1)

digits = False
HEIGHT = WIDTH = 400
on = "blue"
off = "black"
textcolor = "white"
try:
    opts, args = getopt.getopt(sys.argv[1:],
                               "hs:df:b:t:", ["help", "size=", "digits",
                                              "foreground","background",
                                              "textcolor"])
                                          
except getopt.GetoptError, err:
    print str(err)
    usage()
for o, a in opts:
    if o in ("-d", "--digits"):
        digits = True
    elif o in ("-s", "--size"):
        try:
            HEIGHT = WIDTH = int(a)
        except:
            usage()
    elif o in ("-f", "--foreground"):
        on = a
    elif o in ("-b", "--background"):
        off = a
    elif o in ("-h", "--help"):
        usage()
    elif o in ("-t", "--textcolor"):
        textcolor = a
    else:
        usage()

fs = int(math.sqrt(HEIGHT))

w = Canvas(master, width=WIDTH, height=HEIGHT)

def create_grid(canvas, grid, height, width):
    ulx = 0
    uly = 0
    xdimen = width/4
    ydimen = height/4
    item_array = list()
    for square in grid:
            if square == 1:
                i = canvas.create_rectangle(ulx, uly, ulx+xdimen, uly+ydimen,
                                    fill=on)
            else:
                i = canvas.create_rectangle(ulx, uly, ulx+xdimen, uly+ydimen,
                                                    fill=off)
            item_array.append(i)
            if ulx == (width-xdimen):
                uly = uly + ydimen
                ulx = 0
            else:
                ulx = ulx + xdimen
    return item_array


def update(canvas, items, grid):
    for i in range(len(items)):
        if grid[i] == 1:
            canvas.itemconfig(items[i], fill=on)
        else:
            canvas.itemconfig(items[i], fill=off)

if digits:
    def create_timestamp(canvas,h,w):
        return canvas.create_text(h/2,w/2,text=hexdisplay(),fill=textcolor,
                                  justify=CENTER, font="Courier %d bold" % fs)

    def update_time(canvas,text_handle):
        canvas.itemconfig(text_handle, text=hexdisplay())
else:
    def create_timestamp(canvas,h,w): pass
    def update_time(canvas,text_handle): pass
    

items = create_grid(w, hexarray(), HEIGHT, WIDTH)
text = create_timestamp(w, HEIGHT, WIDTH)
w.pack()
while True:
    update(w, items, hexarray())
    update_time(w,text)
    w.update_idletasks()    # redraw
    w.update()              # process events
    sleep(1)
mainloop()
hs.py #!/usr/bin/env python
#
# Based on blog post on: http://halcanary.org/vv/2007/10/31/706/
#
# Copyright 2008 Chris Wilson
#
# This program is free software: you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the
# Free Software Foundation, either version 3 of the License, or (at your
# option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General
# Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program.  If not, see <http://www.gnu.org/licenses/>.

import math, time

# hex timekeeping
hs = 86400.0/(2**16)    # there are 65,536 hexeconds in a day ~1.318 sec.
hm = 86400.0/(2**8) # there are 256 hexeconds in hexinute  ~337.5 sec

# I'm not defining an hour since there wouldn't be many of them (and they'd
# be redundant, with a hexinute being ~5.6 mins, these are pretty stable.
# Plus, hackers are more precise anyway ;) ).

def day_secs():
    h, m, s = time.localtime()[3:6]
    return h*3600 + m * 60 + s

def hextime():
    s = day_secs()
    hexmin = int(math.floor(s/hm))
    s = s - (hexmin*hm)
    hexsec = int(math.floor(s/hs))
    return hexmin,hexsec

def hexarray():
    hexmin, hexsec = hextime()
    total = (hexmin << 8) + hexsec
    out = list()
    for x in range(16):
        out.append(total & 1)
        total = total>>1
    out.reverse()
    return out

def hexdisplay():
    return "%02X:%02X" % hextime()

def bindisplay():
    hm, hs = hextime()
    n1, n2 = nybbles(hm)
    n3, n4 = nybbles(hs)
    format_args = map(to_bin, (n1, n2, n3, n4))
    return "%s\n%s\n%s\n%s" %  tuple(format_args)

def nybbles(byte):
    lower = byte&15
    upper = (byte>>4)&15
    return upper, lower

def to_bin(nybble):
    out = list()
    for x in range(4):
        out.append(str(1 & nybble))
        nybble = nybble>>1
    out.reverse()
    return " ".join(out)
    
def _test():
    print "16", nybbles(16)
    u, l = nybbles(16)
    print "bin:",u,l, to_bin(u), to_bin(l)

def run_clock():
    import os
    while True:
        os.system("clear")
        print bindisplay()
        print "("+hexdisplay()+")"
        time.sleep(1)

if __name__ == "__main__":
    #_test()      
    run_clock()

2008-06-27

Everything cool has already been done

I guess I shouldn't be surprised by anything that turns up within a million-line codebase, but I was. I had been looking around for a way to keep contacts, you know, collections of vital stats: name, e-mail address and if I have it postal address. I've had such a file hanging around for years now my nickname for it is snailmail. I know that the term has traditionally meant more or less: "things one can send through the postal service." But seeing as how e-mail is dead (TIC), it grows increasingly apropos. But anyway, I digress. I have this file of info and I wanted to keep it up to date, generate lists of names, generate those mailto: links and etc. In keeping with what I feel is the the *NIX way, I thought I should create a bunch of shell scripts that operate on my flat file in various ways. So I wrote a few scripts that would dig through the file and spit out whatever it was that I wanted. Okay, I can't resist, I wanted to include one of the ones that I wrote in scheme (guile):
snails.scm
#!/usr/bin/guile -s
!#
(use-modules (ice-9 rdelim))

(define (unpack-snailmail line)
"Currently, this is just ::e-mail"
(let ((line-elts (string-split line #\:)))
(format #t "\"~a ~a\" <~a>~%" (cadr line-elts) (car line-elts)
   (caddr line-elts))))

(define (read-lines port f)
"Read all the lines of a file found at port , applying f"
(let ((current-line (read-line port)))
(if (not (eof-object? current-line))
(begin
 (f current-line)
 (read-lines port f)))))

; rudimentary error-checking on the command line
(if (< (length (command-line)) 2)
    (display "Usage: snail snailmail.db\n")
    (read-lines (open-file (cadr (command-line)) "r") 
                unpack-snailmail))
So it is cake to read out the lines of this file. What I wondered more about was how to update and tend this file? Emacs to the rescue! I found out about forms-mode this does what you would expect, that is to create and fill out forms. There are two components to the form, the first is your data file. In my case it looks like this
.snails
Lastname0:Firstname0:E-mail_address0
Lastname1:Firstname1:E-mail_address1
And so on. Next we need an emacs lisp file to tell Emacs how to parse that file and what the form should look like:
sform.el
;; -*- mode: forms -*-
;; This is the 'Control file' (see 'info forms') for .snails
(setq forms-file "~/.snails")
(setq forms-format-list
 (list
  ",------------------,\n"
  "| Snail Mail Ver 1 |\n"
  "'------------------'\n\n"
  "First:   " 2 "\n"
  "Last:    " 1 "\n"
  "E-mail:  " 3 "\n"))
(setq forms-number-of-fields 3)
(setq forms-field-sep ":")
When you launch emacs and point it at that file emacs -nw sform.el, emacs will ask if you want parse it (answer 'yes') And then there are helpful commands at the bottom of the screen. The only basic one omitted is the one to create new entries, C-c C-o does the trick.

2008-05-16

webindent.c

Sometimes I want to post code to a website, but they have utterly broken filters (I'm looking at you Yahoo! Answers). So that when I paste nicely indented code it goes from this:

def double(x):
    return x+x

to this:

def double(x):
return x+x

It is annoying in C or Lisp, but it is wrong in python where the syntax relies on proper indentation. When websites do this it is broken.

So I wrote a little program to rectify the situation. It processes a source file so that it looks the same or similar on the web as in your text editor.

usage: webindent < source.c > page.html

#include <stdio.h>
#define true 1
#define false 0

/* main(): get characters from stdin, replace
 * leading spaces with '&nbsp;'
 * and replace leading tabs with 4 '&nbsp;'s
 * This code is donated to the public domain */
int main()
{
    int start = true; /* we're at the start of a line */
    char *space = "&nbsp;";
    char *fourspaces = "&nbsp;&nbsp;&nbsp;&nbsp;";
    char c; /* the current char */

    while ((c = getchar()) != EOF) {
        if (c == ' ') {
            if (start)
                printf(space);
            else
                putchar(c);
        } else if (c == '\t') {
            if (start)
                printf(fourspaces);
            else
                putchar(c);
        } else if (c == '<') {
            printf("&lt;");
        } else if (c == '>') {
            printf("&gt;");
        } else if (c == '\n') {
            printf("<br />");
            putchar(c);
            start = true;
        } else if (c == '&') {
            printf("&amp;");
        } else {
            if (start)
                start = false;
            putchar(c);
        }
    }
    return 0;
}

It does a bit more than is stated in that comment, because it has to remove '&', '<' and '>' for it to be a well-formed HTML fragment. Note that I used this program to process itself.

2008-05-12

Make a book

I had been wondering about how to create a printed book from the novel that I had written a couple of years ago and I just now got it in my head to put everything together. I'll list the steps that I took from a big ASCII file to the printed page.
  1. Convert ASCII to LaTeX: (beyond the scope of this article)
  2. Produce a postscript file from the LaTeX source:
    • latex novel.tex
    • dvips novel.dvi
  3. Alternatively, if you're starting from an existing PDF (which is what I had to do, I had lost the LaTeX source) you can run pdftops to get to this step.
  4. Run psbook -q -s 8 novel.ps > novel_bookorder.ps — this rearranges the postscript file so that pages are in the following order 8, 1, 2, 7, 6, 3, 4, and 5. The -s 8 means use a signature of size 8. This will be useful during the next step.
  5. Run psnup -q -n 2 -p letter novel_bookorder.ps > novel_book.ps this puts two pages onto each 8.5x11 output page. Print the pages back to back. We will now have 4 pages on each sheet, two on the front and two on the back. The page will be in landscape orientation, but the printing will be side-by-side portrait. Coupled with the previous command we can now fold over two pages to form an 8-page signature
  6. Combine the signatures (sewing, stapling, choose your method)
  7. clamp the sewn signatures and run hot glue along the spine of the book, this would be the time to add a cover.
This should work reasonably well. Though I have not had much luck with staples, they seem to contribute too much thickness to the spine and make clamping awkward. Sewing would likely work much better but I don't have any good suggestions for technique.

2008-04-20

manpage

If anyone is actually intending on installing dehexer, here is the manpage:

.\" Dehexer
.TH dehexer 1 "20 April 2008" "1.0" "dehexer"
.SH NAME
.B dehexer 
is a simple program that converts a file of ascii characters into
the equivalent binary file. Any character not in the range 0-9 or
letters not A-F or a-f are simply ignored.
.SH EXAMPLES
dehexer < 
.I ASCII_FILE 
> 
.I BINARY_FILE
.SH DESCRIPTION
dehexer is used to convert a human-readable (i.e. description of
the bytes) into an actual binary file. Thus we get "0a" is
transformed into the byte 00001010 on disk (of course, it would
actually be the bits themselves).
.SH OPTIONS
None, just do normal I/O redirection.
.SH FILES
.P 
.I /usr/share/man/man1/dehexer.1.gz
.SH SEE ALSO
.BR hexer(1)
.SH BUGS
No known bugs at this time. 
.SH AUTHOR
.nf
Chris Wilson (christopher.j.wilson@gmail.com)
This program is dedicated to the public domain.
.fi
.SH HISTORY
2008 - Written as a compliment to Hal Canary's hexer program.

Gzip the above text and store it in

/usr/share/man/man1/
(at least on my system).

2008-04-19

dehexer

I was looking through Hal's blog when I came across his hexer program. I thought that I should write the complimentary program, that is, given an ascii file of hex characters (e.g. "01CAFEBABE") it will write actual bytes to stdout.

chris@papaya:c$ cc helloworld.c
chris@papaya:c$ ./a.out 
Hello World!
chris@papaya:c$ cat a.out | ./hexer | ./dehexer > helloworld
chris@papaya:c$ chmod a+x helloworld
chris@papaya:c$ ./helloworld 
Hello World!

It seems to work (I changed the name to 'helloworld' because doing

cat a.out | ./hexer | ./dehexer > a.out
seems to clobber the file in a bad way.

Without further ado:
/* dehexer - Convert an ascii file of hex characters into the
   corresponding binary file. Any non-hex characters are silently
   skipped (newlines, tabs, etc.)
   Copyright 2008 Christopher Wilson, based in part on hexer by Hal
   Canary (also DTPD)
   Dedicated to the Public Domain */
/* cc -o dehexer dehexer.c */
#include 
#include 
int main (int argc, char *argv[])
{
  char x;
  int char_out = 0;
  int low = 0; // LSBs, 1 for true
  while (fread(&x, sizeof(x), 1, stdin) == 1) {
    if (x > 47 && x < 58) { 
      // digit is 0-9
      x = x - 48;
      handle_byte(x, &char_out, low);
    } else if (x > 64 && x < 71) {
      // digit is A-F
      x = x - 55;
      handle_byte(x, &char_out, low);
    } else if (x > 96 && x < 103) {
      // digit a-f
      x = x - 87;
      handle_byte(x, &char_out, low);
    } else {
      // skip anything that isn't 0-9A-Z, or a-z
      continue;
    }
    low = (low + 1) % 2; // flip the high/low bit marker
  }
  return(0);
}

/* handle_byte - if low is true then it prints the full byte. If low
     is false, set char_out to 16 times x.
*/
int handle_byte(int x, char *char_out, int low)
{
  if(low) {
    *char_out += x;
    putc(*char_out, stdout);
  } else {
    *char_out = x * 16;
  }
  return 0;
}
/* EOF */

2008-04-13

Comments

So I've had this idea bouncing around in my head and I'm not sure why. Well that's not true, if I were so unsure I don't think that I'd post this. Maybe it is the idea of what's going on with a project called StupidFilter. It is pretty much what its name would suggest. Its aim is to create a software filter that would remove stupid blog comments (or any other content that you'd like to filter). Note that by my use of 'stupid' above, I'm not really talking about stupidity per se, (because that's very likely a hard problem). No, what I'm talking about may be more akin to a symptom of stupidity? Or maybe just gross violations of proper English, which doesn't really speak to a person's intelligence... I had been thinking about what would be the simplest, most brain-dead method for identifying stupidity in text. The first thing that sprang to mind was entropy. Entropy is the measure of how much uncertainty there is in something (at least as it applies to information theory). A coin toss has 1 bit of entropy, it is calculated like so:
So we get -1/2 * log(1/2) + -1/2 * log(1/2) = 1 I wrote the following code: here (which you can actually run, thanks codepad!). You'll notice that the text from a YouTube comment has a higher entropy than some text that I typed in. I'm actually just going by the letters, no punctuation is included. My hypothesis here is that badly mangled text will have a higher entropy than normal English. Perfectly random text (i.e. text with all 26 letters equally likely) has an entropy of 4.7 bits. This makes sense, since if you'd want to encode all the letters of the alphabet in binary, you'd need at least 5 bits (2^5 = 32, first power of two greater than 26). I think if I include punctuation and all that I may get a better "reading" because an exclamation point, being rare in normal text at least, would have a longer Huffman coding (I haven't really thought about this, could be wrong) and thus lend more to the entropy. My next idea was taken from my cryptography class. English has a certain frequency distribution for the letters (and numbers, punctuation etc.) that we can exploit. A string of 25 'z's in a row doesn't "look" like any standard sentence. We expect, roughly, that as the length of an English text increases the frequencies of the letters should approach 13% 'e', 9% 't', 8% 'a', and so on (List here). I wrote a very simple program here. I took the sum of the squared differences from the "normal" distribution as a "distance" measurement. We would expect a very long text to get very close to zero (i.e. the frequency distributions will tend to match). Both of these are really simple and would need a lot of work (multiplying together, weighting?) to be in in any way practical. But both constitute a very simple test of English-ness, basically, does the target text resemble English (at least statistically).

2008-04-11

Catching up!

Here are some posts that I've done recently. I'm moving them here because of the simplicity of using Blogger.

Fri, 04 Apr 2008

Stories of the Earth's Demise...

It may be the case that the Earth isn't doomed if the LHC produces a tiny black hole as the product of a high-energy collision. Read about the lawsuit that sparked the rebuttal.

Posted 2008-Apr-04 18:27

Wed, 02 Apr 2008

Learning a new language can be hard

I decided that I needed to try and pick up a new "language." Language deserves the scare quotes here because I'm referring to Vim. I know that emacs will probably remain my editor of choice, but I didn't want the whole vi side of the earth to remain an editor of last resort. I should be able to get around in vimopolis even if I can't converse fluently with the locals.

That said. I do find some of its features pretty appealing. It just seems to get out of your way in a fashion that emacs doesn't do. And I like the idea that if you know a movement command, say 'w' for moving over a word and 'c' for changing something then you can put them together to have vim delete the word that you're sitting on and drop the cursor right in place to type a new one.

So it is nice to see how the other half lives.

Posted 2008-Apr-02 13:42

Fri, 28 Mar 2008

I've been googled!

I just saw that my house is now (sort of) visible with Google street view. You can check it out here.

The driveway in the foreground leads up to my place. This will be nice for hosting parties, or anytime someone needs to know what the area around my house looks like. I have one of those places that is tucked away off the main road and so is usually hard to find.

Posted 2008-Mar-28 15:41

Tue, 25 Mar 2008

Entropy function

I find myself re-typing this into lisp all the time, so here it is, chiseled into digital stone:

(defun entropy (probs)                                                
         (* -1 (apply #'+ (mapcar #'(lambda (p) (* p (log p 2))) probs))))
Usage:
CL-USER> (entropy '(0.5 0.5))1.0
1.0

Just don't expect it to make sure the probabilities sum to 1!

Posted 2008-Mar-25 23:16

Mon, 24 Mar 2008

Affine cipher

I wrote a little program here to do simple affine encryption. I saw something very much like it elsewhere (can't think of where right now). It gives you the option to do a simple affine cipher, but be careful it will take the function y = ax + b (mod 26) so you can very easily find an a with no multiplicative inverse, caveat emptor.

If you're interested, and really how could you not be, here is the source code:

#!/usr/bin/env python
from sys import argv, exit

letters = "abcdefghijklmnopqrstuvwxyz"

def usage():
 print """affine
 Do affine encryption (y = ax + b (mod 26)). Case insensitive."""

def to_num(c):
 """
 >>> to_num('a')
 0
 >>> to_num('z')
 25
 """
 c = c.lower()
 if c in letters:
  return letters.find(c)
 else:
  return -1

def to_letter(num):
 """
 >>> to_letter(0)
 'a'
 >>> to_letter(25)
 'z'
 >>> to_letter(34)
 """
 if num >= 0 and num <= 25:   return str(letters[num])  else:   return None  def e(a,b,msg):  """  >>> e(1,1,"cat")
 'dbu'
 """
 out = ""
 for c in msg:
  x = to_num(c)
  if x == -1:
    out = out + c
  else:
    out = out + to_letter( (to_num(c) * a + b) % 26 )
 return out

def main():
 if len(argv) != 4:
  usage()
  exit(0)
 a = int(argv[1])
 b = int(argv[2])
 message = str(argv[3])
 print e(a,b,message)

def _test():
 import doctest
 doctest.testmod()

if __name__ == "__main__":
 #_test()
 main()

Posted 2008-Mar-24 19:28

Thu, 20 Mar 2008

Emacs autosave

Emacs, the best text editor in the world, has this distressing habit of making backup files all over the place. If I'm writing something about the Square Root of Christmas, say sqrtxmas.txt, then I'll get a little file like sqrtxmas.txt~ in the same directory. It is nice if I lose the file for some reason, but otherwise it can be something of a nuisance.

I found this website with the remedy. Props to you dude for making the best text editor in the world universe even betterer. You can visit the website for more details, but just to reproduce this little gem in one more place, here it is:

;; Put autosave files (ie #foo#) in one place, *not*
;; scattered all over the file system!
(defvar autosave-dir
(concat "/tmp/emacs_autosaves/" (user-login-name) "/"))

(make-directory autosave-dir t)

(defun auto-save-file-name-p (filename)
(string-match "^#.*#$" (file-name-nondirectory filename)))

(defun make-auto-save-file-name ()
(concat autosave-dir
(if buffer-file-name
(concat "#" (file-name-nondirectory buffer-file-name) "#")
(expand-file-name
(concat "#%" (buffer-name) "#")))))

;; Put backup files (ie foo~) in one place too. (Thebackup-directory-alist
;; list contains regexp=>directory mappings; filenames matching a regexp are
;; backed up in the corresponding directory. Emacs will mkdir it if necessary.)
(defvar backup-dir (concat "/tmp/emacs_backups/" (user-login-name) "/"))
(setq backup-directory-alist (list (cons "." backup-dir)))

Posted 2008-Mar-20 10:56

Voting machines

I have a bit of a problem with voting machines. Not the paper system because that seems to be a problem that has already been solved. What concerns me is the seeming lack of transparency of electronic voting machines. There is a recent story about New Jersey voting officials being told that they may not seek independent security audits of their voting machines. Over on Ed Felten's blog, Freedom to Tinker he has posted the e-mail that the voting company sent him. He has previouly demonstrated that some voting machines can be hacked.

Posted 2008-Mar-20 10:41

Wed, 19 Mar 2008

YACB

Yet another "chris blog"? What's up?

I guess I've decided that I write enough stuff on my Facebook, Myspace and even Orkut (remember that?) that it justifies just keeping a semi-regular blog. Besides, my cs account allows for cgi scripts and an easy shell-access so that it allows me to write in my favorite CMS, Blosxom.

I'll see how long I keep at it, but if my past record is anything to go by, I've done okay.

Posted 2008-Mar-19 19:47

twopoint718

About Me

My photo
A sciency type, but trying to branch out into other areas. After several years out in the science jungle, I'm headed back to school to see what I can make of the other side of the brain.