The Drawing Board

I want to write/illustrate something like Dinotopia for our daughter. This is a table of illustration categories from the first 100 pages of the 3 big Dinotopia books (First Flight excluded).

Gurney's books are what every anthropologist wishes she could produce from the field. I love the feel and look of a 19th century naturalist's notebook and would love to imitate it.

Lessons Learned: Lingua Compara 00

I tried to run a few big samples, and it seems that my little device will go no grander in size than something the length of Beowulf - which for the time being is okay. I can cover a lot with that.

Internet Explorer continues to vex me with it's native input character length restrictions.

I think there may be something strange going on with the/to frequencies - more on that in a later post.

I think I may try something like map reduce to speed up the analysis of really large texts.

Lingua Compara

I wanted to build a kind of microscope (maybe mass spectrometer would be a better analogy) for natural language, something that would give me a window into patterns of word use that are invisible to a gentle reading.

I was inspired by this guy, James Pennebaker at the University of Texas at Austin who stumbled on a bunch of - normally invisible - patterns in the way people use language.

In the coming weeks I'll build up a few other tools to deal with the data - graphs and raw numerical metrics - that I hope will start to give a better window into what people are saying between the lines.

I'm calling my tool Lingua Compara.