Tuesday, February 7, 2012
Monday, February 6, 2012
I want to write/illustrate something like Dinotopia for our daughter. This is a table of illustration categories from the first 100 pages of the 3 big Dinotopia books (First Flight excluded).
Gurney's books are what every anthropologist wishes she could produce from the field. I love the feel and look of a 19th century naturalist's notebook and would love to imitate it.
Friday, February 3, 2012
I tried to run a few big samples, and it seems that my little device will go no grander in size than something the length of Beowulf - which for the time being is okay. I can cover a lot with that.
Internet Explorer continues to vex me with it's native input character length restrictions.
I think there may be something strange going on with the/to frequencies - more on that in a later post.
I think I may try something like map reduce to speed up the analysis of really large texts.
Thursday, February 2, 2012
I wanted to build a kind of microscope (maybe mass spectrometer would be a better analogy) for natural language, something that would give me a window into patterns of word use that are invisible to a gentle reading.
I was inspired by this guy, James Pennebaker at the University of Texas at Austin who stumbled on a bunch of - normally invisible - patterns in the way people use language.
In the coming weeks I'll build up a few other tools to deal with the data - graphs and raw numerical metrics - that I hope will start to give a better window into what people are saying between the lines.
I'm calling my tool Lingua Compara.