Birchbark letters

Posted by Quinn

It's been a busy month, albeit without one particular thing that would merit its own post. So, in summary...

This project aims to develop XML resources to facilitate research into the Novgorod birchbark letters. Transforming existing indices such as those found in Zaliznjak 2004 into machine-readable XML enables us to perform statistical calculations on the data, and brings to light statistically significant patterns that would be difficult for a scholar to recognize due to the size of the corpus.

Posted by Quinn

Birchbark letter XML

I have at least a first pass at the names-in-context data for documents 1-23, to supplement the proof-of-concept data I put together previously.

Posted by Quinn

To enable a more interesting proof-of-concept for the birchbark letter XML project, I've spent the last week making a new, limited data set (all documents from 1100-1120, plus some documents with the same names from the 12th century, and all the documents that include the name Boris) that lists all the names that occur in a given document and characterizes their role in the document. For the time being, I'm calling it "names in context" (NIC).

Posted by Quinn

The frustrating thing about attempting a proof-of-concept for the birchbark letters is that many of the calculations I find most interesting need to be run over the entire corpus-- which means a lot more data preparation than I currently have finished. The sample "names in context" (NIC) XML only has entries for 42 documents, so there's a lot of work yet to be done.

Posted by Quinn

I was hoping to delay this until I had the Subversion repository ready to distribute the first versions of the XML, but setting it up is taking a bit longer than I anticipated. I'm working on a proof-of-concept for the kinds of analysis that can be done using the name and date indices together; I'm hoping to finish it in about a week or so. Here's the current status of the deliverables as of 20 June 2010: