About

Home

View the text (HTML)

View the text (XML)

Search the text

Concordance

Chapter lengths

About

About Us

We are three Stanford University students who took Prof. Matthew Jockers' "Digital Humanities" class in fall 2007.

About This Project: Technologies (426 words)

Our project was to encode the novel given us in TEI-compliant XML, display it using either PHP or an XSLT transformation, and write three analysis tools in PHP for use with the text.

TEI: The TEI, or Text Encoding Initiative, is a consortium which develops and maintains standards for the encoding of electronic texts. The TEI provides a set of standardized XML tags that can be used to label various aspects of a document. For more information, see http://www.tei-c.org/index.xml. We mostly used the standard tags (for paragraphs, chapters, information about our source, etc.). We encountered some difficulties in encoding the front material, particularly the byline for the illustrator, the list of illustrations, and the dedication. Our solutions to these problems can be seen in the XML document.

PHP: PHP is a web-based programming language. We wrote the three analysis tools in PHP, as well as using a nifty little function called "require" to insert the same header on each page while retaining the header itself in one location for easy editing. For more information about PHP, see http://www.php.net/ or the Wikipedia article on PHP.

XSLT: We initially attempted to create a PHP display for the text, but found this extremely difficult. Therefore, we ultimately decided to use an XSLT transformation (viewable here), which uses algorithms to change an XML document into a document with different tags or encoding, to create an XHTML display for the text which preserves more of the features of the original document (italics, page breaks, etc.). Even here, there were difficulties with emdashes and with including the one necessary line of PHP (see above) such that a minor amount of hand-coding was ultimately necessary after the transformation. For more information about XSLT, see Wikipedia or IBM's page on XSLT.

CSS:We also made use of CSS stylesheets to easily standardize the display of our website. You can see the stylesheet we use here. For more information about CSS, see its Wikipedia article

About This Project: Theory (433 words)

We encountered a number of points of theoretical difficulty in the course of this project. In as many as possible, we decided to be as faithful to the printed text in our possession as we were able. For this reason, we have replaced double hyphens ("--") in our text with emdashes ("—"), retained the asterisks for scene breaks within chapters, and maintained the distinctive line breaks inserted in the dedication in the physical book. We deviated from the literal presentation of the physical text by deciding not to include the list of Kyne's other works from the front of the book, on the basis that such a list had no real significance for the text at hand and would be easily accessible by other means.

The decision to be faithful to the text was itself a significant one. We ultimately chose to be as faithful as we were able because we wanted to put as little of our own interpretation into this text as we could, to the greatest extent possible allowing readers to encounter the text as they would if they were to pick up the book as we had it.

We encountered difficulty with the illustrations; in some cases, the location of an illustration in the book did not match its location in the digital text that we were provided. We eventually decided to place references in the XML to illustrations in positions matching the positions the illustrations occupy in the print book, as this was both more faithful to the actual document and made the illustrations occur at the most reasonable places in relation to the prose. We also chose to scan the illustrations, as they seem to be an important part of the book and are listed in our front matter. However, since one illustration was actually missing from the book, we decided not to include any illustrations with the HTML text, instead placing them on a separate page for viewing.

We envision researchers using this site as a jumping-off point for their analyses. Word searches, concordances, and tools that calculate relative chapter length do not provide any conclusions; they simply add data to all of the things that a literary critic can consider in the course of constructing an argument. We hope to facilitate exploration and provoke thought, but to suggest questions and avenues for consideration rather than answers. The ideal researcher would use our tools in a playful manner to generate interesting results that would then need to be explained by thoughtful reference to the text as a whole.