The Budapest Glagolitic Fragments

Introductory Page to the XML edition

The Budapest Glagolitic Fragments (Budapest, Országos Széchényi Könyvtár, MS 12º Eccl. Slav. 2) are two fragments from a parchment manuscript containing nine lines of text from the Old Slavonic translation of the Life of St Symeon Stylites written in round glagolitic script; they are almost certainly the oldest witness to the Slavonic version. The fragments have been edited twice in paper editions, by Király in 1955 and by Reinhart and Turilov in 1989-90. The object of the present study is therefore not primarily to publish the text as such, but rather to demonstrate the potential of the new technology for the encoding, edition and publication of glagolitic texts, and to encourage its use for this purpose. Nevertheless, some new contributions to the study of the fragments are made in the commentary to the edition.

As far as I am aware, these pages represent the first XML encoding of a glagolitic text to be made available on the web. They are not the first glagolitic texts on the web in any form: a number of digital images exist, and the CCMH has made a very important contribution in making available some major Old Church Slavonic texts in an encoding similar to the Greek Beta transcription. This is, however, the first time that XML has been used for this purpose.

The advantage of XML over SGML (of which it is a subset) is that the user requires much less in the way of software and auxiliary files in order to view the documents; like generic SGML, it allows a much greater flexibility and sophistication in displaying and processing a text than encodings which are essentially text strings without markup, while at the same time it is platform independent and, since it confines itself to the basic ASCII repertory of characters, capable of transmission across the Internet without risk of corruption. Non-ASCII characters are generally capable of being displayed by means of Unicode entity references. However, glagolitic is not yet included in the Basic Multilingual Plane of Unicode. (A a proposal has been made by Michael Everson for the inclusion of glagolitic in the Basic Multilingual Plane of Unicode, but there is still a considerable way to go before it is accepted.) In order to display glagolitic, therefore, local entity references have to be mapped to a local font. The entity references used in the present publication are based on Anders Berglund’s 1992 proposal for a Croatian glagolitic entity set, but extensively revised. The mapping is achieved within the DTD. The most easily available glagolitic font in the public domain is Old Church Slavonic Gla, which may be downloaded from Dr. Berlin’s Foreign Font Archive.

Although Greek and cyrillic are included in Unicode, it is not possible to display them properly unless suitable fonts are installed on the system, and in practice many “Unicode” fonts do not include all the glyphs necessary for polytonic Greek, let alone old cyrillic. Fortunately there is at least one Unicode font in the public domain which has the necessary glyphs for old cyrillic, Christoph Singer’s Kirillica Nova Unicode; his homepage also offers a number of freely downloadable fonts which include the necessary Greek characters, in case these are lacking on your system. If you wish to display glagolitic and old cyrillic text in fonts other than Old Church Slavonic Gla and Kirillica Nova Unicode respectively, it will be necessary to edit the stylesheet; in the case of glagolitic, if the characters are mapped to different positions in the new font, it will also be necessary to edit the DTD.

The Fragments have been encoded in conformity with the Text Encoding Initiative (TEI). Using the TEI to encode a nine-line document may seem like a case of using a sledgehammer to crack a walnut, but nevertheless using a mark-up scheme which has become standard for many users has distinct advantages, and it also proves that it can be done (and is thus also feasible for more substantial texts). The DTD for the edition was generated by the TEI Pizza Chef and subsequently edited manually. The only significant modification made to the TEI DTD was the introduction of a global ws attribute to denote the writing system of any particular element. The TEI was evidently compiled on the assumption that each language uses a single writing system: thus it prescribes that "The writing system declaration is associated with different portions of a main document by means of the global lang attribute" (TEI P4 ¶25.6 – subject to revision). This of course is a false assumption. In the commentary to the present edition, for example, a value of os for the lang attribute would not indicate whether the content of a given element were to be rendered in glagolitic or cyrillic, or indeed in latin transcription. The ws attribute allows us to indicate the writing system of a given element independently of its language. A similar problem arises in respect of the <language> element within the profile description of the file, which is allowed to refer only to a single WSD, by means of its wsd attribute. This has been solved by omitting the wsd attribute altogether for the Old Slavonic <language> element, and placing the information about the writing system declarations used within the content of the element, sitting rather uncomfortably inside a <phr> element.

In order to view the documents, all you will require are an XML-capable browser and, as mentioned above, the fonts for glagolitic, old cyrillic and Greek. I have found that the documents can be displayed well with Doczilla 1.0; I should be interested to know whether other browsers work equally well. The edition itself is contained in two files, of which the first, http://userwww.port.ac.uk/cleminsr/xml/bgf.xml, contains the text of the fragments, and the second, http://userwww.port.ac.uk/cleminsr/xml/comment.xml, contains the commentary.

First edition, 2002-03-14