R.M.Cleminson

The Budapest Glagolitic Fragments

Introductory Page

This page provides access to an electronic edition of the Budapest Glagolitic Fragments and to the auxiliary files needed to view it. I have tried to be as "un-technical" as possible, and users unfamiliar with SGML should not be put off by the amount of detail given, since it is provided precisely for their benefit: I would hope that the information and instructions below are sufficient to allow anyone access to the edition, even if they have never used SGML before.

The Budapest Glagolitic Fragments (Budapest, Országos Széchényi Könyvtár, MS 12º Eccl. Slav. 2) are two fragments from a parchment manuscript containing nine lines of text from the Old Slavonic translation of the Life of St Symeon Stylites written in round glagolitic script; they are almost certainly the oldest witness to the Slavonic version. The fragments have been edited twice in paper editions, by Király in 1955 and by Reinhart and Turilov in 1989-90. The object of the present study is therefore not primarily to publish the text as such, but rather to demonstrate the potential of the new technology for the encoding, edition and publication of glagolitic texts, and to encourage its use for this purpose. Nevertheless, some new contributions to the study of the fragments are made in the commentary to the edition.

As far as I am aware, these pages represent the first SGML encoding of a glagolitic text to be made available on the web. They are not the first glagolitic texts on the web in any form: a number of digital images exist, and the CCMH has made a very important contribution in making available some major Old Church Slavonic texts in an encoding similar to the Greek Beta transcription. This is, however, the first time that SGML has been used for this purpose.

The advantages of SGML are that it allows a much greater flexibility and sophistication in displaying and processing a text than encodings which are essentially text strings without markup, while at the same time it is platform independent and, since it confines itself to the basic ASCII repertory of characters, capable of transmission across the Internet without risk of corruption. In this respect it resembles the standard "web page". However, HTML, the subset of SGML used for webpages, is incapable of handling glagolitic text, and, since glagolitic is not yet included in the Basic Multilingual Plane of Unicode, even the most recent versions of HTML, with their capacity to process unicode character entities, are unlikely to be able to display it for a long time yet. Glagolitic webpages must therefore be written in SGML.

First steps towards putting glagolitic on the web were taken in 1992 with Anders Berglund's proposal for a Croatian glagolitic entity set, and subsequently a proposal was made by Michael Everson for the inclusion of glagolitic in the Basic Multilingual Plane of Unicode, but at the time of writing there is still a considerable way to go before it is accepted. The entity references used in the present publication are based on Berglund's set, but extensively revised.

The Fragments have been encoded in conformity with the Text Encoding Initiative (TEI). Using the TEI to encode a nine-line document may seem like a case of using a sledgehammer to crack a walnut, but nevertheless using a mark-up scheme which has become standard for many users has distinct advantages, and it also proves that it can be done (and is thus also feasible for more substantial texts). At the same time, it has proved necessary to write an extension DTD modifying the TEI DTD in two important respects. The first of these concerns the TEI milestone tags (specifically <lb> and <pb>), which, as at present declared in the TEI DTD, may not occur anywhere within a document. (In the recently released preliminary test version of the TEI P4, this problem has been rectified. However, it is the TEI P3 that is in general use at the moment.) In order to make them available, the parameter entity for the globincl element class has been replaced with that for the refsys class within the declaration of the <text> element. This has the unfortunate side-effect of disabling the members of the former class, which means that it would not be an acceptable method for anyone who wanted to use these as well as the milestone tags; however, it works very well for our present purposes.

The second modification introduces a global ws attribute to denote the writing system of any particular element. The TEI was evidently compiled on the assumption that each language uses a single writing system: thus it prescribes that "The writing system declaration is associated with different portions of a main document by means of the global lang attribute" (TEI P3 ¶25.6). This of course is a false assumption. In the commentary to the present edition, for example, a value of os for the lang attribute would not indicate whether the content of a given element were to be rendered in glagolitic or cyrillic, or indeed in latin transcription. The ws attribute allows us to indicate the writing system of a given element independently of its language. A similar problem arises in respect of the <language> element within the profile description of the file, which is allowed to refer only to a single WSD, by means of its wsd attribute. This has been solved by omitting the wsd attribute altogether for the Old Slavonic <language> element, and placing the information about the writing system declarations used within the content of the element, sitting rather uncomfortably inside a <phr> element.

In order to view the documents, you will require certain applications and auxiliary documents, as listed here. Note, particularly in respect of the files which I am making available locally, that it is not possible to view some of them with an HTML browser, which will misinterpret their content, and they must therefore be downloaded as files.

Available publicly on the web you will find:

An SGML browser
The market leaders appear to be Citec's MultiDoc Pro (which I use), which now seems to have been swallowed up by their new Doczilla browser, and Softquad's Panorama. If you produce something else which I haven't mentioned, don't blame me, blame your marketing department: I haven't heard of it.
The TEI DTD and its associated files
These are available from the TEI home page.
A glagolitic font
There are a number of free ones available on the web, for instance at Dr. Berlin's Foreign Font Archive; a search will probably reveal more. There are also good commercial fonts.

The rest are contained in the compressed file bgf.zip, which you can download and "unzip" here. Make sure you know where you have put the files! They comprise:

Entity sets and writing system declarations for the various scripts used.
Those for languages using the Latin alphabet in its various guises are available together with the TEI DTD above. You will however also need entity sets for glagolitic, specially prepared for this edition, old cyrillic, based on David Birnbaum's article in Computer Standards and Interfaces, vol.18 (1996), pp.201-252, and Greek, since the ISO entity sets provided with the TEI only cover modern (monotonic) Greek.
Some of the WSDs used are similarly available together with the TEI files, but you will also require those for glagolitic and cyrillic, both old and modern.
Since linkage to the above is achieved by using public identifiers, you will need to add the relevant public identifiers to your CATALOG file.
These are listed, together with an explanation of the above for those unfamiliar with the system, in my note on catalogs. This is also included in the zipped file above.
A mapping file allowing the browser to map the glagolitic entity references to the appropriate places in your font.
The one I have used is likewise in the zipped file. This will of course only work properly if you are using a font which maps the letters to the same positions as the one I have used (which was SynthesisSoft's "Glagolica Bulgarian"), which very conveniently maps them to the positions occupied by the equivalent cyrillic letters in CP-1251. If your font uses a different mapping, the mapping file will have to be modified. You can also paste the contents of this file into the mapping file you already use, but here a word of warning: MultiDoc Pro, at least, appears to put a limit on the size of mapping file it will process, which means that it reads the file up to a certain point and ignores anything beyond it. Therefore if you add this data at the end of your existing file and nothing happens, this is probably the reason. If you can sacrifice a sufficient number of mathematical symbols or other entities not needed for your purposes (having backed up your original mapping file first, of course), your browser should be able to process this material.

Finally, it will be possible to view the edition. This is contained in two files, the edition of the text itself and the commentary. The URLs for these are http://userwww.port.ac.uk/cleminsr/bgf.tei and http://userwww.port.ac.uk/cleminsr/comment.tei. You can either type these into your SGML browser or else configure your HTML browser to open this type of file with your SGML browser and just click on the links in this paragraph. If you decide to download the two files, you will also need to download the auxiliary files associated with them, namely the TEI extension files and the stylesheets, and place them in the same directory. These files are included in the zipped file available above.

First edition, 2001-07-30