Relish: Rendering Endangered Languages Lexicons Interoperable through Standards Harmonization
Zusammenfassung der Projektergebnisse
RELISH (“Rendering Endangered Lexicons Interoperable through Standards Harmonization”) was funded in 2008 through the Bilateral Digital Humanities Initiative of the German Research Foundation and the National Endowment for the Humanities of the United States. The RELISH project was inspired by two pressing concerns: (a) the need to share the scarce lexical resources available on endangered languages and (b) the major barriers existing to lexicon interoperability. At the time of RELISH’s inception, the most significant barrier was that standards-setting bodies had arrived at different standards for format and markup on the two sides of the Atlantic. The RELISH project was designed to promote language-oriented research by (a) harmonizing the digital standards for lexical information developed in Europe and America, and (b) migrating six heterogeneous lexicons of endangered languages into the new standards-compliant format. The procedure was designed to be generalizable to the large store of lexical resources involved in the American LEGO project implemented by LINGUIST List (LL) at Eastern Michigan University and in the extensive European documentation project DoBeS, spear-headed by the Max Planck Institute for Psycholinguistics (MPI). Collaborating throughout the three years of the RELISH project, MPI, LL, and the Johann Wolfgang Goethe-Universität Frankfurt (UF) integrated the two terminology frameworks used for lexical information in Europe and America and created an XML interchange format to promote interoperability among existing lexicons of endangered languages. The team harmonized terminology between the General Ontology for Linguistic Description (GOLD) and the International Standards Organization Data Category Registry (ISOcat), uploading GOLD into the ISOcat registry. Additionally, the team harmonized the standards for lexicon structure formalized in the Lexicon Interchange Format (LIFT) used in America and the Lexical Markup Framework (LMF), a meta-standard for multilingual lexicons, which is more influential in Europe. An LMF-compliant version of the LIFT schema was developed to be an interchange format within the RELISH project. Three LEGO lexicons (Fulfulde, W. Sissaala, and Potawatomi) and three DoBeS lexicons (Wichita, Tuva, Udi) were migrated to the RELISH schema, annotated with GOLD/ISOCat data categories, and round-tripped between the American and German projects. Thus, the RELISH project successfully accomplished its goals, uniting the differing sets of lexical concepts and creating an interchange format (the RELISH schema) for the differing structural models. The project also initiated a virtual archive, with six lexicons of endangered languages round-tripped between participating American and German projects in order to demonstrate the usability of the harmonized standards. Language data are central to a large scientific community, including anthropologists, archaeologists, historians, geneticists, sociologists, and linguists. Ensuring the interoperability of any individual lexicon exponentially increases its potential scientific contribution. The proposed harmonization of standards will streamline the future development of software tools and web services deployed in lexical research. Accordingly, the outcomes of the project add value to other projects already funded with public funds in Europe (e.g., LIRICS, CLARIN) and the US (E-MELD, the Data-Driven Ontology Project, GOLDComm, LEGO); and it contributes to the ongoing effort of developers and funding agencies to make the most efficient use of scarce resources. As collaboration between two of the organizations that have been instrumental in promoting both endangered languages documentation and standards development in Europe and the US, this project has provided impetus for other standards harmonization efforts, as well as offered the scientific research community flexible and integrated access to important new digital materials.
Projektbezogene Publikationen (Auswahl)
- (2012). „Rendering Endangered Lexicons Interoperable through Standards Harmonization”: The RELISH Project. In N. Calzolari (Ed.), Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, May 23rd-25th, 2012 (pp. 766-770). European Language Resources Association (ELRA)
Aristar-Dry, H., Drude, S., Windhouwer, M., Gippert, J., & Nevskaya, I.