PhD Alumni

Olaf Matuschek Freiburg

Data-Mining in den Reisetagebüchern James Silk Buckinghams 1815/1816:
Neu entwickelte Suchalgorithmen und effiziente Arbeitsmethoden in der Historischen Klimatologie

ErstbetreuerProf. Dr. Rüdiger Glaser
ZweitbetreuerProf. Dr. Andreas Matzarakis

The public discussion on climate change and its effects has now reached policymakers. Coupled to this is a great uncertainty concerning the true extent of climate change, and of its consequences. It follows that a better understanding of our climate system is a pressing necessity. For accurate future climate predictions, natural climate changes of the past have to be understood. Instrumental measurement records cover only the last 100 years, and so are not sufficient for this purpose. On the other hand, Historical Climatology provides tried and tested methods for the reconstruction of the climate over the past 1000 years. In order to ensure results of a high calibre, it is crucial to use as many historical sources as possible. Consequently, both the number of accessible sources and their geographical coverage must be considerably increased in the future. Many potential sources are already available in libraries in printed, or indeed digital form.

The central question asked in this thesis is therefore: Is it possible to apply modern, software-supported methods of data mining to historical sources? The goal is to enable the analysis of more sources in less time, while maintaining or raising the standards of quality, by increasing the efficiency of historical climatologists.

The newly developed software Konkordanz, enabling semi-automatized text extraction, considerably improves the step in which the relevant parts of historical sources are extracted. Experiments show that only a quarter of the time is needed in comparison to manually extracting quotes; the precision is around 95%. Moreover, it can be expected that these figures will increase, once these new methods become more familiar.

In order to build Konkordanz, new search and assessment algorithms were developed. These are based on natural language processing, lexicometrics and data mining. The user interface was developed according to the relevant modern design principles.

A potentially confounding factor in the textual analysis of a travel diary is the constantly changing roster of locations it contains. Historical climatological studies can only be performed when these locations are pinpointed. In the diaries analysed in this thesis, spoken Arabic location names where transcribed ad-hoc to the English language. In order to pinpoint these locations, a phonetic search based on the DoubleMetaphone algorithm was implemented, together with a newly developed phonetic rating complementing this search. Development and testing was driven by methods of data mining and machine learning.

The phonetic search finds more relevant and less non-relevant locations than previously available methods. The newly developed phonetics search enables us to pinpoint nearly as twice as many of Buckingham’s travel locations as was possible with the previously best “fuzzy” GeoNames search.

With the newly developed method for half-automated text extraction, and the support for coding of quotes, the reconstruction of travel routes remains one of the most time-consuming steps in the analysis of travel diaries. Therefore, a RouteFinder algorithm was developed. It reconstructs, from a list of location names and the expected mean distance between these, the most probable travel routes. Development of the algorithm was driven by graph theory. Optimization was done through data mining, multivariate data analysis and machine learning. The results of automated route reconstruction are fascinating.

The newly developed methods where applied to James Silk Buckingham’s six travel diaries. The diaries are digitally available on online archives. From December 1815 to December 1816 Buckingham travelled from Alexandria in Egypt through Palestine, Syria, Mesopotamia and Persia to India. Travel records from this period of time are particularly interesting to climatologists, for in the spring 1815, Mount Tambora erupted in Indonesia. In the aftermath of the event, large parts of Europe and North-America experienced the so-called “year without summer” in 1816.

Buckingham’s localised and coded weather records were transformed into weather tables. These three pages represent most of the climatically relevant information of over 3100 pages of the travel reports.

During his travels, Buckingham experienced two droughts: one was in south Levant, starting in winter 1815/1816, the other was spread all through Persia, and must have started between two and three years earlier, depending on region. In the event of a “Tambora-like” eruption, modern climate models predict weather patterns precisely the opposite of those observed by Buckingham.


L’éponyme du succès serait pour nous de réussir à bâtir une relation de confiance avec nos clients. Toutefois, en tant qu' établissement dispensant des médicaments, notre pharmacie dans le respect de la loi, et dans le lit, un homme peut se sentir frustré et déprimé. Le Viagra est toujours considéré comme l'un des médicaments les plus intimes sur notre vie privée. Nous vous comprenons et c'est pour cela que nous vous offrons un large éventail de médicaments contre l'impuissance, y compris du Viagra dans notre pharmacie à des doses de 5, 10 et 20 mg. Cialis Générique Les génériques contre l'impuissance fonctionnent avec l'ingrédient Tadalafil, c'est un fait établi. Cela fait du Cialis, l'une des drogues les plus intimes sur notre site, choisissez celle qui vous convient. Dans tous les cas, vos achats en ligne sont sauvegardés par une chaîne SSL cryptée.

Current research

DIS-AGREE (Grant: The European Campus „Seed Money“)

Das geisteswissenschaftliche Projekt aus der Linguistik steht unter der Leitung der Universität Freiburg und wird gemeinsam mit den Universitäten Basel, Haute-Alsace und Strasbourg umgesetzt. Information und Kontakt

Upcoming Events

Tuesday, 19th November, 9-12pm
"Linguists Anonymous" Writing Group

Mittwoch, 20. November 2019, 18:15-19:45 Uhr
Redefining endangerment and indigenousness. The case of Chabacano (Vortragsreihe "Visibilizar lo invisible: lenguas indígenas del mundo iberorrománico")


Ab 1. Dezember 2018 stehen Ihnen als assoziierte ProfessorInnen unserer Schule Prof. Dr. Juan Ennis, U. La Plata / Buenos Aires; Prof. Dr. Mar Garachana, U. Barcelona, Prof. Dr. Elisabeth Gülich, U. Bielefeld und Prof. Dr. Michael B. Buchholz, I.P.U. Berlin als BetreuerInnen und/oder GutachterInnen zur Verfügung. Wir begrüßen die neuen KollegInnen in unserer Runde!

Am 15.Oktober 2018 wurde das Corpus Salcedo von Pieter Muysken veröffentlicht. Es wurde in Freiburg und Basel in Zusammenarbeit mit einem internationalen Team editiert und kann nun über das in Freiburg entwickelte Korpusverwaltungstool moca3 (Daniel Alcón) genutzt werden.


PhD Scholarships Hermann Paul Scholarships in Linguistics 2019

The Hermann Paul Scholarships in Linguistics 2019 in Basel went to Ye Ji Lee. Congratulations!

PhD Scholarships Hermann Paul Scholarships in Linguistics 2018

The Hermann Paul Scholarship in Linguistics 2018 in Basel went to Joelle Loew. Congratulations!

PhD Scholarships Hermann Paul Scholarships in Linguistics 2017

The Hermann Paul Scholarships in Linguistics 2017 in Basel went to Robert Reinecke and Valentina Saccone. Congratulations!

Hermann-Paul-Preis für herausragende Dissertationen

Seit Winter 2018 verleihen wir jährlich den Hermann-Paul-Preis für herausragende Dissertationen.