Development of a geographical dictionary in the context of NOKIS
The North and Baltic Sea Information System (NOKIS) aims to develop an information structure for the German coast. This article describes how a geographical dictionary, a "gazetteer", has been developed within the context of NOKIS. The article describes also the encountered problems and the evaluation (including some field research).
Background of NOKIS
NOKIS, the North and Baltic Sea Information System, has the goal to establish an information infrastructure for the German coast, driven by metadata. The system uses the international standard ISO 19115 for metadata and realizes a working environment for the production of metadata with an editor; which was developed for this purpose, and a map-based search, which brings up existing metadata.
Based on NOKIS, a comprehensive information infrastructure was developed between 2001 and 2004 with participation of eight typical coastal departments (Lehfeldt & Heidmann, 2004). In the meantime, fourteen departments from the coastal German Federal States and central government are involved. The German Federal Ministry for Education and Research (BMBF) supports the NOKIS++ project since 2005, in which new functionalities are developed (Lehfeldt et al. 2006a , Lehfeldt et al. 2006b, Kohlus & Heidmann, 2006a), with the additional objective to create possibilities for using the described data and also to establish web-services for coastal departments in accordance with the ISO19119. Among other things, the search for information by toponyms should be supported. Beyond that, the gazetteer-service should be made available as an independent service. As a technical base for implementing the NOKIS gazetteer, the gazetteer of the Alexandria Digital Library (ADL 2005) project, was selected. After some conceptual considerations (Kohlus & Heidmann, 2006b) it became clear that some special characteristics would be needed for a coastal gazetteer which are not covered by the ADL data-model.
Since the end of 2005, a concrete concept for the data contents has been developed and implemented for test areas during the course of 2006. The data acquisition was done by using very different procedures, from data transformation to data source evaluation, up to field work. The heterogeneous material is used for critical examination of the present gazetteer concept and also to assess the deviation from the data-model and services.
What is a gazetteer?
The expression “gazetteer” goes back to the Latin word Gazetta. A gazetteer is defined as a geographical dictionary, giving place names and some additional explanations.
History of gazetteers
The Topographia Germaniae Inferioris (1642 to 1655) of Matthaeus Merian and Martin Zeiller, which delivered a total admission of the German empire in sixteen volumes with copper engraving of city views and maps at the time of the Thirty Years’ War, became well-known. One of the earliest topographies of Schleswig-Holstein, today a German federal state, was written by Dankwerth in 1652, titled Newe Landesbeschreibung der zwey Hertzogthümer Schleswich und Holstein. In the 16th century also in Britain such first gazetteer had been written. They became common, particularly in the 19th century many publishing companies released detailed local geographical listings. They received a lexical character with alphabetically arranged listings of place-names and descriptions of the location. Gazetteers became popular and common like dictionaries and encyclopaedias and other common sources. In 1855 the renowned publisher W.G. Blackie praised his self-published Imperial Gazetteer:
“Next to a good dictionary, the most generally useful book is a good gazetteer” (after Gittings & Munro, 2004).
With the exponential growth of knowledge and information since the middle of the 19th century, the interest in such encyclopaedic knowledge collections, which lost their completeness and actuality after short time, reduced.
Recent development of gazetteers
Only since the pervasive availability of computers, concepts for the administration and provision of encyclopaedic information receive new impacts. World-wide information collections develop and search engines provide a lot of this information. After NASA and some other providers, also the search-engine provider Google offers a free possibility of viewing each part of the earth by satellite or airborn images and maps (Zota, 2005). Toponyms are the key to linguistic communication about spatial objects. The usability of satellite images or maps without toponyms is rare. For this reason Google Earth has implemented a gazetteer which is based on the information of the National Geospatial Intelligence Agency (NGA 2005). Digital gazetteers are also well known by their use in route planning software, in which they form the database beside the route listings.
Such modern gazetteers are primarily catalogues of spatial units, assigned to coordinates. There are many projects that concentrate on the development of gazetteers. They differ by the described units, the working areas and the information depth. Mainly names of settlements and their membership to administrative units are listed, but also names of natural features, man-made structures (bridges, buildings) are considered (Hill & Zengh, 1999). Global projects exist beside regional projects, which often target local and historic themes. The World Gazetteer () contains data belonging to cities, extended by hierarchical information about the country / the province and other descriptive data such as total population and coordinates. For each country it‘ll be possible to retrieve information about provinces and cities, as well as a map. The regional Gazetteer for Scotland (Gittings & Munro, 2004) contains not only administrative units but also topographic names and combines spatial units with historical events or family names.
On the basis of such gazetteers, searching tools could be developed which use place-names to find information with spatial relationship. These software tools are called gazetteer-services. They use semantic information and geographical ontology of the gazetteer data base, to generate queries on the datasets. Above all, the main goal of the gazetteer project is to offer a possibility to find datasets on measurements, mapping or literature among other things - with the help of toponyms.
Determination of the study area
In the end of 2005, the first considerations about the volume and sources and the way of reclamation the word-pool have been made. Firstly a spatial limitation was set. The discussion of coastal areas in the context of the conception of ICZM showed that the extent - to the sea as well as to the hinterland - is a function respective to the question. Questions like the entry of harmful substances by river systems or the linkage of transportation nets and ports can only be recognised from the perspective of the overview. Questions of the protection of salt marshes or planning of coastal protection constructions need detailed local information. The relevant factors of influence are concentrated at the narrow transition area from sea to land. In the hinterland, therefore, only a coarse net of settlement names and administrative units are generated. Compared to the land, in the offshore area westward from the "Wadden" only a few toponyms might be found. For NOKIS mainly the names within the range of the Wadden Sea (coastal waters according to the European Water Framework Directive (EWFD)) and of a ten kilometres zone along the coast will be processed. In some testing areas also micro-toponyms of specific aspects will be collected (Figure 1). The borders and designation of research areas, sheet cuts of maps and areas with special status - protected areas, military areas etc. - are originally determined by the partner institutes of the NOKIS project. Those data are made directly available by the project partners and could be used as named areas for the search.
Other place names, in particular field names, cannot be made available at full coverage for the entire area because of lacking quantity and complexity of the evaluation of the sources. However, in the test areas field names are acquired to some typical landscape features - like polders, dwelling mounds or bluffs. With the aid of these data, acquired in the test areas, the technical concept for the gazetteer can be examined. Based on this data ascertainment, the data model was analysed and extended for the requirements of geographical ontology, chronicle validity and to provide references. In Figure 1 we see that toponyms are made available on the seaside for the German sovereign territory as well as up to 50 km to the hinterland, depending on the basis of the inside borders of the coastal waters according to the EWFD. In the test areas (black bordered) different data sources for the reclamation of the word-pool are made available and used for the examination of the data modelling and services.
A framework of settlement- and marine water names is provided for the whole area. In the marine area the digital landscape model 1:250.000 of the BKG (Federal Office for Cartography and Geodesy) is used as basic source, in which for many toponyms validity areas are already contained. Names and their spatial allocations are harmonized to and completed by other sources. The location and dimension of the objects are updated with the help of the bathymetric information of actual nautical charts (see next chapter). For the terrestrial area, the digital base landscape model (DLM) of ATKIS (Official Topographic-Cartographic Information System-ADV, 2005) of Germany’s Federal States’ land surveying offices is used. In NOKIS the homogenised and standardized dataset of the BKG is used. Names of settlements, administration areas and water names are abstracted and geometrically simplified.
The settlement area of one locality has mostly several separate parts. In ATKIS this area is often additionally divided along sheet cuts or supply units. The actual status of ATKIS data does not allow to differentiate such separate parts from another locality with the same name. To create a unique identification code for settlements with homonyms, the data are reworked with a GIS. Spatial neighbourhood and intersections with administration-areas are used to create this code. This identification code is also needed to link additional information, like historical or differently-linguistic designations, to the frameworks objects. Beside the settlements, the administrative community area is taken out of ATKIS (ADV 2005). Separated polygons can be joined by a unique communal code. Since the data of ATKIS contain no version number or publication date, the file date of the available data is assigned as temporal validity of the geometries. A tool has been created to convert the data - toponyms and geometries - into a XML structure, which can be imported into the data base of gazetteer service.
Evaluation of exemplary sources
Littoral Toponyms and Water-Names
A special challenge to the gazetteer service was the fast-changing geomorphology of the coastal area. The most well-known case is the small island "Trischen", which shifts about 30 m per year eastwards (Figure 2). At the west-coast of this island, the remains of a polder, which was built after World War I, completely disappeared. With similar speed the North Frisian Barrier Island (Nordfriesische Außensände) shifted to east. Hence, the footprint of a toponym (Hill, 2006) can only be identified at a certain point in time.
In the northern part, seawards the mouth of the Elbe, as well as in the estuary mouths of the large tidal channels like the Hever, the spatial shifting of sandbanks, tidal shoals and creeks are partly even more serious. Big tidal shoals, sometimes with designated names, may become divided, grow together, change its form or disappear completely within a few years time. The water names and littoral toponyms are being acquired for the DLM 250.000 of the BKG. Within the Wadden Sea, i.e. for the names of tidal shoals, tidal channels and creeks as well as shallows and bays, the maps of Geographical Names in the German Coastal Waters (Stagn, 2005) are evaluated beside actual sea maps. In the test area of the Hallig Wadden Sea the Frisian exonyms are mapped additionally. For the Dithmarscher Wadden Sea the extensive collection by Falkson (2000) can be used. For a lot of the historical names found in documents of Falkson(2000), no spatial reference can be given. Also, names from sailing instructions or from maps created before the 19th century could not be determined by coordinates satisfactory. There could only be a region determined in which a certain object was located. Such a relative spatial reference has another character than the representation of a real object by its absolute geometry. The use of bounding boxes, i.e. the definition of a rectangle in the coordinate system with maximum and minimum coordinate values, is a usual procedure and is also used in the context of the gazetteer.
In the Wadden Sea area the sharpness of the spatial boundaries of named objects is often unequal. Within the Wadden area, typical geographic forms often receive their name from fisherman or other seamen and from their perspective. Typically a tidal channel gets its name by the form of the course through the intertidal shoals (Falkson, 2000). The process of naming intertidal shoals is often based on characteristic attributes out of the view of the spatial interaction and observation position. For example, the tidal flats in front of Wesselburener Koog are called Wesselburener Watt, changing westward the name to Linnenplate with the Linnensand and the part mostly exposed to the sea is called Isern Hinnerk. The borders between these named sections are diffuse and haphazardly.
Likewise the partially lateral, and with buoyages marked channels, gradually change seawards into deep water. There, borders of the validity area of the toponyms can only specified by definition or better described by a transition area. Such borders have a soft character in opposite to those at the edges of tidal creeks. The accuracy of the borders varies not only from object to object, but can be also very different for the same object. For the search with a gazetteer, it is desirable that such inaccuracies can be considered within the spatial search with the help of buffer areas.
For the region of the North Frisian Islands, the Wadden Sea and the mainland, a map with the Frisian place names is analyzed (Holander & Jorgenson, 1973). In a first step, the settlements derived from ATKIS data are assigned to the Frisian names. Furthermore the inclusion of water names as well as dwelling mounds and other is prepared in the test area of the Hallig-Wadden Sea.
Also other names in foreign languages - in the test area e.g. the Danish language - just like designations in dialects or historical language forms should be supported by the gazetteer. Multilingual special signs have to be administered in the data base. Especially for names in rare languages and dialectic forms the documentation of the speech-sound appears desirably. A basic approach offers the transcription into the IPA (International Phonetic Alphabet, IPA, 1999).
Field names of polders and islands
The gathering of the field names for recent existing polders at the west coast was promptly accomplished due to the good data source situation. Digital data of the polder areas are stored in the information system of the national park administration for the Schleswig-Holstein Wadden Sea. They have been revised in the framework of articles used in this publication (Kunz& Panten, 1997, Kunz et al., 1997, Kohlus, 2000). Errors, which have been noticed after publishing, were edited, and toponyms determined by Falkson (2000) were added.
The sources contain information about the type of names - alternative names, former names and partially Frisian names - and for the building history of the polders including the date of construction. Polders have a clear boundary drawn by dykes. From many polders the accurate year of edification is known and many receive their names not by assumption of existing field names but by a concrete designation act.
In principle, the names of polders seem to be a simple topic for a gazetteer, assigning names to objects in space and time. Nevertheless, a polder could get lost. For some polders, the date of the collapse is even well-known. Also names get lost: e.g. during the time of the National Socialists, polders named after public leaders were renamed after 1945. Otherwise a name can survive the fall of an object and more frequently the alteration of its characteristic. It’s a typical case that a summer-polder becomes part of a larger embankment. The name of the summer-polder survives then as designation of a subsection within the new polder (e.g. Kettelsbüller Sommerkoog in the Speicherkoog Nord).
Even if construction and designation was frequently done as a simultaneous act, the examples show - particularly clear with the act of explicit renaming - that the temporal validity of geometry and objects have to be described independently. For older polders, often a particular year cannot be determined for the embankment, the source material only allows a temporal allocation to one century or a part of it. A polder, built in the 17th century, existed with its name not before 1601. The same applies to a polder with dating in the first half, the first quarter… of the 17th century. For searching and calculating the temporal validity, a numeric definition is essential, but this could not be realized with only one term. For described timescales a characteristic value and a term of temporal haziness is used: The 17th century can be in accordance to 1650 ± 50 years, the first half of the 17th century can be described as 1625 ± 25 years. The likewise frequent indication of a minimum age (before 1725) or the exclusion of a pre-existence (after the 17th century) could be described when the term of temporal haziness gets a direction (1650 -100 years).
Such a complex administration of temporal data is not only relevant for polders, but it is necessary to manage many other objects of the geosciences and history. There are a lot of other forms of temporal definitions like historic eras, periods of art etc. The most complex field are archaeological objects such as tumuli or old dwelling mounds in the investigation area. Basically it can differentiate between events and eras (the year 800, or around 800 a. D). Like periods of art, the periods of culture-technical characteristics (Iron Age) have no static dating and they are not independent from the geographical location. First solutions to work with periods have been worked out by Feinberg et al. (2003) and Petras et al. (2006). But archaeological and geologic dating is mostly only correlated with relative time systems like C14 rate. This correlation and the correlation of the relative time system to a static dating-system are depending from the level of knowledge and are not static.
In the focal are of the middle North Frisian Wadden Sea the opportunity arose to extend the database by means of a field research. Through the assistance of residents on the island Amrum and the Hallig Oland, it was possible to ask inhabitants and tenants for common field names. Heavy migration, the dissolution of the social communities, in particular in the rural areas, as well as the dissolution of stable social structures together with the increasing communication via print- and electronic media, result in the increasing loss of orally passed information and local names, such as field names, (e.g. see Falkson 2000, Bd. 1, p. 105). Otherwise, on a holm, with a high degree of the isolation, the small population size and a high steadiness of the population, atypical favourable conditions for the conservation of toponyms still exist. In the framework of the coastal-gazetteer some of these toponyms could be collected and made available. Furthermore, the current collection informs about the toponyms extracted from historical sources, which are in common use today; other, no longer used designations can be assigned with a final value of temporal validity. Probably many used place names will not be in common use for decades, but in the gazetteer the date of questioning was used as the secured final value of the temporal validity, if there are no plausible arguments for an available date.
On the Deutsche Grundkarte (the German base map) of Oland, which is used as basis for the name collection, only three field names are contained: a sports field, the word Teich (pond) on the surface of a former sediment withdrawal and the Seeslot, a drainage tidal creek into the port. Deviating from conditions in the Kirchspiel Büsum observed by Falkson (2000) on Oland some field names are actively used for communication. Already with the first results, which can be cited here, the language change and the change of its sound pointed out as recent functions. Thus, from Nickelshaage (Schmidt-Petersen, 1925) it became Nickelsjage. According to the transcription of Schmidt-Petersen (1925) Di Pöpe is addressed today as Pipe. Deviating from Schwinnehalli (Schmidt-Petersen, 1925) an accumulation near the port is called today as Schwinnshallig. The folkloristic interpreted change of meaning, elucidated by A. Schmidt-Petersen (1975), already consummated by J. Schmidt-Petersen (1925), manifested itself linguistically. Other toponyms such as Presters Feen or Klerebüll (Schmidt-Petersen, 1925) are used invariably. But also field names of recent origin were added: a stone reclamation ditch from the beginning of the 20th century designated as Wohlerts Fanger and the salt marshes along the causeway to Oland built in 1896 (Steensen, 1996) designated as the Neueres Land, divided into Nordervorland and Südervorland. Primarily the place names were documented particularly in High German. It is also aimed to capture the field names in the Low German manner of speaking.
If it is possible, the field names won by verbal report should be stored also in phonetic notation of IPA (see Falkson, 2000). The advantage of considering a transcription in IPA in the gazetteer is pointed already out in previous sections.
- NOKIS - Information Infrastructure for the North and Baltic Sea
- For other information portals, see Database of useful links - Theme 6
- Bernard, L., Einspanier, U., Haubrock, S., Hübner, S., Kuhn, W., Lessing, R., Lutz, M. & U. Visser (2003), Ontologies for Intelligent Search and Semantic Translation in Spatial Data Infrastructures. Photogrammetrie–Fernerkundung-Geoinformation 6/2003, p. 451-462.
- Hill, L. L., Frew, J., & Q. Zheng, (1999), Geographic names: The implementation of a gazetteer in a georeferenced digital library. D-Lib Magazine, Vol. 5/1.
- Kazakos, W. & F. Sellerhof (2006), Web-Services und Geodaten. In: Traub, K-P. & Kohlus, J., GIS im Küstenzonenmanagement. Grundlagen und Anwendungen. Wichmann Verlag, Heidelberg.
- ISO - International Organization for Standardization (ed.) (2003): ISO 19112:2003. Geographic information - Spatial referencing by geographic identifiers.
- Lehfeldt, R. & C. Heidmann (2004), Erstellung eines Metadaten Informationssystems für die Küstenforschung und das Küsteningenieurwesen. Final Report, BAW Hamburg.
- Lehfeldt, R., Heidmann, C., Reimers, H-C., Kohlus, J. & M. von Weber (2006a) NOKIS - Nord- und Ostsee KüstenInformationsSystem - Netzwerk der Metadaten. In: Traub, K.-P. & Kohlus, J. (ed.) GIS im Küstenzonenmanagement - Grundlagen und Anwendungen. Wichmann Verlag, Heidelberg, p. 150 - 160.
- Lehfeldt, R., Heidmann, C., Sellerhoff, F. & H.-C. Reimers (2006b) Managing Information in the Coastal Zone. Proc. 7th Intl. Conf. Hydro-Science and -Engineering, Philadelphia, USA.
- Kohlus, J. & C. Heidmann (2006a) Data Retrieval and Usage - The North- and Baltic Sea Information System (NOKIS). Proc. NatureProtection:GIS - International Symposium on Geoinformatics in Nature Protection Regions. 13th – 14th Nov. 2006, Dresden.
- Kohlus, J. & C. Heidmann (2006b): Ein digitaler Gazetteer für die Küste. In: Traub, K.-P. & Kohlus, J.(ed.): GIS im Küstenzonenmanagement - Grundlagen und Anwendungen. Wichmann Verlag, Heidelberg, p. 180 - 191.
- * Merian, M. & M. Zeiller (1642 – 1655), Topographia Germaniae Inferioris. Merian, Frankfurt a. M.
- Dankwerth, C. (1652), Newe Landesbeschreibung der zweye Herzogthümer Schleswich und Holstein. Schleswig.
- Gittings, B. & D. Munro (2004), Background to the Gazetteer for Scotland
- Zota, V. (2005) Deutschsprachige Geodaten für Google Earth. CT news of 3rd Juli 2005.
- NGA - National Geospatial-Intelligence Agency (ed.) (2005)
- Hill, L. L. & Q. Zheng (1999). Indirect Geospatial Referencing through Place Names in the Digital Library: Alexandria Digital Library Experience with Developing and Implementing Gazetteers. Proc. of the 62nd Annual Meeting of the American Society for Information Science, Washington.
- Arbeitsgemeinschaft der Vermessungsverwaltungen (AdV; ed.) (2005) ATKIS- Objektartenkatalog Basis-DLM
- Hill, L. (2006) Georeferencing: The Geographic Associations of Information. MIT Press.
- StAGN-Ständiger Ausschuss für Geographische Namen (ed.) (2005), Geographische Namen in den deutschen Küstengewässern. 4 maps 1: 200.000. In collaboration with the offices for land survey of Lower Saxony, Schleswig-Holstein and Mecklenburg-Vorpommern, Frankfurt a. M.
- Falkson, K. (2000), Die Flurnamen des Kirchspiels Büsum (Dithmarschen) – einschließlich der Flurnamen des Dithmarscher Wattenmeeres. In: Kieler Beiträge zur Deutschen Sprachgeschichte, Bd. 20.1 and 20.2, Wachholtz Verlag, Neumünster.
- Wieland, P. (2000), Trischen - die Geschichte einer alluvialen Insel im Dithmarscher Wattenmeer. Die Küste, H. 62.
- Holander, R. K. & V. T. Jörgensen (1973) Nordfriesland / Nordfriislon mit den friesischen Ortsnamen, Landkarte M. 1:100.000 und Register; Bredstedt / Bräist, Nordfriisk Instituut.
- IPA - International Phonetic Association (ed.) (1999) Handbook of the International Phonetic Association. A Guide to the Use of the International Phonetic Alphabet. Cambridge University Press.
- Kunz, H. & A. Panten (1997) Die Köge Nordfrieslands. Quellensammlung. Verlag Nordfriisk Instituut. Bräist/Bredstedt.
- Kunz, H., Kohlus, J. & A. Panten (1997) Die Köge Nordfrieslands. Map. Verlag Nordfriisk Instituut. Bräist/Bredstedt.
- Kohlus, J. (2000), Die Köge Dithmarschens. In: Gietzelt, M. (ed.), Geschichte Dithmarschens. Edited for Verein für Dithmarscher Landeskunde e.V..
- M. Feinberg, R. Mostern, S. Stone, and M. Buckland (2003) Application of geographical gazetteer standards to named time periods. Technical report, Electronic Cultural Atlas Initiative, Berkeley, 2003.
- Petras, V., Larson, R. R., & M. Buckland (2006) Time period directories: a metadata infrastructure for placing events in temporal and geographic context. Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries, p. 151 – 160; ACM Press, New York.
- Schmidt-Petersen, Jürgen (1925) Die Orts- und Flurnamen Nordfrieslands. Husum.
- Schmidt-Petersen, Asmus (1975) Beiträge zur Kenntnis der Orts- und Flurnamen der Insel Amrum und der Halligen. Husum.
- Steensen, T. (1996): 19. und 20. Jahrhundert. In: Nordfriisk Instituut (ed.), Geschichte Nordfrieslands. 2nd act. edition, Heide.
Please note that others may also have edited the contents of this article.