Utah News, June 2011
Employees at the University of Utah’s Marriott Library have developed a process to enhance access to digital collections using a geographic interface. The innovative part of this process is the development of a script that allows for the automated population of the metadata with latitude and longitude coordinates, which can then be used for the plotting of the location of digital resource, such as a historical photograph, on a Google Map.
Many collections have numerous digital objects with a geographic dimension that are natural targets for geographic location, but the geospatial metadata may vary greatly in quality, and often the geographic coordinates are not known. An example is the Western Soundscape Archive, which is hosted by the Marriott Library. Our team included a computer programmer who wrote a script to read the place name metadata and generate latitude-longitude pairs using a Google geocoding application programming interface (API). He also abstracted a list of unique place names and sent it to the team’s metadata cataloger (WAML’s own Ken Rockwell), who assigned a ranking to each place name. This ranking system was necessary to get the subsequent script to update the item with the most local and accurate coordinate data. Since we had multiple place names in records separated out by a semicolon, the scripting program would aim to populate the latitude and longitude field with the most specific information first. His system for coordinate ranking:
1 = Very localized, such as a specific address within a city or a specific landmark (e.g., Pioneer Square in Seattle, Delicate Arch in Utah, etc.)
2 = Locality, such as a town, city, or smallish region
3 = Larger region but within one county (or in the case of Yellowstone and Alaskan regions, no county)
4 = County
5 = Multi-county regions (e.g., National Forests, mountain ranges like the Cascades, etc.)
6 = States, Provinces
7 = Countries [used for recordings from outside the U.S. and Canada]
To use a Utah example: He assigned a “1” to Silver Lake in Brighton, which is a “2.” Brighton is in Big Cottonwood Canyon (“3”), which is in Salt Lake County (“4”), and in the Wasatch Mountains (“5”), in Utah “6”).
Following coordinate generation, he reviewed the retrieved geocoordinate pair for problems. There were a few, such as when a locality could not be found by the geocoding API, it returned the center coordinates of the next ranking place—which might be the state. For obscure places such as a small natural preserve within a city, some exploration on the Web, the USGS Geographic Names Information System, and Google Earth helped to locate a more precise geocoordinate pair. Also, linear features such as canyons and streams usually retrieve a single point (usually at the lowest elevation, the stream or canyon outlet), which may be less than ideal as the representative of the recording place, so a midpoint may be selected using Google Earth.
Once the coordinates were finalized, they were loaded into the collection in separate latitude and longitude fields, and another team member, the digital initiatives librarian, produced a .kml file from them. This was fed into a free program, Earth Point, that generated a Google map. Once the map interface is uploaded, a user will be able to click on a locality and see a hyperlinked thumbnail which can access the sound recording associated with a species found at that location.
An article describing the programming steps is being submitted to an online journal on metadata. This procedure may be of value to other libraries with digital collections. We are in the process of exploring other collections to which we may apply it. --Ken Rockwell