Everything You Ever Wanted to Know About Data Models (But Were Afraid to Ask)

Data Models. The topic either elicits a "knowing nod" or a sense of "fear and desperation" for geospatial professionals. Many of us, from my experience, have only a vague sense of what a data model is as well as some questions about how it affects GIS work. I want to take a look at those topics, along with how data models are changing the marketplace, how they are changing our day-to-day work, and what they may bring in the future.

What is a Data Model?
As users of geospatial technology, we model the world according to our own specific purposes. We might be interested in land management, site location, water network maintenance, or any one of the topics in the long list of applications for GIS. Each of us chooses how to model our data to fit the type of problem we are trying to solve, whether it be a raster or vector model, or a digital elevation model. Many of us mix and match the models. Some of us require built-in topology, while others do not. Some of us separate the spatial information and the database information. These are all "higher level" data model questions.

With GIS now thirty-odd years old, we are looking even more deeply at application-specific data models. Vendors, companies, schools, government agencies, and standards organizations are developing data models aimed at specific disciplines. Put simply, data models are structures into which data is placed. (A structure is a set of objects and the relationships between them.) The model ideally makes the work to be done using the GIS software and GIS data "easier." ESRI puts it this way: "Our basic goals [in developing data models] are to simplify the process of implementing projects and to promote standards within our user communities."

Data models can detail everything from field names (e.g., "roads" vs. "streets"), the number of attributes for each geographic feature (e.g., 2 or 200), where network nodes are required, how relationships of connected linear features are indicated, and much more. Some are aimed at a single software platform, while others are widely applicable across platforms, such as the Tri-Service's Spatial Data Standard (SDSFIE) for Facilities, Infrastructure, and Environment. (Tri-Service uses both the terms "standard" and "data model" to describe its products.)

Who Creates Data Models? And Why?
From my research I've found three reasons to develop data models: A GIS user (individual or group) develops a model to meet its organization's needs. A vendor might develop a model and make it available to its customers to help "jumpstart" their implementations. Industry organizations and government agencies craft models with the idea that they will simplify data sharing. There may be some mixed motives across those groups, so those statements may be somewhat oversimplified.

ESRI and Data Models
I spoke with Steve Grise, ESRI Product Manager for Data Models. One of the things he illustrated when I asked about how standards play into data models was that most data models are built on standards. He pointed to the existing Federal Geographic Data Committee (FGDC) Cadastral Data Content Standard, which details what information "should be" included in such a GIS layer. That Standard is the basis of the ESRI parcel data model: the ESRI data model basically incorporates all of the definitions included in the FGDC standard. So, as Grise puts it, "They are exactly the same." What ESRI basically did was implement the standard in ArcGIS. Any other vendor or user can do the same.

The history of ESRI's data model dates back six to eight years according to Grise. In a typical ESRI-user fashion, a well-put-together model for water utilities from Glendale, California, built on the version of ArcInfo that was current at the time, started to make the rounds to other users. That was the basis of the first formal ESRI data model, developed for water utilities. The project grew from there and now models-in various states of completion-are available for 21 application areas.

Intergraph and Data Models
Data models may seem like a "hot topic" these days, but David Holmes, director of worldwide product strategy, Intergraph Mapping and Geospatial Solutions, makes it clear that the company has been developing such models for 15 years, since MGE and FRAMME came on the scene. Intergraph has models for utilities and communications, local government, transportation, cartography, geospatial intelligence, and other areas.

Holmes and his colleagues Brimmer Sherman (vice president, Utilities & Communications), and Kecia Pierce, (industry consultant, Utilities & Communications), suggested a few reasons the topic seems to be "in the news" just now. The database focus of GIS (the trend toward storing both the spatial and non-spatial data in a standard database) may have helped rekindle interest, according to Sherman. The new connections with information technology (IT) departments, which have always been active data modelers, may be a factor, too. Finally, the three suggested that the new players in the GIS market may just now realize the return on data models. It costs quite a lot of time and money for users to build from scratch, rather than beginning with a well-thought-out model from a vendor.

Open GIS Consortium and Data Models
To get a non-vendor perspective I spoke with Kurt Buehler, Chief Technical Officer of the Open GIS Consortium (OGC). (I am a consultant to that organization.) He made it clear that OGC is only interested in one type of data model: one that enhances interoperability.

The FGDC Data Content Standards are of particular interest because they are designed for exchange. And, OGC, and other organizations have been involved in their deployment. The objectives of the Cadastral Data Content Standard, for example, include providing "common definitions for cadastral information found in public records, which will facilitate the effective use, understanding, and automation of land records," and "to standardize attribute values, which will enhance data sharing," and "to resolve discrepancies … which will minimize duplication within and among those systems." The reference models play two important roles, one for those crafting their own models, and another for those organizations looking to make their data shareable to neighbors, states, commercial entities, and the National Spatial Data Infrastructure (NSDI). Data model implementers can take the common model and use it as the basis of their own data models, developed for their own purposes.

For those who want to make data available for sharing, the data model is the basis for a translation capability, or "layer" (that can be implemented in many different ways), that will allow an organization's currently implemented data model to behave "as if" it is the reference model. Essentially, a layer sits between the current implementation and the rest of the Internet world. That layer makes the underlying model look and act as though it matches the reference data model exactly.

What's Ahead?
This is an active time in the GIS data model world. The FGDC Data Content Standards are well along. So, too, are many of the vendor-offered data models. Will vendor models be widely implemented in user sites? Will the FGDC Standards enhance data sharing in the near future or long term? Is there friction between those trying to best serve their customers and those who envision NSDI? These are questions to ask in the coming months and years.

» Back to our September 2003 Issue

Website design and hosting provided by 270net Technologies in Frederick, Maryland.