Mapping Classes to Relational Tables
Meet the challenge of impedance mismatch with object to relational mapping. Part 1 overviews best practice approaches and looks at horizontal mapping
by Richard Jensen
Posted April 28, 2004
Editor's Note: This is the first of a two-part article on best practices for object to relational mapping. Part 1 provides an overview of the necessity to overcome the challenge of impedance mismatch between object-oriented and relational data and of using the horizontal mapping approach. Part 2 will look at the union and vertical mapping approaches and provide examples of both.
The object to relational (O/R) mapping layer is a critical component of any enterprise system. Whether you're writing new code or porting legacy applications to a new platform, the impedance mismatch between object technologies and relational databases presents a difficult challenge.
Complex systems often require a rich data model with inheritance and complex relationships to take advantage of the object-oriented language in which they are being written. Depending on the platform and development tools, the process of mapping the data model to the physical database schema may be completely manual, automated, or somewhere in between.
However, regardless of the process, simplistic and/or inefficient mapping can have a detrimental effect on applications at run time. Under load, data access can easily become a bottleneck for the whole application. While we can alleviate some of the risk by realistic testing and subsequent tweaking of the mappings, such tweaking by hand can become a maintenance nightmare.
Let's look at several best practice approaches for mapping object classes to relational tables. In particular, we'll discuss the pros and cons for various approaches to mapping object inheritance hierarchies to relational tables.
Reconciling Interests
In the ideal developer-centric world, developers would be free to define the most expressive data model and let tools create the most efficient schema mapping automatically for it. At times, this approach would require restructuring an existing database schema.
Unfortunately for developers, they rarely have complete control over the relational schema. In an enterprise environment, there are often multiple data sources accessed by many different applications. The needs of a particular application need to be weighed against the needs of the many. The flexibility usually declines when an application must access legacy data.
In the ideal database administrator (DBA)-centric world, the database would be normalized to the desired form, and the administrator would be able to enforce all the constraints necessary to ensure database integrity. Applications would be required to have explicit knowledge of the relational schema and access it in a well-behaved fashion.
Unfortunately, this approach results in a great deal of unnecessary complexity being exposed to all applications that access the database. Designing applications in a relational-centric way removes many of the benefits of object-oriented development.
The way to reconcile the interests of developers and DBAs is to architect applications with a rich O/R mapping layer that hides the complexity of underlying data structures from the business logic developers. The O/R mapping layer should support complex application data models while exploiting native relational database features under the covers. It should provide performant and transactional access to relational data, which allows the database schema to enforce as few constraints as possible on the application and vice versa.
Such an approach gives the developer options for mapping. The mapping chosen can have a great impact on application performance, particularly in the way object inheritance is handled. Let's consider what is involved in mapping an object model to the database.
Inheritance or generalization/specialization allows related entities to share common structure and behavior. The benefits of using inheritance include implementation reuse and the ability to treat parts of a hierarchy in a generic manner (for example, perform operations on an Employee object without needing to know if it is a Manager or a PartTime object). Hierarchical mapping considerations include two different aspects: whether a particular entity is concrete or abstract and how the concrete classes map to tables.
For each class in a hierarchy, data modelers must decide whether it should be possible to create (instantiate) instances for that class in memory. Entities that will not be instantiated become abstract classes and those that will be instantiated are concrete classes.
Just as a leaf on a tree terminates a branch, leaf classes are classes from which no other classes inherit. Typically, all leaf classes are concrete classes, while base classes are often abstract. The object model in Figure 1 shows a number of leaf and nonleaf classes. Class A is the common parent of all classes in the hierarchy. Any data or function members defined on that class are available to all of the other classes because of the inheritance relationships. If instances of A never need to be created—that is, if only subclasses of A appear in memory—A can be defined as an abstract class. Class C is also a base class. It is the parent class of the leaf classes D and E. The developer determines whether instances of this class could exist on their own.
Once the developer has defined a suitable hierarchy, it is time to determine how classes will map to physical data tables.
Mapping Approaches
There are several common strategies for mapping the inheritance structure of an object model to a relational database schema. Each mapping approach has capabilities that make it appropriate for solving particular problems. The basic differences are:
-
Horizontal mapping associates every concrete class to its own table (which also contains attributes for all the base classes). This is the simplest approach to object/relational mapping since a class equals a table.
-
Vertical mapping associates every class (even those that are abstract) to a separate table in the database. Vertical mapping requires accessing multiple tables to extract all of the data for an object. This is the most flexible mapping approach for dealing with complex legacy data.
-
Union mapping associates a part of an inheritance hierarchy to a single table (many classes map to one table). This can be the most efficient way to map classes where queries may be performed at a base class level.
Consider the Employee subtree of the model shown in Figure 2. For this discussion we'll assume that the classes PartTime, FullTime, Exempt, and Manager are concrete, while NonExempt and Employee are abstract.
When you map a hierarchy horizontally, each concrete class corresponds to a distinct table. The mapped table contains columns for all of the persistent data of the class. The name horizontal refers to way that each leaf class and attribute is mapped directly to its own table and column. Using our example, we might have four tables as shown in Figure 3.
Because each concrete class maps to a table that contains all the data needed for that class, this is typically the easiest way for a developer to represent object inheritance within the O/R mapping. This simplicity comes at a price, however. There are a couple of situations where vertical inheritance is either unworkable or suboptimal: Inefficient queries: these are queries at the superclass level that must be duplicated across each leaf class. Thus, a query for all exempt employees with blue eyes would have to be repeated for the FullTime, PartTime, Exempt, and Manager tables. May not match underlying table structures: for legacy tables, it may not be possible to map each table to a concrete class without creating a very unnatural object model.
Let's look at some snippets that demonstrate the SQL portion of the mapping code for creating, reading, updating, removing, and retrieving persistent data (with the example object model mapped horizontally to the database tables). To create the database row corresponding to a new PartTime employee object, we insert one row into the PartTime table:
insert into PartTime name, dob,
hiredate, ssn, payrate
values ("Elijah Banks",
"03/21/85", "07/01/2003",
"901-23-4567", 8.90)
To instantiate a FullTime employee object, we only need to read from one table:
select name, dob, hiredate, ssn,
payrate from FullTime
where name = 'Sarah Jackson'
Similarly, we can update an Exempt employee by writing to one table:
update Exempt set dob=
"06/30/1962", set hiredate=
"06/15/1997", set ssn=
"567-89-0123", set salary =
52000.00
where name = 'Nicky Colson'
Note that updating unchanged columns is not necessary, but will highlight some of the differences between horizontal/union mapping and vertical mapping. To remove an object:
delete from Manager where name =
'Susan Jekel'
In the next case, we need to perform the query on each of the horizontal tables and return the combined results to find employees hired since "01/01/1998":
select name, dob, hiredate, ssn,
payrate from PartTime
where hiredate >= "01/01/1998"
select name, dob, hiredate, ssn,
payrate from FullTime
where hiredate >= "01/01/1998"
select name, dob, hiredate, ssn,
salary from Exempt
where hiredate >= "01/01/1998"
select name, dob, hiredate, ssn,
salary from Manager
where hiredate >= "01/01/1998"
Part 2 of this article will take a look at union and vertical inheritance mapping and include SQL examples for creating, reading, updating, removing, and retrieving objects under those approaches.
About the Author
Richard Jensen is a senior architect at Persistence Software.
|