This article is more than 1 year old

The database abstraction framework strikes back

Part 2: C++ Generic Coding

The first question is, "how do we access the data model to generate the code"? In theory, the data model is expressed in SQL and we can parse that to generate C++, but this has some issues. Firstly, how do we know which relationships exist? We could, in theory, interpret the constraints on each table, but this may be impractical as the constraints may not be expressed in the same SQL file as the tables.

Furthermore, in some cases the constraints may simply not be expressed in SQL. Secondly, we need to do a mapping from SQL types to C++ types. We could do this by hard coding the C++ type that corresponds with each SQL type but this isn't that flexible, especially when one SQL type could be used with several C++ types (e.g. an SQL Number could be int, unsigned, long, float, double etc). We also would like to allow developers to use their user-defined types to maximise the flexibility of our framework.

For this reason, it makes sense to start with a model description that describes the entities and their relationships and from which we'll generate both the SQL and the C++ object oriented interface. In this file, we will define each of the entities, their attributes and the relationships they participate in. It will also define both the C++ and SQL types for each attribute. We want to keep the core discussion focused here; but there is clearly room in such a description to separate the relational and the object oriented aspects, to support indexes, constraints, how primary keys are handled, what namespace code is generated in, etc.

From this file, we have enough information to generate our SQL schema and the object-oriented interface. Each entity will have a class file; each attribute will have its accessors; and each relationship will become a function that returns a list of objects. The code generator that turns all this information into SQL and C++ code is included in the package on SourceForge.net here. A link to the generated SQL schema is here, and to the generated dbi object code header code here.

Ok, so now we've done the simple part, we've generated a C++ class hierarchy that will allow us to access the relational data. However, we haven't answered such questions as: how we retrieve these objects from the databases; how we update the values of attributes in objects; how do we insert new objects and delete old ones; and how do we manage relationships between objects? The following sections describe how this could all work, and the generated code from the example is included at the end for the practical minded among us.

So, let's start with the insert and update; the easiest approach is to have a user create a new object, set its attributes, and tell us that s/he wants to store it in the database.

To simplify lifetime issues, we want these objects created on the heap, as this way the object can't go out of scope and be destructed while references to the objects exist in other places in the framework. For this reason, the generated constructors are declared private and object creation is by a factory method.

For the "store in database" behaviour, we are going to have a "store" member function that overrides a virtual function in the base class. The implementation of this store function will be the execution of an SQL query generated from the model description file, by the same class generator used above. Any resulting errors are wrapped in an exception and propagated.

Another option would be to use templates and generic programming - more interesting and much cooler - but here we are out to prove a concept, so simple is good. In any event, the speed gains made by statically bound functions as opposed to virtual calls are going to be hidden by the cost of network round trips to the database. We get some complexity because an object that is not yet stored may have been added into relationships; but you can see from the implementations below that this is manageable.

Retrieving objects is another challenge. A simple function that gets an object from a database is one option, but we also want to be able to retrieve a set of objects. Similarly, for delete, a common operation is to delete all the objects corresponding to some criteria so we'll want to provide an algorithm that can iterate over a list of objects calling delete where appropriate. An added complication is that when we delete an object we also need to remove that object from all the relationships it participates in.

Managing the relations of an object is a little harder; this is a key difference between an object-oriented model and a relational model. In the relational model for a one-to-many relation, the relation is normally expressed on the many side. In an object model the many side is often the child of a relation: if a salesperson is associated with many customers then we expect the salesperson object to contain a list of his customers.

There are a number of approaches; the easiest is to express the relation in only one object, either on the salesperson object or on the customer object, but not both. This is convenient and easier to implement, but eventually we want both objects to be able to access their associates easily.

More about

TIP US OFF

Send us news


Other stories you might like