Reg Developer: Many of the refactorings in your book involve applying triggers to ensure data integrity. So database refactoring differs from code refactoring, in the sense that code changes are “absolute” and replace what went before, whereas DB refactorings accumulate in layers of criss-crossing triggers. Do you see this as a problem?
Scott: This is why there is a defined transition window. At the end of the window, you need to remove the old schema and any supporting triggers. This way the patches are only there for a short period of time. If you don’t remove them then you’re going to run into problems. To be fair, how complex is it to schedule the running of a script which does the appropriate cleanup? To make things easy for our readers we included examples of such scripts for each refactoring.
Reg Developer: Of course there's also a danger that all those triggers will end up a drag on the system (i.e. slow the database down).
Scott: Yes, you’re going to take a performance hit. But you’re already taking a data quality hit and as TDWI argues that seems to add up. So you need to make a trade-off between performance and financial cost. Costs include development team productivity loss because data professionals can’t react swiftly to changes during development, the cost of business mistakes due to poor quality data, and the additional development costs associated with additional programming code to work with less-than-ideal database schemas.
And what about the performance hit taken by the applications with all of the additional workaround code because of database problems? Why is it that this is rarely mentioned?
Reg Developer: Can EDD work with a database schema shared by many projects? It's effectively decentralizing the ER design, which sounds as if you're inviting Mr Chaos to tea. (I guess this is where an effective DBA comes in, to act as a central gateway for all changes).
Scott: The book is written under the assumption that there are hundreds of other disparate systems, all of which are outside the scope of your control, accessing your database. These systems run on different platforms and have different release cycles. This is why the transition window is critical to your success.
Reg Developer: I've been a programmer in projects where the database design is handled by a separate team. Often the schema design was out of control, changing radically from one day to the next, breaking my code. I can just imagine the frustration of a programming team having to cope with a separate DB team practising EDD, and being told to “get with the times” when they complain that the shifting database keeps breaking their code. How does EDD address this issue?
Scott: It’s incredibly inefficient to have the data team working separately from the development team, and this is something that I’ve written about extensively in Agile Database Techniques and in other books. Any organization that chooses to work this way gets what they deserve.
Reg Developer: I agree up to a point. Yes, the DBA should be helping the developers from day -1 and should be part of the team; but looking at the data independently of the programmers is a great way of finding defects (such as gaps in developer understanding) well before they get built into the code.
Another issue raised by the DBAs I spoke to is that EDD shakes up the core competencies: that is, the physical database design is suddenly being done by programmers, without the necessary step of a DBA optimizing (or correcting, some might say) the design before it goes live. You effectively lose the shield of having separate logical and physical ER designs.
Scott: It’s pretty much recognized within the agile community that it’s inefficient to build teams made up of specialists. Instead, you want people who are generalizing specialists with one or more specialities, such as database development or Java programming, and a general knowledge of software development and the domain. Generalizing specialists are far more effective than specialists, so you increase your overall productivity. It’s an assumption that you need a DBA to optimize the design, you just need someone on the team with the requisite skills to do the work, and it doesn’t have to be a DBA per se. There are several performance refactorings and they’re pretty straightforward in practice, so perhaps the real issue here is the cultural one of moving away from over specialization towards the more skilled paradigm of generalizing specialists.
Also, is having separate logical and physical ER designs really a shield or is it simply busy work?
Reg Developer: If you want to make changes to an enterprise schema in use by multiple applications, you're likely to run into resistance in the organization. Different projects will operate according to their own testing and release schedules. Can an evolutionary design approach really work in this sort of environment?
Scott: Like I said, the cultural issues are the difficult ones. Such an organization would have to choose to succeed, but unfortunately, it’s very easy to choose to fail. Hence the $611 billion data quality problem that we currently have on our hands.
Reg Developer: To finish off, where do you see evolutionary database design heading in the near future?
Scott: Right now we’re at the beginning of the adoption curve. The techniques are in place but we need to educate and mentor people in them. This will definitely take time as we need to overcome some very serious cultural challenges within the existing data community.
Scott Ambler is a noted author and speaker on object-oriented software development, software process and the like. He currently works with IBM as Practice Leader Agile Development within the IBM Methods group. He is Canadian and still lives in Canada although he spends a lot of time consulting in the United States and Europe.
Matt Stephens is a Java developer and project leader based in Central London. He’s the co-author of Extreme Programming Refactored which objectively throws XP into a pit of rabid hamsters, Agile Development with the ICONIX Process and most recently, Use Case Driven Object Modelling with UML: Theory and Practice.