"I've lost some documents!" the Head of IT gasps, bounding into Mission Control with beads of sweat dotting his puffy red brow.
"Documents?" the PFY asks.
"Yes, I scanned our licence agreements into the computer and now they're gone!"
"Gone from your computer?" the PFY sighs, firing up the backup software.
"No, no, I put them into the content management system."
"Ah," the PFY sighs. "So they're gone alright."
"But I only put them in last week!"
"It sent me an email telling me they'd been added to the system!"
"There was a link to the documents!!"
"I TESTED IT!!"
"Course you did - and it worked the first time around didn't it?"
"And maybe you checked, a day later, just to be certain?"
"I did - and they were there! But they're not now!"
"No, they wouldn't be would they?"
"WHY NOT!?" he snarls.
"Because we bought a budget content management system. The one based around a relational database that the developers designed themselves."
"The one we repeatedly told the company not to buy a couple of years back," I chip in.
"Yes, but it was..."
"The one that the developers abandoned development on six months later because we were the only UK customer to buy it."
"They weren't to know th..." the Head pleads.
"The one with the referential integrity of an Alzheimer's patient meaning a document will be there one day, gone the next and back - briefly - at some indeterminate time in the future?" the PFY says, really labouring the point now.
"Yes, yes, well it's done now, so how do we recover the data?"
"Get it back into the content management system."
"It's already back," the PFY says. "It's in there somewhere, just the database indexing is corrupt."
"So can you uncorrupt it?"
"You mean do an index rebuild?"
"Yes," the Head sighs, seeing a happy ending.
"Sure - though there's only a small chance we'll get your docs back in the index but a large chance that we'll lose other documents from the index."
"Ok," I say, going to the whiteboard in lecture mode. "The database >scribble< >squeak< will rebuild indexes which are corrupt. The indexes got corrupt >scribble< >scribble< somehow, which means there's every chance that there's duplicates >squeak< >squeak< in the database - which in turn means that when you rebuild the indexes one of the dups will disappear >scribble<. Alternatively, because the integrity's so bad, we could delete >scribble< documents that the database can index in the hopes that when you rebuild the indexes >squeak< >squeak< the missing licence documents will reappear."
"We need those licenses back, so do what you have to!" the Head snaps.
"Which licences were they exactly?" the PFY asks.
"All of them."
"All... uh even the ones in the document safe!?" the PFY gasps.
"Yes - especially those."
"Our site licenses... for... everything?!"
"Yes, but if you get them back you ca..."
"You destroyed the originals didn't you?" I sigh.
"Of course. What's the point in scanning them if you're going to keep the documents?"
"What was the point in scanning them in the first place?"
"We needed space in the document vault for some new contracts."
"So you destroyed licence documents - some of which are proof-of-purchase, some of which are one-time licences and will not be reissued by the vendor."
"But as you say, they're still in the content management system somewhere. Can't you just do a search on the content management server and find them?"
"Don't be silly - no content management server allows that - or you'd be able to change systems to some cheaper vendor. No, a proper content management system makes it next to impossible to extract your content in any automated manner so that you're forced to use their product and pay their licence fees no matter how crap it is."
"But you said this wasn't a proper system."
"No, we said that this was a budget system - so it's worse. In their wisdom the designers adopted a file system model and split the files into 128K chunks with a pointer to the first chunk and a linked list thereafter. Once you lose the first pointer, it's gone - unless of course you rebuild the database and the right pointer wins."
"So we should delete some documents from the system?"
"In theory we should delete all the documents from the system to free up pointers then rebuild the indexes which should retrieve all the missing documents. But it'd take ages to do that..."
"Not if we work together," the Head gasps. "But - how do we get the existing documents back?"
"When we've rebuilt the index we extract all the recovered data files. Then we just recover the content management system from backups and reinsert the documents into it, safely."
"Right," the Head says, dashing off to get deleting.
"So when do we tell him that there's no index rebuild function?" the PFY asks.
"AFTER I ring that company that gives you £50 for dobbing in companies who pirate software..." I reply, picking up the Yellow Pages.
"But that would be after I discover that the backup utility on the content management system has been silently failing for months...All of which would be a week or so before we tell him that the licences were only colour photocopies of the originals in the tape safes..."
"It's a plan!" the PFY chirps.