This article is more than 1 year old
You must remember: An archive isn't a thing, it's a strategy
Think of the future, edit the past
The biggest asset your organisation has is its data. And since IT is a world of compromises and paradoxes, the thing you have to work hardest to manage is...your organisation's data.
The task of data management is a big one. First you have to purchase or rent somewhere to put it – buy storage or run up a cloud repository. Then you have the whole matter of managing access to the data; layering a directory service on top of the server and storage infrastructure, defining roles and permissions, making individuals accountable for managing those permissions and controlling changes to access privileges. The particularly fun bit is managing access to data for staff who change department within the organisation – most companies I've come across make an appalling job of this and when someone moves to a different division they inadvertently retain privileges they had in their previous role.
And that's just the data that you're using. Things get even more complicated when it's time to move the data offline.
What archiving isn't
If you think a filing cabinet or firesafe full of tapes is an archive, you're sadly mistaken. Taking your monthly backup tapes offsite to a secure location and storing them in an environment that won't rot or dissolve them is not an archiving strategy. In fact it's an invitation to a life of paying long-term maintenance on archaic tape drives and keeping a server that can run prehistoric backup software so that you stand a modest chance of getting a data item back on the once-in-a-blue-moon occasion that you need to do so.
What you need, then, is an archiving strategy that serves your requirements and is sensibly futureproof.
Many organisations I've worked with over the years think that the first step of running up a new system or service is to buy something their IT manager read about recently and then try (in vain) to make it do what they want. Well, I'm a bit old fashioned and I start with requirements.
The basic starting point in requirements for an archiving strategy goes like this:
- How old does data have to be in order to be archived? This will probably vary from application to application, but define it properly.
- What's the shortest time I must keep data in the archive for? The legal guys and gals will enlighten you with regard to data retention required by law – a stated number of years for tax-related information, for instance.
- This is often forgotten, but it's essential that you have a policy for disposing of data from the archive after a stated time. Of course you're obliged to keep data for a minimum amount of time as I've just mentioned. But equally, in these days of Freedom of Information laws and other such legislation you most likely don't want to give people the right to dig into your affairs back to the year dot unless the law requires you to keep the information to hand. So know and implement both a minimum and maximum retention period.
- Where will I store it? I'm a huge believer that the cloud storage market is about to go absolutely ballistic, because it's reached the stage where: it's easy (i.e. cloud storage APIs are widely supported by hardware and software); and we have appliances that make interfacing to cloud storage a doddle. Oh, and incidentally: because you pay by the gigabyte, the maximum retention period gives a bit of cost control instead of simply allowing indefinite capacity growth.
- What format(s) shall I store it in? You need to work out what the oldest piece of data will be, and decide what format it'll be stored in and hence what software you'll need in order to read it once it's been retrieved.
- The archiving tools you select need to be integrated into your software futures roadmap: if your successors change horses, software-wise, they need to be aware of the requirement for restoring from the archive.
- Understand how the archive will be indexed: it's one thing storing the files, but quite another finding them – particularly the correct instance of a file that exists in numerous versions – when the time comes.
- Speaking of your successors, work on the premise that it's not you that will have to retrieve files but those who come after you. Document everything to death in a safe, well-known repository that's backed up and has longevity and ensure that you have the documents validated by someone who's not you to be sure they're comprehensible.
- Devise and action a test schedule for your archives – have a calendar for search, restore and access tests and ensure it's adhered to.