“It is completely unreasonable to expect that today’s existing operational environment will exist to enable access to archived data.”— Craig S. Mullins
In his book Database Administration: The Complete Guide to DBA Practices and Procedures, Craig Mullins discusses database archiving for long-term data retention. The author points out what’s probably the most important consideration: the archived data must be hardware and software independent. He says, “Independence is crucial because of the duration over which the archived data must exist. With a lifespan of decades (or longer) it is likely that the production system from which the data was archived will no longer exist – at least not in the same form, and perhaps not at all.”
Additionally, the author identifies the key capabilities required of a database archiving solution:
- Data must be archived at the business object level (records).
- The archive solution must be able to store a large amount of data.
- The archive must be able to manage data for very long time periods. The data will outlive the systems that generated them, and it will also outlive the media we store it on.
- To support regulatory compliance, data must remain unchanged once it is archived.
- The archive requires metadata to be useful.
- The database archiving solution must enable query retrieval of the archived data in a meaningful format until it is discarded.
How do these recommendations stand up in the context of IBM Lotus Notes archiving?
Data and applications are bound together
The fact that a Domino NSF file contains both an application and a complete database makes the task easier at first. If you restore an NSF file, you can rest assured that the data and the application will be fully compatible. That’s one of the greatest features of Lotus Notes Domino from an archiver’s point of view.
On the other hand, whenever business users need to access the archived data, they will be fully dependent on Notes and Domino. For classic Notes applications, the only tool capable of displaying the record in a meaningful way is the Notes client. So you will probably want to keep the Notes client on users’ computers, even when Notes is not needed in day-to-day activities. Would you rather use a web browser with Domino? That too is a safe way to go, but it comes with a drawback: you will likely lose the document layout and probably much of the context of the original record.
To sum it up, if we archive Notes data as NSF files, we will also need a functional Notes and Domino system to use for future record retrievals. And as IT professionals, we should ask ourselves if – and how much – we could rely on IBM Notes/Domino and the proprietary NSF format for long-term data archiving.
Record retrieval without DB restore
What’s the alternative? Exporting content out of Notes databases to an open standard repository sounds like a no-brainer. This kind of approach brings a number of benefits, but let’s focus on this one from the business user’s point of view: easy access to archived data without the need to restore an entire monolithic database to its old operational environment. That’s the real gain – by using an open standard repository, historical records will be accessible without requiring a lot of manual intervention. And this will further guarantee a reduced flow of requests to the IT.
Importance of data in context
As a document-based system, Notes ensures that data are handled as whole business objects or records, and these can be readily transformed into a different format suitable for long-term archiving. However, a complete archive must additionally include the context and the relationships of the records. This is because if the data context is not available in an archive, important pieces of information will be missing, so users might be unable to reference one record to another.
In Notes, this data and information context includes document hierarchies and links, as well as metadata. On top of that, the original document layout is also an important and integral part of the record, and should therefore be preserved as well. An ideal archiving solution is thus one which ensures that both the data and the data context are preserved and remain unchanged once saved in a new archiving environment.
Rising importance of database archiving
Finally, it is definitely worth repeating that database archiving will become more prevalent over time. Archived data will probably outlive our existing applications, platforms and storage media – perhaps even ourselves. We should make our plans accordingly.
To sum up, let’s see what to think about if you are looking at preserving historical data from Notes and Domino applications:
- INDEPENDENCE: Data and content need to be detached from the original applications and platform.
- DATA IN CONTEXT: Retained documents need to be accessible in a meaningful format. Views, categories, document hierarchies, and links should be preserved, as should document layouts.
- LONG-TERM ARCHIVAL: Do not expect that today’s operational environment will exist tomorrow to enable access to archived data.