A computer system, like any other mechanical or electrical system is subject to failure. There are a variety of causes, including disk crash, power failure, software errors, a fire in the machine room, or even sabotage. Whatever the cause, information may be lost. The database must take actions in advance to ensure that the atomicity and durability properties of transactions are preserved. An integral part of a database system is a recovery scheme that is responsible for the restoration of the database to a consistent stage that existed prior to the occurrence of the failure.
The major types of failures involving data integrity (as opposed to data security) are:
To implement stable storage, we need to replicate the needed information on several nonvolatile media with independent failure modes and to update the information in a controlled manner to ensure that failure during data transfer does not damage the needed information.
RAID systems guarantee that the failure of a single disk will not result in the loss of data. The simplest and fasted form of RAID is the mirrored disk, which keeps two copies of each block, on separate disks. RAID systems can not guarantee failure of a site! The most secure systems keep a copy of each block of stable storage at a remote site, writing it out over a computer network, in addition to storing it on a local disk system.
Transfer of data between memory and disk storage can result in:
If a data transfer failure occurs, the system must detect it and invoke a recovery procedure to restore the block to a consistent state. To do so, the system must maintain two physical blocks for each logical database block.
A transfer of a block of data would be:
During recovery, each pair of physical blocks is examined. If both of them are the same and no detectable error exists, then no further actions are necessary.
If one block contains a detectable error, then we replace its content with the contents of the other block.
When the blocks contain different data without a detectable error in either, then the contents of the second block are written to the first block.
Hopefully, this recovery procedure ensures that a write to stable storage either succeeds completely or results in no change.
The most widely used structure for recording database modifications is the log . The log is a sequence of log records and maintains a history of all update activities in the database. There are several types of log records.
An update log record describes a single database write:
Whenever a transaction performs a write, it is essential that the log record for that write be created before the database is modified. Once a log record exists, we can output the modification that has already been output to the database. Also we have the ability to undo a modification that has already been output to the database, by using the old-value field in the log records.
For log records to be useful for recovery from system and disk failures, the log must reside on stable storage. However, since the log contains a complete record of all database activity, the volume of data stored in the log may become unreasonable large.
The deferred-modification technique ensures transaction atomicity by recording all database modifications in the log, but deferring all write operations of a transaction until the transaction partially commits (i.e., once the final action of the transaction has been executed). Then the information in the logs is used to execute the deferred writes. If the system crashes or if the transaction aborts, then the information in the logs is ignored.
The immediate-update technique allows database modifications to be output to the database while the transaction is still in the active state. These modifications are called uncommitted modifications . In the event of a crash or transaction failure, the system must use the old-value field of the log records to restore the modified data items.
When a system failure occurs, we must consult the log to determine those transactions that need to be redone and those that need to be undone. Rather than reprocessing the entire log, which is time-consuming and much of it unnecessary, we can use checkpoints :
Now recovery will be to only process log records since the last checkpoint record.
Shadow paging is an alternative to log-based recovery techniques, which has both advantages and disadvantages. It may require fewer disk accesses, but it is hard to extend paging to allow multiple concurrent transactions. The paging is very similar to paging schemes used by the operating system for memory management.
The idea is to maintain two page tables during the life of a transaction: the current page table and the shadow page table. When the transaction starts, both tables are identical. The shadow page is never changed during the life of the transaction. The current page is updated with each write operation. Each table entry points to a page on the disk. When the transaction is committed, the shadow page entry becomes a copy of the current page table entry and the disk block with the old data is released. If the shadow is stored in nonvolatile memory and a system crash occurs, then the shadow page table is copied to the current page table. This guarantees that the shadow page table will point to the database pages corresponding to the state of the database prior to any transaction that was active at the time of the crash, making aborts automatic.
There are drawbacks to the shadow-page technique:
Regardless of the number of concurrent transactions, the disk has only one single disk buffer and one single log. These are shared by all transactions. The buffer blocks are shared by a transactions. We allow immediate updates, and permit a buffer block to have data items updated by one or more transactions.
The cost of performing the output of a block to stable storage is sufficiently high that it is desirable to output multiple log records at once, using a buffer. When the buffer is full, it is output with as few output operations as possible. However, a log record may reside in only main memory for a considerable time before it is actually written to stable storage. Such log records are lost if the system crashes. It is necessary, therefore, to write all buffers related to a transaction when it is committed. There is no problem written the other uncommitted transactions at this time.
Database buffering is the standard operating system concept of virtual memory. Whenever blocks of the database in memory must be replaced, all modified data blocks and log records associated with those blocks must be written to the disk.
We can manage the database buffer sing one of two approaches:
The basic scheme is to dump the entire content of the database to stable memory periodically. No transaction can be active during the dump procedure.
To recover from the loss of nonvolatile memory, we restore the database from the archive and all the transactions that have been committed since the most recent dump are redone.
This is also known as an archival dump . Dumps of the database and checkpointing are very similar.