· 5 years ago · Feb 02, 2020, 08:04 PM
1Subject: TECH: Internals of Recovery
2Type: REFERENCE
3Creation Date: 13-SEP-1996
4
5Oracle7 v7.2 Recovery Outline
6
7Authors: Andrea Borr & Bill Bridge
8Version: 1 May 3, 1995
9
10Abstract
11
12This document gives an overview of how database recovery works
13in Oracle7 version 7.2. It is assumed that the reader is familiar
14with the Database Administrator's Guide for Oracle7 version 7.2.
15The intention of this document is to describe the recovery
16algorithms and data structures, providing more details than the
17Administrator's Guide.
18
19Table of Contents
20
211 Introduction
22 1.1 Instance Recovery and Media Recovery: Common Mechanisms
23 1.2 Instance Failure and Recovery, Crash Failure and Recovery
24 1.3 Media Failure and Recovery
25
262 Fundamental Data Structures
27 2.1 Controlfile
28 2.1.1 Database Info Record (resetControlfile)
29 2.1.2 Datafile Record (Controlfile)
30 2.1.3 Thread Record (Controlfile)
31 2.1.4 Logfile Record (Controlfile)
32 2.1.5 Filename Record (Controlfile)
33 2.1.6 Log-History Record (Controlfile)
34 2.2 Datafile Header
35 2.3 Logfile Header
36 2.4 Change Vector
37 2.5 Redo Record
38 2.6 System Change Number (SCN)
39 2.7 Redo Logs
40 2.8 Thread of Redo
41 2.9 Redo Byte Address (RBA)
42 2.10 Checkpoint Structure
43 2.11 Log History
44 2.12 Thread Checkpoint Structure
45 2.13 Database Checkpoint Structure
46 2.14 Datafile Checkpoint Structure
47 2.15 Stop SCN
48 2.16 Checkpoint Counter
49 2.17 Tablespace-Clean-Stop SCN
50 2.18 Datafile Offline Range
51
523 Redo Generation
53 3.1 Atomic Changes
54 3.2 Write-Ahead Log
55 3.3 Transaction Commit
56 3.4 Thread Checkpoint
57 3.5 Online-Fuzzy Bit
58 3.6 Datafile Checkpoint
59 3.7 Log Switch
60 3.8 Archiving Log Switches
61 3.9 Thread Open
62 3.10 Thread Close
63 3.11 Thread Enable
64 3.12 Thread Disable
65
664 Hot Backup
67 4.1 BEGIN BACKUP
68 4.2 File Copy
69 4.3 END BACKUP
70 4.4 "Crashed" Hot Backup
71
725 Instance Recovery
73 5.1 Detection of the Need for Instance Recovery
74 5.2 Thread-at-a-Time Redo Application
75 5.3 Current Online Datafiles Only
76 5.4 Checkpoints
77 5.5 Crash Recovery Completion
78
796 Media Recovery
80 6.1 When to Do Media Recovery
81 6.2 Thread-Merged Redo Application
82 6.3 Restoring Backups
83 6.4 Media Recovery Commands
84 6.4.1 RECOVER DATABASE
85 6.4.2 RECOVER TABLESPACE
86 6.4.3 RECOVER DATAFILE
87 6.5 Starting Media Recovery
88 6.6 Applying Redo, Media Recovery Checkpoints
89 6.7 Media Recovery and Fuzzy Bits
90 6.7.1 Media-Recovery-Fuzzy
91 6.7.2 Online-Fuzzy
92 6.7.3 Hotbackup-Fuzzy
93 6.8 Thread Enables
94 6.9 Thread Disables
95 6.10 Ending Media Recovery (Case of Complete Media Recovery)
96 6.11 Automatic Recovery
97 6.12 Incomplete Recovery
98 6.12.1 Incomplete Recovery UNTIL Options
99 6.12.2 Incomplete Recovery and Consistency
100 6.12.3 Incomplete Recovery and Datafiles Known to the
101 Controlfile
102 6.12.4 Resetlogs Open after Incomplete Recovery
103 6.12.5 Files Offline during Incomplete Recovery
104 6.13 Backup Controlfile Recovery
105 6.14 CREATE DATAFILE: Recover a Datafile Without a Backup
106 6.15 Point-in-Time Recovery Using Export/Import
107
1087 Block Recovery
109 7.1 Block Recovery Initiation and Operation
110 7.2 Buffer Header RBA Fields
111 7.3 PMON vs. Foreground Invocation
112
1138 Resetlogs
114 8.1 Fuzzy Files
115 8.2 Resetlogs SCN and Counter
116 8.3 Effect of Resetlogs on Threads
117 8.4 Effect of Resetlogs on Redo Logs
118 8.5 Effect of Resetlogs on Online Datafiles
119 8.6 Effect of Resetlogs on Offline Datafiles
120 8.7 Checking Dictionary vs. Controlfile on Resetlogs Open
121
1229 Recovery-Related V$ Fixed-Views
123 9.1 V$LOG
124 9.2 V$LOGFILE
125 9.3 V$LOG_HISTORY
126 9.4 V$RECOVERY_LOG
127 9.5 V$RECOVER_FILE
128 9.6 V$BACKUP
129
13010 Miscellaneous Recovery Features
131 10.1 Parallel Recovery (v7.1)
132 10.1.1 Parallel Recovery Architecture
133 10.1.2 Parallel Recovery System Initialization Parameters
134 10.1.3 Media Recovery Command Syntax Changes
135 10.2 Redo Log Checksums (v7.2)
136 10.3 Clear Logfile (v7.2)
137
138
139
140
141
142
143
144
1451 Introduction
146
147The Oracle RDBMS provides database recovery facilities capable
148of preserving database integrity in the face of two major failure
149modes:
150
1511. Instance failure: loss of the contents of a buffer cache, or data
152residing in memory.
153
1542. Media failure: loss of database file storage on disk.
155
156Each of these two major failure modes raises its own set of
157challenges for database integrity. For each, there is a set of
158requirements that a recovery utility addressing that failure mode
159must satisfy.
160
161Although recovery processing for the two failure modes has much
162in common, the requirements differ enough to motivate the
163implementation of two different recovery facilities:
164
1651. Instance recovery: recovers data lost from the buffer cache
166due to instance failure.
167
1682. Media recovery: recovers data lost from disk storage.
169
1701.1 Instance Recovery and Media Recovery: Common Mechanisms
171
172Both instance recovery and media recovery depend for their
173operation on the redo log. The redo log is organized into redo
174threads, referred to hereafter simply as threads. The redo log of a
175single-instance (non-Parallel Server option) database consists of a
176single thread. A Parallel Server redo log has a thread per instance.
177
178A redo log thread is a set of operating system files in which an
179instance records all changes it makes - committed and
180uncommitted - to memory buffers containing datafile blocks.
181Since this includes changes made to rollback segment blocks, it
182follows that rollback data is also (indirectly) recorded in the redo
183log.
184
185The first phase of both instance and media recovery processing is
186roll-forward. Roll-forward is the task of the RDBMS recovery
187layer. During roll-forward, changes recorded in the redo log are re-
188applied (as needed) to the datafiles. Because changes to rollback
189segment blocks are recorded in the redo log, roll-forward also
190regenerates the corresponding rollback data. When the recovery
191layer finishes its task, all changes recorded in the redo log have
192been restored by roll-forward. At this point, the datafile blocks
193contain not only all committed changes, but also any uncommitted
194changes recorded in the redo log.
195
196The second phase of both instance and media recovery processing
197is roll-back. Roll-back is the task of the RDBMS transaction layer.
198During roll-back, undo information from rollback segments (as
199well as from save-undo/deferred rollback segments, if appropriate)
200is used to undo uncommitted changes that were applied during the
201roll-forward phase.
202
2031.2 Instance Failure and Recovery, Crash Failure and Recovery
204
205Instance failure, a failure resulting in the loss of the instance's
206buffer cache, occurs when an instance is aborted, either
207unexpectedly or expectedly. Examples of reasons for unexpected
208instance aborts are operating system crash, power failure, or
209background process failure. Examples of reasons for expected
210instance aborts are use of the commands SHUTDOWN ABORT
211and STARTUP FORCE.
212
213Crash failure is the failure of all instances accessing a database. In
214the case of a single-instance (non-Parallel Server option) database,
215the terms crash failure and instance failure are used
216interchangeably. Crash recovery (equivalent to instance recovery in
217this case) is the process of recovering all online datafiles to a
218consistent state following a crash. This is done automatically in
219response to the ALTER DATABASE OPEN command.
220
221In the case of the Parallel Server option, the term crash failure is
222used to refer to the simultaneous failures of all open instances.
223Parallel Server crash recovery is the process of recovering all
224online datafiles to a consistent state after all instances accessing the
225database have failed. This is done automatically in response to the
226ALTER DATABASE OPEN command. Parallel Server instance
227failure refers to the failure of an instance while a surviving instance
228continues in operation. Parallel Server instance recovery is the
229automatic recovery by a surviving instance of a failed instance.
230
231Instance failure impairs database integrity because it results in loss
232of the instance's dirty buffer cache. A "dirty" buffer is one whose
233memory version differs from its disk version. An instance that
234aborts has no opportunity for writing out "dirty" buffers so as to
235prevent database integrity breakage on disk following a crash. Loss
236of the dirty buffer cache is a problem due to the fact that the cache
237manager uses algorithms optimized for OLTP performance rather
238than for crash-tolerance. Examples of performance-optimizing
239cache management algorithms that make the task of instance
240recovery more difficult are as follows:
241
2427 LRU (least recently used) based buffer replacement
243
2447 no-datablock-force-at-commit (see 3.3).
245
246As a consequence of the performance-oriented cache management
247algorithms, instance failure can cause database integrity breakage
248as follows:
249
250A. At crash time, the datafiles on disk might contain some but not
251all of a set of datablock changes that constitute a single atomic
252change to the database with respect to structural integrity
253(see 2.5).
254
255B. At crash time, the datafiles on disk might contain some dat-
256ablocks modified by uncommitted transactions.
257
258C. At crash time, the datafiles on disk might contain some dat-
259ablocks missing changes from committed transactions.
260
261During instance recovery, the RDBMS recovery layer repairs
262database integrity breakages A and C. It also enables subsequent
263repair - by the RDBMS transaction layer - of database integrity
264breakage B.
265
266In addition to the requirement that it repair any integrity breakages
267resulting from the crash, instance recovery must meet the following
268requirements:
269
2701. Instance recovery must accomplish the repair using the current
271online datafiles (as left on disk after the crash).
272
2732. Instance Recovery must use only the on-line redo logs. It must
274not require use of the archived logs. Although instance recov-
275ery could work successfully from archived logs (except for a
276database running in NOARCHIVELOG mode), it could not
277work autonomously (requirement 4) if an operator were
278required to restore archived logs.
279
2803. The invocation of instance recovery must be automatic,
281implicit at the next database startup.
282
2834. Detection of the need for repair and the repair itself must pro-
284ceed autonomously, without operator intervention.
285
2865. The duration of the roll-forward phase of instance recovery is
287governed by both RDBMS internal mechanisms (checkpoint)
288and user-configurable parameters (e.g. number and sizes of
289logfiles, checkpoint-frequency tuning parameters, parallel
290recovery parameters).
291
292As seen above, Oracle's buffer cache component is optimized for
293OLTP performance rather than for crash-tolerance. This document
294describes some of the mechanisms used by the cache and recovery
295components to solve the problems posed by use of performance-
296optimizing cache algorithms such as LRU buffer replacement and
297no-datablock-force-at-commit. These mechanisms enable instance
298recovery to meet its requirements while allowing optimal OLTP
299performance. These mechanisms include:
300
3017 Log-Force-at-Commit: see 3.3.
302Facilitates repair of breakage type C by guaranteeing that, at
303transaction commit time, all of the transaction's redo records,
304including its "commit record," are stored on disk in the on-line
305redo log.
306
3077 Checkpointing: see 3.4, 3.6.
308Bounds the amount of transaction redo that instance recovery
309must potentially apply.
310Works in conjunction with online-log switch management to
311ensure that instance recovery can be accomplished using only
312online logs and current online datafiles.
313
3147 Online-Log Switch Management: see 3.7.
315Works in conjunction with checkpointing to ensure that
316instance recovery can be accomplished using only online logs
317and current online datafiles. It guarantees that the current
318checkpoint is beyond an online logfile before that logfile is
319reused.
320
3217 Write-Ahead-Log: see 3.2.
322Facilitates repair of breakage types A and B by guaranteeing
323that: (i) at crash time there are no changes in the datafiles that
324are not in the redo log; (ii) no datablock change was written to
325disk without first writing to the log sufficient information to
326enable undo of the change should a crash intervene before
327commit.
328
3297 Atomic Redo Record Generation: see 3.1.
330Facilitates repair of breakage types A and B.
331
3327 Thread-Open Flag: 5.1.
333Enables detection at startup time of the need for crash recov-
334ery.
335
3361.3 Media Failure and Recovery
337
338Instance failure affects logical database integrity. Because instance
339failure leaves a recoverable version of the online datafiles on the
340post-crash disk, instance recovery can use the online datafiles as a
341starting point.
342
343Media failure, on the other hand, affects physical storage media
344integrity or accessibility. Because the original datafile copies are
345damaged, media recovery uses restored backup copies of the
346datafiles as a starting point. Media recovery then uses the redo log
347to roll-forward these files, either to a consistent present state or to a
348consistent past state. Media recovery is run by issuing one of the
349following commands: RECOVER DATABASE, RECOVER
350TABLESPACE, RECOVER DATAFILE.
351
352Depending on the failure scenario, a media failure has the potential
353for causing database integrity breakages similar to those caused by
354an instance failure. For example, an integrity breakage of type A,
355B, or C could result if I/O accessibility to a datablock were lost
356between the time the block was read into the buffer cache and the
357time DBWR attempted to write out an updated version of the
358block. More typical, however, is the case of a media failure that
359results in the permanent loss of the current version of a datafile, and
360hence of all updates to that datafile that occurred since the last time
361the file was backed up.
362
363Before media recovery is invoked, backup copies of the damaged
364datafiles are restored. Media recovery then applies relevant
365portions of the redo log to roll-forward the datafile backups,
366making them current. Current implies a pre-failure state consistent
367with the rest of the database
368
369Media recovery and instance recovery have in common the
370requirement to repair database integrity breakages A-C. However,
371media recovery and instance recovery differ with respect to
372requirements 1-5. The requirements for media recovery are as
373follows:
374
3751. Media recovery must accomplish the repair using restored
376backups of damaged datafiles.
377
3782. Media recovery can use archived logs as well as the online
379logs.
380
3813. Invocation of media recovery is explicit, by operator com-
382mand.
383
3844. Detection of media failure (i.e. the need to restore a backup) is
385not automatic.Once a backup has been restored however,
386detection of the need to recover it via media recovery is auto-
387matic.
388
3895. The duration of the roll-forward phase of media recovery is
390governed solely by user policy
391(e.g. frequency of backups, parallel recovery parameters)
392rather than by RDBMS internal mechanisms.
393
394
395
3962 Fundamental Data Structures
397
3982.1 Controlfile
399
400The controlfile contains records that describe and keep state
401information about all the other files of the database.
402
403The controlfile contains the following categories of records:
404
4057 Database Info Record (1)
406
4077 Datafile Records (1 per datafile)
408
4097 Thread Records (1 per thread)
410
4117 Logfile Records (1 per logfile)
412
4137 Filename Records (1 per datafile or logfile group member)
414
4157 Log-History Records (1 per completed logfile)
416
417Fields of the controlfile records referenced in the remainder of this
418document are listed below, together with the number(s) of the
419section(s) describing their use:
420
4212.1.1 Database Info Record (Controlfile)
422
4237 resetlogs timestamp: 8.2
424
4257 resetlogs SCN: 8.2
426
4277 enabled thread bitvec: 8.3
428
4297 force archiving SCN: 3.8
430
4317 database checkpoint thread (thread record index): 2.13, 3.10
432
4332.1.2 Datafile Record (Controlfile)
434
4357 checkpoint SCN: 2.14, 3.4
436
4377 checkpoint counter: 2.16, 5.3, 6.2
438
4397 stop SCN: 2.15, 6.5, 6.10, 6.13
440
4417 offline range (offline-start SCN, offline-end checkpoint): 2.18
442
4437 online flag
444
4457 read-enabled, write-enabled flags (1-1: read/write, 1-0: read-
446only)
447
4487 filename record index
449
4502.1.3 Thread Record (Controlfile)
451
4527 thread checkpoint structure: 2.12, 3.4, 8.3
453
4547 thread-open flag: 3.9, 3.11, 8.3
455
4567 current log (logfile record index)
457
4587 head and tail (logfile record indices) of list of logfiles in
459thread: 2.8
460
4612.1.4 Logfile Record (Controlfile)
462
4637 log sequence number: 2.7
464
4657 thread number: 8.4
466
4677 next and previous (logfile record indices) of list of logfiles in
468thread: 2.8
469
4707 count of files in group: 2.8
471
4727 low SCN: 2.7
473
4747 next SCN: 2.7
475
4767 head and tail (filename record indices) of list of filenames in
477group: 2.8
478
4797 "being cleared" flag: 10.3
480
4817 "archiving not needed" flag: 10.3
482
4832.1.5 Filename Record (Controlfile)
484
4857 filename
486
4877 filetype
488
4897 next and previous (filename record indices) of list of filenames
490in group: 2.8
491
4922.1.6 Log-History Record (Controlfile)
493
4947 thread number: 2.11
495
4967 log sequence number: 2.11
497
4987 low SCN: 2.11
499
5007 low SCN timestamp: 2.11
501
5027 next SCN: 2.11
503
5042.2 Datafile Header
505
506Fields of the datafile header referenced in the remainder of this
507document are listed below, together with the number(s) of the
508section(s) describing their use:
509
5107 datafile checkpoint structure: 2.14
511
5127 backup checkpoint structure: 4.1
513
5147 checkpoint counter: 2.16, 3.4, 5.3, 6.2
515
5167 resetlogs timestamp: 8.2
517
5187 resetlogs SCN: 8.2
519
5207 creation SCN: 8.1
521
5227 online-fuzzy bit: 3.5, 6.7.1, 8.1
523
5247 hotbackup-fuzzy bit: 4.1, 4.4, 6.7.1, 8.1
525
5267 media-recovery-fuzzy bit: 6.7.1, 8.1
527
5282.3 Logfile Header
529
530Fields of the logfile header referenced in the remainder of this
531document are listed below, together with the number(s) of the
532section(s) describing their use:
533
5347 thread number: 2.7
535
5367 sequence number: 2.7
537
5387 low SCN: 2.7
539
5407 next SCN: 2.7
541
5427 end-of-thread flag: 6.10
543
5447 resetlogs timestamp: 8.2
545
5467 resetlogs SCN: 8.2
547
5482.4 Change Vector
549
550A change vector describes a single change to a single datablock. It
551has a header that gives the Data Block Address(DBA) of the block,
552the incarnation number, the sequence number, and the operation.
553After the header is information that depends on the operation. The
554incarnation number and sequence number are copied from the
555block header when the change vector is constructed. When a block
556is made "new," the incarnation number is set to a value that is
557greater than its previous incarnation number and the sequence
558number is set to one. The sequence number on the block is
559incremented after every change is applied.
560
5612.5 Redo Record
562
563A redo record is a group of change vectors describing a single
564atomic change to the database. For example, a transaction's first
565redo record might group a change vector for the transaction table
566(rollback segment header), a change vector for the undo block
567(rollback segment), and a change vector for the datablock. A
568transaction can generate multiple redo records. The grouping of
569change vectors into a redo record allows multiple database blocks
570to be changed so that either all changes occur or no changes occur,
571despite arbitrary intervening failures. This atomicity guarantee is
572one of the fundamental jobs of the cache layer. Recovery preserves
573redo record atomicity across failures.
574
5752.6 System Change Number (SCN)
576
577An SCN defines a committed version of the database. A query
578reports the contents of the database as it looked at some specific
579SCN. An SCN is allocated and saved in the header of a redo record
580that commits a transaction. An SCN may also be saved in a record
581when it is necessary to mark the redo as being allocated after a
582specific SCN. SCN's are also allocated and stored in other data
583structures such as the controlfile or datafile headers. An SCN is at
584least 48 bits long. Thus they can be allocated at a rate of 16,384
585SCN's per second for over 534 years without running out of them.
586We will run out of SCN's in June, 2522 AD (we use 31 day months
587for time stamps).
588
5892.7 Redo Logs
590
591All changes to database blocks are made by constructing a redo
592record for the change, saving this record in a redo log, then
593applying the change vectors to the datablocks. Recovery is the
594process of applying redo to old versions of datablocks to make
595them current. This is necessary when the current version has been
596lost.
597
598When a redo log becomes full it is closed and a log switch occurs.
599Each log is identified by its thread number (see below), sequence
600number (within thread), and the range of SCN's spanned by its redo
601records. This information is stored in the thread number, sequence
602number, low SCN, and next SCN fields of the logfile header.
603
604The redo records in a log are ordered by SCN. Moreover, redo
605records containing change vectors for a given block occur in
606increasing SCN order across threads (case of Parallel Server). Only
607some records have SCN's in their header, but every record is
608applied after the allocation of the SCN appearing with or before it
609in the log. The header of the log contains the low SCN and the next
610SCN. The low SCN is the SCN associated with the first redo record
611(unless there is an SCN in its header). The next SCN is the low
612SCN of the log with the next higher sequence number for the same
613thread. The current log of an enabled thread has an infinite next
614SCN, since there is no log with a higher sequence number.
615
6162.8 Thread of Redo
617
618The redo generated by an instance - by each instance in the
619Parallel Server case - is called a thread of redo. A thread is
620comprised of an online portion and (in ARCHIVELOG mode) an
621archived portion. The online portion of a thread is comprised of
622two or more online logfile groups. Each group is comprised of one
623or more replicated members. The set of members in a group is
624referred to variously as a logfile group, group, redo log, online log,
625or simply log. A redo log contains only redo generated by one
626thread. Log sequence numbers are independently allocated for each
627thread. Each thread switches logs independently.
628
629For each logfile, there is a controlfile record that describes it. The
630index of a log's controlfile record is referred to as its log number.
631Note that log numbers are equivalent to log group numbers, and are
632globally unique (across all threads). The list of a thread's logfile
633records is anchored in the thread record (i.e. via head and tail
634logfile record indices), and linked through the logfile records, each
635of which stores the thread number. The logfile record also has fields
636identifying the number of group members, as well as the head and
637tail (i.e. filename record indices) of the list (linked through
638filename records) of filenames in the group.
639
6402.9 Redo Byte Address (RBA)
641
642An RBA points to a specific location in a particular redo thread. It
643is ten bytes long and has three components: log sequence number,
644block number within log, and byte number within block.
645
6462.10 Checkpoint Structure
647
648The checkpoint structure is a data structure that defines a point in
649all the redo ever generated for a database. Checkpoint structures
650are stored in datafile headers and in the per-thread records of the
651controlfile. They are used by recovery to know where to start
652reading the log thread(s) for redo application.
653
654The key fields of the checkpoint structure are the checkpoint SCN
655and the enabled thread bitvec.
656
657The checkpoint SCN effectively demarcates a specific location in
658each enabled thread (for a definition of enabled see 3.11). For each
659thread, this location is where redo was being generated at some
660point in time within the resolution of one commit. The redo record
661headers in the log can be scanned to find the first redo record that
662was allocated at the checkpoint SCN or higher.
663
664The enabled thread bitvec is a mask defining which threads were
665enabled at the time the checkpoint SCN was allocated. Note that a
666bit is set for each thread that was enabled, regardless of whether it
667was open or closed. Every thread that was enabled has a redo log
668that contains the checkpoint SCN. A log containing this SCN is
669guaranteed to exist (either online or archived).
670
671The checkpoint structure also stores the time that the checkpoint
672SCN was allocated. This timestamp is only used to print a message
673to aid a person looking for a log.
674
675In addition, the checkpoint structure stores the number of the
676thread that allocated the checkpoint SCN and the current RBA in
677that thread when the checkpoint SCN was allocated. Having an
678explicitly-stored thread RBA (as opposed to only having the
679checkpoint SCN as an implicit thread location "pointer") makes the
680log sequence number (part of the RBA) and archived log name
681readily available for the single-instance (i.e. single-thread, non
682Parallel Server) case.
683
684A checkpoint structure for a port that supports up to 1023 threads
685of redo is 150 bytes long. A VMS checkpoint is 30 bytes and
686supports up to 63 threads of redo.
687
6882.11 Log History
689
690The controlfile can be configured (using the MAXLOGHISTORY
691clause of the CREATE DATABASE or CREATE CONTROLFILE
692command) to contain a history record for every logfile that is
693completed. Log history records are small (24 bytes on VMS). They
694are overwritten in a circular fashion so that the oldest information
695is lost.
696
697For each logfile, the log-history controlfile record contains the
698thread number, log sequence number, low SCN, low SCN
699timestamp, and next SCN (i.e. low SCN of the next log in
700sequence). The purpose of the log history is to reconstruct archived
701logfile names from an SCN and thread number. Since a log
702sequence number is contained in the checkpoint structure (part of
703the RBA), single thread (i.e. non-Parallel Server) databases do not
704need log history to construct archived log names.
705
706The fields of the log history records are viewable via the
707V$LOG_HISTORY "fixed-view" (see Section 9 for a description
708of the recovery-related "fixed-views"). Additionally,
709V$RECOVERY_LOG, which displays information about archived
710logs needed to complete media recovery, is derived from
711information in the log history records. Although log history is not
712strictly needed for easy administration of single-instance (non-
713Parallel Server) databases, enabling use of V$LOG_HISTORY and
714V$RECOVERY_LOG might be a reason to configure it.
715
7162.12 Thread Checkpoint Structure
717
718Each enabled thread's controlfile record contains a checkpoint
719structure called the thread checkpoint. The SCN field in this
720structure is known as the thread checkpoint SCN. The thread
721number and RBA fields in this structure refer to the associated
722thread.
723
724The thread checkpoint structure is updated each time an instance
725checkpoints its thread (see 3.4). During such thread checkpoint
726events, the instance associated with the thread writes to disk in the
727online datafiles all dirty buffers modified by redo generated before
728the thread checkpoint SCN.
729
730A thread checkpoint event guarantees that all pre-thread-
731checkpoint-SCN redo generated in that thread for all online
732datafiles has been written to disk. (Note that if the thread is closed,
733then there is no redo beyond the thread checkpoint SCN; i.e. the
734RBA points just past the last redo record in the current log.)
735
736It is the job of instance recovery to ensure that all of the thread's
737redo for all online datafiles is applied. Because of the guarantee
738that all of the thread's redo prior to the thread checkpoint SCN has
739already been applied, instance recovery can make the guarantee
740that, by starting redo application at the thread checkpoint SCN, and
741continuing through end-of-thread, all of the thread's redo will have
742been applied.
743
7442.13 Database Checkpoint Structure
745
746The database checkpoint structure is the thread checkpoint of the
747thread that has the lowest checkpoint SCN of all the open threads.
748The number of the database checkpoint thread - the number of
749the thread whose thread checkpoint is the current database
750checkpoint - is recorded in the database info record of the
751controlfile. If there are no open threads, then the database
752checkpoint is the thread checkpoint that contains the highest
753checkpoint SCN of all the enabled threads.
754
755Since each instance guarantees that all redo generated before its
756own thread checkpoint SCN has been written, and since the
757database checkpoint SCN is the lowest of the thread checkpoint
758SCNs, it follows that all pre-database-checkpoint-SCN redo in all
759instances has been written to all online datafiles.
760
761Thus, all pre-database-checkpoint-SCN redo generated in all
762threads for all online datafiles is guaranteed to be in the files on
763disk already. This is described by saying that the online datafiles
764are checkpointed at the database checkpoint. This is the rationale
765for using the database checkpoint to update the online datafile
766checkpoints (see below) when an instance checkpoints its thread
767(see 3.4).
768
7692.14 Datafile Checkpoint Structure
770
771The header of each datafile contains a checkpoint structure known
772as the datafile checkpoint. The SCN field in this structure is known
773as the datafile checkpoint SCN.
774
775All pre-checkpoint-SCN redo generated in all threads for a given
776datafile is guaranteed to be in the file on disk already. An online
777datafile has its checkpoint SCN replicated in its controlfile record.
778Note: Oracle's recovery layer code is designed to "tolerate" a
779discrepancy in checkpoint SCN between the file header and the
780controlfile record. These values could get out of sync should an
781instance failure occur between the time the file header was updated
782and the time the controlfile "transaction" committed. (Note: A
783controlfile "transaction" is an RDBMS internal mechanism,
784independent of the Oracle transaction layer, that allows an
785arbitrarily large update to the controlfile to be "committed"
786atomically.)
787
788The execution of a datafile checkpoint (see 3.6) for a given datafile
789updates the checkpoint structure in the file header, and guarantees
790that all pre-checkpoint-SCN redo generated in all threads for that
791datafile is on disk already.
792
793A thread checkpoint event (see 3.4) guarantees that all pre-
794database-checkpoint-SCN redo generated in all threads for all
795online datafiles has been written to disk. The execution of a thread
796checkpoint may advance the database checkpoint (e.g. in the
797single-instance case; or if the thread having the oldest checkpoint
798changed from being the current thread to another thread). If the
799database checkpoint does advance, then the new database
800checkpoint is used to update the datafile checkpoints of all the
801online datafiles (except those in hot backup: see Section 4).
802
803It is the job of media recovery (see Section 6) to ensure that all redo
804for a recovery-datafile (i.e. a datafile being media-recovered)
805generated in any thread through the recovery end-point is applied.
806Because of the guarantee that all recovery-datafile-redo generated
807in any enabled thread prior to that datafile's checkpoint SCN has
808already been applied, media recovery can make the guarantee that,
809by starting redo application in each enabled thread with the datafile
810checkpoint SCN and continuing through the recovery end-point
811(e.g. end-of-thread on all threads in the case of complete media
812recovery), all redo for the recovery-datafile from all threads will
813have been applied.
814
815Since the datafile checkpoint is stored in the header of the datafile
816itself, it is also present in backup copies of the datafile. It is the job
817of hot backup (see Section 4) to ensure that - despite the
818occurrence of ongoing updates to the datafile during the backup
819copy operation - the version of the datafile's checkpoint captured
820in the backup copy satisfies the checkpoint-SCN guarantee with
821respect to the versions of the datafile's datablocks captured in the
822backup copy.
823
8242.15 Stop SCN
825
826Each datafile's controlfile record has a field called the stop SCN. If
827the file is offline or read-only, the stop SCN is the SCN beyond
828which no further redo exists for that datafile. If the file is online and
829any instance has the database open, the stop SCN is set to
830"infinity." The stop SCN is used during media recovery to
831determine when redo application for a particular datafile can stop.
832This ensures that media recovery will terminate when recovering
833an offline file while the database is open.
834
835The stop SCN is set whenever a datafile is taken offline or set read-
836only. This is true whether the offline was "immediate" (due to an I/
837O error, or due to taking the file's tablespace offline "immediate"),
838"temporary" (due to taking the file's tablespace offline
839"temporary"), or "normal" (due to taking the file's tablespace
840offline "normal"). However, in the case of a datafile taken offline
841"immediate," there is no file checkpoint (see 3.6), and dirty buffers
842are discarded. Hence, media recovery may need to apply redo from
843before the stop SCN in order to bring the datafile online. However,
844media recovery does not need to look for redo after the stop SCN,
845since it does not exist. If the stop SCN is equal to the datafile
846checkpoint SCN, then the file does not need recovery.
847
8482.16 Checkpoint Counter
849
850There is a checkpoint counter kept in both the datafile header and
851in the datafile's controlfile record. Its purpose is to allow detection
852of the fact that a datafile or controlfile is a restored backup.
853
854The checkpoint counter is incremented every time checkpoints of
855online files are being advanced (e.g. by thread checkpoint). Thus
856the datafile's checkpoint counter is incremented even though the
857datafile's checkpoint is not being advanced because the file is in hot
858backup (see Section 4), or because its checkpoint SCN is already
859beyond that of the intended checkpoint (e.g. the file is new or has
860undergone a recent datafile checkpoint).
861
862The old value of the checkpoint counter - matching the
863checkpoint counter in the datafile's controlfile record - is also
864remembered in the file header. It is usually one less than the current
865counter in the header, but may differ from the current counter by
866more than one if the previous file header update failed after the
867header was written but before the controlfile "transaction"
868committed.
869
870A mismatch in checkpoint counters between the datafile header and
871the datafile's controlfile record is used to detect when a backup
872datafile (or a backup controlfile) has been restored.
873
8742.17 Tablespace-Clean-Stop SCN
875
876TS$, a data dictionary table that describes tablespaces, has a
877column called the tablespace-clean-stop-SCN. It identifies an SCN
878at which a tablespace was taken offline or set read-only "cleanly":
879i.e. after checkpointing its datafiles (see 3.6). The SCN at which the
880datafiles are checkpointed is recorded in TS$ as the
881tablespace-clean-stop SCN. It allows such a "clean-stopped"
882tablespace to survive (i.e. not need to be dropped after) a
883RESETLOGS open (see 8.6). During media recovery, prior to
884resetlogs, the "clean-stopped" tablespace would be set offline.
885After resetlogs, the tablespace - which needs no recovery - is
886permitted to be brought online and/or set read-write. (An
887immediate backup of the tablespace is recommended).
888
889The tablespace-clean-stop SCN is set to zero (after being set
890momentarily to "infinity" during datafile state transition) when
891bringing an offline-clean tablespace online, or setting a read-only
892tablespace read-write. The tablespace-clean-stop SCN is also
893zeroed when taking a tablespace offline "immediate" or
894"temporary."
895
896A tablespace that has a non-zero tablespace-clean-stop SCN in TS$
897is clean at that SCN: the tablespace currently contains all redo up
898through that SCN, and no redo for the tablespace beyond that SCN
899exists. If the tablespace's datafiles are still in the state they had
900when the tablespace was taken offline "normal" or set read-only -
901i.e. they are not restored backups, are not fuzzy, and are
902checkpointed at the clean-stop SCN - then the tablespace can be
903brought online without recovery. Note that the semantics of the
904tablespace-clean-stop SCN differ from those of a constituent
905datafile's stop SCN in the datafile's controlfile record. The
906controlfile stop SCN designates an SCN beyond which no redo for
907the datafile exists. This does not imply that the datafile currently
908contains all redo up through that SCN.
909
910The tablespace-clean-stop SCN is stored in TS$ rather than in the
911controlfile so that it is covered by redo and will finish in the correct
912state - i.e. reflecting the correct online/offline state of the
913tablespace - following an incomplete recovery (see 6.12). Its
914value will not be lost if a backup controlfile is restored, or if a new
915controlfile is created. Furthermore, the presence of the tablespace-
916clean-stop SCN in TS$ allows an offline normal (or read-only)
917tablespace to survive (not need to be dropped after) a
918RESETLOGS open, since it is known that no redo application is
919needed to bring it online (see 8.6 for more detail). Thus, for
920example, an offline normal (or read-only) tablespace that was
921offline during an incomplete recovery can be brought online (or set
922read-write) subsequent to a RESETLOGS open. Without the
923tablespace-clean-stop SCN, there would be no way of knowing that
924the tablespace does not need recovery using redo that was
925discarded by the resetlogs. The only alternative would have been to
926force the tablespace to be dropped.
927
9282.18 Datafile Offline Range
929
930The offline-start SCN and offline-end checkpoint fields of the
931controlfile datafile record describe the offline range. If valid, they
932delimit a log range guaranteed not to contain any redo for the
933datafile. Thus, media recovery can skip this log range when
934recovering the datafile, obviating the need to access old archived
935log data (which may be uavailable or unusable due to resetlogs: see
936Section 7). This optimization aids in recovering a datafile that is
937presently online (or read-write), but that was offline-clean (or read-
938only) for a long time, and whose last backup dates from that time.
939For example, this would be the case if, after a RESETLOGS open,
940an offline normal (or read-only) tablespace had been brought online
941(or set read-write), but not yet backed up.
942
943When a datafile transitions from offline-clean to online (or from
944read-only to read-write), the offline range is set as follows: The
945offline-start SCN is set from the tablespace-clean-stop SCN saved
946when setting the file offline (or read-only). The offline-end
947checkpoint is set from the file checkpoint taken when setting the
948file online (or read-write).
949
950
951
952
953
954
955
956
957
9583 Redo Generation
959
960Redo is generated to describe all changes made to database blocks.
961This section describes the various operations that occur while the
962database is open and generating redo.
963
9643.1 Atomic Changes
965
966The most fundamental operation is to atomically change a set of
967datablocks. A foreground process intending to change one or more
968datablocks first acquires exclusive access to cache buffers
969containing those blocks. It then constructs the change vectors
970describing the changes. Space is allocated in the redo log buffer to
971hold the redo record. The redo log buffer - the buffer from which
972LGWR writes the redo log - is located in the SGA (System
973Global Area). It may be necessary to ask LGWR to write the buffer
974to the redo log in order to make space. If the log is full, LGWR
975may need to do a log switch in order to make the space available.
976Note that allocating space in the redo buffer also allocates space in
977the logfile. Thus, even though the redo buffer has been written, it
978may not be possible to allocate redo log space. After the space is
979allocated, the foreground process builds the redo record in the redo
980buffer. Only after the redo record has been built in the redo buffer
981may the datablock buffers be changed. Writing the redo to disk is
982the real change to the database. Recovery ensures that all changes
983that make it into the redo log make it into the datablocks (except in
984the case of incomplete recovery).
985
9863.2 Write-Ahead Log
987
988Write-ahead log is a cache-enforced protocol governing the order
989in which dirty datablock buffers are written vs. when the redo log
990buffer is written. According to write-ahead log protocol, before
991DBWR can write out a cache buffer containing a modified
992datablock, LGWR must write out the redo log buffer containing
993redo records describing changes to that datablock.
994
995Note that write-ahead log is independent of log-force-at-commit
996(see 3.3).
997
998Note also that write-ahead log protocol only applies to datafile
999writes that originate from the buffer cache. In particular, write-
1000ahead log does not apply to so-called direct path writes (e.g.
1001originating from direct path load, table create via subquery, or
1002index create). Direct path writes (targeted above the segment high-
1003water mark) originate not as writes out of the buffer cache, but as
1004bulk-writes out of the foreground process' data space. Indeed,
1005correct handling of direct path writes by media recovery dictates a
1006write-behind-log protocol. (The basic reason is that, because the
1007bulk-writes do not go through the buffer cache, there is no
1008mechanism to guarantee their completion at checkpoint).
1009
1010One guarantee made by write-ahead log protocol is that there are
1011no changes in the datafiles that are not in the redo log, regardless of
1012intervening failure. This is what enables recovery to preserve the
1013guarantee of redo record atomicity despite intervening failure.
1014
1015Another guarantee made by write-ahead log protocol is that no
1016datablock change can be written to disk without first writing to the
1017redo log sufficient information to enable the change to be undone
1018should the transaction fail to commit. That undo-enabling
1019information is written to the redo log in the form of "redo" for the
1020rollback segment.
1021
1022Write-ahead log protocol plays a key role in enabling the
1023transaction layer to preserve the guarantee of transaction atomicity
1024despite intervening failure.
1025
10263.3 Transaction Commit
1027
1028Transaction commit allocates an SCN and builds a commit redo
1029record containing that SCN. The commit is complete when all of
1030the transaction's redo (including commit redo record) is on disk in
1031the log. Thus, commit forces the redo log to disk - at least up to
1032and including the transaction's commit record. This is termed log-
1033force-at-commit.
1034
1035Recovery is designed such that it is sufficient to write only the redo
1036log at commit time - rather than all datablocks changed by the
1037transaction - in order to guarantee transaction durability despite
1038intervening failure. This is termed no-datablock-force-at-commit.
1039
10403.4 Thread Checkpoint
1041
1042A thread checkpoint event, executed by the instance associated
1043with the redo thread being checkpointed, forces to disk all dirty
1044buffers in that instance that contain changes to any online datafile
1045before a designated SCN - the thread checkpoint SCN. Once all
1046redo in the thread prior to the checkpoint SCN has been written to
1047disk, the thread checkpoint structure in the thread's controlfile
1048record is updated in a controlfile transaction.
1049
1050When a thread checkpoint begins, an SCN is captured and a
1051checkpoint structure is initialized. Then all the dirty buffers in the
1052instance's cache are marked for checkpointing. DBWR proceeds to
1053write out the marked buffers in a staged manner. Once all the
1054marked buffers have been written, the SCN in the checkpoint
1055structure is set to the captured SCN, and the thread checkpoint
1056structure in the thread's controlfile record is updated in a controlfile
1057transaction.
1058
1059A thread checkpoint might or might not advance the database
1060checkpoint. If only one thread is open, the new checkpoint is the
1061new database checkpoint. If multiple threads are open, the database
1062checkpoint will advance if the local thread is the current database
1063checkpoint. Since the new checkpoint SCN was allocated recently,
1064it is most likely greater than the thread checkpoint SCN in some
1065other open thread. If it advances, the database checkpoint becomes
1066the new lowest-SCN open thread checkpoint. If the old checkpoint
1067SCN for the local thread was higher than the current checkpoint
1068SCN of some other open thread, then the database checkpoint does
1069not change.
1070
1071If the database checkpoint is advanced, then the checkpoint counter
1072is advanced in every online datafile header. Furthermore, for each
1073online datafile that is not in hot backup (see Section 4), and not
1074already checkpointed at a higher SCN (e.g. as would be the case for
1075a recently added or recovered file), the datafile header checkpoint is
1076advanced to the new database checkpoint, and the file header is
1077written to disk. Also, the checkpoint SCN in the datafile's
1078controlfile record is advanced to the new database checkpoint SCN.
1079
10803.5 Online-Fuzzy Bit
1081
1082Note that more changes - beyond those already in the marked
1083buffers - may be generated after the start of checkpoint. Such
1084changes would be generated at SCNs higher than the SCN that will
1085be recorded in the file header. They could either be changes to
1086marked buffers that were added since checkpoint start, or else
1087changes to unmarked buffers. Buffers containing these changes
1088could written out for a variety of reasons. Thus, the online files are
1089online-fuzzy; that is, they generally contain changes in the future of
1090(i.e. generated at higher SCNs than) their header checkpoint SCN.
1091A datafile is virtually always online-fuzzy while it is online and the
1092database is open.
1093
1094Online-fuzzy state is indicated by setting the so-called online-fuzzy
1095bit in the datafile header. The online-fuzzy bits of all online
1096datafiles are set at database open time. Also, when a datafile is
1097brought online while the database is open, its online-fuzzy bit is
1098set.
1099
1100The online-fuzzy bits are cleared after the last instance does a
1101shutdown "normal" or "immediate." Other occasions for clearing
1102the online-fuzzy bits are: (i) the finish of crash recovery; (ii) when
1103media recovery "checkpoints" (flushes its buffers) after
1104encountering an end-crash-recovery redo record (see 5.5); (iii)
1105when taking a datafile offline "temporary" or "normal" (i.e. an
1106offline operation that is preceded by a file checkpoint); (iv) when
1107BEGIN BACKUP is issued (see 4.1).
1108
1109As will be seen in 8.1, open with resetlogs will fail if any online
1110datafile has the online-fuzzy bit (or any fuzzy bit) set.
1111
11123.6 Datafile Checkpoint
1113
1114A datafile checkpoint event, executed by all open instances (for all
1115open threads), forces to disk all dirty buffers in any instance that
1116contain changes to a particular datafile (or set of datafiles) before a
1117designated SCN - the datafile checkpoint SCN. Once all datafile-
1118related redo from all open threads prior to the checkpoint SCN has
1119been written to disk, the datafile checkpoint structure in the file
1120header is updated and written to disk.
1121
1122Datafile checkpoints occur as part of operations such as beginning
1123hot backup (see Section 4) and offlining datafiles as part of taking a
1124tablespace offline normal (see 2.17).
1125
11263.7 Log Switch
1127
1128When an instance needs to generate more redo but cannot allocate
1129enough blocks in the current log, it does a log switch. The first step
1130in a log switch is to find an online log that is a candidate for reuse.
1131
1132The first requirement for the candidate log is that it must not be
1133active: i.e. it must not be needed for crash/instance recovery. In
1134other words, it must be overwritable without losing redo data
1135needed for instance recovery. The principle enforced is that a
1136logfile cannot be reused until the current thread checkpoint is
1137beyond that logfile. Since instance recovery starts at the current
1138thread checkpoint SCN/RBA (and expects to find that RBA in an
1139online redo log), the ability to do instance recovery using only
1140online logs translates into the requirement that the current thread
1141checkpoint SCN be beyond the highest SCN associated with redo
1142in the candidate log. If this is not the case, then the thread
1143checkpoint currently in progress - e.g. the one started when the
1144candidate log was originally switched into (see below) - is
1145hurried up to complete.
1146
1147The other requirement for the candidate log is that it does not need
1148archiving. Of course, this requirement only applies to a database
1149running in ARCHIVELOG mode. If archiving is required, the
1150archiver is posted.
1151
1152As soon as the log switch completes, a new thread checkpoint is
1153started in the new log. Hopefully, the checkpoint will complete
1154before the next log switch is needed.
1155
11563.8 Archiving Log Switches
1157
1158Each thread switches logs independently. Thus, when running
1159Parallel Server, an SCN is almost never at the beginning of a log in
1160all threads. However, it is desirable to have roughly the same range
1161of SCNs in the archived logs of all enabled threads. This ensures
1162that the last log archived in each thread is reasonably current. If an
1163unarchived log for an enabled thread contained a very old SCN (as
1164would occur in the case of a relatively idle instance), it would not
1165be possible to use archived logs from a primary site to do recovery
1166to a higher SCN at a standby site. This would be true even if the log
1167with the low SCN contained no redo.
1168
1169This problem is solved by forcing log switches in other threads
1170when their current log is significantly behind the log just archived.
1171For the case of an open thread, a lock is used to "kick" the laggard
1172instance into switching logs and archiving when it can. For the case
1173of a closed thread, the archiving process in the active instance does
1174the closed thread's log switch and archiving for it. Note that this
1175can result in a thread that is enabled but never used having a bunch
1176of archived logs with only a file header. A force archiving SCN is
1177maintained in the database info controlfile record to implement this
1178feature. The system strives to archive any log that contains that
1179SCN or less. In general, the log with the lowest SCN is archived
1180first.
1181
1182The command ALTER SYSTEM ARCHIVE LOG CURRENT can
1183be used to manually archive the current logs of all enabled threads.
1184It forces all threads, open and closed, to switch to a new log. It
1185archives what is necessary to ensure all the old logs are archived. It
1186does not return until all redo generated before the command was
1187entered is archived. This command is useful for ensuring all redo
1188logs necessary for the recovery of a hot backup are archived. It is
1189also useful for ensuring the potential currency of a standby site in a
1190configuration in which archived logs from a primary site are
1191shipped to a standby site for application by recovery in case of
1192disaster (i.e. "standby database").
1193
11943.9 Thread Open
1195
1196When an instance opens the database, it needs to open a thread for
1197redo generation. The thread is chosen at mount time. A system
1198initialization parameter can be used to specify the thread to mount
1199by number. Otherwise, any available publicly-enabled thread can
1200be chosen by the instance at mount time. A thread-mounted lock is
1201used to prevent two instances from mounting the same thread.
1202When an instance opens a thread, it sets the thread-open flag in the
1203thread's controlfile record. While the instance is alive, it holds a set
1204of thread-opened locks (one held by each of LGWR, DBWR,
1205LCK0, LCK1, ...). (These are released at instance death, enabling
1206one instance to detect the death of another in the Parallel Server
1207environment: see 5.1). Also at thread open time, a new checkpoint
1208is captured and used for the thread checkpoint. If this is the first
1209database open, this becomes the new database checkpoint, ensuring
1210all online files have their header checkpoints advanced at open
1211time. Note that a log switch may be forced at thread open time.
1212
12133.10 Thread Close
1214
1215When an instance closes the database, or when a thread is
1216recovered by instance/crash recovery, the thread is closed. The first
1217step in closing a thread is to ensure that no more redo is generated
1218in it. The next step is to ensure that all changes described by
1219existing redo records are in the online datafiles on disk. In the case
1220of normal database close, this is accomplished by doing a thread
1221checkpoint. The SCN from this final thread checkpoint is said to be
1222the "SCN at which the thread was closed." Finally, the thread's
1223controlfile record is updated to clear the thread-open flag.
1224
1225In the case of thread close by instance recovery, the presence in the
1226online datafiles of all changes described by thread redo records is
1227ensured by starting redo application at the most recent thread
1228checkpoint and continuing through end-of-thread. Once all changes
1229described by thread redo records are in the online datafiles, the
1230thread checkpoint is advanced to the end-of-thread. Just as in the
1231case of a normal thread checkpoint, this checkpoint may advance
1232the database checkpoint. If this is the last thread close, the database
1233checkpoint thread field in the database info controlfile record -
1234which normally points to an open thread - will be left pointing at
1235this thread, even though it is closed.
1236
12373.11 Thread Enable
1238
1239In order for a thread to be opened, it must be enabled. This ensures
1240that its redo will be found during media recovery. A thread may be
1241enabled in either public or private mode. A private thread can only
1242be mounted by an instance that specifies it in the THREAD system
1243initialization parameter. This is analogous to rollback segments. A
1244thread must have at least two online redo log groups while it is
1245enabled. An enabled thread always has one online log that is its
1246current log. The next SCN of the current log is infinite, so that any
1247new SCN allocated will be within the current log. A special thread-
1248enable redo record is written in the thread of an instance enabling a
1249new thread (i.e. via ALTER DATABASE ENABLE THREAD).
1250The thread-enable redo record is used by media recovery to start
1251applying redo from the new thread. Note that this means it takes an
1252open thread to enable another thread. This chicken and egg
1253problem is resolved by having thread one automatically enabled
1254publicly at database creation. This also means that databases that
1255do not run in Parallel Server mode do not need to enable a thread.
1256
12573.12 Thread Disable
1258
1259If a thread is not going to be used for a long while, it is best to
1260disable it. This means that media recovery will not expect any redo
1261to be found in the thread. Once a thread is disabled, its logs may be
1262dropped. A thread must be closed before it can be disabled. This
1263ensures all its changes have been written to the datafiles. A new
1264SCN is allocated to save as the next SCN for the current log. The
1265log header is marked with this SCN and flags saying it is the end of
1266a disabled thread. It is important that a new current SCN is
1267allocated. This ensures the SCN in any checkpoint with this thread
1268enabled will appear in one of the logs from the thread. Note that
1269this means a thread must be open in order to disable another thread.
1270Thus, it is not possible to disable all threads.
1271
1272
1273
1274
1275
1276
1277
1278
1279
12804 Hot Backup
1281
1282A hot backup is a copy of a datafile that is taken while the file is in
1283active use. Datafile writes (by DBWR) go on as usual during the
1284time the backup is being copied. Thus, the backup gets a "fuzzy"
1285copy of the datafile:
1286
12877 Some blocks may be ahead in time versus other blocks of the
1288copy.
1289
12907 Some blocks of the copy may be ahead of the checkpoint SCN
1291in the file header of the copy.
1292
12937 Some blocks may contain updates that constitute breakage of
1294the redo record atomicity guarantee with respect to other
1295blocks in this or other datafiles.
1296
12977 Some block copies may be "fractured" (due to front and back
1298halves being copied at different times, with an intervening
1299update to the block on disk).
1300
1301The "hotbackup-fuzzy" copy is unusable without "focusing" (via
1302the redo log) that occurs when the backup is restored and
1303undergoes media recovery. Media recovery applies redo (from all
1304threads) from the begin-backup checkpoint SCN (see Step 2. in
1305Section 4.1) through the end-point of the recovery operation (either
1306complete or incomplete). The result is a transaction-consistent
1307"focused" version of the datafile.
1308
1309There are three steps to taking a hot backup:
1310
13117 Execute the ALTER TABLESPACE ... BEGIN BACKUP
1312command.
1313
13147 Use an operating system copy utility to copy the constituent
1315datafiles of the tablespace(s).
1316
13177 Execute the ALTER TABLESPACE ... END BACKUP com-
1318mand.
1319
13204.1 BEGIN BACKUP
1321
1322The BEGIN BACKUP command takes the following actions (not
1323necessarily in the listed order) for each datafile of the tablespace:
1324
13251. It sets a flag in the datafile header - the hotbackup-fuzzy bit
1326- to indicate that the file is in hot backup. The header with
1327this flag set (copied by the copy utility) enables the copy to be
1328recognized as a hot backup. A further purpose of this flag in
1329the online file header is to cause the checkpoint in the file
1330header to be "frozen" at the begin-backup checkpoint value
1331that will be set in Step 4. This is the value that it must have in
1332the backup copy in order to ensure that, when the backup is
1333recovered, media recovery will start redo application at a suffi-
1334ciently early checkpoint SCN so as to cover all changes to the
1335file in all threads since the execution of BEGIN BACKUP (see
13366.5). Since we cannot guarantee that the file header will be the
1337first block to be written out by the copy utility, it is important
1338that the file header checkpoint structure remain "frozen" until
1339END BACKUP time. This flag keeps the datafile checkpoint
1340structure "frozen" during hot backup, preventing it (and the
1341checkpoint SCN in the datafile's controlfile record) from being
1342updated during thread checkpoint events that advance the
1343database checkpoint. New in v7.2: While the file is in hot
1344backup, a new "backup" checkpoint structure in the datafile
1345header receives the updates that the "frozen" checkpoint
1346would have received.
1347
13482. It executes a datafile checkpoint, capturing the resultant
1349"begin-backup" checkpoint information, including the begin-
1350backup checkpoint SCN. When the file is checkpointed, all
1351instances are requested to write out all dirty buffers they have
1352for the file. If the need for instance recovery is detected at this
1353time, the file checkpoint operation waits until it is completed
1354before proceeding. Checkpointing the file at begin-backup
1355time ensures that only file blocks changed after begin-backup
1356time might have been written to disk during the course of the
1357file copy. This guarantee is crucial to enabling block before-
1358image logging to cope with the fractured block problem, as
1359described in Step 3.
1360
13613. [Platform-dependent option]: It starts block before-image log-
1362ging for the file. During block before-image logging, all
1363instances log a full block before-image to the redo log prior to
1364the first change to each block of the file (since the backup
1365started, or since the block was read anew into the buffer
1366cache). This is to forestall a recovery problem that would arise
1367if the backup were to contain a fractured block copy (mis-
1368matched halves). This could happen if (the database block size
1369is greater than the operating system block size, and) the front
1370and back halves of the block were copied to the backup at dif-
1371ferent times - with an intervening update to the block on
1372disk. In this eventuality, recovery can reconstruct the block
1373using the logged block before-image.
1374
13754. It sets the checkpoint in the file header equal to the begin-
1376backup checkpoint captured in Step 2. This file header check-
1377point will be "frozen" until END BACKUP is executed.
1378
13795. It clears the file's online-fuzzy bit. The online-fuzzy bit
1380remains clear during the course of the file copy operation, thus
1381ensuring a cleared online-fuzzy bit in the file copy. Note that
1382the online-fuzzy bit is set again by the execution of END
1383BACKUP.
1384
13854.2 File Copy
1386
1387The file copy is done by utilities that are not part of Oracle. The
1388presumption is that the platform vendor will have backup facilities
1389that are superior to any portable facility that we could develop. It is
1390the responsibility of the administrator to ensure that copies are only
1391taken between the BEGIN BACKUP and END BACKUP
1392commands, or when the file is not in use.
1393
13944.3 END BACKUP
1395
1396The END BACKUP command takes the following actions for each
1397datafile of the tablespace:
1398
13991. It restores (i.e. sets) the file's online-fuzzy bit.
1400
14012. It creates an end-backup redo record (end-backup "marker")
1402for the datafile. This record, interpreted only by media recov-
1403ery, contains the begin-backup checkpoint SCN (i.e. the SCN
1404matching that in the "frozen" checkpoint in the backup's
1405header). This record serves to mark the end of the redo gener-
1406ated during the backup. The end-backup "marker" is used by
1407media recovery to determine when all redo generated between
1408BEGIN BACKUP and END BACKUP has been applied to the
1409datafile. Upon encountering the end-backup "marker", media
1410recovery can (at the next media recovery checkpoint: see
14116.7.1) clear the hotbackup-fuzzy bit. This is only important in
1412preventing an incomplete recovery that might erroneously
1413attempt to end before all redo generated between BEGIN
1414BACKUP and END BACKUP has been applied. Ending
1415incomplete recovery at such a point may result in an inconsis-
1416tent file, since the backup copy may already have contained
1417changes beyond this endpoint. As will be seen on 8.1, open
1418with resetlogs following incomplete media recovery will fail if
1419any online datafile has the hotbackup-fuzzy bit (or any other
1420fuzzy bit) set.
1421
14223. It clears the file's hotbackup-fuzzy bit.
1423
14244. It stops block before-image logging for the file.
1425
14265. It advances the file checkpoint to the current database check-
1427point. This compensates for any file header update(s) missed
1428during thread checkpoints that may have advanced the data-
1429base checkpoint while the file was in hot backup state, with its
1430checkpoint "frozen".
1431
14324.4 "Crashed" Hot Backup
1433
1434A normal shutdown of the instance that started a backup, or the last
1435remaining instance, is not allowed while any files are in hot
1436backup. Nor may a file in backup be taken offline normal or
1437temporary. This is to ensure an end-backup "marker" is generated
1438whenever possible, and to make administrators aware that they
1439forgot to issue the END BACKUP command, and that the backup
1440copy is unusable.
1441
1442When an instance failure or shutdown abort leaves a hot backup
1443operation incomplete (i.e. lacking termination via END BACKUP),
1444any file that was in backup before the failure has its hotbackup-
1445fuzzy bit set and its checkpoint "frozen" at the begin-backup
1446checkpoint. Even though the online file's datablocks are actually
1447current to the database checkpoint, the file's header makes it look
1448like a restored backup that needs media recovery and is current
1449only to the begin-backup checkpoint. Crash recovery will fail -
1450claiming media recovery is required - if it encounters an online
1451file in "crashed" hot backup state. The file does not actually need
1452media recovery, however, but only an adjustment to its file header
1453to take it out of "crashed" hot backup state.
1454
1455Media recovery could be used to recover and allow normal open of
1456a database that has files left in "crashed" hot backup state. For v7.2
1457however, a preferable option - because it requires no archived
1458logs - is to use the (new in v7.2) command ALTER DATABASE
1459DATAFILE... END BACKUP on the files left in "crashed" hot
1460backup state (identifiable using the V$BACKUP fixed-view: see
14619.6). Following execution of this command, crash recovery will
1462suffice to open the database. Note that the ALTER TABLESPACE
1463... END BACKUP format of the command cannot be used when the
1464database is not open. This is because the database must be open in
1465order to translate (via the data dictionary) tablespace names into
1466their constituent datafile names.
1467
1468
1469
1470
1471
1472
1473
1474
1475
14765 Instance Recovery
1477
1478Instance recovery is used to recover from both crash failures and
1479Parallel Server instance failures. Instance recovery refers either to
1480crash recovery or to Parallel Server instance recovery (where a
1481surviving instance recovers when one or more other instances fail).
1482
1483The goal of instance recovery is to restore the datablock changes
1484that were in the cache of the dead instance and to close the thread
1485that was left open. Instance recovery uses only online redo logfiles
1486and current online datafiles (not restored backups). It recovers one
1487thread at a time, starting at the most recent thread checkpoint and
1488continuing until end-of-thread.
1489
14905.1 Detection of the Need for Instance Recovery
1491
1492The kernel performs instance recovery automatically upon
1493detecting that an instance died leaving its thread-open flag set in
1494the controlfile. Instance recovery is performed automatically on
1495two occasions:
1496
14971. at the first database open after a crash (crash recovery);
1498
14992. when some but not all instances of a Parallel Server fail.
1500
1501In the case of Parallel Server, a surviving instance detects the need
1502to perform instance recovery for one or more failed instances by
1503the following means:
1504
15051. A foreground process in a surviving instance detects an
1506"invalid block lock" condition when it attempts to bring a
1507datablock into the buffer cache. This is an indication that
1508another instance died while a block covered by that lock was
1509in a potentially "dirty" state in its buffer cache.
1510
15112. The foreground process sends a notification to its instance's
1512SMON process, which begins a search for dead instances.
1513
15143. The death of another instance is detected if the current
1515instance is able to acquire that instance's thread-opened locks
1516(see 3.9).
1517
1518SMON in the surviving instance obtains a stable list of dead
1519instances, together with a list of "invalid" block locks. Note: After
1520instance recovery is complete, locks in this list will undergo "lock
1521cleanup" (i.e. they will have their "invalid" condition cleared,
1522making the underlying blocks accessible again).
1523
15245.2 Thread-at-a-Time Redo Application
1525
1526Instance recovery operates by processing one thread at a time,
1527thereby recovering one instance at a time. It applies all redo (from
1528the thread checkpoint through the end-of-thread) from each thread
1529before starting on the next thread. This algorithm depends on the
1530fact that only one instance at a time can have a given block
1531modified in its cache. Between changes to the block by different
1532instances, the block is written to disk. Thus, a given block (as read
1533from disk during instance recovery) can need redo applied from at
1534most one thread - the thread containing the most recent
1535modification.
1536
1537Instance recovery can always be accomplished using the online
1538redo logs for the thread being recovered. Crash recovery operates
1539on the thread with the lowest checkpoint SCN first. It proceeds to
1540recover the threads in the order of increasing thread checkpoint
1541SCNs. This ensures that the database checkpoint is advanced by
1542each thread recovered.
1543
15445.3 Current Online Datafiles Only
1545
1546The checkpoint counters are used to ensure that the datafiles are the
1547current online files rather than restored backups. If a backup copy
1548of a datafile is restored, then media recovery is required.
1549
1550Media recovery is required for a restored backup even if recovery
1551can be accomplished using the online logs. The reason is that crash
1552recovery applies all post-thread-checkpoint redo from each thread
1553before starting on the next thread. Crash recovery can use this
1554thread-at-a-time redo application algorithm because a given
1555datablock can need redo application from at most one thread.
1556
1557However, starting recovery from a restored backup enables no such
1558assumption about the number of threads that have relevant redo.
1559Thus, the thread-at-a-time algorithm would not work. Recovering a
1560backup requires thread-merged redo application: i.e. application of
1561all post-file-checkpoint redo, simultaneously merging redo from all
1562threads in SCN order. This thread-merged redo application
1563algorithm is the one used by media recovery (see Section 6).
1564
1565Crash recovery would not suffice - even with thread-merged redo
1566application - to recover a backup datafile, even if it were
1567checkpointed at the current database checkpoint. The reason is that
1568in all but the database checkpoint thread, crash recovery would
1569miss applying redo between the database checkpoint and the
1570(higher) thread checkpoint. By contrast, media recovery would
1571start redo application at the file checkpoint in all threads.
1572Furthermore, crash recovery might fail even if it started redo
1573application at the file checkpoint in all threads. The reason is that
1574crash recovery assumes that it will need only online logfiles. All
1575but the database checkpoint thread might have already archived
1576and re-used a needed log.
1577
1578If the STARTUP RECOVER command is used (in place of simple
1579STARTUP), and crash recovery fails due to datafiles needing
1580media recovery (e.g. they are restored backups), then media
1581recovery via RECOVER DATABASE (see 6.4.1) is automatically
1582executed prior to database open.
1583
15845.4 Checkpoints
1585
1586Instance recovery does not attempt to apply redo that is before the
1587checkpoint SCN of a datafile. (The datafile header checkpoint
1588SCNs are not used to decide where to start recovery, however.)
1589
1590The redo from the thread checkpoint through the end-of-thread
1591must be read to find the end-of-thread and the highest SCN
1592allocated by the thread. These are then used to close the thread and
1593advance the thread checkpoint. The end of a instance recovery
1594almost always advances the datafile checkpoints, and always
1595advances the checkpoint counters.
1596
15975.5 Crash Recovery Completion
1598
1599At the termination of crash recovery, the "fuzzy bits" - online-
1600fuzzy, hotbackup-fuzzy, media-recovery-fuzzy - of all online
1601datafiles are cleared. A special redo record, the end-crash-recovery
1602"marker," is generated. This record is interpreted by media
1603recovery to know when it is permissible to clear the online-fuzzy
1604and hotbackup-fuzzy bits of the datafiles undergoing recovery (see
16056.6).
1606
1607
1608
1609
1610
1611
1612
1613
1614
16156 Media Recovery
1616
1617Media recovery is used to recover from a lost or damaged datafile,
1618or from a lost current controlfile. It is used to transform a restored
1619datafile backup into a "current" datafile. It is also used to restore
1620changes that were lost when a datafile went offline without a
1621checkpoint. Media recovery can apply archived logs as well as
1622online logs. Unlike instance or crash recovery, media recovery is
1623invoked only via explicit command.
1624
16256.1 When to Do Media Recovery
1626
1627As was seen in 5.3, a restored datafile backup always needs media
1628recovery, even if its recovery can be accomplished using only
1629online logs. The same is true of a datafile that went offline without
1630a checkpoint. The database cannot be opened if any of the online
1631datafiles needs media recovery. A datafile that needs media
1632recovery cannot be brought online until media recovery has been
1633executed. Unless the database is not open by any instance, media
1634recovery can only operate on offline files. Media recovery may be
1635explicitly invoked to recover a database prior to open even when
1636crash recovery would have sufficed. If so, crash recovery - though
1637it may find nothing to do - will still be invoked automatically at
1638database open. Note that media recovery may be run - and, in
1639cases such as restored backups or datafiles that went offline
1640immediate, must be run - even if recovery can be accomplished
1641using only the online logs. Media recovery may find nothing to do
1642- and signal the "no recovery required" error - if invoked for
1643files that do not need recovery.
1644
1645If the current controlfile is lost and a backup controlfile is restored
1646in its place, media recovery must be done. This is the case even if
1647all of the datafiles are current.
1648
16496.2 Thread-Merged Redo Application
1650
1651Media recovery uses a thread-merged redo application algorithm:
1652i.e. it applies redo from all threads simultaneously, merging redo
1653records in increasing SCN order. The process of media-recovering
1654a backup datafile differs from the process of crash-recovering a
1655current online datafile in the following fundamental way: Crash
1656recovery applies redo from one thread at a time because any block
1657of a current online file can need redo from at most one thread (one
1658instance at a time can dirty a block in cache). With a restored
1659backup, however, no assumption can be made about the number of
1660threads that have redo relevant to particular block. In general,
1661recovering a backup requires simultaneous application of redo
1662from all threads, with merging of redo records across threads in
1663SCN order. Note that this algorithm depends on a redo-generation-
1664time guarantee that changes for a given block occur in increasing
1665SCN order across threads (case of Parallel Server).
1666
16676.3 Restoring Backups
1668
1669The administrator may copy backup versions of datafiles to the
1670current datafile while the database is shut down or the file is offline.
1671There is a strong assumption that backups are never copied to files
1672that are currently accessible. Every file header read verifies that this
1673has not been done by comparing the checkpoint counter in the file
1674header with the checkpoint counter in the datafile's controlfile
1675record.
1676
16776.4 Media Recovery Commands
1678
1679 There are three media recovery commands:
1680
16817 RECOVER DATABASE
1682
16837 RECOVER TABLESPACE
1684
16857 RECOVER DATAFILE
1686
1687The only essential difference in these commands is in how the set
1688of files to recover is determined. They all use the same criteria for
1689determining if the files can be recovered. There is a lock per
1690datafile that is held exclusive by a process doing media recovery on
1691a file, and is held shared by an instance that has the database open
1692with the file online. Media recovery signals an error if it cannot get
1693the lock for a file it is asked to recover. This prevents two recovery
1694sessions from recovering the same file, and prevents media
1695recovery of a file that is in use.
1696
16976.4.1 RECOVER DATABASE
1698
1699This command does media recovery on all online datafiles that
1700need any redo applied. If all instances were cleanly shutdown, and
1701no backups were restored, this command will signal the "no
1702recovery required" error. It will also fail if any instances have the
1703database open, since they will have the datafile locks.
1704
17056.4.2 RECOVER TABLESPACE
1706
1707This command does media recovery on all datafiles in the
1708tablespaces specified. In order to translate (i.e. via the data
1709dictionary) the tablespace names into datafile names, the database
1710must be open. This means that the tablespaces and their constituent
1711datafiles must be offline in order to do the recovery. An error is
1712signalled if none of the tablepace's constituent files needs recovery.
1713
17146.4.3 RECOVER DATAFILE
1715
1716This command specifies the datafiles to be recovered. The database
1717may be open; or it may be closed, as long as the media recovery
1718locks can be acquired. If the database is open in any instance, then
1719datafile recovery can only recover offline files.
1720
17216.5 Starting Media Recovery
1722
1723Media recovery starts by finding the media-recovery-start SCN: i.e.
1724the lowest SCN of the datafile header checkpoints of the files being
1725recovered. Note: An exception occurs if a file's checkpoint is in its
1726offline range (see 2.18). In that case, the file's offline-end
1727checkpoint is used in place of its datafile header checkpoint in
1728computing the media-recovery-start SCN.
1729
1730A buffer for reading redo is allocated for each thread in the enabled
1731thread bitvec of the media-recovery-start checkpoint (i.e. the
1732datafile checkpoint with the lowest SCN). The initial file header
1733checkpoint SCN of every file is saved to ensure that no redo from a
1734previous use of the file number is applied, as well as to eliminate
1735needlessly attempting to apply redo to a file from before its
1736checkpoint. The stop SCNs (from the datafiles' controlfile records)
1737are also saved. If finite, the highest stop SCN can be used to allow
1738recovery to terminate without needlessly searching for redo beyond
1739that SCN to apply (see 6.10). At recovery completion, any datafile
1740initially found to have a finite stop SCN will be left checkpointed at
1741that stop SCN (rather than at the recovery end-point). This allows
1742an offline-clean or read-only datafile to be left checkpointed at an
1743SCN that matches the tablespace-clean-stop-SCN of its tablespace.
1744
17456.6 Applying Redo, Media Recovery Checkpoints
1746
1747A log is opened for each thread of redo that was enabled at the time
1748the media-recovery-start SCN was allocated (i.e. for each thread in
1749the enabled thread bitvec of the media-recovery-start checkpoint).
1750If the log is online, then it is automatically opened. If the log was
1751archived, then the user is prompted to enter the name of the log
1752(unless automatic recovery is being used). The redo is applied from
1753all the threads in the order it was generated, switching threads as
1754needed. The order of application of redo records without an SCN is
1755not precise, but it is good enough for rollback to make the database
1756consistent.
1757
1758Except in the case of cancel-based incomplete recovery (see
17596.12.1) and backup controlfile recovery (see 6.13), the next online
1760log in sequence is accessed automatically, if it is on disk. If not, the
1761user is prompted for the next log.
1762
1763At log boundaries, media recovery executes a "checkpoint." As
1764part of media recovery checkpoint, the dirty recovery buffers are
1765written to disk and the datafile header checkpoints of the files
1766undergoing recovery are advanced, so that the redo does not need
1767to be reapplied. Another type of media recovery "checkpoint"
1768occurs when a datafile initially found to have a finite stop SCN
1769reaches that stop SCN. At such a stop SCN boundary, all dirty
1770recovery buffers are written to disk, and the datafiles that have been
1771made current have their datafile header checkpoints advanced to
1772their stop SCN values.
1773
17746.7 Media Recovery and Fuzzy Bits
1775
17766.7.1 Media-Recovery-Fuzzy
1777
1778The media-recovery-fuzzy bit is a flag in the datafile header that is
1779used to indicate that - due to ongoing redo application by media
1780recovery - the file may contain changes in the future of (at SCNs
1781beyond) the current header checkpoint SCN. The media-recovery-
1782fuzzy bit is set at the start of media recovery for each file
1783undergoing recovery. Generally the media-recovery-fuzzy bits can
1784be cleared when a media recovery checkpoint advances the
1785checkpoints in the datafile headers. They are left clear when a
1786media recovery session completes successfully or is cancelled. As
1787will be seen on 8.1, open with resetlogs following incomplete
1788media recovery will fail if any online datafile has the media-
1789recovery-fuzzy bit (or any fuzzy bit) set.
1790
17916.7.2 Online-Fuzzy
1792
1793Upon encountering an end-crash-recovery "marker" (or a file-
1794specific offline-immediate "marker": generated when a datafile
1795goes offline without a checkpoint), media recovery can (at the next
1796media recovery checkpoint) clear (if set) the online-fuzzy and
1797hotbackup-fuzzy bits in the appropriate datafile header(s).
1798
17996.7.3 Hotbackup-Fuzzy
1800
1801Upon encountering an end-backup "marker" (or an end-crash-
1802recovery "marker"), media recovery can (at the next media
1803recovery checkpoint) clear the hotbackup-fuzzy bit. Open with
1804resetlogs following incomplete media recovery will fail if any
1805online datafile has the hotbackup-fuzzy bit (or any fuzzy bit) set.
1806This prevents a successful RESETLOGS open following an
1807incomplete recovery that terminated before all redo generated
1808between BEGIN BACKUP and END BACKUP had been applied.
1809Ending incomplete recovery at such a point would generally result
1810in an inconsistent file, since the backup copy may already have
1811contained changes between this endpoint and the END BACKUP.
1812
18136.8 Thread Enables
1814
1815A special thread-enable redo record is written in the thread of an
1816instance enabling a new thread. If media recovery encounters a
1817thread-enable redo record, it allocates a new redo buffer, opens the
1818appropriate log in the new thread, and prepares to start applying
1819redo from the new thread.
1820
18216.9 Thread Disables
1822
1823When a thread is disabled, its current log is marked as the end of a
1824disabled thread. After media recovery finishes applying redo from
1825such a log, it deallocates the thread's redo buffer and stops looking
1826for redo from the thread.
1827
18286.10 Ending Media Recovery (Case of Complete Media Recovery)
1829
1830The current (i.e. last) log in every enabled thread has the end-of-
1831thread flag set in its header. Complete (as opposed to incomplete:
1832see 6.12) media recovery always continues redo application
1833through the end-of-thread in all threads. The end-of-thread log can
1834be identified without having the current controlfile, since the end-
1835of-thread flag is in the log header rather than in the logfile's
1836controlfile record.
1837
1838Note: Backing up and later restoring copies of current online logs
1839is dangerous, and can lead to mis-identification of the current true
1840end-of-thread. This is because the end-of-thread flag in the backup
1841copy will in general be out-of-date with respect to the current end-
1842of-thread log.
1843
1844If the datafiles being recovered have finite stop SCNs in their
1845controlfile records (assuming a current controlfile), then media
1846recovery can stop prior to the end-of-threads. Redo application for
1847a datafile with a finite stop SCN can terminate at that SCN, since it
1848is guaranteed that no redo for that datafile beyond that SCN was
1849generated.
1850
1851As described on 2.15, the stop SCN is set when a datafile goes
1852offline. Note that without the optimization that allows recovery of a
1853file with a finite stop SCN to terminate at that SCN, it could not be
1854guaranteed that recovery of an offline datafile while the database is
1855open would terminate.
1856
18576.11 Automatic Recovery
1858
1859Automatic recovery is invoked by using the AUTOMATIC option
1860of the media recovery command. It saves the user the trouble of
1861entering the names of archived logfiles, provided they are on disk.
1862If the sequence number of the log can be determined, then a name
1863can be constructed by concatenating the current values of the
1864initialization parameters LOG_ARCHIVE_DEST and
1865LOG_ARCHIVE_FORMAT. The current LOG_ARCHIVE_DEST
1866is assumed, unless the user overrides it by specifying a different
1867archiving destination for the recovery session. The media-
1868recovery-start checkpoint (see 6.5) contains (in the RBA field) the
1869initial log sequence number for one thread (i.e. the thread that
1870generated the checkpoint). If multiple threads of redo are enabled,
1871the log history section of the controlfile (if configured) can be used
1872to map the media-recovery-start SCN to a log sequence number for
1873each thread. Once the initial recovery log is found for a thread, all
1874subsequent logs needed from the thread follow in order. If it is not
1875possible to determine the initial log sequence number, the user will
1876have to guess and try logs until the right one is accepted. The
1877timestamp from the media-recovery-start checkpoint is reported to
1878aid in this effort.
1879
18806.12 Incomplete Recovery
1881
1882A RECOVER DATABASE execution can be stopped and the
1883database opened before all the redo has been applied. This type of
1884recovery is termed incomplete recovery. The subsequent database
1885open is termed a RESETLOGS open.
1886
1887Incomplete recovery effectively sets the entire database backwards
1888in time to a transaction-consistent state at or near the recovery end-
1889point. All subsequent updates to the database are lost and must be
1890re-entered.
1891
1892Use of incomplete recovery is indicated in the following
1893circumstances:
1894
18957 Media recovery is necessary (e.g. due to datafile damage or
1896loss), but cannot be complete (i.e. all redo cannot be applied)
1897because all copies of a needed online or archived redo log
1898were lost.
1899
19007 All copies of an active (i.e. needed for instance recovery) log
1901were damaged or lost while the database was open. Since
1902crash recovery is precluded, this case reduces to the previous
1903case.
1904
19057 It is necessary to reverse the effect of an erroneous user action
1906(e.g. table drop or batch run); and it is acceptable to set the
1907entire database - not just the affected schema objects -
1908backwards to a point-in-time before the error.
1909
19106.12.1 Incomplete Recovery UNTIL Options
1911
1912There are three types of incomplete recovery. They differ in the
1913means used to stop the recovery:
1914
19157 Cancel-Based (RECOVER DATABASE UNTIL CANCEL)
1916
19177 Change-Based (RECOVER DATABASE UNTIL CHANGE)
1918
19197 Time-Based (RECOVER DATABASE UNTIL TIME)
1920
1921The UNTIL CANCEL option terminates recovery when the user
1922enters "cancel" rather than the name of a log. Online logs are not
1923automatically applied in this mode in case cancellation at the next
1924log is desired. If multiple threads of redo are being recovered, there
1925may be logs in other threads that are partially applied when the
1926recovery is cancelled.
1927
1928The UNTIL CHANGE option terminates redo application just
1929before any redo associated with the specified SCN or higher. Thus
1930the transaction that committed at that SCN will be rolled back. If
1931you want to recover through a transaction that committed at a
1932specific SCN, then add one to the specified SCN.
1933
1934The UNTIL TIME option works similarly to the UNTIL CHANGE
1935option, except that a time rather than an SCN is specified.
1936Recovery uses the timestamps in the redo block headers to convert
1937the specified time into an SCN. Then recovery is stopped when that
1938SCN is reached.
1939
19406.12.2 Incomplete Recovery and Consistency
1941
1942In order to avoid database corruption when running incomplete
1943recovery, all datafiles must be recovered to the exact same point.
1944Furthermore, no datafile must have any changes in the future of this
1945point. This requires that incomplete media recovery must start from
1946datafiles restored from backups whose copies completed prior to
1947the intended stop time. The system uses file header fuzzy bits (see
19488.1) to ensure that the datafiles contain no changes in the future of
1949the stop time.
1950
19516.12.3 Incomplete Recovery and Datafiles Known to the Controlfile
1952
1953If recovering to a time before a datafile was dropped, the dropped
1954file must appear in the controlfile used for recovery. Otherwise it
1955would not be recovered. One alternative for achieving this is to
1956recover using a backup controlfile made before the datafile was
1957dropped. Another alternative is to use the CREATE
1958CONTROLFILE command to construct a controlfile that lists the
1959dropped datafile.
1960
1961Recovering to a time before a file was added is not a problem. The
1962extra datafile will be eliminated from the controlfile after the
1963database is open. The unwanted file may be taken offline before the
1964recovery to avoid accessing it.
1965
19666.12.4 Resetlogs Open after Incomplete Recovery
1967
1968The next database open after an incomplete recovery must specify
1969the RESETLOGS option. Amongst other effects (see Section 7),
1970resetlogs throws away the redo that was not applied during the
1971incomplete recovery, and marks the database so that the skipped
1972redo can never be accidentally applied by a subsequent recovery. If
1973the incomplete recovery was a mistake (e.g. the lost log was
1974found), the next open can specify the NORESETLOGS option.
1975However, for the open with NORESETLOGS to succeed, it must
1976be preceded by a successful execution of complete recovery (i.e.
1977one in which all redo is applied).
1978
19796.12.5 Files Offline during Incomplete Recovery
1980
1981If a file is offline during incomplete recovery, it will not be
1982recovered. This is ok if the file is part of a tablespace that was taken
1983offline normal, and that is still offline normal at the recovery end-
1984point. Otherwise, if the file is still offline when the resetlogs is
1985done, the tablespace containing the file will have to be dropped.
1986This is because it will need media recovery with logs from before
1987the resetlogs. In general V$DATAFILE should be checked to
1988ensure that files are online before running an incomplete recovery.
1989Only files that will be dropped and files that are part of offline
1990normal (or read-only) tablespaces should be offline (Section 8.6).
1991
19926.13 Backup Controlfile Recovery
1993
1994If recovery is done with a controlfile other than the current one,
1995then backup controlfile recovery (RECOVER
1996DATABASE...USING BACKUP CONTROLFILE) must be used.
1997This applies both to the case of a restored controlfile backup, and to
1998the case of a "backup" controlfile created via CREATE
1999CONTROLFILE...RESETLOGS.
2000
2001Use of CREATE CONTROLFILE...RESETLOGS makes a
2002controlfile that is a "backup." Only a backup controlfile recovery
2003can be run after executing CREATE
2004CONTROLFILE...RESETLOGS. Only a RESETLOGS open can
2005be used after executing CREATE
2006CONTROLFILE...RESETLOGS. Use of CREATE
2007CONTROLFILE...RESETLOGS is indicated if (all copies of) an
2008online redo log were lost in addition to (all copies of) the control
2009file.
2010
2011By contrast, CREATE CONTROLFILE...NORESETLOGS makes
2012a controlfile that is "current"; i.e. it has knowledge of the current
2013state of the online logfiles and log sequence numbers. A backup
2014controlfile recovery is not necessary following CREATE
2015CONTROLFILE...NORESETLOGS. Indeed, no recovery at all is
2016required if there was a clean shutdown, and if no datafile backups
2017have been restored. A normal or NORESETLOGS open may
2018follow CREATE CONTROLFILE ...NORESETLOGS.
2019
2020A backup controlfile lacks valid information about the current
2021online logs and datafile stop SCNs. Hence, recovery cannot look
2022for online logs to automatically apply. Moreover, recovery must
2023assume infinite stop SCN's. A RESETLOGS open corrects this
2024information. The backup controlfile may have a different set of
2025threads enabled than did the original controlfile. That set will be the
2026effective enabled thread set following RESETLOGS open.
2027
2028The BACKUP CONTROLFILE option may be used either alone or
2029in conjunction with an incomplete recovery option. Unless an
2030incomplete recovery option is included, all threads must be applied
2031to the end-of-thread. This is validated at open resetlogs time.
2032
2033It is currently required that a RESETLOGS open follow execution
2034of backup controlfile recovery, even if no incomplete recovery
2035option was used. The following procedure could be used to avoid a
2036backup controlfile recovery and resetlogs in case the only problem
2037is a lost current controlfile (and a backup controlfile exists):
2038
20391. Copy the backup controlfile to the current control file and do a
2040STARTUP MOUNT.
2041
20422. Issue ALTER DATABASE BACKUP CONTROLFILE TO
2043TRACE NORESETLOGS.
2044
20453. Issue the CREATE CONTROLFILE...NORESETLOGS com-
2046mand from the SQL script output by Step 2.
2047
2048It is important to assure that the CREATE CONTROLFILE
2049command issued in Step 3 creates a controlfile reflecting a database
2050structure equivalent to that of the lost current controlfile. For
2051example, if a datafile was added since the backup controlfile was
2052saved, then the CREATE CONTROLFILE command should be
2053modified to declare the added datafile.
2054
2055Failure to specify the BACKUP CONTROLFILE option on the
2056RECOVER DATABASE command when the controlfile is indeed a
2057backup can frequently be detected. One indication of a restored
2058backup controlfile would be a datafile header checkpoint count that
2059is greater than the checkpoint count in the datafile's controlfile
2060record. However, this test may not catch the backup controlfile if
2061the datafiles are also backups. Another test validates the online
2062logfile headers against their corresponding controlfile records, but
2063this too may not always catch an old controlfile.
2064
20656.14 CREATE DATAFILE: Recover a Datafile Without a Backup
2066
2067If a datafile is lost or damaged and no backup of the file is
2068available, it can be recovered using only information in the redo
2069logs and control file. The following conditions must be met:
2070
20711. All redo logs written since the datafile was originally created
2072must be available.
2073
20742. A control file in which the datafile is declared (i.e. name and
2075size information) must be available or re-creatable.
2076
2077The CREATE DATAFILE clause of the ALTER DATABASE
2078command is first used to create a new, empty replacement for the
2079lost datafile. RECOVER DATAFILE is then used to apply all redo
2080generated for the file from the time of its original creation until the
2081time it was lost. After all redo logs written since the datafile was
2082originally created have been applied, the file will have been
2083restored to its state at the time it was lost. This mechanism is useful
2084for recovering a recently-created datafile for which no backup has
2085yet been taken. The original datafiles of the SYSTEM tablespace
2086cannot be recovered by this means, however, since relevant redo
2087data is not saved at database creation time.
2088
20896.15 Point-in-Time Recovery Using Export/Import
2090
2091Occasionally, it may become necessary to reverse the effect of an
2092erroneous user action (e.g. table drop or batch run). One approach
2093would be to perform an incomplete media recovery to a point-in-
2094time before the corruption, then open the database with the
2095RESETLOGS option. Using this approach, the entire database -
2096not just the affected schema objects - would be set backwards in
2097time.
2098
2099This approach has an undesirable side-effect: it discards committed
2100transactions. Any updates that occurred subsequent to the resetlogs
2101SCN are lost and must be re-entered. Resetlogs has another
2102undesirable side-effect: it renders all pre-existing backups unusable
2103for future recovery.
2104
2105Setting a mission-critical database globally back in time is often
2106not an acceptable solution. The following procedure is an
2107alternative whose effect on the mission-critical database is to set
2108just the affected schema objects - termed the recovery-objects -
2109backwards in time.
2110
2111Point-in-time incomplete media recovery is run against a side-copy
2112of the production database, called the recovery-database. The
2113initial version of the recovery-database is created using backups of
2114the production database that were taken before the corruption
2115occurred. Non-relevant objects in the recovery-database can be
2116taken offline in order to avoid unnecessarily recovering them.
2117However, the SYSTEM tablespace and all tablespaces containing
2118rollback segments must participate in the media recovery in order
2119to allow a clean open. (Note that this is a good reason to place
2120rollback segments and data segments into separate tablespaces.)
2121
2122After it has undergone point-in-time incomplete media recovery,
2123the recovery-database is opened with the RESETLOGS option.
2124The recovery-database is now set backwards to a point-in-time
2125before the recovery-objects were corrupted. This effectively
2126creates pre-corruption versions of the recovery-objects in the
2127recovery-database. These objects can then be exported from the
2128recovery-database and imported back into the production database.
2129Prior to importing the recovery-objects, the production database is
2130prepared as follows:
2131
21327 In the case of recovering an erroneously updated schema
2133object, the copy of the object in the production database is pre-
2134pared by discarding just the data; e.g. the table is truncated.
2135
21367 In the case of recovering an erroneously dropped schema
2137object, the object is re-created (empty) in the production data-
2138base.
2139
2140The import operation is then executed, using the data-only option
2141as appropriate. Since export/import can be a lengthy process, it
2142may be desirable to postpone it until a time when recovery-object
2143unavailability can be tolerated. In the meantime, the recovery-
2144objects can be made available, albeit at degraded performance, via
2145a database link between the production database and the recovery-
2146database.
2147
2148An undesirable side-effect of this approach is that transaction
2149consistency across objects is lost. This side-effect can be avoided
2150by widening the recovery-object set to include all objects that must
2151be kept transaction-consistent.
2152
2153
2154
2155
2156
2157
2158
2159
2160
21617 Block Recovery
2162
2163Block recovery is the simplest type of recovery. It is performed
2164automatically by the system during normal operation of the
2165database, and is transparent to the user.
2166
21677.1 Block Recovery Initiation and Operation
2168
2169Block recovery is used to clean up the state of a buffer whose
2170modification by a foreground process (in the middle of invoking a
2171redo application callback to apply a change vector to the buffer)
2172was interrupted by the foreground process dying or signalling an
2173error. Recovery involves (i) reading the block from disk; (ii) using
2174the current thread's online redo logs to reconstruct the buffer to a
2175state consistent with the redo already generated; and (iii) writing
2176the recovered block back to disk. If block recovery fails, then after
2177a second attempt, the block is marked logically corrupt (by setting
2178the block sequence number to zero) and a corrupt block error is
2179signalled.
2180
2181Block recovery is guaranteed doable using only the current thread's
2182online redo logs, since:
2183
21841. Block recovery cannot require redo from another thread or
2185from before the last thread checkpoint.
2186
21872. Online logs are not reused until the current thread checkpoint
2188is beyond the log.
2189
21903. No buffer currently in the cache can need recovery from
2191before the last thread checkpoint.
2192
21937.2 Buffer Header RBA Fields
2194
2195The buffer header (an in-memory data structure) contains the
2196following fields pertaining to block recovery:
2197
2198Low-RBA and High-RBA: Delineate the range of redo (from the
2199current thread) that needs to be applied to the disk version of the
2200block in order make it consistent with redo already generated.
2201
2202Recovery-RBA: A place marker for recording progress in case the
2203invoker of block recovery is PMON and complete recovery in
2204one invocation would take too long (see next section).
2205
22067.3 PMON vs. Foreground Invocation
2207
2208If an error is signalled while a foreground process is in a redo
2209application callback, then the process itself executes block
2210recovery. If foreground process death is detected during a redo
2211application callback, on the other hand, PMON executes block
2212recovery.
2213
2214Block recovery may require an unbounded amount of time and I/O.
2215However, PMON cannot be allowed to spend an inordinate amount
2216of time working on the recovery of one block while neglecting
2217other necessary time-critical tasks. Therefore, a limit is placed on
2218the amount of redo applied by one PMON call to block recovery.
2219(A port-specific constant specifies the maximum number of redo
2220log blocks applied per invocation). As PMON applies redo during
2221invocations of block recovery, it updates the recovery-RBA in the
2222buffer header to record its progress. When a PMON call to block
2223recovery causes the recovery-RBA to reach the high-RBA, then
2224block recovery for that block is complete.
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
22368 Resetlogs
2237
2238The RESETLOGS option is needed on the first database open
2239following:
2240
22417 Incomplete recovery
2242
22437 Backup controlfile recovery
2244
22457 CREATE CONTROLFILE...RESETLOGS.
2246
2247The primary function of resetlogs is to discard the redo that was not
2248applied during incomplete recovery, ensuring that the skipped redo
2249can never be accidentally applied by a subsequent recovery. To
2250accomplish this, resetlogs effectively invalidates all existing redo
2251in all online and archived redo logfiles. This has the side effect of
2252making any existing datafile backups unusable for future recovery
2253operations.
2254
2255Resetlogs also reinitializes the controlfile information about online
2256logs and redo threads, clears the contents of any existing online
2257redo log files, creates the online redo log files if they do not
2258currently exist, and resets the log sequence number in all threads to
2259one.
2260
22618.1 Fuzzy Files
2262
2263The most important requirement when doing a RESETLOGS open
2264is that all datafiles be validated as recovered to the same point-in-
2265time. This is what ensures that all the changes in a single redo
2266record are done atomically. It is also important for other
2267consistency reasons. If all threads of redo have been applied
2268through end-of-thread to all online datafiles, then we can be sure
2269that the database is consistent.
2270
2271If incomplete recovery was done, there is the possibility that a file
2272was not restored from a sufficiently old backup. In the general case,
2273this is detectable if the file has a different checkpoint than the other
2274files (exceptions: offline or read-only files).
2275
2276The other possibility is that the file is fuzzy - i.e. it may contain
2277changes in the future of its checkpoint. As seen earlier, the
2278following "fuzzy bits" are maintained in the file header to
2279determine if a file is fuzzy:
2280
22817 online-fuzzy bit (see 3.5, 6.7.2)
2282
22837 hotbackup-fuzzy bit (see 4, 6.7.3)
2284
22857 media-recovery-fuzzy bit (see 6.7.1)
2286
2287Open with resetlogs following incomplete media recovery will fail
2288if any online datafile has any of the three fuzzy bits set.
2289
2290Redo records are created at the end of a hot backup (the end-
2291backup "marker") and after crash recovery (the end-crash-recovery
2292"marker") to enable media recovery to determine when it can clear
2293the fuzzy bits. Resetlogs signals an error if any of the datafiles has
2294any of the fuzzy bits set.
2295
2296Except in the following special circumstances, resetlogs signals an
2297error if any of the datafiles is recovered to a checkpoint SCN
2298different from the one at which the other files are checkpointed (i.e.
2299the resetlogs SCN: see 8.2):
2300
23011. A file recovered to an SCN earlier than the resetlogs SCN
2302would be tolerated in case there were no redo generated for the
2303file between its checkpoint SCN and the resetlogs SCN. For
2304example, such would be the case if the file were read-only, and
2305its offline range spanned the checkpoint SCN and resetlogs
2306SCN. In this case, resetlogs would allow the file but set it
2307offline.
2308
23092. A file checkpointed at an SCN later than the resetlogs SCN
2310would be tolerated in case its creation SCN (allocated at file
2311creation time and stored in the file header) showed it to have
2312been created after the resetlogs SCN. During the data dictio-
2313nary vs. controlfile check performed by RESETLOGS open
2314(see 8.7), such a file would be found to be missing from the
2315data dictionary but present in the controlfile. As a conse-
2316quence, it would be eliminated from the controlfile.
2317
23188.2 Resetlogs SCN and Counter
2319
2320A resetlogs SCN and resetlogs timestamp - known together as the
2321resetlogs data - are kept in the database info record of the
2322controlfile. The resetlogs data is intended to uniquely identify each
2323execution of a RESETLOGS open. The resetlogs data is also stored
2324in each datafile header and in each logfile header. A redo log cannot
2325be applied by recovery if its resetlogs data does not match that in
2326the database info record of the controlfile. Except for some very
2327special circumstances (e.g. offline normal or read-only
2328tablespaces), a datafile cannot be recovered or accessed if its
2329resetlogs data does not match that of the database info record of the
2330controlfile. This ensures that changes discarded by resetlogs do not
2331get back into the database. It also renders previous backups
2332unusable for future recovery operations, making it prudent to take a
2333database backup immediately after a resetlogs.
2334
23358.3 Effect of Resetlogs on Threads
2336
2337Each thread's controlfile record is updated to clear the thread-open
2338flag and to set the thread-checkpoint SCN to the resetlogs SCN.
2339Thus, the thread appears to have been closed at the resetlogs SCN.
2340The set of enabled threads from the enabled thread bitvec of the
2341database info controlfile record is used as is. It does not matter
2342which threads were enabled at the end of recovery, since none of
2343the old redo can ever be applied to the database again. The log
2344sequence numbers in all threads are also reset to one. One of the
2345enabled threads is picked as the database checkpoint.
2346
23478.4 Effect of Resetlogs on Redo Logs
2348
2349The redo is thrown away by zeroing all the online logs. Note that
2350this means that redo in the online logs would be lost forever - and
2351there would be no way to undo the resetlogs in an emergency - if
2352the online logs were not backed up prior to executing resetlogs.
2353Note that ensuring the ability to undo an erroneous resetlogs is the
2354only valid rationale for making backups of online logs. Undoing an
2355erroneous resetlogs requires re-running the entire recovery
2356operation from the beginning, after restoring backups of all
2357datafiles, controlfile, and online logs.
2358
2359One log is picked to be the current log for every enabled thread.
2360That log header is written as log sequence number one. Note that
2361the set of logs and their thread association is picked up from the
2362controlfile (i.e. using the thread number and log list fields of the
2363logfile records). If it is a backup controlfile, this may be different
2364from what was current the last time the database was open.
2365
23668.5 Effect of Resetlogs on Online Datafiles
2367
2368The headers of all the online datafiles are updated to be
2369checkpointed at the new database checkpoint. The new resetlogs
2370data is also written to the header.
2371
23728.6 Effect of Resetlogs on Offline Datafiles
2373
2374The controlfile record for an offline file is set to indicate the file
2375needs media recovery. However that will not be possible because it
2376would be necessary to apply redo from logs with the wrong
2377resetlogs data. This means that the tablespace containing the file
2378will have to be dropped. There is one important exception to this
2379rule. When a tablespace is taken offline normal or set read-only, the
2380checkpoint SCN written to the headers of the tablespace's
2381constituent datafiles is saved in the data dictionary TS$ table as the
2382tablespace-clean-stop SCN (see 2.17). No recovery is ever needed
2383to bring a tablespace and its files online if the files are not fuzzy
2384and are checkpointed at exactly the tablespace-clean-stop SCN.
2385Even the resetlogs data in the offline file header is ignored in this
2386case. Thus a tablespace that is offline normal is unaffected by any
2387resetlogs that leaves the database at a time when the tablespace is
2388offline.
2389
23908.7 Checking Dictionary vs. Controlfile on Resetlogs Open
2391
2392After the rollback phase of RESETLOGS open, the datafiles listed
2393in the data dictionary FILE$ table are compared with the datafiles
2394listed in the controlfile. This is also done on the first open after a
2395CREATE CONTROLFILE. There is the possibility that incomplete
2396recovery ended at a time when the files in the database were
2397different from those in the controlfile used for the recovery. Using a
2398backup controlfile or creating one can have the same problem.
2399Checking the dictionary does not do any harm, so it could be done
2400on every database open; however there is no point in wasting the
2401time under normal circumstances.
2402
2403 The entry in FILE$ is compared with the entry in the controlfile
2404for every file number. Since FILE$ reflects the space allocation
2405information in the database, it is correct, and the controlfile might
2406be wrong. If the file does not exist in FILE$ but the controlfile
2407record says the file exists, then the file is simply dropped from the
2408controlfile.
2409
2410If a file exists in FILE$ but not in the controlfile, a placeholder
2411entry is created in the control file under the name MISSINGnnnn
2412(where nnnn is the file number in decimal). MISSINGnnnn is
2413flagged in the control file as being offline and needing media
2414recovery. The actual file corresponding (with respect to the file
2415header contents as opposed to the file name) to MISSINGnnnn can
2416be made accessible by renaming MISSINGnnnn to point to it.
2417
2418In the RESETLOGS open case however, rename can succeed in
2419making the file usable only in case the file was read-only or offline
2420normal. If, on the other hand, MISSINGnnnn corresponds to a file
2421that was not read-only or offline normal, then the rename operation
2422cannot be used to make it accessible, since bringing it online would
2423require media recovery with redo from before the resetlogs. In this
2424case, the tablespace containing the datafile must be dropped.
2425
2426When the dictionary check is due to open after CREATE
2427CONTROLFILE...NORESETLOGS rather than to open resetlogs,
2428media recovery may be used to make the file current.
2429
2430Another option is to repeat the entire operation that lead up to the
2431dictionary check with a controlfile that lists the same datafiles as
2432the data dictionary. For incomplete recovery, this would involve
2433restoring all backups and repeating the recovery.
2434
2435
2436
2437
2438
2439
2440
2441
2442
24439 Recovery-Related V$ Fixed-Views
2444
2445The V$ fixed-views contain columns that extract information from
2446data structures dynamically maintained in memory by the kernel.
2447These "views" make this information accessible to the DBA under
2448SYS. The following is a summary of recovery-related information
2449that is viewable via V$ views:
2450
24519.1 V$LOG
2452
2453Contains log group information from the controlfile:
2454
2455GROUP#
2456
2457THREAD#
2458
2459SEQUENCE#
2460
2461SIZE_IN_BYTES
2462
2463MEMBERS_IN_GROUP
2464
2465ARCHIVED_FLAG
2466
2467STATUS_OF_ GROUP (unused, current, active, inactive)
2468
2469LOW_SCN
2470
2471LOW_SCN_TIME
2472
24739.2 V$LOGFILE
2474
2475Contains log file (i.e. group member) information from the
2476controlfile:
2477
2478GROUP#
2479
2480STATUS_OF_MEMBER (invalid, stale, deleted)
2481
2482NAME_OF_MEMBER
2483
24849.3 V$LOG_HISTORY
2485
2486Contains log history information from the controlfile:
2487
2488THREAD#
2489
2490SEQUENCE#
2491
2492LOW_SCN
2493
2494LOW_SCN_TIME
2495
2496NEXT_SCN
2497
24989.4 V$RECOVERY_LOG
2499
2500Contains information (from the controlfile log history) about
2501archived logs needed to complete media recovery.:
2502
2503THREAD#
2504
2505SEQUENCE#
2506
2507LOW_SCN_TIME
2508
2509ARCHIVED_NAME
2510
25119.5 V$RECOVER_FILE
2512
2513Contains information on the status of files needing media recovery:
2514
2515FILE#
2516
2517ONLINE_FLAG
2518
2519REASON_MEDIA_RECOVERY_NEEDED
2520
2521RECOVERY_START_SCN
2522
2523RECOVERY_START_SCN_TIME
2524
25259.6 V$BACKUP
2526
2527Contains status information relative to datafiles in hot backup:
2528
2529FILE#
2530
2531FILE_STATUS (no-backup-active, backup-active, offline-normal,
2532error)
2533
2534BEGIN_BACKUP_SCN
2535
2536BEGIN_BACKUP_TIME
2537
2538
2539
2540
2541
2542
2543
2544
2545
254610 Miscellaneous Recovery Features
2547
254810.1 Parallel Recovery (v7.1)
2549
2550The goal of the parallel recovery feature is to use compute and I/O
2551parallelism to reduce the elapsed time required to perform crash
2552recovery, single-instance recovery, or media recovery. Parallel
2553recovery is most effective at reducing recovery time when several
2554datafiles on several disks are being recovered concurrently.
2555
255610.1.1 Parallel Recovery Architecture
2557
2558Parallel recovery partitions recovery processing into two
2559operations:
2560
25611. Reading the redo log.
2562
25632. Applying the change vectors.
2564
2565Operation #1 does not easily lend itself to parallelization. The redo
2566log(s) must be read in sequentially, and merged in the case of
2567media recover. Thus, this task is assigned to one process: the
2568redo-reading-process.
2569
2570Operation #2, on the other hand, easily lends itself to
2571parallelization. Thus, the task of change vector application is
2572delegated to some number of redo-application-slave-processes.
2573The redo-reading-process sends change vectors to the redo-
2574application-slave-processes using the same IPC (inter-process-
2575communication) mechanism used by parallel query. The change
2576vectors are distributed based on the hash function that takes the
2577block address as argument (i.e. DBA modulo # redo-application-
2578slave-processes). Thus, each redo-application-slave-process
2579handles only change vectors for blocks whose DBAs hash to its
2580"bucket" number. The redo-application-slave-processes are
2581responsible for reading the datablocks into cache, checking
2582whether or not the change vectors need to be applied, and applying
2583the change vectors if needed.
2584
2585This architecture achieves parallelism in log read I/O, datablock
2586read I/O, and change vector processing. It allows overlap of log
2587read I/Os with datablock read I/Os. Moreover, it allows overlap of
2588datablock read I/Os for different hash "buckets." Recovery elapsed
2589time is reduced as long as the benefits of compute and I/O
2590parallelism outweigh the costs of process management and inter-
2591process-communication.
2592
259310.1.2 Parallel Recovery System Initialization Parameters
2594
2595PARALLEL_RECOVERY_MAX_THREADS
2596
2597PARALLEL_RECOVERY_MIN_THREADS
2598These initialization parameters control the number of redo-
2599application-slave-processes used during crash recovery or
2600media recovery of all datafiles.
2601
2602PARALLEL_INSTANCE_RECOVERY_THREADS
2603This initialization parameter controls the number of redo-appli-
2604cation-slave-processes used during instance recovery.
2605
260610.1.3 Media Recovery Command Syntax Changes
2607
2608RECOVER DATABASE has a new optional parameter for specify-
2609ing the number of redo-application-slave-processes. If specified,
2610it overrides PARALLEL_RECOVERY_MAX_THREADS.
2611
2612RECOVER TABLESPACE has a new optional parameter for spec-
2613ifying the number of redo-application-slave-processes. If speci-
2614fied, it overrides PARALLEL_RECOVERY_MIN_THREADS.
2615
2616RECOVER DATAFILE has a new optional parameter for specify-
2617ing the number of redo-application-slave-processes. If specified,
2618it overrides PARALLEL_RECOVERY_MIN_THREADS.
2619
262010.2 Redo Log Checksums (v7.2)
2621
2622The log checksum feature allows a potential corruption in an online
2623redo log to be detected when the log is read for archiving. The goal
2624is to prevent the corruption from being propagated, undetected, to
2625the archive log copy. This feature is intended to be used in
2626conjunction with a new command, CLEAR LOGFILE, that allows
2627a corrupted online redo log to be discarded without having to
2628archive it.
2629
2630A new initialization parameter, LOG_BLOCK_CHECKSUM,
2631controls activation of log checksums. If it is set, a log block
2632checksum is computed and placed in the header of each log block
2633as it is written out of the redo log buffer. If present, checksums are
2634validated whenever log blocks are read for archiving or recovery. If
2635a checksum is detected as invalid, an attempt is made to read
2636another member of the log group (if any). If an irrecoverable
2637checksum error is detected - i.e. the checksum is invalid in all
2638members - then the log read operation fails.
2639
2640Note that a rudimentary mechanism for detecting log block header
2641corruption was added, along with log group support, in v7.1. The
2642log checksum feature extends corruption detection to the whole
2643block.
2644
2645If an irrecoverable checksum error prevents a log from being read
2646for archiving, then the log cannot be reused. Eventually log switch
2647- and redo generation - will stall. If no action is taken, the
2648database will hang. The CLEAR LOGFILE command provides a
2649way to obviate the requirement that the log be archived before it
2650can be reused.
2651
265210.3 Clear Logfile (v7.2)
2653
2654If all members of an online redo log group are "lost" or "corrupted"
2655(e.g. due to checksum error, media error, etc.), redo generation may
2656proceed normally until it becomes necessary to reuse the logfile.
2657Once the thread checkpoints of all threads are beyond the log, it is a
2658potential candidate for reuse. Possible scenarios preventing reuse
2659are the following:
2660
26611. The log cannot be archived due to a checksum error; it cannot
2662be reused because it needs archiving.
2663
26642. A log switch attempt fails because the log is inaccessible (e.g.
2665due to a media error). The log may or may not have been
2666archived.
2667
2668The ALTER DATABASE CLEAR LOGFILE command is
2669provided as an aid to recovering from such scenarios involving an
2670inactive online redo log group (i.e. one that is not needed for crash
2671recovery). CLEAR LOGFILE allows an inactive online logfile to
2672be "cleared": i.e. discarded and reinitialized, in a manner analogous
2673to DROP LOGFILE followed by ADD LOGFILE. In many cases,
2674use of this command obviates the need for database shutdown or
2675resetlogs.
2676
2677Note: CLEAR LOGFILE cannot be used to clear a log needed for
2678crash recovery (i.e. a "current" or "active" log of an open thread).
2679Instead, if such a log becomes lost or corrupted, shutdown abort
2680followed by incomplete recovery and open resetlogs will be
2681necessary.
2682
2683Use of the UNARCHIVED option allows the log clear operation to
2684proceed even if the log needs archiving: an operation that would be
2685disallowed by DROP LOGFILE. Furthermore, CLEAR LOGFILE
2686allows the log clear operation to proceed in the following cases:
2687
26887 There are only two logfile groups in the thread.
2689
26907 All log group members have been lost through media failure.
2691
26927 The logfile being cleared is the current log of a closed thread.
2693
2694All of these operations would be disallowed in the case of DROP
2695LOGFILE.
2696
2697Clearing an unarchived log makes unusable any existing backup
2698whose recovery would require applying redo from the cleared log.
2699Therefore, it is recommended that the database be immediately
2700backed up following use of CLEAR LOGFILE with the
2701UNARCHIVED option. Furthermore, the UNRECOVERABLE
2702DATAFILE option must be used if there is a datafile that is offline,
2703and whose recovery prior to onlining requires application of redo
2704from the cleared logfile. Following use of CLEAR LOGFILE with
2705the UNRECOVERABLE DATAFILE option, the offline datafile,
2706together with its entire tablespace, will have to be dropped from the
2707database. This is due to the fact that redo necessary to bring it
2708online has been cleared, and there is no other copy of it.
2709
2710The foreground process executing CLEAR LOGFILE processes
2711the command in several steps:
2712
27137 It checks that the logfile is not needed for crash recovery and
2714is clearable.
2715
27167 It sets the "being cleared" and "archiving not needed" flags in
2717the logfile controlfile record. While the "being cleared" flag is
2718set, the logfile is ineligible for reuse by log switch.
2719
27207 It recreates a new logfile, and performs multiple writes to clear
2721it to zeroes (a lengthy process).
2722
27237 It resets the "being cleared" flag.
2724
2725If the foreground process executing CLEAR LOGFILE dies while
2726execution is in process, the log will not be usable as the current log.
2727Redo generation may stall and the database may hang, much as
2728would happen if log switch had to wait for checkpoint completion,
2729or for log archive completion. Should the process executing
2730CLEAR LOGFILE die, the operation should be completed by
2731reissuing the same command. Another option would be to drop the
2732partially-cleared log. CLEAR LOGFILE could also fail due to an I/
2733O error encountered while writing zeros to a log group member. An
2734option for recovering would be to drop that member and add
2735another to replace it.