Thursday, November 10, 2011

TSM replacing damaged primary storage pool volume - tape

Failing tape can be replaced by different ways and I'll cover two of them:

If a tape can be read so you can issue move data command to move data from failing tape to another tape(s) in a same storage pool:
move data VOLUME

If a tape can not be read then it can be restored from backup storage pool.
For "how to" read more bellow:


Problem(Abstract)

My primary storage pool volume is physically damaged and cannot be reused. Steps for recovering the data on the volume.

Resolving the problem

NOTE: The key factor is that a copy storage pool must exist for the primary storage pool, and the data on the volume(s) has been copied with the 'backup stgpool' command. If the data on the damaged volume(s) has not been copied to the copy storage pool, or if the data on the volume was damaged prior to copy, then it will not be recoverable. Any files that are unable to be restored would be eligible for backup from the client again.
1. The damaged volume must be marked as destroyed to prevent access:
UPDATE VOLUME ACCESS=DESTROYED

2. Check the volume out of the library before proceeding. When the restore commands are executed the destroyed volume is deleted from the TSM database, and cannot be removed within TSM if the volume has not been checked out. To checkout the volume:
CHECKOUT LIBVOLUME

3. This command will preview the restore and not move any physical data. This will produce a list of volumes needed for the restore:
RESTORE VOLUME PREVIEW=YES

4. Look in the Activity log for the list of volumes that must be returned from offsite storage and checked into the library.

5. Place the volumes from step #4 in the BULK I/O door and then check them into the library:
CHECKIN LIBVOLUME SEARCH=BULK CHECKLABEL=BARCODE STATUS=PRIVATE

If you do not have a bulk I/O door then place them in empty slots inside the library and change the SEARCH parameter:
CHECKIN LIBVOLUME SEARCH=YES CHECKLABEL=BARCODE STATUS=PRIVATE

6. The checked in volumes must be marked READONLY to prevent processes from using them. The important thing to remember is that all volumes listed in the preview need to be in a ACCESS=READONLY state so that the next command can run to completion:
UPDATE VOLUME ACCESS=READONLY WHERESTGPOOL=

7. The next command will start the volume restore process. The 'maximum number of processes' needs to be at least 2, one process to read from the copy pool volume, and one process to write the new primary storage pool volume. Ensure that sufficient scratch volumes are available.
RESTORE VOLUME MAXPROCESS=

8. Once the restore has been completed, the copy volumes will need to be sent back offsite. The first step is to change there access back to offsite. By updating all volumes from step #5 to have an ACCESS=OFFSITE, the server will not try to use them during reclamation.
UPDATE VOLUME * ACCESS=OFFSITE WHERESTGPOOL=

9. Check out the copy pool volumes so that they can be delivered back to the vault location:
CHECKOUT LIBVOLUME CHECKLABEL=NO

10. This command will need to be run on the damaged primary volume only if the TSM Server reports that files are still located on the volume. This could be because of previous problems with the backup storage pool command or files on that volume had not been copied to the copy storage pool as of yet. These files, as long as they are still present on the client, will be backed up again from the owning client during their next incremental:
DELETE VOLUME DISCARD=YES

No comments:

Post a Comment