I have done a lot of tweaking with tune2fs, snaprestore procedures, etc ... to mitigate the time to restore a FC LUN exported from a storage system, both unified systems and just block-level export systems (direct-attached-storage only). We have recently only just moved to the requirement for a 15T FC LUN, formatted on a guest OS RHEL5u5 as an ext3FS. The filesystem is setup with LVM, multipathed through two single-port QLogic HBA to the backend storage device in active-backup mpath.
My question is that are there any best-practices or guidelines for mitigating the chances on the guest OS (RHEL) of a fsck being required, or workarounds to get things back operational ASAP. One thing I am throwing around is doing a snap-restore from the most recent snapshot, first just blowing away the entire ext3FS via a reformat, then restoring the snapshot ... occasionally we have had (occasionally being twice in two years on 170+ systems) the journal become corrupt, and in those instances I just blow away the journal via tune2fs options and remake it since data retention is not critical, the partition is holding hdf5 files that get purged on various intervals, but not retained more than 5 days at the max.
The filesystem is also part of a resource group that can fail between two servers ...
Anyway -- other than the obiovus tune2fs settings to not auto-check on mount or time, and having a valid backup-restore policy, within a single storage system (not using snapmirror, syncmirror etc) is there any other good TRs or guidelines for such a recovery plan.
Google and the forums of course provided some insight, but I am curious what others with large, I am sure there are much larger, FC LUNs to Linux guest OS ext3FS have experienced in the past.