[ale] Remounting R/W After Aborted Journal

Tue Jan 11 14:16:45 EST 2011

On 01/11/2011 01:29 PM, Drew Wade wrote:
> Jim,
> 
> You really need to fsck that volume to correct the problem.
> 
> Since it is in read only mode, you need to umount it or umount -l it if
> it doesn't respond to umount.
> 
> Then fsck the logical volume.  Then once that completes you need to
> remount it.  Even if the application is happily running, if it tries to
> retrieve a piece of data on a errored filesystem you risk feeding it bad
> data which it could send downstream (to a DB or another post processing
> application) and run into even more problems.
> 
> I'd suggest telling the customer that you need to fix the filesystem and
> get some outage time from the app owners, then umount and fsk it.  If
> that fails, you'll have to go to single user mode and fsck it (just make
> sure you unmount it even in single user mode before the fsck).

FYI it was me asking the question, not Jim.

On three other volumes that suffered the same problem, I already
unmounted and remounted without an fsck. These are ext3 filesystems with
the journal in ordered mode.

I am not sure if I should be treating this situation (someone unplugging
the connection to the storage array) any differently than I would treat
someone unplugging the power cables from the server itself. In that
case, I would expect the journal to keep filesystem metadata consistent.
There might be corrupt data if a program was writing inside an existing
file (instead of extending), but this isn't something fsck can fix.

http://www.ibm.com/developerworks/library/l-fs7.html
http://www.redhat.com/support/wpapers/redhat/ext3/tuning.html

-- 
All the best,
Brian Pitts