[ale] fun fun changing Linux swap partition to a swap file

Michael B. Trausch mike at trausch.us
Thu Jan 20 03:56:12 EST 2011


On Thu, 2011-01-20 at 01:17 -0500, Ron Frazier wrote:
> Before I detail the steps, I just have to say this.  I don't intend to
> start a flame war, but most of what I did here, I could have done on
> the Windows side of my dual boot fence within 15 minutes from the
> graphical, easy to use, utility built into Vista.  Doing it on Linux
> took me over an hour the first time and involved numerous trips to the
> command line and a dance with a live boot disk.  I think it could and
> should be much easier. 

Given that the Windows operating system has never used a dedicated swap
partition, I think it's fair to say that the task that you have
described in this message does not even apply to Windows.  Partition and
filesystem resizing aside, that is.  And while it is possible to move
the swap file from one drive to another (or, at least, it used to be),
Windows treats the swap file (along with a few other files) as special
cases.  That is, they are not "normal" files on the filesystem; the
operating system treats them differently from other filesystems, and
given the way that Windows does this, it makes the filesystem quite a
bit less flexible than it is in, say, most UNIX-like operating systems
(Linux included).

Now, Linux does not make any special provisions for swap files (or, for
that matter, any other files that happen to be on the filesystem).
Comparing this to Windows, which guarantees that the swap file is only
ever exactly 1 extent (IOW, one contiguous bytestring on the platter),
it is entirely possible that a swap file may be (or become, if resized)
fragmented.  It must not be a sparse file, and it should be located on a
filesystem that is built-in to the kernel or initrd image so that the
swap file is available when the system comes up.  However, there are a
few "gotchas":

 * A swap file on a filesystem is only usable for virtual memory
   if the filesystem is mounted read-write, because the virtual memory
   layer must be able to read and write swap or it is useless as a
   backing store.

 * If the filesystem becomes corrupted (which, trust me, can happen for
   any number of reasons), then the swap file can become corrupted.  If
   the swap file is corrupted, it won't be a matter of if you suffer a
   crash, but when.

 * If you are using the hibernation feature of the kernel, it is almost
   a given that you should be using a full swap partition, and not a
   swap file.  Hibernation will leave files open and will preserve the
   state of many devices, including all virtual (software) devices such
   as filesystem drivers.  However, the act of writing to a file on
   a filesystem means that you will wind up with not only the swap file
   contents being updated, but also the file's metadata, which will land
   in the filesystem journal (if you are using a journaled filesystem).
   While it is not a regular occurrence, it is possible in certain cases
   to have changes made to the filesystem as a result of using a swap
   file that are not recorded in the filesystem driver's state as saved
   within the swap file itself.  This can cause problems in certain
   situations.

 * Don't use a swap file on btrfs.  Or any networked filesystem.  Or a
   FUSE filesystem.  Or a normal filesystem that is accessed on a
   block device that is provided over the network.  And certainly not
   on a time-tested, heavily stressed filesystem that hasn't received
   a "proper shakedown" yet.  Unless you very much understand the
   workings of the virtual memory subsystem, or are looking to learn
   them (or other complex things, like figuring out WTF went wrong)
   by experimentation.  :-)

Perhaps the biggest thing with using a swap file on a real filesystem is
that it is then up to the filesystem whether or not the swapfile will
work as intended or as efficiently as it can.  And even worse, if you
put your swap file on a filesystem, and your system runs out of memory,
you're going to usually have even less chance of getting it back without
pressing the reset button on the computer, especially if the problem
with the system is the filesystem driver itself (*cough, cough* btrfs
and it's way slow sync I/O, if you want a recent example).

One reason that UNIX-like systems all tend to use a dedicated partition
for swap storage is that it is a lot easier:  you have a contiguous set
of blocks that have no metadata in them and essentially represent an
extension to live system RAM (when the system is up and running,
anyway).  It's harder to screw up a dedicated swap partition, since
usually raw access to partitions is always hands-off to all users of a
system except for the superuser.  It's also something that has been well
tested; the code for handling that hasn't needed change in ages.
However, filesystems come and go, and yes, we still find bugs even in
the extX series of filesystems (more in the newer ones than the older
ones, but I have yet to find a filesystem driver that is free of
bugs---and it is nearly impossible to prove that one is free of bugs,
because filesystems tend to be relatively complex).

I'd like to know more about your cloning setup to try to find out what
happened to the swap partition during the clone process.  It might have
to do with the fact that most systems refer to just about everything
(including swap) using unique identifiers that aren't partition numbers,
which means that if you have two filesystems or swap spaces that use
identical unique IDs, the system can get confused.  If your setup uses
partition numbers, that is easily confused since device enumeration does
not have to remain static (and can differ based on things like the time
it takes for a drive to respond to init/identification commands).

This brings up another point: block-by-block clones of filesystems are
almost never useful, are typically wasteful, usually not as compessible
as a smart image (e.g., there can be junk data in unused sectors that
won't compress as well as the all-null-bytes that would be there if you
used a smarter utility) and can in fact cause harm in subtle ways.
Imagine this: you have two hard disk drives, /dev/sda and /dev/sdb.
Now, imagine that your backup plan is to dd the contents of /dev/sda
to /dev/sdb every day.  If your system is setup as most modern systems
are, this is going to cause a bit of an interesting problem: now, you
have two drives that appear to be different (/dev/sda and /dev/sdb) but
have filesystems with the same identifiers on them.  Now you have
created the ability for a race condition to occur.  UUIDs prevent
confusion caused by device reordering, but only if every filesystem on
every drive has a truly unique identifier.

Instead, what happens if at boot time, the drives change logical spots?
Maybe the drives are on two different controllers, and sometimes one
controller responds faster than the other and other times the second one
responds faster.  Or maybe the wires are just the tiniest bit
squirrelly.  Whichever filesystem with the specified UUID is the first
filesystem to be found by the operating system is the one that will be
used.  That means that you might inadvertently find yourself using your
backup drive as your primary drive.  Or even worse, you won't realize it
before you clobber the drive that you have been using as your primary
drive.  And in the worst case scenario, you won't notice it at all until
you find files mysteriously missing, or filesystems that are seemingly
inconsistent with each other.

It's not all impossible, however.  There is a very useful component that
is available with every modern Linux-based system at installation time:
LVM, the Logical Volume Manager.  LVM makes *so* many things a great
deal easier.  Among other things:

  * You can create (and mount) read-only or read-write snapshots of
    live filesystems such that you can safely run backups while the
    system is up and running.  Snapshots usually only take a second
    or so to create, and can be destroyed almost instantly.  You only
    need to keep them around long enough to create a new full or
    incremental backup, as well.

  * Logical volume management means that you can move, grow, shrink,
    create, delete, reshape, and reconfigure the logical volume,
    without worrying about whether the filesystem will be happy or
    not.  Oh, and fileystems running on LVM do not have to be 100%
    contiguous, either; you can have them in extents all over the
    place, and you won't wind up with the odd problems of having to
    move them around to create enough useful contiguous space later
    on should you have to expand.  LVM storage areas (called
    "physical volumes") and pools (called "volume groups") can grow
    and shink, as well, so you can increase your storage pool if
    desired, or shrink it if need be.

  * LVM's capabilities are filesystem agnostic.  Snapshotting applies
    to all filesystems, including FAT, NTFS, ext2, ext3, ext4, minix,
    btrfs...

  * LVM does not have a (practical) limit on the number of logical
    volumes present.  Unlike the MBR partition table, which is
    limited to 4 partitions, or the extended partition type which is
    limited to another certain number (and the Linux drivers which
    often have a limit of 16 total partitions for a device that uses
    the MBR partition table), logical volumes have either no limit,
    or a limit that is high enough that it is not likely to be
    encountered.  I create tens of volumes (many of them snapshots)
    on production systems all the time.  It's part of my daily
    backup processes.

  * Once you know how to use LVM and all of its functionality, you
    can perform volume management tasks far more quickly than
    without, especially if you do things like take backups quite
    often (as, IMHO, one should).

  * If you're using LVM, you no longer need to use UUIDs in your
    system files like /etc/fstab.  Instead, you can use LVM device
    names like /dev/myvolumegroup/

Yes, most UNIX-like systems expect that you know what you're doing
around the command line to perform truly advanced, and sometimes
dangerous (from the point of view of your data) tasks.  On the other
hand, Windows has a habit of exposing some types of advanced and
sometimes dangerous (again, from the point of view of your data) tasks
via it's GUI---leading people to believe that somehow it is safer
because they are using a mouse, and then people have systems that they
don't know how to fix because there simply isn't a low-level way to get
at them.  Not as low-level as UNIX family systems (and the software that
runs on them) tend to allow.  It gets pretty difficult to accurately
learn about your computer and how it works when so much of it is tucked
away under the seams.  And that's fine, when it works.  But boy, when it
doesn't, it's a whole different world of hurt.  Then not only is it not
working, but the answer to the question "how do I get it working again"
is usually "I don't know" or "reinstall the system".

Even if the only thing that's wrong is that a file was deleted that
needs nothing more than to be restored from the latest backup or the
operating system installation media.

	--- Mike
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
Url : http://mail.ale.org/pipermail/ale/attachments/20110120/307377e0/attachment.bin 


More information about the Ale mailing list