[mirror-admin] Inode caching [was: Re: ERROR: chroot failed for fedora-web]
Chris Schanzle
schanzle at nist.gov
Wed Jan 14 12:17:09 EST 2009
On 01/13/2009 08:48 PM, Matt Domsch wrote:
> a) various kernel tunables to keep more NFS inodes and directory trees
> in cache on the server.
Funny this has come up, as in the last couple weeks I've been
investigating this issue and noticed a huge benefit on my local mirror /
file server to NOT cache file data, as ironic as that sounds, but to
cache inode data. After doing some tweaking, I can only imagine the
seeking that a public rsync server would get without preferring to cache
inode data. And when I connect to a mirror and see rsync (with -vP
options) taking seconds to count by the hundreds, but yet will transfer
files at > MB/sec speeds, I know it's inode cache is not working
optimally, or the server needs more RAM.
The Linux kernel tunable is /proc/sys/vm/vfs_cache_pressure. It's a
balance knob, centered at 100, to equally balance caching of inode
versus file data, with lower numbers preferring to cache inode data.
E.g., near the extreme:
echo 10 > /proc/sys/vm/vfs_cache_pressure
or add to /etc/sysctl.conf for persistence across reboots:
vm.vfs_cache_pressure = 10
Monitor inode/dnode cache usage in from Slab in /proc/meminfo. Cached
file data is listed as "Cached". For more info, see
/usr/share/doc/kernel-doc-*/Documentation/filesystems/proc.txt
Caution: Like all things in life, too much of a good thing is bad.
Twisting the knob too low (e.g., all the way to zero) and loading up the
inode cache via "du -sh" on a 3.4 million file filesystem brought my 4GB
x86_64 server to a near freeze. Character echo through ssh took about
two minutes, even with the "du" stopped. As soon as I got
vfs_cache_pressure up to 10, system was normal again. The OOM killer
was kicking in (dmesg | grep "Out of memory"), knocking out httpd
processes. At a vfs_cache_pressure value of 10, my Slab worked up to
3.6/3.7GB and was managed without adverse side-effects.
It's worth noting a 4GB system cannot cache 3.4 million inodes in RAM.
I also suspect increasing the device readahead buffer with 'blockdev
--setra NN DEVICE' would be beneficial as well, to reduce seeking back
to read more file data while streaming data out the network. But this
caching essentially conflicts with vfs_cache_pressure; I'm not sure what
instruments can be used to determine the right balance.
Please share your experience of tuning parameters in our mirroring
application.
-Chris
--
More information about the Mirror-admin
mailing list