[nylug-talk] Ext3 Disk Reservation and Defaults -- WAS: extfs journalling percentage
Bryan J. Smith
b.j.smith at ieee.org
Wed May 28 14:16:53 EDT 2008
Michael Bubb wrote:
> The file system in this case is ext3.
> On this and other similar servers in my company,
> I have noticed a 5.1% 'overhead' on the
> difference between the partition's capacity and its
> amount usable.
> I realize this is the journal and the blocks reserved
> for root.
Chris Knadle wrote:
> I'm not 100% positive, but I don't think the ext3
> journal counts in that 5% reserved for root-only usage.
Chris, you are most definitely correct. It's 100% block reservation
only writable by superuser privilege.
The Ext3 meta-data journal is tiny -- tens of MiBs. In fact, even
back in the kernel 2.2 Ext3 full data journaling days, the upper,
recommended limit for "full data" journaling was still under 0.1%
IIRC (that was long ago).
Chris Knadle wrote:
> Mainly I think that this is space set aside for root for
> having a place to do emergency filesystem repair, like using
> debugfs or similar utilities. I've often wondered the same
> thing you have and on occasion I've set up a test box with 0%
> reserved space and nothing terrible happened, but then again
> I also didn't have a problem such that I'd need a space to
> work with debugfs, etc.
Fragmentation is the most common issue with filesystems filling up.
This is especially the case with user data filesystems where file
sizes vary greatly**. The 5% default is the threshold where this is
most likely to occur, although it varies on the type and size of
files, frequency, etc... I've actually reserved 10% on some
filesystems myself, especially if the file sizes are quite large, or
some operations require contiguous blocks.
The other reason is as you alluded to -- in general -- you don't want
privileged operations at the mercy of non-privileged operations.
E.g., users filling up filesystems so nothing can be resolved.
That's really bad, especially for things like syslogd and other,
crucial operations. /tmp is another filesystem where bad things
happen when it fills up. So it's good that /tmp and/or /var have
such a reservation, or root (/) if they are not separate filesystems.
**NOTE: Some filesystems use Extents to manage varying file size
commits better from the standpoint of fragmentation. E.g., XFS does
this. Extents also introduce overhead, so it's not ideal when the
file sizes do not vary, or use lots of small files. E.g., I would
never use XFS for /tmp, /var, etc... ;)
The 5% default is a proliferated "good practice," although it can be
2% in some cases on very large Ext3 filesystems. I know many
ReiserFS advocates would disagree, and even based e2fsprogs for
making this a default -- which they see as "legacy" -- but it has
saved my butt several times in just recent years as well. One of my
counter-argument was always that if you really care about such
things, then both Ext3 and ReiserFS v3 (v4 not being out at the time)
have serious design deficiencies compared to other filesystems for
data and other, non-/, /tmp, /var filesystems (like XFS ;).
> i.e. I believe whether you need it or not is a question of
> your own administration needs rather than the filesystem's
> own requirements.
People have their own experiences. I don't tell anyone what they
should do, as I expect them not to tell me the same. ;)
Some never segment filesystems, and rely on a single root (/)
filesystem and the reservation to keep from /tmp and
/var/log/messages (and anything else targeted by syslogd) from being
filled.
I started with SunOS and a few others long ago, so I'm biased on
mounting different, segment filesystems, although I differ -- heavily
-- from most Solaris administrators at the same time on my sizes. I
may segment my filesystems so they are the same size, and do not make
them small for a reason. But I always segment out /tmp and /var
(sometimes several /var/* and, if used, /srv/* filesystems). In
honesty, I like to minimize the writes to the root (/) filesystem in
general.
For Ext3, I not only sometimes change the default reservations
sometimes, but I also increase the number of pre-allocated inodes per
data blocks (which Ext3 pre-allocates) for select filesystems. This
includes /var, which often has lots of small files, and I've seen a
root (/) filesystem only 30% full before, but 100% of inodes
utilized, because of /var.
We all have our own experiences.
--
Bryan J. Smith Professional, Technical Annoyance
b.j.smith at ieee.org http://www.linkedin.com/in/bjsmith
------------------------------------------------------
Fission Power: An Inconvenient Solution
More information about the nylug-talk
mailing list