Finding the best filesystem (for USB flash drive installs)

Non release banter

Re: Finding the best filesystem (for USB flash drive install

Postby pizzar0 » 16 Aug 2011, 20:07

No sweat!

I shall correct the apparent omissions in some of the points;

First and foremost, underlying any and all previous and present points is the following, undeniable fact: One SHOULD NOT run ANY journaling file system on a non-block device backed loop, under Linux, due to the current implementation of the main-line loop driver!!!
[Ask Linus or anyone else in the know, for that matter. - @fanthom - One of those things you can not just script your way out of!] There is no exception to this rule unless the loop driver itself is physically altered or replaced, including the accompaning util-linux, mount etc. (Which is what most have been doing since the beginning of time for that precise reason.)

The reason I proposed the perceived importance of the Default File System choice is two fold;
1) Porteus likely ending up in a frugal install on (removable) USB drives, due to its intended use/nature, the correct file system choice will have a much bigger performance impact than a regular, full installation on traditional (legacy) hardware.
1a) While differences of major file systems are usually dependent on their use the one
thing that became apparent - and well documented out there - that EXT4 is
undeniably "The" worst choice of all available FS when running on flash media. (Just
how "bad" that can be debated but that it is the worst is hardly questionable.)
2) There is wide and long-standing evidence that there are two main factors which turn off new users from a any distribution (or anything else, for matter);
A) An ugly, cumbersome, unintuitive user interface;
B) A file system corruption and consequent data loss.

Now, if Porteus is used solely as intended, as a live system, essentially reliant on its read-only modules then the only obvious choice (under Linux) is ext2 by its very nature.
In our experience (with Slax) however, we found that unless the read-only, modular nature (as intended) can be easily and reliably maintained (WORKING modules created, downloaded, etc.), a great majority of new(er) users will simply continue to install (e.g. "dump") EVERYTHING into the persistence portions of their installations where the choice and the implementation of the default file system will make or brake a distro!!
In short, how Porteus is intended to be used and how it most likely will end up being used is two wildly different things!! -Unless, of course, the modules and their management is absolutely perfect, including function and reliability, requiring minimal user interaction.
As long as this later holds true - e.g. "perfect" in the true Biblical sense - then ext2 it is, as there is no other, intelligent choice. Said "perfection", however, is yet to be observed "in the wild"!
Just as importantly, where is the "average user" going to keep his mp3 library and baby/cat photos?? -Will there be a read-only module for those, too?

were you using XFS as the native filesystem on your block devices, running it as the default filesystem inside the initrd and /dev/root (i.e., changing the default initrd/linuxrc, which I believe are set to create virtual ext2 systems for both Slax and porteus v1.0),


The XFS case is ONLY (naturally) applicable to "Persistence".
Ext2 remains for everything read-only. (Why in hell would anyone change that, to etx4 no less???)

your first post in this thread included your mkfs and mount options for the Corsair Voyager drive, as follows:...
... I have tried to recreate this on my end, but have repeatedly failed to establish a loop device as the external log device


Let me corrected some obvious omissions - see: the first and most essential point - where all the mount (and other) options, as illustrated for XFS, ONLY hold with a block device (partition, etc.) backed loop - as is always the case!

All that said let me throw out a few sentences, just in lead-words, trying to give some context to what otherwise may seem like a purely academic exercise;
- Currently only data-centers and workstation use ext 4 and not very often;
- USB is the obvious future choice because A) nobody wants to be wired; B) existing infrastructure; C) power requirements - versus SCSI; D) cloud computing; E) upcoming performance.
- Who is the "average user" and what does he want from a Distro? Experts will - continue to - "roll their own" (modules) and beginners jump ship the moment they loose grandma's pictures. Who is Porteus for? Why will it make top 10 on Distro Watch? (If that's a metric?)
(There is no such thing as "the average user" in OSS because those are called "windows users" and between them and Apple they constitute 97.2% of the "average" eyeballs.)
- Why/How did Mint come from obscurity and shot to the top in record time. What did they do? (and it wasn't much radical innovation.)
- Why was Slax so successful, then suddenly stopped and there is still no Slax 7.0, yet? (and No, its not because Thomas is still hanging dry-walls.) Did the "average user" download it a few million times? (No!) Who and why did?
Last edited by pizzar0 on 16 Aug 2011, 20:52, edited 1 time in total.
pizzar0
White ninja
White ninja
 
Posts: 28
Joined: 08 May 2011, 23:26
Location: Chritmas Island

Re: Finding the best filesystem (for USB flash drive install

Postby Ahau » 16 Aug 2011, 20:50

I neglected to mention that fanthom's initrd (ext4 formatted) has the journal removed. More information on that is here:
viewtopic.php?f=44&t=747

I've not tested it yet, but will do so later today.

If you could please provide some rationale as to why EXT4 is the worst FS for use on a USB drive, I would be grateful (I haven't run into this statement before). I can document performance gains in EXT4 over EXT2 on my media, but I'm just getting into the reliability testing, so I suppose that could be a source of issues. I know that EXT4 doesn't sync data to disk as often as EXT3, but this may be outdated information. I know a lot of people use EXT2 because they don't want to have the I/O associated with journaling in EXT3 and EXT4, but this seems to be less of an impact (on performance, anyway) when running EXT4 in data=ordered mode.

EDIT: I'll agree that most users will continue to dump their data to the drive (either the native filesystem on the USB stick or the save.dat if they are using FAT or NTFS). You're correct that this increases the need for reliability when considering the filesystem for use in both instances.
Please take a look at our online documentation, here. Suggestions are welcome!
User avatar
Ahau
King of Docs
King of Docs
 
Posts: 1317
Joined: 28 Dec 2010, 16:18
Location: USA
Distribution: LXDE & Xfce 32/64-bit

Re: Finding the best filesystem (for USB flash drive install

Postby pizzar0 » 16 Aug 2011, 20:55

Ext4 is perfectly OK and good-to-go, of course, with the journal OFF. (It is/was all abut the "journaling", it being the entire issue- not block handling etc.)

"Why EXT4 is a "bad" choice for USB, etc?..."
Because it takes an immense hit (200-300% on average) in Read Performance if the physical sectors are misaligned! (Its almost even with xfs in the write performance penalty numbers.)
- And read performance being rather essential here while sector alignment is less than guaranteed, or even likely, in the case of the "average user".
pizzar0
White ninja
White ninja
 
Posts: 28
Joined: 08 May 2011, 23:26
Location: Chritmas Island

Re: Finding the best filesystem (for USB flash drive install

Postby fanthom » 17 Aug 2011, 08:23

some highlights:
the main reason for using ext4 for initrd is that i wanted to get rid of ext2 and ext3 from the kernel: they are supported by ext4 subsystem since few kernel releases so this solution should be stable at this point.
ext4 without journal produces smaller initrd and porteus boots marginally faster (measured in Vbox: 0,64 sec till starting linuxrc for ext2 and 0,62 sec for ext4) so i have decided to give it a try.

it doesn't mean that i would recommend this combination for ssd/USB flash drives.
initrd is unpacked to RAM and it's data are wiped at each reboot - users does not care what happens with it's content. it is totally different subject than saving data (with or without journal) on a flash media.

btw: thanks for the input guys - this topic has plenty of valuable info.
Please add [Solved] to your thread title if the solution was found.
User avatar
fanthom
Site Admin
Site Admin
 
Posts: 3136
Joined: 28 Dec 2010, 03:42
Location: Poland, currently - Cork, IE
Distribution: Porteus KDE4, Porteus Kiosk

Re: Finding the best filesystem (for USB flash drive install

Postby Ahau » 17 Aug 2011, 18:50

Thanks, fanthom!

@pizzar0, I previously tested differing alignments with ext4, so last night I offset my test partition again by 63 sectors and ran some additional tests in ext2 and XFS to compare the results with those I already had for ext4 (note: this uses default mkfs and mount options for each filesystem, so XFS is not using a delayed log, etc., and EXT4 is mounted in data=ordered mode, so metadata is journaled but actual data is not).

Percentages expressed below represent the drop in transfer rate between a partition that starts on an aligned sector (in this case, a sector that is a multiple of 8192), versus a partition that starts on a sector that is unaligned (in this case, a sector that is 63 greater than the sector used for the previous alignment):

Write speads for a single, large file (300MB ISO) fell for all three FS's fell, but not evenly.
XFS: 7.68%
EXT2: 21.449%
EXT4: 17.48%
Speeds when aligned: XFS: 11.59 MB/s EXT2: 5.34 MB/s EXT4: 12.02 MB/s

Write speeds for many, small files (untarring the kernel source tarball to the device) fell as well, but not as much (this is to be expected, because the tarball takes a while to extract in RAM before it starts writing to disk, and this delay is included in the total time for my tests):
XFS: 5.39%
EXT2: 1.76%
EXT4 6.46%
Speeds when aligned: XFS: 1.22 MB/s EXT2: 2.58 MB/s EXT4: 5.38 MB/s

Write speed for creating a 128MB file full of zero's with a 1024k block size, through dd if=/dev/zero:
XFS: -0.56% (write speed actually went up a tiny bit, but well within the margin of error for these tests)
EXT2: 15.52%
EXT4: 17.85%
Speeds when aligned: XFS: 10.71 MB/s EXT2: 4.79 MB/s EXT4: 12.13 MB/s

Read speeds also went down --

Read speed for a single large file (300MB ISO):
XFS: 2.93%
EXT2: 4.94%
EXT4: 8.30%
Speeds when aligned: XFS: 20.46 MB/s EXT2: 20.64 MB/s EXT4: 20.03 MB/s

Read speed for many small files (copy the extracted kernel source files from the device to RAM):
XFS: -62.56% (again, XFS got better, and substantially so -- additional testing would be needed to find out if this is an anomaly in the test or some other variable -- the error is likely in the XFS aligned test, as that value is quite a bit slower than all other values for this test, and the unaliged speed is comparable to the performance of the other two FS's)
EXT2: 7.37%
EXT4: -6.30%
Speeds when aligned: XFS: 5.44 MB/s EXT2: 8.96 MB/s EXT4: 9.30 MB/s

Read speed for a 128MB file full of zero's w/1024k block size:
XFS: 1.30%
EXT2: -14.86%
EXT4: -8.36%
Speeds when aligned: XFS: 20.53 MB/s EXT2: 21.42 MB/s EXT4: 20.17 MB/s

Copy2ram speed (measures the speed of copying the contents of a loop-mounted module -- 000-kernel.xzm-- to a directory in RAM):
XFS: -0.16% (again an increase, but within the margin of error)
EXT2: -5.38% (increase, within the margin of error without running additional tests)
EXT4: 3.22%
Speeds when aligned: XFS: 23.50 MB/s EXT2 22.26 MB/s EXT4: 23.71 MB/s


Thus, on my media, for the tasks performed in my tests, EXT4 can be shown to slow down when out of alignment, but this is fairly comparable to the hit that EXT2 takes in most cases. XFS appears to not drop as much when put out of alignment, but perhaps this is due to the fact that I lack the experience to correctly configure XFS for the particulars of my device (or, maybe there are a certain number of sectors at the beginning of the partition that push the actual data out of alignment? I recently read some articles that discussed modifications to the number of reserve sectors for FAT partitions to compensate for the area taken up by the file acquisition table and superblock at the beginning of the drive, so that the actual data starts on a flash block-aligned sector). In any case, I don't see a speed reduction that is within the order of magnitude you mention above, especially in read speeds, as all filesystems performed about equally in the read-speed tests. Do you know what benchmarks (if any) were used to derive those results, and on what hardware? I know my tests are relatively simplistic, and don't test speeds for multithreaded applications. I'm also testing an extremely limited number of devices (2 thus far, though I have my eye on a 16GB Patriot Rage USB flash drive!).

My tests would indicate that EXT2 and EXT4 have more or less comparable read speeds, whether in or out of alignment (though EXT4 read speeds do dip a little below EXT2, particularly when out of alignment), but EXT4 provides much better write performance, especially when dealing with small files. I have not yet finished my "crash testing" for EXT4 or XFS (I'm about 3/4 done with EXT2, then will do EXT3 and EXT4), but I imagine both XFS and EXT4 will prove better at dealing with power failures and data recovery versus EXT2. Personally, I take stability and data loss more seriously than a slight drop in performance, so I'm looking forward to completing those tests (but, they are very time consuming as I crash each FS about 20 times, then have to reboot and check the data and repair it if necessary, representing greater than 40 boot cycles per filesystem tested).
Please take a look at our online documentation, here. Suggestions are welcome!
User avatar
Ahau
King of Docs
King of Docs
 
Posts: 1317
Joined: 28 Dec 2010, 16:18
Location: USA
Distribution: LXDE & Xfce 32/64-bit

Re: Finding the best filesystem (for USB flash drive install

Postby pizzar0 » 19 Aug 2011, 01:03

Finally a good investigative start! Good job.
Those numbers look familiar. (Let me guess: current Kingston <16GB USB/controller? Although you could've just told us... :)

... and here it is, in your own words - assuming that you used Parted > ver. 0.6 or sfdisk/fdisk > ver. 2.17 but Not cfdisk:
maybe there are a certain number of sectors at the beginning of the partition that push the actual data out of alignment?

Yes, there are. -And Yes, "aligned" means what the built-in micro-controller decides it means, not what Parted says - although ver. >0.6 goes some way to fixing that.

ON AVERAGE(!!), running the same/similar tests as you've just completed, on 15-20 top-selling USB drives currently on the market, one will find that the "deterioration" of XFS' performance due to bad NAND chips, messed-up controllers, etc. is significantly less - more fault tolerant, or "compensatory" - then that of EXT4. In some, quite common, instances those performance differences, which you have observed, simply go off the charts. (Read Penalty -> EXT4 vs. XFS >200% in favor of XFS.)
Of course, there remains XFS' bias in working with large(r) files. (Its data-recovery potential still remains without peers!) Compare that to EXT4's preference to not work at all (for all practical purposes) in the case of many, currently available USB drives.
pizzar0
White ninja
White ninja
 
Posts: 28
Joined: 08 May 2011, 23:26
Location: Chritmas Island

Re: Finding the best filesystem (for USB flash drive install

Postby Ahau » 19 Aug 2011, 02:59

Thanks -- you're confirming my fears (namely, that USB controllers are all over the board, and there really is no one best filesystem for use with all USB flash drives).

Yes, I should have specified, this is my 8 GB Kingston DT101 G2.

Here's the log for how I set up the offset partition for ext2:

Code: Select all
bash-4.1# fdisk /dev/sdb

Command (m for help): p

Disk /dev/sdb: 7998 MB, 7998537728 bytes
128 heads, 32 sectors/track, 3814 cylinders, total 15622144 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x5a5a5a5a

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *        8192     4095999     2043904   83  Linux
/dev/sdb2         4096000    15622143     5763072   83  Linux

Command (m for help): d
Partition number (1-4): 2

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4, default 2): 2
First sector (2048-15622143, default 2048): 4096063
Last sector, +sectors or +size{K,M,G} (4096063-15622143, default 15622143):
Using default value 15622143

Command (m for help): p

Disk /dev/sdb: 7998 MB, 7998537728 bytes
128 heads, 32 sectors/track, 3814 cylinders, total 15622144 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x5a5a5a5a

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *        8192     4095999     2043904   83  Linux
/dev/sdb2         4096063    15622143     5763040+  83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.
bash-4.1# partprobe
bash-4.1# fdisk /dev/sdb

Command (m for help): p

Disk /dev/sdb: 7998 MB, 7998537728 bytes
128 heads, 32 sectors/track, 3814 cylinders, total 15622144 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x5a5a5a5a

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *        8192     4095999     2043904   83  Linux
/dev/sdb2         4096063    15622143     5763040+  83  Linux

Command (m for help): w   
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.
bash-4.1#
bash-4.1# mkfs.ext2 /dev/sdb2
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
360448 inodes, 1440760 blocks
72038 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1476395008
44 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Writing inode tables: done                           
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 20 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
bash-4.1#


This was using fdisk from util-linux 2.19 (x86_64 -- default for V1.0 Porteus 64 bit edition). I previously set the geometry to 128 heads / 32 sectors per track using cfdisk (fdisk seems to be buggy when it comes to setting the geometry). I've been setting my partitions to start on an even multiple of 8192 sectors--however, it sounds like some FS's (e.g. FAT32) will use an uneven number of sectors at the start of the partition (not the drive) for the FAT table and some unallocated space, before the first datablock. Thus, aligning your partition to an even sector might not actually line up your data blocks with the NAND pages... reference: http://www.patriotmemory.com/forums/sho ... ning-FAT32

When I create an EXT partition, it tells me that the first data block is block 0, which I assume to mean that the first block in the partition lines up with the first sector of the partition on the disk; but I don't know that for sure. XFS doesn't really give me an indication:

bash-4.1# mkfs.xfs -f /dev/sdb2
meta-data=/dev/sdb2 isize=256 agcount=4, agsize=360190 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=1440760, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
bash-4.1#


BTW, if you like the data I presented above...I have lots (and lots) more data where that came from :)
Please take a look at our online documentation, here. Suggestions are welcome!
User avatar
Ahau
King of Docs
King of Docs
 
Posts: 1317
Joined: 28 Dec 2010, 16:18
Location: USA
Distribution: LXDE & Xfce 32/64-bit

Re: Finding the best filesystem (for USB flash drive install

Postby pizzar0 » 19 Aug 2011, 04:07

Yes, that data has practical relevance, although mostly only in the case of Kingston.
But may all not be lost...

Here is something(s) may be worth to ponder;

According to Zsolt Kerekes (He runs probably the most trusted developer/enterprise SSD tech. portals on the web.)
You don't need to understand semiconductor physics to buy a new processor or a server - but at this particular generation in the evolution of the SSD market we're in today - you do have to be aware of a heck of a lot of technology and architectural concepts if you want to successfully leverage SSD technology in large scale deployments. Eventually that will change. But it could take another 3 years.

The above said, and true as all that is,... Sandisk modeled its new PRO 2000 Enterprise series with the dm_raid (main line) driver (Linear RAID) and xfs. THEN they translated the whole combo into silicon and BAAAM!! - came out with their 2000 PRO series drives, the fastest SDD drives in history and 2x faster than their nearest competitor. (550 MB/s r/w. encrypted by default.) Probably the simplest of all designs, even according to Sandisk. Beautiful!
So, it is somewhat telling what can be achieved with the correct use and combination of tools, even if they are only emulated in software. (Which is how one gets >300 out of a $60 stick.)
One can not help but to wonder; What if someone threw together an optimized FS+controller combo that probably would work for ~80% of the USB sticks out there (more less) and then dropped that into a "module"...
(just wondering...)
pizzar0
White ninja
White ninja
 
Posts: 28
Joined: 08 May 2011, 23:26
Location: Chritmas Island

PreviousNext

Return to General chat



Who is online

Users browsing this forum: No registered users and 2 guests