Finding the best filesystem (for USB flash drive installs)

Non release banter
User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: Finding the best filesystem (for USB flash drive install

Post#31 by Ahau » 11 Aug 2011, 02:48

Thanks, Blackrider. I have to do some more research to fully understand this, but it sounds like this could be a double strike against XFS on flash. In my limited understanding of flash devices under linux, commands to write to a particular location are not explicitly obeyed. Instead, the flash controller retains the physical locations of the data, and maps it to a logical location. Data cannot be overwritten on flash; it must be erased entirely and then written again. As such, a command to overwite would instead result in the data being marked as deleted on the drive (but not actually overwritten), and a bunch of 0's written to a new location.

If this is the case (and I'm not sure that it is), then you would lose direct access to your data, but a skilled attacker would likely still be able to get to it. Not very helpful, really.

Corrections or additions would be welcome...

I'll try some tests on various filesystems and report on them in my final analysis. Were you using a flash device when this occurred? Did you simply lose power during a write, and was it the files you were writing that were lost, or were any others? to simulate, I'll install Porteus to a drive, and pull it out while booting up, and then after booting up while performing a write.

Thanks!
Please take a look at our online documentation, here. Suggestions are welcome!

User avatar
BlackRider
Black ninja
Black ninja
Posts: 70
Joined: 13 Jul 2011, 11:04
Location: Nowhere
Contact:

Re: Finding the best filesystem (for USB flash drive install

Post#32 by BlackRider » 11 Aug 2011, 09:27

As such, a command to overwite would instead result in the data being marked as deleted on the drive (but not actually overwritten), and a bunch of 0's written to a new location.
The overwriting is not a protection against forensic attacks. It is a way to prevent the data from being taken by unprivileged users through the filesystem interface.

JFS uses the same method.

Reiserfs does not "zero" the files, but if a file is locked (in use) when the filesystem crashes, you can bet it will be corrupted.

The same goes with ext3/ext4 in writeback mode, where the consistency of the filesystem is guaranteed but files in use in the moment of the crash can be easily deleted/corrupted. In ordered mode, only files that are being overwritten are in danger (if it is being read or it is only being appended new data, there are not much chances of it getting corrupted), In journal mode, as most data is kept duplicated, corruption is extremely unlikely in any case. Writing speed is nearly halved for most scenarios that use journal mode.

If you love your data, I think the best is to use ext3/4 in ordered or journal mode. Any other option offers not much protection for individual files.
Were you using a flash device when this occurred? Did you simply lose power during a write, and was it the files you were writing that were lost, or were any others?
It was an inner drive. The data loss was caused because an improper shutdown script that did not stop a process that was writing to the files (I had backed up the data because I knew the shutdown script was not trustworthy. Don't ask). I cannot tell if only the data that was being written was destroyed, or if they where collateral damages.

User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: Finding the best filesystem (for USB flash drive install

Post#33 by Ahau » 11 Aug 2011, 16:35

Thanks for the clarifications and details!

I've also heard that Reiser is kind of easy to break...we'll see how easy :evil:

JFS was absolutely horrible (performance-wise) in my initial tests; I've dropped it from consideration, but I'm still hoping to find a reason to keep XFS as a contender.

Personally speaking, I think that data=ordered is the right balance between performance and data integrity, when matched with frequent backups (this ought to be done even with full journals, as a journal won't save you if you lose your drive, format it on accident, the hardware fails quickly, etc).
Please take a look at our online documentation, here. Suggestions are welcome!

User avatar
BlackRider
Black ninja
Black ninja
Posts: 70
Joined: 13 Jul 2011, 11:04
Location: Nowhere
Contact:

Re: Finding the best filesystem (for USB flash drive install

Post#34 by BlackRider » 11 Aug 2011, 20:35

I don't think there are many reasons to keep XFS as a filesystem for Porteus. XFS is a good option when you need to handle many files with big sizes (over 4 Gb). As the size of the data container will be surely tiny, it makes not much sense to use it as a filesystem hosted in a file smaller than 4 Gb ... But please, keep benchmarking it!

Also, if you have had a look at the thread about cryptoloop, you will know that journaled filesystems are not a good idea with the actual cryptographic implementation Porteus has. Cryptoloop+Journaling=Corruption+Criptographical_ Weakness.

Also, what version(s) of Reiserfs are you testing? Reiser4 seems to me a total disaster (you could not expect something better when the original author is jailed because he killed his wife), judging the results of Linux Gazette's test I have been reading lately.

User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: Finding the best filesystem (for USB flash drive install

Post#35 by Ahau » 11 Aug 2011, 20:50

I've been testing Reiser3, as all tests have been done within Porteus, and support for Reiser4 is not built in. I have seen the thread about using journaled FS's in an encrypted, loop-back device. What I don't fully understand is if that would apply to EXT4 with the default mount option (data=ordered) -- As this keeps the same kind of data in it's log as XFS (logging metadata, but not the data itself), I'd imagine it would be OK. That said, my personal favorite so far has been a NILFS2 'save.dat' container image for saving changes on a FAT formatted device. Maybe someday relatively soon NILFS will be stable and that can be implemented in the default Porteus ISO :)
Please take a look at our online documentation, here. Suggestions are welcome!

User avatar
BlackRider
Black ninja
Black ninja
Posts: 70
Joined: 13 Jul 2011, 11:04
Location: Nowhere
Contact:

Re: Finding the best filesystem (for USB flash drive install

Post#36 by BlackRider » 11 Aug 2011, 22:58

One of the many reasons is that file backed loop devices ***will*** dead-lock under certain circumstances because "With file backed loop devices, correct write ordering may extend only to page cache (which resides in RAM) of underlying file system. VM can write such pages to disk in any order it wishes, and thus break write order expectation of journaling file system."
This means that any journaling loop filesystem is at risk, because nothing guarantees that the data will be written in the proper order under some circumstances.
What I don't fully understand is if that would apply to EXT4 with the default mount option (data=ordered) -- As this keeps the same kind of data in it's log as XFS
Are you refering to the cryptoloop issue, or to usual corruption of files?

Ext3 and 4 don't zero the files because they use another method in order to avoid data leaks. I read it somewhere, but I can remember how it was exactly.

I guess the cryptoloop issue is not OK. Ext3/4 is still vulnerable to having the metadata and the data itself written to the journal when it shouldn't.

pizzar0
White ninja
White ninja
Posts: 28
Joined: 08 May 2011, 22:26
Location: Chritmas Island

Re: Finding the best filesystem (for USB flash drive install

Post#37 by pizzar0 » 15 Aug 2011, 17:56

OK, having no intent to beat a long dead horse too much further but here is a novel idea;
Aside from actually READING the posted links in "SSD Installation Considerations"...
...
Buffalo, Samsung, Sony, SGI and just about all others with any, significant, vested interest in the "usb/FS business" use the XFS file system WITH EXT. LOGs on their A) own servers; B) On the consumer products they sell by the millions. The Buffalo TeraStation could be special interest here since its implementation is entirely based on XFS + USB + External log, precisely because:
In a worst case scenario of the total loss of the Buffalo TeraStation, one would be able to recover the files on the XFS formatted Hard Drive provided the drive itself is intact.
Furthermore, they ALL use XFS on USB Drives with external logs AND use those drives also to hold the actual logs (meta data) itself on their R&D (workstation) and some portions of their data center (server) environments.
So, here is what I wonder: WHAT WOULD THAT TELL YOU?!?!...

Let me take a wild guess here and reflect on the fact that all the - above - "analysis" is based on, as well as its results derive from, the totally misguided notions that
A) One can run a journaling file system on a non-block device (such as: loop);
B) That the external meta-data (log) is only "mildly important";
C) The mounting options "dont really matter" in the case of XFS.

None of which is even remotely true, under any circumstance!
May be users got used to the idea that the default values provided for XFS are close to optimum, out of the box. All that may have been reasonably true up to ~2.6.30.x. but since Porteus appears to be running >2.6.38 that long accepted notion simply does not hold water any more - at all! (One shining example that jumps to mind is the current way XFS handles barriers and the radical evolution of device inter-layer communication with respect to file systems such as XFS, starting around >2.6.33.)

Further examples of hard to break "believes" would be the notion that one could not boot "/" on XFS with external logs. (Really?!)
http://mindplusplus.wordpress.com/2008/ ... hing-an-i/

All in all, the only vested interest here is that since administering 1000+ nodes worlwide for >10 years, mainly running XFS, I'd very much welcome the opportunity to take my hat off and to shake the man's hand who could show me how anything (other than NILFS) would even come close in performance to that of XFS' on any SSDs, USB included. Please, do not be shy!

p.s.
The XFS file system was invented as "recently" as 1994. Perhaps that's just not enough time to read up on it? (or to get the bugs fixed which, of course, would be much more easily done in EXT4 since it was congered up as "long ago" as yesterday, mostly in an attempt to combine XFS's features with ext3. -NO?!)

User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: Finding the best filesystem (for USB flash drive install

Post#38 by Ahau » 15 Aug 2011, 22:45

Thanks, pizzar0, I was hoping you'd have some comments to share on this.

In response:

1) I did read the posted links in the SSD Installation Considerations thread. In terms of performance, what I gathered from them is that XFS underperforms most other FS's, and that was confirmed by my tests (but your results earlier in this thread show otherwise -- reaffirming your point that mount options DO matter). I tried and failed to configure XFS with an external log pointed to a loop device, as you suggested earlier in this thread (perhaps I need to do more to setup the loop device before I point to it as the external log device?). I would really like to experiment more with XFS, and I'll do more research into the mount options and how to implement them. That said, the goal of this thread and my research in general is to find the best filesystem to recommend to relatively new users (as experienced users will have already formed their own opinion on the matter, as you have); most of these users will be as unfamilar with XFS mount options as I am, if not more, and most of them have absolutely no idea how their drives are structured internally, what their page size is, or how many channels their drive uses. That is why the default mkfs and mount options have been used to this point (further, the developers of XFS recommend the use of the default settings for most implementations: http://xfs.org/index.php/XFS_FAQ#Q:_I_w ... mething.3E). If I am able to learn and understand the relevant options and relay them in simple terms in my documentation, and if XFS performs well and is stable with those options present, then I will gladly recommend it as the filesystem of choice. This is and has been a learning process with a steep curve for me; that is why I have been working on it for so long without publishing my full results as of yet.

2) The fact that the manufacturers use XFS on their drives does indicate that they must have a solid reason for doing so. However, that fact, in and of itself, does not dictate that it is the best filesystem for use with running Porteus on USB flash. I'll need to know the reasons why they use XFS over other filesystems if I'm going to incorporate this information into my analysis. In the case of the Buffalo TeraStation, it appears that XFS is used on USB drives when backing up the information on the server -- is this use comparable to running Porteus from a flash drive? Are they backing up the data itself, or only copying the XFS metadata to the flash drive? Please understand, my only issue with XFS thus far is the performance when handling many small files; this is a minor issue in the overall scheme of things, but in my opinion, unless XFS performs better than other filesystems in other regards (such as reliability, repairability, ease of use, compatability with bootloaders, filesize efficiency, degredation, etc), I don't yet see a compelling reason to use it.

3) Your wild guesses are incorrect in my opinion, per the following:

A) We have had some discussions about using non-block devices in this thread, but the main point of this thread is in regards to the best filesystem for use on a block device (i.e., running a native filesystem on flash drives). In my opinion, it is better to format your flash drive with a native POSIX compatible filesystem, and run Porteus (or at least your saved changes) from that partition, rather than using loop-backed, POSIX-formatted devices living on FAT or NTFS formatted drives. I do plan to perform additional research and testing on loop-backed devices, in order to see if we can find a better way to implement them for those users who do not want to reformat their partitions. That will be the subject of another thread, and another document, when I have time to put it together. XFS remains the prime canditate for this purpose, because I feel that we shouldn't change the status quo without doing some kind of analysis of the alternatives; I also plan to evaluate EXT2, EXT4 without a journal, and NILFS2 (though NILFS2 is still experimental, it has performed well in early tests, and is just plain fun to play with). I would love it if every Windows+Porteus user formatted their drive with a windows-compatible partition in front and a POSIX partition in the back; however, many users will (and always will) prefer to keep their FAT stick as purely FAT, and they won't use Porteus if we force them to repartition their drive during instalaltion. Save.dat's are an accomodation for these folks, and though it is far from perfect, it is leaps and bounds ahead of posixovl in terms of reliability. I don't believe we've received any complaints about corrupted or otherwise deadlocked save.dat's

B) I don't believe that meta-data is only "mildly important". To the contrary, I believe that it is very important, and this is why I need to understand the use of an external log device better -- I need to understand how to implement it in Porteus if we are going to use it, and how the meta-data is saved and used when put on an external device (assuming it is saved to a loop device), thus affecting the reliability of the filesystem. That said, the developers of XFS seem to feel that using an external log is more likely to cause corruption issues, as it disables the use of barriers: http://xfs.org/index.php/XFS_FAQ#Q:_How ... e_cache.3F Is their FAQ out of date, or am I misunderstanding the issues?

C) The mounting options DO matter, especially with XFS, as XFS seems to have more extensive and valuable mount options than most other FS's. In fact, I have results that show substantial performance differences (for better and for worse), by manipulating the mount options. That said, if it requires expert knowledge to set up an XFS partition that is customized for the aspects of individual drives, this should be set apart as a separate document, and written by someone who is an expert with XFS, for those who want to become experts with it. If there are general settings that would be beneficial when applied in all cases (such as the use of a delaylog, which will become moot once we advance to a newer kernel), this might be more doable. I'll continue my research and experimentation (right now, I'm testing the behaviour of various filesystems when I deliberately crash the system, by pulling the flash drive out while reading/writing. I'll work with XFS more while performing these tests, attempt to use , and run my performance tests again while I have the drive formatted to XFS).

All in all, I can certainly understand your frustration with regards to my inexperience with this topic. It's frustrating for me, too :D I know that XFS can do more, but I have been unable to get it to perform as I wish. If you can help me create and mount an XFS partition that beats EXT4 and BTRFS, then it is I who would take my hat off to shake your hand!

Posted after 47 minutes 24 seconds:
Re: Finding the best filesystem (for USB flash drive installs)
For posterity's sake, I just found the following information, that might give some more detail on the 'deadlocking' issue:

Code: Select all

Author: Dave Chinner <david@fromorbit.com>
Date:   Wed Feb 17 05:36:29 2010 +0000

    xfs: Non-blocking inode locking in IO completion

    The introduction of barriers to loop devices has created a new IO
    order completion dependency that XFS does not handle. The loop
    device implements barriers using fsync and so turns a log IO in the
    XFS filesystem on the loop device into a data IO in the backing
    filesystem. That is, the completion of log IOs in the loop
    filesystem are now dependent on completion of data IO in the backing
    filesystem.

    This can cause deadlocks when a flush daemon issues a log force with
    an inode locked because the IO completion of IO on the inode is
    blocked by the inode lock. This in turn prevents further data IO
    completion from occuring on all XFS filesystems on that CPU (due to
    the shared nature of the completion queues). This then prevents the
    log IO from completing because the log is waiting for data IO
    completion as well.

    The fix for this new completion order dependency issue is to make
    the IO completion inode locking non-blocking. If the inode lock
    can't be grabbed, simply requeue the IO completion back to the work
    queue so that it can be processed later. This prevents the
    completion queue from being blocked and allows data IO completion on
    other inodes to proceed, hence avoiding completion order dependent
    deadlocks.
I found this information here: https://bugzilla.redhat.com/show_bug.cgi?id=667707 Not sure how much it helps, but I think I understand the issue a bit better now, or at least how the problem can arise.
Please take a look at our online documentation, here. Suggestions are welcome!

pizzar0
White ninja
White ninja
Posts: 28
Joined: 08 May 2011, 22:26
Location: Chritmas Island

Re: Finding the best filesystem (for USB flash drive install

Post#39 by pizzar0 » 16 Aug 2011, 06:30

I totally sympathize with your efforts and the above points. I mean it!

WHICH IS WHY my entire point - tirade, "wonderings", monologue, rants, tangents - can be summarized in one single paragraph, in the form of a question:

Would it not be important to Porteus' ultimate success to find, select AND implement (as the "Default") the optimum file system for the intended use of THIS specific Distribution?

That's all! - The true essence of what I am attempting to convey, no more no less.

Also, to quickly address the "outdated" description of barriers vs no barriers by the xfs designers... Yes, it is very much outdated and no longer holds as it is stated!
Here is a tad interesting bit about XFS's code development:
http://sandeen.net/wordpress/?p=532

In our experiences, including running Slax, starting from 5.x on, on several hundred server and workstation installations, XFS posses all the features and has proven itself time after time in real-world implementations, far superior to all other, commonly available Linux FS implementations, PROVIDED it is implemented and configured optimally.
Which is why it should NOT be left up to the average user to configure the default file system - out of the box - because it simply has too many gotchas and rather diverse, mostly questionable documentation.
But read it for what it is/says: "in our experience". (Although rather indepth as one may observe given the unusual amount of time and resources we were allotted to investigate.)

Opinion-wise... Everyone is entitled to their own opinion but not to their own facts!
A little fact checking in all the right places - Intel, Microsoft, Samsung, SGI, Kingston, SuperMicro, etc. - should leave very little to be desired. No?! (They pretty much invent, design and manufacture all that is at issue.)

That "the manufacturers preference in of it self does not matter..." I would rather strongly disagree with that - on more levels than I could count.
OSS is a beautiful thing! -But lets refrain from being blinded by "purity" and "intentions" alone because when SGI spent close to $40 Mill. into the XFS development (in 1992-1994 Dollars), they were not exactly just bored, looking for a hobby. Which is why by far the most (important?) applications which eventually make their way into the public domain originate in the "big houses" and usually evolve from proprietary roots. (Democracy is nice but it only works to a point and after some point.)
Most importantly, the whole business "near the metal [hardware]" is not only rather complex and difficult but it is also a very unforgiving, low margin undertaking, in most cases. E.g. there is very little reason to take it for granted that just because its a big company with proprietary product intentions, they set out from the get go to trick you "just in case". How likely is that? (I'm just rationalizing here without any vested interests any which way.) Is it not reasonable to assume that they actually want it to work whatever it is they are trying to sell. (Like the TeraStation, etc.) How it all turns out, of course, is case dependent.

Which is why, in my experience/knowledge/observations, if Patrick Volkerding (or Ubuntu, or Microsoft, or Redhat, or...) were to release a purely live Distro, I can't help but to think that the default file system issue/selection/implementation would be no lower than 1-3 place on their list of "really important things to do", pretty much before all else. No?!

Now, in order to keep this post from being just an other academic rant, here is some practical observation;
To know a given manufacturers' device micro-controller/algorithm (in the case of USB devices) is almost the entire game. Sometimes its revealed and often it remains hidden - by design or due to a lack of will. ANY file system will only be able to do so much - and usually far less than one likes to imagine - to run optimally on a particular hardware.
However, much like in the case of some girl you just met and she likes to do to the things you like to be done to you, as much as you like to receive them... you marry her right then and there! -The same could be said for SSD devices, as it stands.
If one can find out the (innate) particulars of a given manufacturer's device, chances are that something more can be achieved with the right selection of the file system - the more flexible the better; XFS?!
Here is how; In the case of the current Corsair Voyager (USB 3.0) 16 GB drive on the market the 0x050 section in the boot sector DOES contain/hint at that particular drive's (cell) layout and controller implementation. That means that there is a rather ingenious way (one might even call it simple) to set up a simple Linear RAID configuration (including XFS tuned to it) and have the RAID+XFS combo reasonably closely tuned to the device's characteristics.
It may sound complicated at first read but the prize in this case would be the 333 MB/s sustained read/write one can observe on everyday hardware.
Of course, there is no such thing as a "free lunch" so what is applicable, how and when varies. (Not in this particular case, though!) So the only question is whether one likes this particular device and is willing to invest a nominal amount of time into the setup to out-run just about any known drive out there.
Of course, this has nothing to do with Distro design, in general but rather an attempt to illustrate the complexities of the issue at hand.

p.s. To satisfy my own curiosity I have double-checked our logs for ALL the Slax/XFS installs (servers and workstations) since 01/01/2009. (80 -> 400 machines, as the numbers increased over time.) There was not a single unrecoverable file system corruption/failure in that time. May be we were are just really (really!!) lucky?!...

User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: Finding the best filesystem (for USB flash drive install

Post#40 by Ahau » 16 Aug 2011, 17:36

Thanks, pizzar0! Please understand that I very much appreciate your comments and contributions -- my only goal here is to understand all of this better, to improve my own knowledge and to share what I learn with the rest of the community. You have been a great help towards this goal.

To answer your single question, I think that it would be of moderate importantance to Porteus' ultimate success to find and select the optimum file system for the intended use of this specific distribution (this is in my opinion only). Why do I think it is of only moderate importance, and why do I think it should not necessarily be implemented as a default during installation? Because at the end of the day, if we're talking about the native filesystem for a flash drive on which Porteus is installed, it is of only moderate importance, and highly open to user customization without much risk. This is because the majority of the system is constructed from read-only modules, so the internals of the initrd and the live filesystem (not to mention the content thereof) have much more to do with the actual useability, performance and end-user experience of Porteus. Furthermore, many users will never get to the point where they will want to repartition their drive away from windows compatibility, so we ought to remain flexible enough for users to customize their choices in regards to their native filesystem of choice (to this point, let me add that read speed ultimately will be of more importance to most users than write speed, because Porteus reads from the disk far more often than writes to it; XFS is in a virtual tie with most other filesystems in this regard, as read speed seems to be much more a function of hardware capabilities than filesystem attributes). The question that naturally arises then is, "If this is not really that important, why the hell is Ahau spending so much god damn time on it?" I don't have a good answer myself (I wish I did!), other than I'm curious, and I want to find the answer, and the more time I spend on it without a resolution, the more it bugs me. Also, I want to share my findings with the rest of the community, so that they don't have to spend a month doing what I'm doing, in order to answer a complex question of minor importance.

Thank you for clarifying about the barriers vs no barriers comments by the designers :) See the end of my post for more on this.

Also, thank you for explaining your experiences with slax and XFS -- were you using XFS as the native filesystem on your block devices, running it as the default filesystem inside the initrd and /dev/root (i.e., changing the default initrd/linuxrc, which I believe are set to create virtual ext2 systems for both slax and porteus v1.0), or both? I ask the question because expert configuration can be applied to the live filesystem on the administration end (in fanthom's latest development release, the initrd was updated to be an ext4 image instead of ext2, so now would be the time to dig into this question and resolve which is really best), but this becomes more problematic on block devices when we have to factor in various device specifications and user experience/knowledge levels (I think we've sufficiently beaten that dead horse to the point where we agree on this).

I agree that we are all entitled to our own opinions, and not our own facts -- I have seen a lot of opinions about flash drives, alignment, geometry, and filesystems, but not a lot of reliable facts on the internet -- that is why I have set out to generate some testable conclusions backed by reproducible data. I should have worded my statement about "the manufacturer's preferences" more carefully -- my intent was not to discard the opinion of the manufacturers. Rather, I wanted to reinforce the need to understand the reasoning behind why they made their decisions and in what instances they are applying them, so that I could evaluate (to the best of my ability) whether or not those same conclusions are valid for our uses. If, for example, they are using XFS because they are primarily working on enterprise class servers with huge files, and are backing up their logs to USB devices, this is a much different situation than running a linux-live distribution on a flash drive. You've explained that they use it in many instances, but I would like to know if it was selected for reasons of performance, reliability, compatibility with other hardware, etc. (note that this is a hypothetical statement, and I'm not asking for or expecting some kind of detailed report from you -- but if you have some more links that point to this, I'd gladly read them, and I'll do some more research on my own)

I don't have any illusions that OSS is naturally better than commercially developed products, nor would I imagine that SGI was out to trick anybody. I entered this experiment with no knowledge of the origins of XFS or it's speed and reliability; my conclusions have been based on the data produced in my (admittedly simple) experiments and the information I have been able to gather elsewhere. I have no vested interests or axe to grind against XFS or big corporations (I work for one myself).

I couldn't agree more with you that the characteristics of the device is of far more importance than which filesystem you choose, when discussing performance issues. Yes, experts can and will tweak devices to obtain awesome performance (as you illustrate), but by and large, better and newer hardware will outperform older and crappier hardware, regardless of how you format the drive.

What I'm after here is a simple answer to a complex question. If a (relatively) new user asks, "what filesystem should I put on my flashdrive, for use with Porteus", I want to be able to provide a somewhat generic, easy to implement response, such as, "In my experience, XXX filesystem provides the best performance, but XXX filesystem is somewhat more reliable in cases of power failure, but in all cases, you're better off using XXX or YYY than using FAT with a save.dat"

Thank you as well for your discussion of the Corsair Voyager (333MB--sweet!) and your experience with Slax installed on XFS -- it's useful to know how reliable XFS has been on so many systems! This is one area where it is especially difficult for me to gather data myself. I can force a number of crashes and document the results, but I simply don't have the time necessary to gather long term data on reliability in normal use for each filesystem.



And finally, back to the external log and barriers discussion: your first post in this thread included your mkfs and mount options for the Corsair Voyager drive, as follows:
mkfs.xfs -f -l logdev=/dev/loop(X-1),size=128m,lazy-count=1 -d agcount=32
mount -t xfs -o noatime,nodiratime,logdev=/dev/loop9,logbufs=8 /dev/loop10 /VoyTest
I have tried to recreate this on my end, but have repeatedly failed to establish a loop device as the external log device. I've also been able to find little to no information on the web about how to set this up (primarily, I find information about putting the external log on a block device). Could you please provide some more detail on this process, so that I can recreate it and incorporate the results into my analysis?
I assume that /dev/loop(X-1) refers to a loop device that is one lower than the loop device you've established for your filesystem (which itself resides on a loop-backed image file). If I do nothing to establish /dev/loop18, for example, mkfs.xfs will not accept it as the external log device. If I create an image file and mount it on /dev/loop18, it fails again (but tells me that the specified device contains a formatted filesystem). To clarify, I am trying to create an actual partition on my flashdrive (e.g. /dev/sdb2), not a loop-backed image file, but I want the log to go someplace else. I guess my ultimate question is this: if you really are putting your log on a loop device that isn't tied to a separate block device, where does the log go, in case of a power failure? And, if you are placing the log on a separate block device, it seems that you would be limited to only booting Porteus on that one machine (or I suppose you could put the log on another flash drive, and use both in concert for each machine).

Using a higher agcount seems to slow down my drive (I think it's probably a dual channel drive, and mkfs.xfs defaults to 4, which would be the appropriate amount per your initial instructions). I realize how this would be beneficial if I had a quad channel (or higher) device -- your Corsaire drive seems to have 16 channels -- egads, that's quick!

Again, many thanks for your participation and help!

Posted after 51 minute 47 seconds:
Re: Finding the best filesystem (for USB flash drive installs)
Some interesting, if outdated info on filesystem reliability, from Toshiba, recorded here for posterity:

http://elinux.org/images/e/e2/Evaluatio ... ayashi.pdf
Please take a look at our online documentation, here. Suggestions are welcome!

pizzar0
White ninja
White ninja
Posts: 28
Joined: 08 May 2011, 22:26
Location: Chritmas Island

Re: Finding the best filesystem (for USB flash drive install

Post#41 by pizzar0 » 16 Aug 2011, 19:07

No sweat!

I shall correct the apparent omissions in some of the points;

First and foremost, underlying any and all previous and present points is the following, undeniable fact: One SHOULD NOT run ANY journaling file system on a non-block device backed loop, under Linux, due to the current implementation of the main-line loop driver!!!
[Ask Linus or anyone else in the know, for that matter. - @fanthom - One of those things you can not just script your way out of!] There is no exception to this rule unless the loop driver itself is physically altered or replaced, including the accompaning util-linux, mount etc. (Which is what most have been doing since the beginning of time for that precise reason.)

The reason I proposed the perceived importance of the Default File System choice is two fold;
1) Porteus likely ending up in a frugal install on (removable) USB drives, due to its intended use/nature, the correct file system choice will have a much bigger performance impact than a regular, full installation on traditional (legacy) hardware.
1a) While differences of major file systems are usually dependent on their use the one
thing that became apparent - and well documented out there - that EXT4 is
undeniably "The" worst choice of all available FS when running on flash media. (Just
how "bad" that can be debated but that it is the worst is hardly questionable.)
2) There is wide and long-standing evidence that there are two main factors which turn off new users from a any distribution (or anything else, for matter);
A) An ugly, cumbersome, unintuitive user interface;
B) A file system corruption and consequent data loss.

Now, if Porteus is used solely as intended, as a live system, essentially reliant on its read-only modules then the only obvious choice (under Linux) is ext2 by its very nature.
In our experience (with Slax) however, we found that unless the read-only, modular nature (as intended) can be easily and reliably maintained (WORKING modules created, downloaded, etc.), a great majority of new(er) users will simply continue to install (e.g. "dump") EVERYTHING into the persistence portions of their installations where the choice and the implementation of the default file system will make or brake a distro!!
In short, how Porteus is intended to be used and how it most likely will end up being used is two wildly different things!! -Unless, of course, the modules and their management is absolutely perfect, including function and reliability, requiring minimal user interaction.
As long as this later holds true - e.g. "perfect" in the true Biblical sense - then ext2 it is, as there is no other, intelligent choice. Said "perfection", however, is yet to be observed "in the wild"!
Just as importantly, where is the "average user" going to keep his mp3 library and baby/cat photos?? -Will there be a read-only module for those, too?
were you using XFS as the native filesystem on your block devices, running it as the default filesystem inside the initrd and /dev/root (i.e., changing the default initrd/linuxrc, which I believe are set to create virtual ext2 systems for both Slax and porteus v1.0),
The XFS case is ONLY (naturally) applicable to "Persistence".
Ext2 remains for everything read-only. (Why in hell would anyone change that, to etx4 no less???)
your first post in this thread included your mkfs and mount options for the Corsair Voyager drive, as follows:...
... I have tried to recreate this on my end, but have repeatedly failed to establish a loop device as the external log device
Let me corrected some obvious omissions - see: the first and most essential point - where all the mount (and other) options, as illustrated for XFS, ONLY hold with a block device (partition, etc.) backed loop - as is always the case!

All that said let me throw out a few sentences, just in lead-words, trying to give some context to what otherwise may seem like a purely academic exercise;
- Currently only data-centers and workstation use ext 4 and not very often;
- USB is the obvious future choice because A) nobody wants to be wired; B) existing infrastructure; C) power requirements - versus SCSI; D) cloud computing; E) upcoming performance.
- Who is the "average user" and what does he want from a Distro? Experts will - continue to - "roll their own" (modules) and beginners jump ship the moment they loose grandma's pictures. Who is Porteus for? Why will it make top 10 on Distro Watch? (If that's a metric?)
(There is no such thing as "the average user" in OSS because those are called "windows users" and between them and Apple they constitute 97.2% of the "average" eyeballs.)
- Why/How did Mint come from obscurity and shot to the top in record time. What did they do? (and it wasn't much radical innovation.)
- Why was Slax so successful, then suddenly stopped and there is still no Slax 7.0, yet? (and No, its not because Thomas is still hanging dry-walls.) Did the "average user" download it a few million times? (No!) Who and why did?
Last edited by pizzar0 on 16 Aug 2011, 19:52, edited 1 time in total.

User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: Finding the best filesystem (for USB flash drive install

Post#42 by Ahau » 16 Aug 2011, 19:50

I neglected to mention that fanthom's initrd (ext4 formatted) has the journal removed. More information on that is here:
http://porteus.org/forum/viewtopic.php?f=44&t=747

I've not tested it yet, but will do so later today.

If you could please provide some rationale as to why EXT4 is the worst FS for use on a USB drive, I would be grateful (I haven't run into this statement before). I can document performance gains in EXT4 over EXT2 on my media, but I'm just getting into the reliability testing, so I suppose that could be a source of issues. I know that EXT4 doesn't sync data to disk as often as EXT3, but this may be outdated information. I know a lot of people use EXT2 because they don't want to have the I/O associated with journaling in EXT3 and EXT4, but this seems to be less of an impact (on performance, anyway) when running EXT4 in data=ordered mode.

EDIT: I'll agree that most users will continue to dump their data to the drive (either the native filesystem on the USB stick or the save.dat if they are using FAT or NTFS). You're correct that this increases the need for reliability when considering the filesystem for use in both instances.
Please take a look at our online documentation, here. Suggestions are welcome!

pizzar0
White ninja
White ninja
Posts: 28
Joined: 08 May 2011, 22:26
Location: Chritmas Island

Re: Finding the best filesystem (for USB flash drive install

Post#43 by pizzar0 » 16 Aug 2011, 19:55

Ext4 is perfectly OK and good-to-go, of course, with the journal OFF. (It is/was all abut the "journaling", it being the entire issue- not block handling etc.)

"Why EXT4 is a "bad" choice for USB, etc?..."
Because it takes an immense hit (200-300% on average) in Read Performance if the physical sectors are misaligned! (Its almost even with xfs in the write performance penalty numbers.)
- And read performance being rather essential here while sector alignment is less than guaranteed, or even likely, in the case of the "average user".

User avatar
fanthom
Moderator Team
Moderator Team
Posts: 5666
Joined: 28 Dec 2010, 02:42
Distribution: Porteus Kiosk
Location: Poland
Contact:

Re: Finding the best filesystem (for USB flash drive install

Post#44 by fanthom » 17 Aug 2011, 07:23

some highlights:
the main reason for using ext4 for initrd is that i wanted to get rid of ext2 and ext3 from the kernel: they are supported by ext4 subsystem since few kernel releases so this solution should be stable at this point.
ext4 without journal produces smaller initrd and porteus boots marginally faster (measured in Vbox: 0,64 sec till starting linuxrc for ext2 and 0,62 sec for ext4) so i have decided to give it a try.

it doesn't mean that i would recommend this combination for ssd/USB flash drives.
initrd is unpacked to RAM and it's data are wiped at each reboot - users does not care what happens with it's content. it is totally different subject than saving data (with or without journal) on a flash media.

btw: thanks for the input guys - this topic has plenty of valuable info.
Please add [Solved] to your thread title if the solution was found.

User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: Finding the best filesystem (for USB flash drive install

Post#45 by Ahau » 17 Aug 2011, 17:50

Thanks, fanthom!

@pizzar0, I previously tested differing alignments with ext4, so last night I offset my test partition again by 63 sectors and ran some additional tests in ext2 and XFS to compare the results with those I already had for ext4 (note: this uses default mkfs and mount options for each filesystem, so XFS is not using a delayed log, etc., and EXT4 is mounted in data=ordered mode, so metadata is journaled but actual data is not).

Percentages expressed below represent the drop in transfer rate between a partition that starts on an aligned sector (in this case, a sector that is a multiple of 8192), versus a partition that starts on a sector that is unaligned (in this case, a sector that is 63 greater than the sector used for the previous alignment):

Write speads for a single, large file (300MB ISO) fell for all three FS's fell, but not evenly.
XFS: 7.68%
EXT2: 21.449%
EXT4: 17.48%
Speeds when aligned: XFS: 11.59 MB/s EXT2: 5.34 MB/s EXT4: 12.02 MB/s

Write speeds for many, small files (untarring the kernel source tarball to the device) fell as well, but not as much (this is to be expected, because the tarball takes a while to extract in RAM before it starts writing to disk, and this delay is included in the total time for my tests):
XFS: 5.39%
EXT2: 1.76%
EXT4 6.46%
Speeds when aligned: XFS: 1.22 MB/s EXT2: 2.58 MB/s EXT4: 5.38 MB/s

Write speed for creating a 128MB file full of zero's with a 1024k block size, through dd if=/dev/zero:
XFS: -0.56% (write speed actually went up a tiny bit, but well within the margin of error for these tests)
EXT2: 15.52%
EXT4: 17.85%
Speeds when aligned: XFS: 10.71 MB/s EXT2: 4.79 MB/s EXT4: 12.13 MB/s

Read speeds also went down --

Read speed for a single large file (300MB ISO):
XFS: 2.93%
EXT2: 4.94%
EXT4: 8.30%
Speeds when aligned: XFS: 20.46 MB/s EXT2: 20.64 MB/s EXT4: 20.03 MB/s

Read speed for many small files (copy the extracted kernel source files from the device to RAM):
XFS: -62.56% (again, XFS got better, and substantially so -- additional testing would be needed to find out if this is an anomaly in the test or some other variable -- the error is likely in the XFS aligned test, as that value is quite a bit slower than all other values for this test, and the unaliged speed is comparable to the performance of the other two FS's)
EXT2: 7.37%
EXT4: -6.30%
Speeds when aligned: XFS: 5.44 MB/s EXT2: 8.96 MB/s EXT4: 9.30 MB/s

Read speed for a 128MB file full of zero's w/1024k block size:
XFS: 1.30%
EXT2: -14.86%
EXT4: -8.36%
Speeds when aligned: XFS: 20.53 MB/s EXT2: 21.42 MB/s EXT4: 20.17 MB/s

Copy2ram speed (measures the speed of copying the contents of a loop-mounted module -- 000-kernel.xzm-- to a directory in RAM):
XFS: -0.16% (again an increase, but within the margin of error)
EXT2: -5.38% (increase, within the margin of error without running additional tests)
EXT4: 3.22%
Speeds when aligned: XFS: 23.50 MB/s EXT2 22.26 MB/s EXT4: 23.71 MB/s


Thus, on my media, for the tasks performed in my tests, EXT4 can be shown to slow down when out of alignment, but this is fairly comparable to the hit that EXT2 takes in most cases. XFS appears to not drop as much when put out of alignment, but perhaps this is due to the fact that I lack the experience to correctly configure XFS for the particulars of my device (or, maybe there are a certain number of sectors at the beginning of the partition that push the actual data out of alignment? I recently read some articles that discussed modifications to the number of reserve sectors for FAT partitions to compensate for the area taken up by the file acquisition table and superblock at the beginning of the drive, so that the actual data starts on a flash block-aligned sector). In any case, I don't see a speed reduction that is within the order of magnitude you mention above, especially in read speeds, as all filesystems performed about equally in the read-speed tests. Do you know what benchmarks (if any) were used to derive those results, and on what hardware? I know my tests are relatively simplistic, and don't test speeds for multithreaded applications. I'm also testing an extremely limited number of devices (2 thus far, though I have my eye on a 16GB Patriot Rage USB flash drive!).

My tests would indicate that EXT2 and EXT4 have more or less comparable read speeds, whether in or out of alignment (though EXT4 read speeds do dip a little below EXT2, particularly when out of alignment), but EXT4 provides much better write performance, especially when dealing with small files. I have not yet finished my "crash testing" for EXT4 or XFS (I'm about 3/4 done with EXT2, then will do EXT3 and EXT4), but I imagine both XFS and EXT4 will prove better at dealing with power failures and data recovery versus EXT2. Personally, I take stability and data loss more seriously than a slight drop in performance, so I'm looking forward to completing those tests (but, they are very time consuming as I crash each FS about 20 times, then have to reboot and check the data and repair it if necessary, representing greater than 40 boot cycles per filesystem tested).
Please take a look at our online documentation, here. Suggestions are welcome!

Post Reply