Rawhide Watch

Daily warnings for rawhide victims

ext4 on Intel X25-M SSD needs nodelalloc

Posted by rawhidewatch on April 13, 2009

Many early SSD owners paid hundreds of dollars for a solid state drive disk, only to realize that performance is terrible with “stuttering” caused by terrible SSD controllers made by JMicron and its inability to handle random write patterns.  Intel’s X25-M and 25-E drives have consistently beat all other consumer grade SSD drives due to its superbly designed SSD controller, but at a huge price premium.  Generally users of Intel’s SSD’s have avoided the “stuttering” problems of lesser quality drives.

Unfortunately, it seems that 15-60 seconds of seemingly deadlock-like non-responsiveness can happen even with an Intel X25-M when using ext4 filesystem.  Ever since I upgraded my Thinkpad T60 laptop to an Intel X25-M 160GB drive, my entire system would feel like it freezed, but then I notice the hard drive light flashing intermittently.  After maybe 30-60 seconds the hard drive light would be solid on for a few seconds, then the system would recover.  I cannot figure out how to reproduce this on-demand, but it seems to happen 2-3 times a day, and only during times where disk activity is very low.

In any case, this problem goes away entirely if I add “nodelalloc” to the mount flags of my ext4 filesystems.  This turns off ext4 delayed allocation, which can typically delay writing to the disk for longer periods than ext3’s 5 seconds, perhaps 30-60 seconds.  This is a benefit to hard drives because you can save power by spinning up less often.  But for Intel”s X25-M drives, you are better off writing as early and often as possible since the drive doesn’t spin (no power benefit), and it reportedly handles parallelism internally very well, meaning the OS shouldn’t worry about it because there is no performance benefit.

In related news, it was suggested to me that elevator=noop is useful in your kernel cmdline in grub.conf if your disks are SSD.  While I am personally trying it, I am uncertain if this is a good thing.  While the elevator normally wouldn’t be necessary with a theoretically perfect solid state disk with constant-time random access to any address,  SSD’s are not perfect with constant time access.  Especially in the case of writing many small changes to non-contiguous parts of the disk, SSD’s suffer from what is known as write amplification because SSD’s need to read, erase and re-write entire blocks if you change only a tiny portion of that block.  I wonder if write-combining that happens in the elevator prevents a few I/O writes to non-adjacent but nearby blocks from triggering multiple write events to the same SSD block.  If it turns out the elevator does help to minimize write amplification before it gets to the SSD’s controller, then it would be both a performance benefit as well as helping the longevity of the disk.  I have no clue how to measure this.  Perhaps we need Intel engineers + people who know filesystems to answer this.

In any case, this ext4 behavior with Intel X25-M seems to be a bug in ext4?  I don’t know.

UPDATE:
Jeff Moyer pointed out that recent versions of the CFQ elevator has smart SSD detection.  If SSD is detected, it disables idling during random access reads because the penalty for reading is not severe.  He thinks you are better off using the default CFQ scheduler with these drives.  Jeff also pointed out that elevator=noop will also do write combining.

UPDATE April 17th,2009:
Intel released a firmware update for their X25-M and X18-M SSD’s that improves performance in the worst case scenario.  I flashed my X25-M and turned off nodelalloc to see if this makes a difference.  It seems ext4 still behaves badly without nodelalloc with occasional episodes of 15-60 seconds of unresponsiveness.

Advertisements

Sorry, the comment form is closed at this time.

 
%d bloggers like this: