Flash testing tools - CrystalDiskMark

CrystalDiskMark is the type of tool I wouldn't even normally look at due to it's focus market, but one of our SE's asked about it last week, so...

CrystalDiskMark is a tool you would normally see used on standalone hard disks/SSD, in particular on consumer focused websites like Toms Hardware - not on enterprise All-Flash Arrays.

Diskspd

According to their website, CrystalDiskMark is basically just a front-end to Microsoft Diskspd. Unfortunately diskspd itself is not a good tool for testing storage that supports de-duplication - see my separate post on Diskspd for the full details, but in short :

  • The initial test file written by Diskspd contains completely unique data
  • The data written during write tests can be made somewhat unique, but requires massive amounts of memory to do so

CrystalDiskMark does actually make an effort to overcome the first of these issues, but unfortunately it's target market means that it fails in countless other ways as a suitable tool for testing AFA's.

CrystalDiskMark

The UI for CrystalDiskMark gives a fairly clear idea of the tests that it runs :

The first two tests are implying a queue depth of 32 and 1 thread for sequential and then "4K" IO, followed by the same two tests with no mention of queue depth/threads. Then for each test there is a read and then write phase run (interestingly, all read tests are run first, followed by all of the write tests).

The details on exactly what each of these tests does is vague, so I grabbed the command-line it's passing to diskspd for each of them to see exactly what they are doing.

The options passed to Diskspd for each of the tests was :

TestReadWrite
Seq Q32T1-b128K -d5 -o32 -t1 -W0 -S -w0-b128K -d5 -o32 -t1 -W0 -S -w100 -Z128K
4K Q32T1-b4K -d5 -o32 -t1 -W0 -r -S -w0-b4K -d5 -o32 -t1 -W0 -r -S -w100 -Z4K
Seq-b1M -d5 -o1 -t1 -W0 -S -w0-b1M -d5 -o1 -t1 -W0 -S -w100 -Z1M
4K-b4K -d5 -o1 -t1 -W0 -r -S -w0-b4K -d5 -o1 -t1 -W0 -r -S -w100 -Z4K
Run Time

The first thing that stands out from this table is the -d option - the duration of the test. Each test is only being run for 5 seconds - far to short to get anything close to a realistic result, especially for write tests where 5 seconds of data will never make it beyond the cache. Each of these tests is run a configurable number of times (1-9 times, defaulting to 5), so that at least brings us up to 25 seconds per test by default - but that's still not even close to enough for realistic results.

Queuing/Outstanding IO/Threads

The second two tests run with only a single outstanding IO (-o) and a single thread (-t) - nowhere near enough to test even a single disk, and it also turns the testing into not a test of IOPS/bandwidth, but instead just a latency test (see Queue Depth, IOPS and Latency to understand why). This basically makes these two tests worthless in even a single disk environment - let along for an All-Flash Array.

The first two tests use a single thread, but 32 outstanding IO. Although better than the latter two tests, this is still not enough concurrency to generate any real load on an all-flash array - especially for a 4K block size.

Block Sizes

As expected, the two "4K" tests use a block size (-b) of 4K, whilst the block size for the "Seq" test varies - it's 128KB for the test with multiple threads, but 1MB for the single threaded/single outstanding IO test. I can only presume this was done to limit the impact of having a single outstanding IO on performance.

Unique Data

The write test does at least use the -Z option in an attempt to generate random data - however it's clear that the author doesn't understand this option as it's set to the same value as the block size being used, which will completely negate it's benefit. In order to correctly use this option it needs to be set to a value much larger than the block size being used.

The end result will be that the data being used for the write test will be massively duplicated - a valid test for a single SSD (without dedup support), but invalid for any form of array with deduplication support.

Test File

There is some (slightly) good news around the test file use, and that is that CrystalDiskMark overcomes one of the issues that Diskspd has. Rather than allowing Diskspd to generate it's initial (completely repetitive) test file, CrystalDiskMark appears to generate this file itself.

This would be a good thing, if CrystalDiskMark didn't just generate a repetitive pattern of it's own - which is exactly what it does. It's a little less repetitive than the default Diskspd generation, but it still repeats blocks hundreds of times each.

Test File Size

The size of the file used for testing is configurable - but the available values are only between 50MiB and 32GiB (with the default being 1GiB). Even if this is bumped up to 32GB it's still well within the size that would fit within the cache of a storage array - especially given the high level of duplicate data generated. Thus tests are not actually going to require the array to go to the SSD, which is going to generate completely invalid results - for both reads and writes.

Conclusion

As a tool to test the performance of a single HDD or SSD in a laptop, CrystalDiskMark likely does a fairly reasonably job. The options it passes to Diskspd are valid for this type of use case, and diskspd will do a reasonable job of testing performance on a single non-dedup/non-compressing disk.

However for an All-Flash array (or really, anything more than a single disk) it is a completely unsuitably tool. Non-Unique data, insufficient queuing, limited dataset size and limited runtime mean that any results generated are going to be completely unrealistic.