1

I know that I can use fio to benchmark my disks with any given static workload. However, does any open source high quality benchmarking tool support doing a test where I select following parameters as constants:

  • Test file size (e.g. 500 MB)
  • Static QD (settable at least to 1, 2, 4 and 8)
  • Workload (e.g. random 4k read over whole file span)
  • Direct I/O access (similar to libaio of fio)
  • Define max latency in µs

And the benchmark should slowly increase IOPS until latency goes over set limit after which the benchmark is done. The test result would be latency for each IOPS value, or better yet, minimum+average+max latency for each IOPS value.

Basically I'm asking a tool that can do similar benchmarking required for this storagereview.com graph: a graph of latency for each random 4k read over different IOPS values

I know that I can repeatedly run fio with different settings to generate the required data but I'm wondering if there's some premade tool for this purpose. Does such a benchmark tool exists?

2 Answers2

1

fio has an option that discovers the highest IOPS that can be done under a certain latency... From the "I/O latency" section of the fio documentation:

latency_target=time

If set, fio will attempt to find the max performance point that the given workload will run at while maintaining a latency below this target. When the unit is omitted, the value is interpreted in microseconds. See latency_window and latency_percentile.

Please see the whole I/O latency section of the fio help as there are a bunch of operations that interact together. You may also find fio's Steady State mode and the separate tool Diskplorer (which itself drives fio) useful. However, I note you've clarified your question and the aforementioned options/tools don't generate latency numbers at a set number of different "max IOPS" points (however Diskplorer does generate latency / IOPS against I/O depth numbers).

Away from fio, you could also look at using the vdbench tool that StorageReview themselves actually seem to be using in that review (despite their page claiming that they use fio) but you'll have to wave goodbye to a libaio like submission - I'm fairly sure vdbench doesn't do platform specific AIO because it is trying to be platform agnostic (so it can only use multiple threads/processes to up the depth).

Anon
  • 312
  • If I've understood correctly, the latency_target allows me to figure one IOPS number for one pre-decided max latency. That does not give me performance numbers for lower IOPS (see the example graph in the question). I can run fio repeatedly with different IOPS numbers and graph the latency but I'm wondering if there exists a better tool for the job. – Mikko Rantalainen Jan 11 '19 at 12:39
  • I'd say Diskplorer is as close as I've seen in terms of a premade tool that uses fio (and perhaps wouldn't be too much work to change the code do what you wanted). Did you also see my mention of the vdbench tool (which is what Storage Review appear to be using)? – Anon Jan 12 '19 at 05:49
1

Here's a bash script I cooked up (fio-ramp hosted at github):

#!/bin/bash
# Copyright 2019 Mikko Rantalainen
# License: MIT X License
#
# Debian/Ubuntu requirements:
# sudo apt install fio jq
#
# See also: https://fio.readthedocs.io/en/latest/fio_doc.html
#
set -e

if test -z "$1"
then
    echo "usage: $(basename $0) <result.csv> [fio options]" 1>&2
    echo "<result.csv> will contain CSV with µs latency for different IOPS" 1>&2
    echo "  For example, " 1>&2
    echo "  $(basename $0) output.csv --blocksize=8k --rw=randwrite --iodepth=4" 1>&2
    echo "  will compute IOPS latency values for 8K random write QD4." 1>&2
    # Note: if --numjobs=4 then actual IOPS will be 4x the amount targeted because targeted is per job - prefer increasing iodepth instead!
fi

resultfile="$1";
shift; # remove filename from parameters, left rest for fio

log10_series()
{
    count=1
    step=1

    echo 1
    while (( $step < 1000000 ))
    do
        for (( i=1; i < 10; i++ ))
        do
            count=$(( $count + $step ))
            echo $count
        done
        step=$(( 10 * $step ))
    done
}

echo "Writing output to '$resultfile' ..."

# Note: "| while read ..." loop causes shell to create subshell, we have to share data via actual file because variables do not work over subshell boundaries :-/
best_actual_iops_file=$(mktemp --tmpdir fio-ramp-best-actual-iops.XXXXXXXX.tmp)
echo 0 > "$best_actual_iops_file"
trap "rm '$best_actual_iops_file'" EXIT

echo '"Target IO/s", "Actual IO/s", "Min latency (µs)", "Avg latency (µs)", "Max latency (µs)"' | tee "$resultfile"
log10_series | while read iops
do
    LC_ALL=C fio --name TEST --filename=fio-ramp.benchmark.temp --rw=randread \
        --size=500m --io_size=10g --blocksize=4k \
        --ioengine=libaio --direct=1 --numjobs=1 --iodepth=1 \
        --ramp_time=1 --runtime=5 --end_fsync=1 --group_reporting \
        --rate_iops=$iops --rate_iops_min=1 --max_latency=1s \
        --warnings-fatal --output-format=json "$@" \
    | jq '.jobs[] | (.read.iops, .read.lat.min, .read.lat.mean, .read.lat.max)' \
    | xargs -r printf "%s %s %s %s\n" | while read actual_iops min avg max
    do
        printf "% 13s, % 13s, % 18s, % 18s, % 18s\n" "$iops" "$actual_iops" "$min" "$avg" "$max" | tee -a "$resultfile"
        if [ "$(echo "$(cat "$best_actual_iops_file") <= $actual_iops" | bc -l)" == "1" ]; then
            echo "$actual_iops" > "$best_actual_iops_file"
        else
            echo "Actual IOPS dropped when target IOPS was increased, aborting." 1>&2
            exit 1
        fi
    done
done

Example graph (Latency vs IOPS on Intel SSD 910, random 4K read QD32, log-log graph): Graph displaying lowest average latency around 30,000 IOPS and steep increase in latency above 60,000 IOPS.

Compared to Diskplorer this runs a new fio process for each IOPS target and collects minimum, average and maximum latency. I know bash better than python so this was easier for me to write. In long run improving Diskplorer could be better in case its license is acceptable (currently no license has been defined for that project).

  • 1
    Nicely done on getting Diskplorer to select a license (I see you the issue you commented on has been solved)! May I suggest that you accept your own answer since technically you made an open source project that solves your question! – Anon Jan 21 '19 at 07:31
  • Note that the above script only outputs and monitors read values. If you do --rw=write to above script will not work correctly. – Mikko Rantalainen Jun 28 '19 at 08:04