18 July 2010

"".equals(val.trim()) vs. 0 == val.trim().length()

Trimming string values a quite often task on the Web. I personally prefer avoid this in origin: just do not store redundant symbols. But you know, shit is happened... Sometimes data you have to use, came from side you can not control. Or even trimming is assumed by protocol.

String transformations are heavy by nature, but it is not a problem if operation is not time sensitive.

Not sure where and when I picked this habit, but usually I used construction:

if (null == value || "".equals(value.trim())) {
    // data is empty or blank
}

But recently I had been informed that this check is much slower compare to:

if (null == value || 0 == value.trim().length()) {
    // data is empty or blank
}

From my look point they both are heavy and should not be used in code that pretends to be a quick. It is even funny to hear about performance tricks in cases where it never appear to be meaningful. But I become interested how much it slower in fact. So, decided to make a quick test.


import org.junit.Test;

import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;

/**
 * Test suite for blank string check measurement
 */
public class StringTest {
    private static final long MAX_ITERATION = 10000000l;

    public long measure(final Algorithm algorithm, final String data) {
        final boolean expected = algorithm.check(data);
        // Warming up...
        for (long i = MAX_ITERATION << 4; i-- > 0;) {
            final boolean result = algorithm.check(data);
            assert expected == result: "Expected is not equals to actual";
        }
        // Do measure...
        long time = System.nanoTime();
        for (long i = MAX_ITERATION; i-- > 0;) {
            final boolean result = algorithm.check(data);
            assert expected == result: "Expected is not equals to actual";
        }
        return System.nanoTime() - time;
    }

    @SuppressWarnings({"unchecked"})
    public static void main(final String[] arguments) throws Exception {
        if (2 != arguments.length) {
            System.out.println("Usage " + StringTest.class.getName() + " < class > < sample >");
            System.exit(-1);
        }
        final StringTest test = new StringTest();
        final Class< Algorithm > clazz = (Class< Algorithm >) Class.forName(arguments[0]);
        String data = arguments[1];
        if ("null".equals(data)) {
            data = null;
        }
        System.out.print(String.format("Measure sample '%s' for %-30s: ", data, clazz.getSimpleName()));
        final long time = test.measure(clazz.newInstance(), data);
        System.out.println(String.format("%.5f ns (%d)", (double) time / MAX_ITERATION, time));
    }

    @Test
    public void testEqualsStrategy() {
        final Algorithm object = new EqualsStrategy();
        assertThat(object.check("ttt"), is(true));
        assertThat(object.check("   "), is(false));
        assertThat(object.check(null), is(false));
    }

    @Test
    public void testLengthStrategy() {
        final Algorithm object = new LengthStrategy();
        assertThat(object.check("ttt"), is(true));
        assertThat(object.check("   "), is(false));
        assertThat(object.check(null), is(false));
    }

    @Test
    public void testImprovedEqualsStrategy() {
        final Algorithm object = new ImprovedEqualsStrategy();
        assertThat(object.check("ttt"), is(true));
        assertThat(object.check("   "), is(false));
        assertThat(object.check(null), is(false));
    }

    @Test
    public void testQuickStrategy() {
        final Algorithm object = new QuickStrategy();
        assertThat(object.check("ttt"), is(true));
        assertThat(object.check("   "), is(false));
        assertThat(object.check("\t"), is(false));
        assertThat(object.check(null), is(false));
    }

    public interface Algorithm {
        boolean check(String value);
    }

    public static class EqualsStrategy implements Algorithm {
        public boolean check(final String value) {
            return null != value && !"".equals(value.trim());
        }
    }

    public static class ImprovedEqualsStrategy implements Algorithm {
        public boolean check(final String value) {
            return null != value && !"".contentEquals(value.trim());
        }
    }

    public static class LengthStrategy implements Algorithm {
        public boolean check(final String value) {
            return null != value && 0 < value.trim().length();
        }
    }

    public static class QuickStrategy implements Algorithm {
        public boolean check(final String value) {
            return null != value && !isBlank(value);
        }

        private boolean isBlank(final CharSequence value) {
            int pos = value.length();
            for (; pos-- > 0 && value.charAt(pos) <= ' ';) {}
            return 0 > pos;
        }
    }

    public static class DronStrategy implements Algorithm {
        public boolean check(final String value) {
            return null != value && !isBlank(value);
        }

        private boolean isBlank(final CharSequence value) {
            for (int pos = 0; pos < value.length(); pos++) {
                if (value.charAt(pos) > ' ') {
                    return false;
                }
            }
            return true;
        }
    }
}

As you can see I added additional check strategies (or better tactics rather?) just to compare. The result really surprised me.

Val:pyramid-jmeter striped$ ./test.sh 
Measure sample 'tttt' for LengthStrategy                : 9.63080 ns (96308000)
Measure sample 'tttt' for EqualsStrategy                : 10.10210 ns (101021000)
Measure sample 'tttt' for ImprovedEqualsStrategy        : 10.22600 ns (102260000)
Measure sample 'tttt' for QuickStrategy                 : 9.61880 ns (96188000)
Measure sample 'tttt' for DronStrategy                  : 9.66010 ns (96601000)
Measure sample ' tt ' for LengthStrategy                : 21.71350 ns (217135000)
Measure sample ' tt ' for EqualsStrategy                : 21.61700 ns (216170000)
Measure sample ' tt ' for ImprovedEqualsStrategy        : 21.86830 ns (218683000)
Measure sample ' tt ' for QuickStrategy                 : 10.66180 ns (106618000)
Measure sample ' tt ' for DronStrategy                  : 9.61820 ns (96182000)
Measure sample '    ' for LengthStrategy                : 23.66650 ns (236665000)
Measure sample '    ' for EqualsStrategy                : 23.67160 ns (236716000)
Measure sample '    ' for ImprovedEqualsStrategy        : 24.30890 ns (243089000)
Measure sample '    ' for QuickStrategy                 : 14.67670 ns (146767000)
Measure sample '    ' for DronStrategy                  : 13.41650 ns (134165000)
Measure sample 'null' for LengthStrategy                : 7.98250 ns (79825000)
Measure sample 'null' for EqualsStrategy                : 7.97830 ns (79783000)
Measure sample 'null' for ImprovedEqualsStrategy        : 7.98630 ns (79863000)
Measure sample 'null' for QuickStrategy                 : 7.98430 ns (79843000)
Measure sample 'null' for DronStrategy                  : 8.01670 ns (80167000)

As expected the difference between analyzed methods is pretty small (we win ~0.4 ns for val.trim().length() that could be attributed as measurement uncertainty). Performance degraded in cases where trimming used is expected. But thing I failed to explain is the ImprovedEqualsStrategy gives no profit or do even worse. According to sources String#compareContent should compare length first, so there should not be loses on instanceOf as per String#equals. Should not?


P.S.

Yet another paradox, the classical iteration from start of string to its end, seems like work better. (Thanx, Dron!)


P.P.S.

Script (for laziest! And do not forgot to compile with optimize flag and without debug info!)

#!/bin/sh
JAVA_OPTS='-server -XX:-UseParallelGC -XX:CompileThreshold=10'

for d in 'tttt' ' tt ' '    ' 'null'
do
    for c in '$LengthStrategy' '$EqualsStrategy' '$ImprovedEqualsStrategy' '$QuickStrategy' '$DronStrategy'
    do
        java -cp target/test-classes $JAVA_OPTS -ea StringTest StringTest$c "$d"
    done
done

No comments: