Friday, February 27, 2015

Architects, engineers?

Delightful passage from Sam Newman's new book, Building Microservices:

Part of us wants recognition, so we borrow names from other professions that already have the recognition we as an industry crave. But this can be doubly harmful. First, it implies we know what we are doing, when we plainly don’t. I wouldn’t say that buildings and bridges never fall down, but they fall down much less than the number of times our programs will crash, making comparisons with engineers quite unfair. Second, the analogies break down very quickly when given even a cursory glance. To turn things around, if bridge building were like programming, halfway through we’d find out that the far bank was now 50 meters farther out, that it was actually mud rather than granite, and that rather than building a footbridge we were instead building a road bridge. Our software isn’t constrained by the same physical rules that real architects or engineers have to deal with, and what we create is designed to flex and adapt and evolve with user requirements.

A lot of times I really don't understand my own job.

Sunday, February 01, 2015

Dice, Test First and saving time

tl;dr — Test first saves time. Also a dice expression calculator.

The setup

Larry Wall famously remarked, "the three great virtues of a programmer: laziness, impatience, and hubris." A skill earned by good programmers is knowing when to favor which virtue. I failed while writing a parser for "dice expressions" ala Dungeons & Dragons. (Example: "3d6 + 1" means roll 3 6-sided dice, sum the results and add 1.)

The parser is straight-forward using the popular ANTLR4 tools:

grammar DiceExpression;

expr: left=expr op=('*' | '/') right=expr #opExpr
    | left=expr op=('+' | '-') right=expr #opExpr
    | '(' group=expr ')' #groupExpr
    | van=('adv' | 'dis')? number=NUMBER? sides=SIDES #diceExpr
    | number=NUMBER #numberExpr
    ;

dice: number=NUMBER? sides=SIDES;

NUMBER: '1'..'9' '0'..'9'*;
SIDES: 'd' ('4' | '6' | '8' | '10' | '12' | '20' | '100')
    { setText(getText().substring(1)); };

WS: [ \t]+ -> skip;

The grammar supports a 5th Edition feature, Advantage/Disadvantage—where you get to roll twice and pick the better/worse result—, with the keywords adv/dis. A less-5th Edition-specific grammar would leave out that term from "#diceExpr".

Enter the rube

All tests pass: good! Then I thought, "What about whitespace in dice rolls or with adv/dis?" I was concerned with expressions such as "1 +disd6" or "3 d4". Surely those are invalid, but I don't check for them. So I began adding explicit whitespace parsing in the #diceExpr rule. Mistake!

Several of my tests failed for perfectly good dice expressions.

Okay, so I reverted my grammar changes. To understand what was going on, I added tests I should have started with before editing (lines marked "FAIL"):

@RunWith(Parameterized.class)
public final class DiceExpressionTest {
    @Parameters(name = "{index}: '{0}'")
    public static Collection<Object> parameters() {
        return asList(args("1", true),
                args("d6", true),
                args("3d6", true),
                args("1+3d6", true),
                args("adv d6", true),
                args("dis 3d6", true),
                args("disd6", false), // FAIL
                args("1 + 3 d6", false)); // FAIL
    }

    private final ExpectedException thrown = ExpectedException.none();

    @Parameter(0)
    public String in;
    @Parameter(1)
    public boolean ok;

    @Test
    public void shouldParse() {
        if (!ok)
            thrown.expect(ParseCancellationException.class);

        // Just check parsing works, ignore results
        DiceExpression.valueOf(in).evalWith(Roll.die());
    }

    private static Object[] args(final Object... args) {
        return args;
    }
}

And ... the "FAIL" tests pass with the original grammar. This got me to thinking more about how ANTLR works. It is a lexer/parser. The lexer grabs strings from the input and hands them to the parser to figure out. The lexer has no brains, it's an expert at chopping up input. (Great post on this.)

The cases I worried about? The lexer does not know how to chop up the first one ("disd6"): it matches no token pattern. The parser does not understand the tokens in the second one ("1 + 3 d6"): it expects an arithmetic operator between "3" and "d6". My grammar already did the right thing.

Had I started with tests for these cases, instead of jumping right to coding, I would have not spent time messing with whitespace in the grammar. Coding first cost time rather than saved it.

Enlightenment by coding Kōan

Write tests first! No, really. Even when you "know" the solution, write tests first. There are many ways it saves time in the long run, it helps you think about your API, and it can save time in the short run. My dice expression example is not unique: test-first avoids unneeded work.

Epilogue

What does the dice expression implementation look like?

public final class DiceExpression {
    private final ExprContext expression;

    @Nonnull
    public static DiceExpression valueOf(@Nonnull final String expression) {
        final DiceExpressionLexer lexer = new DiceExpressionLexer(
                new ANTLRInputStream(expression));
        final DiceExpressionParser parser = new DiceExpressionParser(
                new CommonTokenStream(lexer));
        parser.setErrorHandler(new BailErrorStrategy());

        return new DiceExpression(parser.expr());
    }

    private DiceExpression(final ExprContext expression) {
        this.expression = expression;
    }

    public int evalWith(final Roll roll) {
        return expression.accept(new EvalWithVisitor(roll));
    }

    private static final class EvalWithVisitor
            extends DiceExpressionBaseVisitor<Integer> {
        private final Roll roll;

        public EvalWithVisitor(final Roll roll) {
            this.roll = roll;
        }

        @Override
        public Integer visitOpExpr(final OpExprContext ctx) {
            final int left = visit(ctx.left);
            final int right = visit(ctx.right);
            switch (ctx.op.getText().charAt(0)) {
            case '+': return left + right;
            case '-': return left - right;
            case '*': return left * right;
            case '/': return left / right;
            }
            throw new ArithmeticException(
                    "Unknown operator: " + ctx.getText());
        }

        @Override
        public Integer visitGroupExpr(final GroupExprContext ctx) {
            return visit(ctx.group);
        }

        @Override
        public Integer visitDiceExpr(final DiceExprContext ctx) {
            return Dice.valueOf(ctx).evalWith(roll);
        }

        @Override
        public Integer visitNumberExpr(final NumberExprContext ctx) {
            return Integer.valueOf(ctx.number.getText());
        }
    }
}

And Dice, overly complicated until I can think through Advantage/Disadvantage further:

public class Dice
        implements Comparable<Dice> {
    @FunctionalInterface
    public interface Roll {
        int roll(final int sides);

        static Roll die() {
            final Random random = new Random();
            return sides -> random.nextInt(sides) + 1;
        }
    }

    public enum Advantage {
        DIS(2, Math::min),
        NONE(1, (a, b) -> a),
        ADV(2, Math::max);

        private final int rolls;
        private final IntBinaryOperator op;

        Advantage(final int rolls, final IntBinaryOperator op) {
            this.rolls = rolls;
            this.op = op;
        }

        public final int evalWith(final Roll roll, final Dice dice) {
            return range(0, rolls).
                    map(n -> dice.rollWith(roll)).
                    reduce(op).
                    getAsInt();
        }

        public final String toString(final Dice dice) {
            switch (this) {
            case NONE:
                return dice.stringOf();
            default:
                return name().toLowerCase() + ' ' + dice.stringOf();
            }
        }
    }

    private final Optional<Integer> number;
    private final int sides;
    private final Advantage advantage;

    @Nonnull
    public static Dice valueOf(@Nonnull final String expression) {
        final DiceExpressionLexer lexer = new DiceExpressionLexer(
                new ANTLRInputStream(expression));
        final DiceExpressionParser parser = new DiceExpressionParser(
                new CommonTokenStream(lexer));
        parser.setErrorHandler(new BailErrorStrategy());

        return valueOf((DiceExprContext) parser.expr());
    }

    @Nonnull
    static Dice valueOf(@Nonnull final DiceExprContext ctx) {
        final Optional<Integer> number = Optional.ofNullable(ctx.number).
                map(Token::getText).
                map(Integer::valueOf);
        final Integer sides = Integer.valueOf(ctx.sides.getText());
        final Advantage advantage = Optional.ofNullable(ctx.van).
                map(Token::getText).
                map(String::toUpperCase).
                map(Advantage::valueOf).
                orElse(NONE);
        return new Dice(number, sides, advantage);
    }

    public Dice(@Nonnull final Optional<Integer> number, final int sides,
            final Advantage advantage) {
        this.number = number;
        this.sides = sides;
        this.advantage = advantage;
    }

    public Dice(@Nullable final Integer number, final int sides,
            final Advantage advantage) {
        this(Optional.ofNullable(number), sides, advantage);
    }

    public Dice(@Nullable final Integer number, final int sides) {
        this(Optional.ofNullable(number), sides, NONE);
    }

    public int number() {
        return number.orElse(1);
    }

    public int sides() {
        return sides;
    }

    public Advantage advantage() {
        return advantage;
    }

    public int evalWith(final Roll roll) {
        return advantage.evalWith(roll, this);
    }

    protected int rollWith(final Roll roll) {
        return rangeClosed(1, number()).
                map(ignored -> roll.roll(sides())).
                sum();
    }

    @Override
    public int compareTo(@Nonnull final Dice that) {
        // Sort by roll average - adv/dis messes this up though
        final int compare = Integer.compare(number() + (sides() + 1),
                that.number() * (that.sides() + 1));
        return 0 == compare ? advantage().compareTo(that.advantage())
                : compare;
    }

    @Override
    public String toString() {
        return advantage.toString(this);
    }

    protected String stringOf() {
        return number.
                map(number -> number + "d" + sides()).
                orElse("d" + sides());
    }

    @Override
    public boolean equals(final Object o) {
        if (this == o)
            return true;
        if (o == null || getClass() != o.getClass())
            return false;

        final Dice that = (Dice) o;

        return sides() == that.sides() && number() == that.number()
                && advantage() == that.advantage();
    }

    @Override
    public int hashCode() {
        return 31 * (31 * sides() + number()) * advantage().hashCode();
    }
}
UPDATE: If you wonder about the parser visitor, you have to explicitly enable it in ANTLR4 (even though it is recommended over tree rewriting). Use the command line flag, or in Maven

<plugin>
    <groupId>org.antlr</groupId>
    <artifactId>antlr4-maven-plugin</artifactId>
    <version>${antlr4.version}</version>
    <configuration>
        <visitor>true</visitor>
    </configuration>
    <executions>
        <execution>
            <id>antlr4</id>
            <goals>
                <goal>antlr4</goal>
            </goals>
        </execution>
    </executions>
</plugin>