Step driven evaluation
If you heard about table driven tests, the idea described in this article will be easier to grasp, since it’s the same technique, but used outside of the tests.
Suppose you have a function that executes a lot of other functions. This function probably does two main things:
- It checks for all returned errors as they occur.
- It passes one function outputs as the inputs for another.
// process is an example pipeline-like function.
func queryFile(filename, queryText string) (string, error) {
data, err := readData(filename)
if err != nil {
return nil, errors.Errorf("read data: %v", err)
}
rows, err := splitData(data)
if err != nil {
return nil, errors.Errorf("split data: %v", err)
}
q, err := compileQuery(queryText)
if err != nil {
return nil, errors.Errorf("compile query: %v", err)
}
rows, err = filterRows(rows, q)
if err != nil {
return nil, errors.Errorf("filter rows: %v", err)
}
result, err := rowsToString(rows)
if err != nil {
return nil, errors.Errorf("rows to string: %v", err)
}
return result, nil
}
This function consists of 5 steps. Five relevant calls, to be precise. Everything else is a distraction. The order of those calls matter, it’s a sequence, the algorithm.
Let’s re-write code above using the step driven evaluation.
func queryFile(filename, queryText string) ([]row, error) {
var ctx queryFileContext
steps := []struct {
name string
fn func() error
}{
{"read data", ctx.readData},
{"split data", ctx.splitData},
{"compile query", ctx.compileQuery},
{"filter rows", ctx.filterRows},
{"rows to string", ctx.rowsToString},
}
for _, step := range steps {
if err := step.fn(); err != nil {
return errors.Errorf("%s: %v", step.name, err)
}
}
return ctx.result
}
The pipeline is now explicit, it’s easier to adjust steps order and to insert or remove them. It is also trivial to add debug logging inside that loop, you need only one new statement as opposed to N
statements near every function call.
This approach shines with 4+ step, when the complexity of introducing a new type like queryFileContext
is inferior to the benefits.
// queryFileContext might look like the struct below.
type queryFileContext struct {
data []byte
rows []row
q *query
result string
}
Methods like queryFileContext.splitData
just call the same function while updating the ctx
object state.
func (ctx *queryFileContext) splitData() error {
var err error
ctx.rows, err = splitData(ctx.data)
return err
}
This pattern works particularly well for main
functions.
func main() {
ctx := &context{}
steps := []struct {
name string
fn func() error
}{
{"parse flags", ctx.parseFlags},
{"read schema", ctx.readSchema},
{"dump schema", ctx.dumpSchema}, // Before transformations
{"remove builtin constructors", ctx.removeBuiltinConstructors},
{"add adhoc constructors", ctx.addAdhocConstructors},
{"validate schema", ctx.validateSchema},
{"decompose arrays", ctx.decomposeArrays},
{"replace arrays", ctx.replaceArrays},
{"resolve generics", ctx.resolveGenerics},
{"dump schema", ctx.dumpSchema}, // After transformations
{"decode combinators", ctx.decodeCombinators},
{"dump decoded combinators", ctx.dumpDecodedCombinators},
{"codegen", ctx.codegen},
}
for _, step := range steps {
ctx.debugf("start %s step", step.name)
if err := step.fn(); err != nil {
log.Fatalf("%s: %v", step.name, err)
}
}
}
An additional benefit is the ease of testing. Even though we use log.Fatalf
, which is a bad thing, it’s trivial to re-create this pipeline inside a test and run a set of steps that fail a test instead of doing os.Exit
.
You can also omit some CLI-related steps inside tests, like "dump schema"
or "codegen"
. You can also inject test-specific steps into that list.
There are few drawbacks, as always:
- You need to introduce a new type and probably a few methods for it.
- It’s not always straightforward to figure out appropriate context object layout so it satisfies the needs of the entire pipeline without getting overly complex.
Try using it, maybe you’ll like it.