gogrep: structural search and replace
Original (ru): https://habr.com/post/481696/.
gogrep is one of my favorite tools. It gives you the ability to search Go code using syntax patterns, filter matches by expression types, and perform structural replace.
Today I’ll talk about how to use gogrep
and about VS Code extension that integrates gogrep
into the editor.
Why do you need gogrep?
In a nutshell, gogrep
is useful for:
- Refactoring (search and replace)
- Codebase exploration
- Suspicious code detection (example: ruleguard)
I’ll show you an example that demonstrates the power and elegance of the structural search.
In the snippet below, both a()
and b()
functions perform the same operation:
func a(xs []int) []int {
xs = append(xs, 1)
xs = append(xs, 2)
return xs
}
func b(xs []int) []int {
xs = append(xs, 1, 2)
return xs
}
Suppose we want to re-write all places where append
calls can be merged.
Let’s try gogrep
:
gogrep -w -x '$x=append($x,$a);$x=append($x,$b)' -s '$x=append($x,$a,$b)' ./...
- Find all replacement candidates with
-x
command - Re-write matches with
-s
command$x=append($x,$a,$b)
pattern - Add a
-w
argument to write replacements to the source files instead of stdout
With gogrep extension for VS Code it becomes even easier.
Here is an example of +=1
replacement with ++
:
Real-world example: once upon a time, I wanted to perform a global slice[:] -> slice
replacement. I’ve even opened an issue in staticcheck. The major problem is that you can’t just search for [:]
without knowing the types. array[:]
expression makes sense, we only want to match strings and slices.
Here is how you can perform such replacement for []byte
slices inside Go stdlib:
# Search only.
gogrep -x '$s[:]' -a 'type([]byte)' std
# Search+replace.
gogrep -x '$s[:]' -a 'type([]byte)' -s '$s' -w std
If you wonder what would that query reveal
Only the first 30 results are showed (out of 300+):
$GOROOT/src/archive/tar/format.go:163:59: b[:]
$GOROOT/src/archive/tar/reader.go:345:33: tr.blk[:]
$GOROOT/src/archive/tar/reader.go:348:17: tr.blk[:]
$GOROOT/src/archive/tar/reader.go:348:28: zeroBlock[:]
$GOROOT/src/archive/tar/reader.go:349:34: tr.blk[:]
$GOROOT/src/archive/tar/reader.go:352:18: tr.blk[:]
$GOROOT/src/archive/tar/reader.go:352:29: zeroBlock[:]
$GOROOT/src/archive/tar/reader.go:396:23: tr.blk[:]
$GOROOT/src/archive/tar/reader.go:497:36: blk[:]
$GOROOT/src/archive/tar/reader.go:528:33: blk[:]
$GOROOT/src/archive/tar/reader.go:531:14: blk[:]
$GOROOT/src/archive/tar/writer.go:392:26: blk[:]
$GOROOT/src/archive/tar/writer.go:477:23: zeroBlock[:]
$GOROOT/src/archive/zip/reader.go:233:29: buf[:]
$GOROOT/src/archive/zip/reader.go:236:15: buf[:]
$GOROOT/src/archive/zip/reader.go:251:30: buf[:]
$GOROOT/src/archive/zip/reader.go:254:15: buf[:]
$GOROOT/src/archive/zip/writer.go:92:17: buf[:]
$GOROOT/src/archive/zip/writer.go:110:19: buf[:]
$GOROOT/src/archive/zip/writer.go:116:30: buf[:]
$GOROOT/src/archive/zip/writer.go:132:27: buf[:]
$GOROOT/src/archive/zip/writer.go:157:17: buf[:]
$GOROOT/src/archive/zip/writer.go:177:27: buf[:]
$GOROOT/src/archive/zip/writer.go:190:16: buf[:]
$GOROOT/src/archive/zip/writer.go:198:26: buf[:]
$GOROOT/src/archive/zip/writer.go:314:18: mbuf[:]
$GOROOT/src/archive/zip/writer.go:319:31: mbuf[:]
$GOROOT/src/archive/zip/writer.go:386:16: buf[:]
$GOROOT/src/archive/zip/writer.go:398:23: buf[:]
$GOROOT/src/bytes/bytes.go:172:24: b[:]
Search patterns
A search pattern is a small Go code fragment that can include $-expressions (we’ll refer to them as “pattern variables”). A pattern can be an expression, a statement, or a declaration.
Pattern variables are Go variables with $
prefix. Pattern variables with the same name always match identical AST nodes. The only exception is $_
variable that can be used multiple times to express “whatever” parts of the pattern.
If *
is placed before the pattern variable name then it will match zero or more nodes instead of exactly one.
Search pattern | Interpretation |
---|---|
$_ |
Anything. |
$x |
Identical to the example above, “anything”. |
$x = $x |
Self-assignment. |
(($_)) |
Anything surrounded by two pairs of parens. |
if $init; $cond {$x} else {$x} |
if with duplicated then/else blocks. |
fmt.Fprintf(os.Stdout, $*_) |
Fprintf call with os.Stdout argument. |
As already demonstrated in the example with append()
, the pattern can contain multiple statements. The “$x; $y
” syntax means “find $x that is followed by $y”.
gogrep
performs a genuine backtracking for the patterns with *
. For example, the pattern below can find all map
literals that contain at least 1 duplicated key expression:
map[$_]$_{$*_, $key: $val1, $*_, $key: $val2, $*_}
The Go compiler finds the duplicated map keys in literal, but only if they’re constant expressions.
gogrep
pattern will find more suspicious candidates.
Pipelines, commands, and attributes
We already used -x
and -s
commands before, but now it’s time to describe them in detail.
gogrep
accepts a list of commands that form a pipeline. The order of the commands matters. The full synopsis looks like this:
gogrep command [more commands...] [targets...]
A target
can be a file, a folder, or a package (same as the go build
targets).
My number-1 frequently used target is ./...
which performs a recursive search.
Command | Description |
---|---|
-x pattern |
Find nodes that match the pattern . |
-g pattern |
Discard matches that do not match the pattern . |
-v pattern |
Discard matches that do match the pattern . |
-a attribute |
Discard matches that do not have the attribute . |
-s pattern |
Re-write the match using the `pattern. |
-p n |
For every match, navigate up n node parents. |
Normally, -x
is the very first pipeline command and is followed by other commands that perform filtering or substitution.
Examples should help you to understand this idea better.
// file foo.go
package foo
func bar() {
println(1)
println(2)
println(3)
}
# Find all println() calls.
$ gogrep -x 'println($*_)' foo.go
foo.go:4:2: println(1)
foo.go:5:2: println(2)
foo.go:6:2: println(3)
# The first -v discards matches that contain a literal 1.
# The second -v discards matches that contain a literal 2.
$ gogrep -x 'println($*_)' -v 1 -v 2 foo.go
foo.go:6:2: println(3)
# -p 2 traverses two nodes up, leading us to the containing *ast.BlockStmt.
$ gogrep -x 'println($*_)' -v 1 -v 2 -p 2 foo.go
foo.go:3:12: { println(1); println(2); println(3); }
There are a lot of attributes and they’re quite underdocumented.
You can learn them from the source code though.
An example below will illustrate the concept of the attributes.
# Will match both numerical and string "+" operations.
gogrep -x '$lhs + $rhs'
# Matches only string concatenations.
gogrep -x '$lhs + $rhs' -a 'type(string)'
By default, gogrep
does not perform a search inside test files. If you want to include these files, use -tests
argument.
VS Code extension overview
The extension exposes gogrep search commands (Ctrl+Shift+P
or Cmd+Shift+P
):
Every command creates a search pattern prompt:
Search results are printed to the output channel named gogrep
:
To perform a search and replace, divide the “Find” and “Replace” patterns with ->
token:
The trailing !
serves as a -w
switch. If you end your search and replace patterns with it, your files will be updated. Otherwise, the replacement results are printed to the output channel.
An example of how you can find append()
calls that can be combined:
By default, the extension commands are not bound to any hotkey combination. If you want quick access to them, you can assign any shortcut, following your sense of what is ergonomic.
The extension can automatically install the gogrep
binary for linux-amd64
, windows-amd64
and darwin-amd64
. If something goes wrong or you’re using a different platform, consider building gogrep
from the source:
GO111MODULE=on go get mvdan.cc/gogrep
If you have a feature request (or a bug report), consider opening the issue on the GitHub.
Closing words
I hope this tutorial will make gogrep
more accessible to people.
If you’re using JetBrains IDE, you may be familiar with structural search and replace (SSR) mechanism. It’s basically doing the same thing, but you’re can’t use it outside of the IDE, so gogrep
wins here.
In case you find yourself doing the same refactoring commands, again and again, consider to use the ruleguard with -fix
option on the file save hook.
// These 3 rules will find Fprint* calls with Stdout arguments and
// will replace them with Print* equivalents.
// The patterns inside Match() are gogrep patterns.
m.Match(`fmt.Fprint(os.Stdout, $*args)`).Suggest(`fmt.Print($args)`)
m.Match(`fmt.Fprintln(os.Stdout, $*args)`).Suggest(`fmt.Println($args)`)
m.Match(`fmt.Fprintf(os.Stdout, $*args)`).Suggest(`fmt.Printf($args)`)
Additional resources: