R, pipes, & readability
Or: How I Learned to Stop Worrying and Love the Right Assignment Operator
In version 4.1.0[1], R introduced the native pipe operator:
R now provides a simple native forward pipe syntax
|>
. The simple form of the forward pipe inserts the left-hand side as the first argument in the right-hand side call. The pipe implementation as a syntax transformation was motivated by suggestions from Jim Hester and Lionel Henry.
[1] R News
This is an incredibly useful tool, as it allows functions to be composed into clear, sequential pipelines[2]. Personally, I find using functions with pipes far more readable than the more common “onion-style” function calls, since pipes let me read from left to right (or top to bottom) rather than from right to left:
# Onion style
quux(bar(foo(df)))
# Pipes!
|>
df foo() |>
bar() |>
quux()
In the pipe example above, I think that giving each function its own line also makes the code easier to parse at a glance. While I don’t have any cold hard data to support this, I’d venture that pipes and composable functions are a major part of why the Tidyverse has been so successful.
A more unusual pattern that I like to use with pipes is to assign the result of a pipeline using the right assignment operator because this follows the natural flow of the code. When adopting this approach, my eyes don’t have to jump back to the beginning of the pipeline to remind myself of which variable I’m binding the results to. Compare the following examples.
# Left assignment
<- df |>
result foo() |>
bar() |>
quux()
# Right assignment
|>
df foo() |>
bar() |>
quux() -> result
Usually, the right assignment operator is considered bad form. For example, the assignment_linter
in the lintr
package specifies that the right assignment operator should not be allowed by default[3]. While I agree with this as a rule of thumb, I nevertheless think that pipes are an exception for the sake of readability. As Abelson and Sussman noted in the preface to the first edition of SICP:
[3] Note that the assignment_linter
is a default linter.
[P]rograms must be written for people to read, and only incidentally for machines to execute.
As an alternative to the two examples shown above, magrittr
also has a compound assignment pipe operator, %<>%
, which allows the result of the right-hand side to be assigned to the left-hand side. For example, df %<>% foo() %>% bar() %>% quux()
is essentially equivalent to df <- df foo() |> bar() |> quux()
.