Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0148] Pipe operator #148

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
330 changes: 330 additions & 0 deletions rfcs/0148-pipe-operator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,330 @@
---
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plan for moving forward/stabilizing feature in Nix

Now that the experimental feature has been released, it may be time to start talking about a plan for evaluating it.

I see a number of possibilities:

  • We do nothing and hope it happens organically that enough enthusiasts try the feature out to give it a good trial. In all likelihood, people will only comment if they dislike something about the feature, which means we could receive little or no feedback and be left wondering if enough people have looked at it. This requires least effort but is likely to take the longest.
  • We promote the feature in official Nix channels and set up some sort of polling for both positive and negative feedback, with an expectation of how long the poll will be running before moving this RFC to FCP. This requires a bit of up-front work and may be biased towards people who pay attention to official Nix channels, but has the advantage of resolving faster than the first possibility.
  • We devise an actual experiment to measure people's ability to learn the new operators. This might take the form of a survey asking people to rate the readability of various Nix expressions, some with the new operators and some without, and testing if there is a statistically significant difference in the ratings. We could send the survey to a representative sample of community members instead of (or in addition to) the public polling from the previous possibility, to try to control for bias. I'm not aware of the Nix community undertaking this level of rigor in the past but it would help us bring a stronger case for assuaging people's concerns about the mental cost of adding new operators. It would be a substantial investment of time from a few people (I'm happy to be one of those people but it probably shouldn't be me alone).

Anyone else have thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do nothing and hope it happens organically that enough enthusiasts try the feature out to give it a good trial. In all likelihood, people will only comment if they dislike something about the feature, which means we could receive little or no feedback and be left wondering if enough people have looked at it. This requires least effort but is likely to take the longest.

I think it would be beneficial to give the new feature more visibility and promotion, rather than leaving it as an experimental feature indefinitely.
Collecting feedback through some form of polling or voting mechanism could provide valuable insights.
Additionally, introducing new features in a programming language is a significant event, especially for a community-driven project like Nix. It's likely that experienced PL experts would be needed to assess its impact and stability.

Copy link
Member

@rhendric rhendric Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Attempting to corral feedback into a thread.)

@illustris [#148 (comment)]

nix-repl> double = x: x*2

# current syntax
nix-repl> 3 |> (x: 1 + x) |> double |> toString
"8"

# proposed alternative
nix-repl> 3 |> x: 1 + x |> double |> toString

# current syntax
nix-repl> 3 |> (x: 1 + (x |> double)) |> toString
"7"

# proposed alternative
nix-repl> 3 |> x: 1 + (x |> double) |> toString

Both of those proposals seem, to me, to be ‘more likely’ to mean something else:

3 |> (x: (1 + x |> double |> toString))
3 |> (x: (1 + (x |> double) |> toString))

In other words, if |> binds more loosely than the lambda-forming :, it would be the first thing in the language to do so and thus would be very surprising to me. So far in the language, a lambda always extends as far rightward as it can.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially applied pipes

To me this smells like function composition with extra steps. Function composition is discussed in the RFC as well, and while I am not completely opposed to it, my fear is that it reduces code readability given a lack of type annotations. In x = |> inc |> toString it is a lot less intuitively clear that x is a function.

This is one of these questions where I'd like to see more usage to find out whether this is a thing that comes up sufficiently often in practice that it is worth dealing with.

(Side note: I think we would get this feature for free if Nix had operator sections)

Lambdas inside pipe

While I agree that not requiring parentheses here would be nice, especially given that one goal of the pipe operator is to reduce parentheses, I don't see any way this would be realistically implementable without throwing the entire language under the bus. Making the binding strength of abstractions context dependent just doesn't sound great

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making the binding strength of abstractions context dependent just doesn't sound great

agreed, it’s a recipe for disaster

Copy link

@illustris illustris Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lambdas inside pipe

Making the binding strength of abstractions context dependent

Right. I didn't think that through. This is a bad idea.

Partially applied pipes

In x = |> inc |> toString it is a lot less intuitively clear that x is a function

In my opinion it is slightly more clear than, for example,

concatLines = concatStringsSep "\n"
getBin = getOutput "bin"

or any other partially applied function. The open |> (or |) at the start suggests to me that this is a function.

feature: pipe-operator
start-date: 2023-05-23
author: @piegamesde
shepherd-team: @roberth @rhendric @illustris @adrian-gierakowski
shepherd-leader: @rhendric
related-issues: (will contain links to implementation PRs)
---

# Summary
[summary]: #summary

Introduce a new "pipe" operator, `|>`, to the Nix language, defined as `f a` = `a |> f`.
Additionally, elevate `lib.pipe` to a built-in function.

As a reminder, `pipe a [ f g h ]` is defined as `h (g (f a))`.

# Motivation
[motivation]: #motivation

Creating advanced data processing like transforming a list is a thing commonly done in nixpkgs.
Yet the language has no support for function concatentation/composition,
which results in such constructs looking unwieldy and difficult to format well.
`lib.pipe` may be the most powerful library function with that regard,
but it is unknown and overlooked by many because it is not easily discoverable:
Despite its great usefulness, it is currently used in less than 30 files in Nixpkgs
piegamesde marked this conversation as resolved.
Show resolved Hide resolved
(`rg '[\. ]pipe .* \['`).
Additionally, it is not accessible to Nix code outside of nixpkgs,
piegamesde marked this conversation as resolved.
Show resolved Hide resolved
and due to Nix's lazy evaluation debugging type errors is really difficult.

Let's have a look at an arbitrarily chosen snippet of Nixpkgs code:

```nix
defaultPrefsFile = pkgs.writeText "nixos-default-prefs.js" (lib.concatStringsSep "\n" (lib.mapAttrsToList (key: value: ''
// ${value.reason}
pref("${key}", ${builtins.toJSON value.value});
'') defaultPrefs));
```

It is arguably pretty hard to read and reason about. Even when applying some more whitespace-generous formatting:

```nix
defaultPrefsFile = pkgs.writeText "nixos-default-prefs.js" (
lib.concatStringsSep "\n" (
lib.mapAttrsToList
(
key: value: ''
// ${value.reason}
pref("${key}", ${builtins.toJSON value.value});
''
)
defaultPrefs
)
);
```

One can observe the following issues:

- If you want to follow the data flow, you must read it from bottom to top,
from the inside to the outside (the input here is `defaultPrefs`).
piegamesde marked this conversation as resolved.
Show resolved Hide resolved
- Adding a function call to the output would require wrapping the entire
expression in parentheses and increasing its indentation.

Compare this to the equivalent call with `lib.pipe`:

```nix
defaultPrefsFile = pipe defaultPrefs [
(lib.mapAttrsToList (
key: value: ''
// ${value.reason}
pref("${key}", ${builtins.toJSON value.value});
''
))
(lib.concatStringsSep "\n")
(pkgs.writeText "nixos-default-prefs.js")
];
piegamesde marked this conversation as resolved.
Show resolved Hide resolved
```

The code now clearly reads from top to bottom in the order the data is processed,
piegamesde marked this conversation as resolved.
Show resolved Hide resolved
it is easy to add and remove processing steps at any point.

With a dedicated pipe operator, it would look like this:

```nix
defaultPrefsFile = defaultPrefs
|> lib.mapAttrsToList (
key: value: ''
// ${value.reason}
pref("${key}", ${builtins.toJSON value.value});
''
)
|> lib.concatStringsSep "\n"
|> pkgs.writeText "nixos-default-prefs.js";
piegamesde marked this conversation as resolved.
Show resolved Hide resolved
```

The artificial distinction between the first input and the functions via the list now is gone,
and so are the parentheses around the functions.
With the lower character overhead, using the operator becomes attractive in more situations,
whereas a `pipe` pays for its overhead only in more complex scenarios (usually three functions or more).
Having a dedicated operator also increases visibility and discoverability of the feature.

# Detailed design
[design]: #detailed-design

## `|>` operator

A new operator `|>` is introduced into the Nix language.
It is defined as function application with the order of arguments swapped: `f a` = `a |> f`.
piegamesde marked this conversation as resolved.
Show resolved Hide resolved
It is left-associative and has a binding strength weaker than function application:
`a |> f |> g b |> h` = `h ((g b) (f a))`.
piegamesde marked this conversation as resolved.
Show resolved Hide resolved

# Examples and Interactions
[examples-and-interactions]: #examples-and-interactions

## Tooling support

Like any language extension, this will require the available Nix tooling to be updated.
Updating parsers should be pretty easy, as the syntax changes to the language are fairly minimal.
Tooling that evaluates Nix code in some way or does static code analysis should be easy to support too,
since one may treat the operator as syntactic sugar for function application.
No fundamentally new semantics are introduced to the language.

## Nixpkgs interaction

As soon as the Nixpkgs minimum version contains `|>`, using it will be allowed and encouraged in the documentation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it allowed to backport the pipe operator to the Nixpkgs minimum nix version?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do major feature additions like this usually get a backport?

Even if it was backported, wouldn't it defeat the purpose of having a minimal version that's likely to be installed by most users?

A backport would still be a new release, even if it was based on an old major version number.

Copy link
Member

@alyssais alyssais Sep 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes. Support for zstd cache compression was backported to 2.3, so it's within the realms of possibility.

The reason we have a minimum version of 2.3 not so much to support people who haven't updated Nix in several years, it's because later versions of Nix have unresolved regressions in things that some people depend on.

There might be efforts to automatically convert existing `lib.pipe` usage or even discourage/deprecate using that,
see future work.

### Existing lib functions

Nixpkgs `lib` contains a couple of functions that are concatenated versions of other lib functions,
for example `concatMapStringsSep` being a fuse of `map` and `concatStringsSep`.
This is not unusual in many programming languages,
nevertheless the existence of easy to use piping functionality would reduce the need for some of them.

Of course removing existing lib functions is not an option, but in the future,
newly added functions should meet stronger criteria than being purely convenience helpers replacing two function calls with one.

To keep with that example, is the function called `concatMapStringsSep` or `concatMapStringSep`?
In which order do you provide the mapper or the separator first?
Using `map (…) |> concatStringsSep` requires to memorize less information.
Some example with different alternatives:

```nix
lib.concatMapStringsSep "\n" (test: writeTest "success" test.name "${test}/bin/${test.name}") (lib.attrValues bin)

lib.concatStringsSep "\n" (map (test: writeTest "success" test.name "${test}/bin/${test.name}") (lib.attrValues bin))

lib.attrValues bin |> map (test: writeTest "success" test.name "${test}/bin/${test.name}") |> lib.concatStringsSep "\n"

lib.concatStringsSep "\n" <| map (test: writeTest "success" test.name "${test}/bin/${test.name}") <| lib.attrValues bin
```

# Prior art

Nickel has `|>` too, with the same name and semantics.

F# has `|>`, called "pipe-forward" operator, with the same semantics.
Additionally, it also has "pipe-backward" `<|` and `>>`/`<<` for forwards and backwards function composition.
`<|` is equivalent to function application, however its lower binding order allows removing parentheses:
`g (f a)` = `g <| f a`. All these operators have the same precedence and are left-associative.
F#'s `<|` being left-associative strongly reduces its power of usage,
this can be considered a mistake/compromise/collateral in the language design.
All other discussed variants of `<|` in other languages are right-associative.

Elm has the same operators as F#.

Haskell has the (backwards) function composition operator `.` in its prelude: `(g . f) a` = `g (f a)`.
It also has "reverse application" `&`, which is roughly equivalent to `|>`,
and `$`, which is function application again but right-associative and very weakly binding.
`.` binds stronger than both.

`|>` is definable as an infix function in several other programming languages,
and in even more languages as macro or higher-order function (including Nix, that's `lib.pipe`).
Notable, the Haskell package `flow` provides some common operators like `|>` and `<|`,
with the usual associativity and same binding strength (unlike Haskell's `$` and `&` discussed above).

Languages that allow for custom operators with custom associativity and precedence like Haskell and Scala
(but unlike F#) usually forbid mixing same-strengh operators with different associativity without using parentheses
as a syntax/compile error.

# Alternatives
piegamesde marked this conversation as resolved.
Show resolved Hide resolved
[alternatives]: #alternatives

For each change this RFC proposes, there is always the trivial alternative of not doing it. See #drawbacks.

We could use the occasion and introduce more operators like those mentioned above.
piegamesde marked this conversation as resolved.
Show resolved Hide resolved

## Function composition operators

Function composition is mostly interesting for the so-called "point-free" programming style,
where partially applied compositions of functions are preferred over the introduction of lambda terms.
However, Nix is not well suited for that programming style for various reasons,
nor would that point-free style have nearly as many applications in typical Nixpkgs code.

Take for example this library function, written in a point-free style by using `flip pipe` as function concatenation operator:

```nix
concatMapAttrs = f: flip pipe [ (mapAttrs f) attrValues (foldl' mergeAttrs { }) ];
```

When reading this code, one has to manually do the headwork of inferring the types to understand what this function does.
In Haskell, its powerful type system and type inference would quickly spot any mistakes made.
But in Nix, this can lead to very confusing runtime errors instead
(even ignoring the additional stack trace noise of using `flip pipe`).
Compare this to the fully specifified version of the same function:

```nix
concatMapAttrs = f: v: pipe v [ (mapAttrs f) attrValues (foldl' mergeAttrs { }) ];
```

Would you have guessed correctly from the first code example whether it's `f: v:` or `v: f:`?

## Pipe-forward vs pipe-backward

We could use `<|` instead of `|>` instead:

```nix
defaultPrefsFile =
pkgs.writeText "nixos-default-prefs.js" <|
lib.concatStringsSep "\n" <|
lib.mapAttrsToList (
key: value: ''
// ${value.reason}
pref("${key}", ${builtins.toJSON value.value});
''
) <| # the '<|' here is optional/redundant
defaultPrefs
;
```

`<|` also opens up to other scenarios in which `|>` might be less well suited
(examples inspired by https://github.com/NixOS/nix/issues/1845):

```nix
lib.makeOverridable <|
{ foo, bar }:

builtins.trace "my debug stuff" <|
# some more code here
```

While only one of them would probably be sufficient for most use cases, we could also have both `|>` and `<|`.
Given that we want to call them `|>` and `<|`, users should assume both having equal binding strength.
Therefore mixing them without parentheses should be forbidden like in other languages,
having `<|` weaker than `|>` like Haskell's `$` and `&` would be a bad idea.

## Change the `pipe` function signature

There are many equivalent ways to declare this function, instead of just using the current design.
For example, one could flip its arguments so that it becomes function composition on list of functions.
Not only can this function trivially replace pipe; it can also be readily used where a function is expected, such as in `map`.

The current design of `pipe` has the advantage that its asymmetry points at its operating direction, which is quite valuable.

## `apply` keyword

As suggested in https://github.com/NixOS/rfcs/pull/148#discussion_r1206966546,
one could introduce a keyword (tentatively called `apply`) for piping,
which syntactically similar to `with` and `assert` statements:

```nix
apply f;
apply g;
x

# The same as
f (g x)
```

The biggest disadvantage with it is backwards compatibility of adding a new keyword into the language,
which would require solving language versioning first (see RFC #137).

This approach would be roughly equivalent to introducing a `<|` operator.
See the above for a discussion on the overall design space of that.

## `builtins.pipe`

`lib.pipe`'s functionality could be implemented as a built-in function.

The main motivation for this is that it allows to give better error messages
like line numbers when some part of the pipeline fails:
Currently `lib.pipe` internally uses a fold over the list,
therefore any type mismatches will give a trace which points into `lib.fold`,
leaving the user without the information at which stage of the pipeline it failed.
(This is less of a problem when used in packages, but significant enough that currently,
`lib.pipe` unfortunately should not be used in the implementation of any library functions.)
This could probably be fixed within Nixpkgs alone,
however not without incurring a significant performance penalty for using "reflection".
A built-in operator would be able to provide this more detailed error information basically for free.

Additionally, it allows easy usage outside of Nixpkgs and increases discoverability.

While Nixpkgs is bound to minimum Nix versions and thus `|>` won't be available until
several years after its initial implementation,
it can directly benefit from `builtins.pipe` and its better error diagnostic by overriding `lib.pipe`.
Elevating a Nixpkgs library function to a builtin has been done several times before,
for example `bitAnd`, `splitVersion` and `concatStringsSep`.

The main drawback is that once `|>` is available, there is little use for `builtins.pipe` anymore,
so the main purpose of that would be as a stop-gap for Nixpkgs
until the minimum Nix version is sufficiently high to allow using `|>`.

# Drawbacks
[drawbacks]: #drawbacks

- Introducing `|>` has the drawback of adding complexity to the language, and it will break older tooling.

# Unresolved questions
[unresolved]: #unresolved-questions

- What is the precise binding strength of the operator?
- Who is going to implement this in Nix?
- How difficult will the implementation be?
- Will this affect evaluation performance in some way?
- There is reason to expect that replacing `lib.pipe` with a builtin will reduce its overhead,
and that the builtin should have little to no overhead compared to regular function application.

In order to decide which operators to add to the language (see Alternatives),
a larger survey across the Nixpkgs code will be conducted.
This will give us quantitative information to better make any decisions involving tradeoffs.

# Future work
[future]: #future-work

Once introduced and usable in Nixpkgs, existing code may benefit from being migrated to using these features.
Automatically transforming nested function calls into pipelines is unlikely,
as doing so is not guaranteed to always be a subjective improvement to the code.
It might be possible to write a lint which detects opportunities for piping, for example in nixpkgs-hammering.
piegamesde marked this conversation as resolved.
Show resolved Hide resolved
On the other hand, the migration from `pipe` to `|>` should be a straightforward transformation on the syntax tree.