In praise of Nushell

28 Feb 2024

I'm a little old school and spend a ton of time in the terminal; many programmers and data scientists are the same.

The magic of pipes

Most people will use the default shell that is installed on their system, either bash or zsh.1 Running commands is OK, but the real magic is when commands are designed to do one small thing well, and then you compose them together with pipes:

# look for "hello" in all text files
find . -name '*.txt' | xargs grep 'hello'

What pipes give you is a powerful, incremental way of working, where you can gradually chain together simple commands to build up complex operations, all the while getting to see the intermediate output as you go.

Commands in this workflow like find, grep, cat, sort, uniq are all built around streams of bytes, which makes them very flexible. But, in reality very little of what we push through them is "just bytes". Look around your projects, and what's actually there is not text files, but databases, APIs, and configuration files, in formats like JSON, YAML, TOML, and CSV.

In short, we're nearly always working with structured data of some kind.

Wouldn't it be great if instead of just passing around bytes, our shell could also understand this structure and help us make the most of it? What would that give us?

Enter Nushell

Nushell is a non-POSIX shell implemented in Rust and based around the concept of structured data.

Non-POSIX means that everyday commands like ls, mkdir, find and rm have been redefined to work better with structured data, and that things like environment variables are configured differently to common shells like bash and zsh.

Having to re-learn everyday things is big cost, but Nushell comes with a lot of big benefits that outweigh these costs.

A beautiful data explorer

The first and most obvious thing about Nushell is that it is a beautiful in-terminal data explorer.

Say we have a YAML or JSON file. Sure, we can use cat or less to look at it, or open it in our editor of choice. But Nushell's native open command can display it beautifully, even including the fact that there is nested hierarchical data in there.

Common commands redefined

Nushell redefines common shell commands and has them output structured data, rather than rendering them as text.

For example, instead of pretty-printed text, ls outputs a table that you can sort, select and transform from as you like.

Pervasive, structured data

Nushell's basic commands and pipes give you a swiss-army knife for transforming data, right in your shell. How does it achieve that?

Unlike bash or zsh, Nushell is built around the idea of structured data. It has a range of basic types including numeric types, strings, dictionaries and lists -- in short, the types a modern programming language is built on. It adds to them support for tables, which are built from any sequence of dictionaries.

Anything you can expect to do with sequences, records and tables is doable in Nushell with elegant, composable commands and a lovely closure syntax for full flexibility. This includes things like filtering, sorting, grouping, joining, and aggregating data -- it's a full swiss army knife.

Rough edges and tradeoffs

There are some ways in which Nushell can be challenging or annnoying to use, but also some ways where it's so good you wish it could be even better.

Learning curve

Nushell breaks from POSIX to achieve great things, but it does mean learn a lot of commands, and your initial environment set-up involves some learning. Likely it took you years to get the muscle memory to correctly type some POSIX commands just how you like them; unlearning them can be mildly annoying.

Lack of stability

The language is still evolving. If you use a rolling distribution like Homebrew on MacOS, you'll periodically find a piece of your shell config breaking because of a change in Nushell's syntax or semantics. It's usually not hard to fix, but it's happened a number of times to me in the six months I've been using it, vs never for bash or zsh.

Streaming and type inference

Part of Nushell's power and usability is that it (a) does great type inference so you don't have to, and (b) streams data through commands, so you can work with large datasets. But these two features can also be annoying sometimes when they interact. It can happen that a pipeline that works for the first few records in a stream can give type errors if run on the whole stream, because later records have features that cue a different type inference.

Tuning performance

Since it's so nice as a data workspace, it's convenient to push as much data through Nushell as it can handle, e.g. whole database tables. The limiting factor here is the lack of query planning.

A database would know to only compute the columns you've specified at the end of your pipeline, but Nushell is not able to propagate constraints backwards like that; it's still up to you to push constraints early in your pipelines and select just the fields you need to keep a large pipeline performant.

Overall assessment

Despite these concerns, Nushell is absolutely usable as a daily shell today.

It vastly raises the bar on what you can expect to do and achieve in a terminal environment, and has for example completely replaced all other database query tools in my workflow.

Overall it's a joy to use, and I'm excited to see how the language and capabilities will evolve.

Footnotes

  1. This might seem to exclude Windows users, but they are increasingly using the Windows Subsystem for Linux (WSL) which brings them into the same Unix framework as everyone else.