ECMAScript Pattern Matching
Status
Stage: 1
Authors: Originally Kat Marchán (Microsoft, @zkat__); now, the below champions.
Champions: Jordan Harband (Coinbase, @ljharb), Mark Cohen (@mpcsh_), Tab Atkins-Bittner (Google, @tabatkins), Yulia Startsev (Mozilla, @ioctaptceb), Daniel Rosenwasser (Microsoft, @drosenwasser), Jack Works (Sujitech, @Jack-Works), Ross Kirsling (Sony, @rkirsling)
Problem
There are many ways to match values in the language, but there are no ways to match patterns beyond regular expressions for strings. switch
is severely limited: it may not appear in expression position; an explicit break
is required in each case
to avoid accidental fallthrough; scoping is ambiguous (block-scoped variables inside one case
are available in the scope of the others, unless curly braces are used); the only comparison it can do is ===
; etc.
Priorities for a solution
This section details this proposal’s priorities. Note that not every champion may agree with each priority.
Pattern matching
The pattern matching construct is a full conditional logic construct that can do more than just pattern matching. As such, there have been (and there will be more) trade-offs that need to be made. In those cases, we should prioritize the ergonomics of structural pattern matching over other capabilities of this construct.
Subsumption of switch
This feature must be easily searchable, so that tutorials and documentation are easy to locate, and so that the feature is easy to learn and recognize. As such, there must be no syntactic overlap with the switch
statement.
This proposal seeks to preserve the good parts of switch
, and eliminate any reasons to reach for it.
Be better than switch
switch
contains a plethora of footguns such as accidental case fallthrough and ambiguous scoping. This proposal should eliminate those footguns, while also introducing new capabilities that switch
currently can not provide.
Expression semantics
The pattern matching construct should be usable as an expression:
return match { … }
let foo = match { … }
() => match { … }
- etc.
The value of the whole expression is the value of whatever clause is matched.
Exhaustiveness and ordering
If the developer wants to ignore certain possible cases, they should specify that explicitly. A development-time error is less costly than a production-time error from something further down the stack.
If the developer wants two cases to share logic (what we know as “fall-through” from switch
), they should specify it explicitly. Implicit fall-through inevitably silently accepts buggy code.
Clauses should always be checked in the order they’re written, i.e. from top to bottom.
User extensibility
Userland objects should be able to encapsulate their own matching semantics, without unnecessarily privileging builtins. This includes regular expressions (as opposed to the literal pattern syntax), numeric ranges, etc.
Prior Art
This proposal adds a pattern matching expression to the language, based in part on the existing Destructuring Binding Patterns.
This proposal was approved for Stage 1 in the May 2018 TC39 meeting, and slides for that presentation are available. Its current form was presented to TC39 in the April 2021 meeting (slides).
This proposal draws from, and partially overlaps with, corresponding features in Rust, Python, F#, Scala, Elixir/Erlang, and C++.
Userland matching
A list of community libraries that provide similar matching functionality:
- Optionals — Rust-like error handling, options and exhaustive pattern matching for TypeScript and Deno
- ts-pattern — Exhaustive Pattern Matching library for TypeScript, with smart type inference.
Code samples
match (res) {
// match (matchable) {
when ({ status: 200, body, ...rest }) {
// when (pattern) { … }
// ───────↓────── ───↓───
// LHS RHS (sugar for do-expression)
// ───────────↓──────────
// clause
handleData(body, rest);
}
when ({ status: 301 | 304, destination: url }) {
// ↳ `|` (pipe) is the “or” combinator
// ↳ `url` is an irrefutable match, effectively a new name for `destination`
handleRedirect(url);
}
when({ status: 404 }) { retry(req); }
else { throwSomething(); }
// ↳ cannot coexist with top-level irrefutable match, e.g. `when (foo)`
}
res
is the “matchable”. This can be any expression.when (…) { … }
is the “clause”.- the
…
inwhen (…)
is the “pattern”. - Everything after the pattern is the “right-hand side” (RHS), and is sugar for a
do
expression. 301 | 304
uses|
to indicate “or” semantics for multiple patterns- Any valid object or array destructuring is a valid pattern
- An explicit
else
clause handles the “no match” scenario by always matching. It must always appear last when present, as any clauses after anelse
are unreachable.
match (command) {
when ([ 'go', ('north' | 'east' | 'south' | 'west') as dir ]) { … }
when ([ 'take', item ]) { … }
else { … }
}
This sample is a contrived parser for a text-based adventure game. Note the as
keyword, which introduces bindings. In this case, the first clause will match on any of the four compass directions, binding whatever is passed in to dir
for the right-hand side.
match (res) {
if (isEmpty(res)) { … }
when ({ pages, data }) if (pages > 1) { … }
when ({ pages, data }) if (pages === 1) { … }
else { … }
}
This sample is fetching from a paginated endpoint. Note the use of guards (the if
statements), which provide additional conditional logic where patterns aren’t expressive enough.
match (res) {
if (isEmpty(res)) { … }
when ({ data: [page] }) { … }
when ({ data: [frontPage, ...pages] }) { … }
else { … }
}
This is another way to write the previous code sample without a guard, and without checking the page count.
The first when
clause matches if data
has exactly one element, and binds that element to page
for the right-hand side. The second when
clause matches if data
has at least one element, binding that first element to frontPage
, and an array of any remaining elements to pages
.
Note that for this to work properly, iterator results will need to be cached until there’s a successful match, for example to allow checking the first item more than once.
match (arithmeticStr) {
when (/(?<left>\d+) \+ (?<right>\d+)/) as ({ groups: { left, right } }) { process(left, right); }
when (/(?<left>\d+) \+ (?<right>\d+)/) { process(left, right); } // maybe?
else { ... }
}
This sample is a contrived arithmetic expression parser. Regexes are patterns, with the expected semantics.
Named capture groups motivate the user-extensible protocol. It would be intuitive for named capture groups to introduce bindings to the right-hand side. And surely, if regexes can do this, then userland objects should be able to do this as well.
Note the use of the as
keyword to pattern-match the result of this matching protocol (read on for a few more code samples for further detail on this protocol).
Additionally, it would be nice for regex literals to be able to introduce bindings without the with
keyword. This would be a magic special case, but we find it acceptable since it’s still possible to statically analyze the source of all bindings.
const LF = 0x0a;
const CR = 0x0d;
match (token) {
when ^LF { ... }
when ^CR { ... }
else { ... }
}
Here we see the pin operator (^
), which is the escape-hatch from irrefutable matches.
Without ^
, LF
would be an irrefutable match, which would always match regardless of the value of the matchable (token
, here). Then, in the right-hand side, LF
would be bound to the value of token
, shadowing the outer const LF = 0x0a
binding at the top.
With ^
, LF
is evaluated as an expression, which results in the primitive value 0x0a
. This is then matched against token
, and the clause matches only if token
is 0x0a
. The right-hand side sees no new bindings.
^
can only be followed by an identifier, a chain (^foo.bar.baz
), a function call (^foo()
), or a parenthesized expression.
Note: the champions group is not settled on ^
, and is very open to different sigils, a keyword, or any other ideas to distinguish expressions from irrefutable matches.
class Name {
static [Symbol.matcher](matchable) {
const pieces = matchable.split(' ');
if (pieces.length === 2) {
return {
matched: true,
value: pieces
};
}
}
}
match ('Tab Atkins-Bittner') {
when (^Name with [first, last]) if (last.includes('-')) { … }
when (^Name with [first, last]) { … }
else { ... }
}
This sample has two significant parts. First is a contrived name parser, which simply tries to split a string into exactly two space-separated pieces. This parser is contained in a special static [Symbol.matcher]()
method. Next is a match
construct with three clauses: the first matches hyphenated last names, the second matches all names, and the third (the else
) matches anything.
In this case, the pin operator functions a little differently. Name
is still evaluated as an expression, but this time, the result is a class. The engine would then check if that class has a static [Symbol.matcher]()
method, and if so, calls that method on the matchable.
We also see the with
keyword, which is used to destructure and match against the value returned by the matcher protocol.
This operator is probably the thing we’re least happy with, as a champions group. This turns out to be a hard problem to solve. Prior art is a bit of a mixed bag; this is Elixir’s approach. We’re very open to other spellings and other ideas.
Motivating Examples
Matching fetch()
responses:
const res = await fetch(jsonService)
match (res) {
when ({ status: 200, headers: { 'Content-Length': s } }) {
console.log(`size is ${s}`);
}
when ({ status: 404 }) {
console.log('JSON not found');
}
when ({ status }) if (status >= 400) {
throw new RequestError(res);
}
};
More concise, more functional handling of Redux reducers. Compare with this same example in the Redux documentation:
function todoApp(state = initialState, action) {
return match (action) {
when ({ type: 'set-visibility-filter', payload: visFilter }) {
({ ...state, visFilter });
}
when ({ type: 'add-todo', payload: text }) {
({ ...state, todos: [...state.todos, { text, completed: false }] });
}
when ({ type: 'toggle-todo', payload: index }) {
const newTodos = state.todos.map((todo, i) => {
return i !== index ? todo : {
...todo,
completed: !todo.completed
};
});
({
...state,
todos: newTodos,
});
}
else { state } // ignore unknown actions
}
}
Concise props handling inlined with JSX (via Divjot Singh):
<Fetch url={API_URL}>
{props => match (props) {
when ({ loading }) { <Loading />; }
when ({ error }) { <Error error={error} />; }
when ({ data }) { <Page data={data} />; }
}}
</Fetch>
Possible future enhancements
async match
If the match
construct appears inside a context where await
is allowed, await
can already be used inside it, just like inside do
expressions. However, just like async do
expressions, there’s uses of being able to use await
and produce a Promise, even when not already inside an async function
.
async match (await matchable) {
when ({ a }) { await a; }
when ({ b }) { b.then(() => 42); }
else { await somethingThatRejects(); }
} // produces a Promise
AND combinator (&
)
The OR combinator (|
) that we saw earlier tries patterns until one succeeds; this tries patterns until one fails. It allows for more expressive match clauses without having to reach for guards.
There is no precedence relationship between |
and &
, so they cannot be mixed at the same expression level; parentheses are required to avoid the early syntax error.
Nil pattern
match (someArr) {
when [_, _, someVal] { … }
}
Most languages that have structural pattern matching have the concept of a “nil matcher”, which fills a hole in a data structure without creating a binding.
In JS, the primary use-case would be skipping spaces in arrays. This is already covered in destructuring by simply omitting an identifier of any kind in between the commas.
With that in mind, and also with the extremely contentious nature, we would only pursue this if we saw strong support for it.
Destructuring enhancements
Both destructuring and pattern matching should remain in sync, so enhancements to one would need to work for the other.
Catch guards
Allow a catch
statement to conditionally catch an exception:
try {
throw new TypeError('a');
} catch match (e) {
if (e instanceof RangeError) { … }
when (/^abc$/) { … }
else { throw e; } // default behavior
}
Sugar for clause RHS as single expression
To avoid the extra boilerplate of { }
. This would require an explicit separator, to allow for object literals in the bare expression form versus a statement list in the do
expression form.
Using an above example:
const getLength = vector => match (vector) {
when ({ x, y, z }) Math.hypot(x, y, z);
when ({ x, y }) Math.hypot(x, y);
when ([...etc]) vector.length;
};
Terminology
Terms we use when discussing this proposal:
Match construct
Refers to the entire match (…) { … }
expression.
Matchable
The expression to match against; shows up in match (matchable) { … }
.
TODO: non-top-level matchables
Pattern
There are several types of patterns:
“Leaf” patterns
- Primitives, and near-primitives: such as
1
,false
,undefined
,-Infinity
,"foo"
. These match if the matched value is SameValueZero with them. They do not introduce a binding. The set of near-primitive matchers is predefined. It’s not an arbitrary expression:-Infinity
is allowed, but-1 * Infinity
is not. - Irrefutable match / identifier pattern: any identifier, such as
foo
. These always match, and bind the matched value to the given binding name. - Regex literal pattern: the pattern can be any regular expression literal. The matchable is stringified, and this clause matches if the regex matches. If the regex defines named capture groups, the names are automatically bound to the matched substrings.
Destructuring patterns
Array/iterable destructuring
These contain a comma-separated list of zero or more patterns, possibly ending in rest syntax (like ...rest
).
This pattern first verifies that the matched value is iterable, then obtains and consumes the entire iterator. If the result doesn’t have enough values for the provided patterns, the match fails. If the matcher doesn’t end with rest syntax, and the iterator has leftover values after the provided patterns, the match fails (so [a, b]
only matches things with exactly two items). It then recursively applies the patterns to the corresponding items from the iterator, matching only if all of the child clauses match. It accumulates the bindings from each child pattern, and if it ends in rest syntax (like ...someIdentifier
), binds the remainder of the iterator’s values in a fresh Array to that identifier as well.
Iteration results for the matchable are cached for the lifetime of the overarching match construct, so that successive iterations are not required.
Object destructuring
Contains a comma-separated list of <ident>
or <key>
: <pattern>
entries. A key is either an ident, like { foo: … }
, or a computed-key expression, like { [Symbol.foo]: … }
.
An <ident>
by itself (not followed by a : <pattern>
), is treated as if it was followed by an ident pattern of the same name: { foo }
and { foo: foo }
are equivalent (just like in destructuring).
The pattern requires the key to exist on the matched value; if it’s missing, the match fails. The value of the key is then matched against the pattern provided after the key; if that fails, the match fails.
Like array destructuring patterns, the object destructuring pattern can also contain rest syntax (like ...someIdentifier
), which creates a fresh Object
containing all the keys of the matched value that weren’t explicitly matched, and binds it to the provided identifier (just like in object destructuring).
Custom expression matchers
Any identifier, dotted or bracketed expressions, and/or function calls can be immediately prefixed with ^
. Anything more complex must be wrapped in parentheses such as ^(foo + 1)
.
The expression following ^
is then evaluated. If the result is an Object with a [Symbol.matcher]
key, then the engine fetches that key, throws if it’s present and not a function, and calls it on the matchable. The result, like IterationResult
s, must return an Object
, with a truthy matched
property, for the match to be considered successful.
Otherwise, a SameValueZero
test is performed against the matchable.
If the match is successful and the custom matcher has an as
binding declared, the value
property on the MatchResult object will be used for that binding. Example:
const hasMatcher = {
[Symbol.matcher](matchable) {
return {
matched: matchable === 3,
value: { a: 1, b: { c: 2 } },
};
}
};
match (3) {
when ^hasMatcher as { a, b: { c } } {
assert(a === 1);
assert(c === 2);
}
}
Combinators
Patterns can be joined together with a combinator - like |
which has short-circuiting “or” semantics, or perhaps &
which has short-circuting “and” semantics. Patterns can also be followed by with <pattern>
, which for a custom matcher will match against the matcher protocol’’s returned value.
Clause
This refers to either a when
and its associated pattern or an else
, and the expression representing the RHS
TODO: bare guards, with
, as
, etc; “intuitive” definition
Right-hand side (RHS)
The statement list (surrounded with curly braces, with do
expression semantics), or possibly expression, that evaluates when a clause matches successfully, and produces the value that the surrounding match
construct evaluates to.
Pin operator (^
)
May appear inside any pattern, immediately preceding an identifier (^Foo
), a chained expression (^foo?.bar.Class
), a function call (^foo()
or ^foo.bar()
), or a parenthesized expression (^(<any expression>)
). Used to escape from “pattern mode” and enter “expression mode”.