Regular Expression Buffer Boundaries for ECMAScript
Status
Stage: 1
Champion: Ron Buckton (@rbuckton)
For detailed status of this proposal see TODO, below.
Authors
- Ron Buckton (@rbuckton)
Motivations
NOTE: See https://github.com/rbuckton/proposal-regexp-features for an overview of how this proposal fits into other possible future features for Regular Expressions.
Buffer Boundaries are a common feature across a wide array of regular expression engines that
allow you to match the start or end of the entire input regardless of whether the m
(multiline) flag
has been set. Buffer Boundaries also allow you to match the start/end of a line and the start/end of
the input in a single RegExp using the m
flag.
Prior Art
See https://rbuckton.github.io/regexp-features/features/buffer-boundaries.html for additional information.
Syntax
Buffer boundaries are similar to the ^
and $
anchors, except that they are not affected by the m
(multiline) flag:
\A
— Matches the start of the input.\z
— Matches the end of the input.\Z
— A zero-width assertion consisting of an optional newline at the end of the buffer. Equivalent to(?=\R?\z)
.
NOTE: Requires the
u
orv
flag, as\A
,\z
, and\Z
are currently just escapes forA
,z
andZ
without theu
orv
flag.
NOTE: Not supported inside of a character class.
For more information about the v
flag, see https://github.com/tc39/proposal-regexp-set-notation.
For more information about the \R
escape sequence, see https://github.com/rbuckton/proposal-regexp-r-escape.
Examples
// without buffer boundaries
const pattern = String.raw`^foo$`;
const re1 = new RegExp(pattern, "u");
re1.test("foo"); // true
re1.test("foo\nbar"); // false
const re2 = new RegExp(pattern, "um");
re1.test("foo"); // true
re1.test("foo\nbar"); // true
// with buffer boundaries
const pattern = String.raw`\Afoo\z`;
const re1 = new RegExp(pattern, "u");
re1.test("foo"); // true
re1.test("foo\nbar"); // false
const re2 = new RegExp(pattern, "um");
re1.test("foo"); // true
re1.test("foo\nbar"); // false
// mixing buffer boundaries and anchors
const re = /\Afoo|^bar$|baz\z/um;
re.test("foo"); // true
re.test("foo\n"); // true
re.test("\nfoo"); // false
re.test("bar"); // true
re.test("bar\n"); // true
re.test("\nbar"); // true
re.test("baz"); // true
re.test("baz\n"); // false
re.test("\nbaz"); // true
// trailing buffer boundary
const re = /end\Z/u;
re.test("end"); // true
re.test("end\n"); // true (optional newline)
re.test("end\n\n"); // false
History
- October 28, 2021 — Proposed for Stage 1 (slides)
- Outcome: Advanced to Stage 1
TODO
The following is a high-level list of tasks to progress through each stage of the TC39 proposal process:
Stage 1 Entrance Criteria
- Identified a "champion" who will advance the addition.
- Prose outlining the problem or need and the general shape of a solution.
- Illustrative examples of usage.
-
High-level API.
Stage 2 Entrance Criteria
- Initial specification text.
- Transpiler support (Optional).
Stage 3 Entrance Criteria
- Complete specification text.
- Designated reviewers have signed off on the current spec text.
- The ECMAScript editor has signed off on the current spec text.
Stage 4 Entrance Criteria
- Test262 acceptance tests have been written for mainline usage scenarios and merged.
- Two compatible implementations which pass the acceptance tests: [1], [2].
- A pull request has been sent to tc39/ecma262 with the integrated spec text.
- The ECMAScript editor has signed off on the pull request.