JSChoi.org
Holistic JavaScript dataflow proposals v2
J. S. Choi • 2021-12-24 • CC-BY 4.0
This is an updated version of an
original article from 2021-12.
Introduction
The
TC39 proposal process
for JavaScript, since
ES6
in 2015, has had many advantages. But it also has a pathology: it
emphasizes new individual features and deemphasizes their mutual
cross-cutting concerns, overlap, and costs, which may result in a
disunited and bloated language. A manifestation of this pathology is
that, as of 2022-03, there are five TC39 proposals for JavaScript that
address
dataflow expressions. Multiple TC39
representatives have expressed concerns about redundancies between these
proposals—that the space as a whole needs to be holistically considered,
that goals need to be more specifically articulated, and that
there is not enough “syntax budget” in the language
to approve all of these proposals. This article is part of an effort to
address the pathology by broadly strategizing about dataflow in general.
In the time since
this article’s first version was
published in 2021-12, TC39 has started to discuss the relationships and
goals between the dataflow proposals in earnest. These discussions have
generated several articles and committee meetings:
In addition, the
call-this operator
has changed in response to those discussions. This article has been
updated with those changes.
Five proposals
The five TC39 dataflow proposals are…
Overview of pipe operator
The pipe operator can apply any operation to an
input value. This includes the operations of nearly all
other dataflow proposals. This also includes numerous operations
that are not possible with any other dataflow proposal.
- Status quo
-
1 + input
- Pipe operator
-
input |> 1 + @
- Status quo
-
[ ...input ]
- Pipe operator
-
input |> [ ...@ ]
- Status quo
-
[ input ]
- Pipe operator
-
input |> [ @ ]
- Status quo
-
{ key: input }
- Pipe operator
-
input |> { key: @ }
- Status quo
-
`${ input }`
- Pipe operator
-
input |> `${ @ }`
- Status quo
-
input input
- Pipe operator
-
input |> new @
- Status quo
-
new C(input, 0)
- Pipe operator
-
input |> new C(@, 0)
- Status quo
-
await input
- Pipe operator
-
input |> await @
- Status quo
-
yield input
- Pipe operator
-
input |> yield @
Overview of Extensions syntaxes
The Extensions proposal adds several new
syntaxes, some of which have meanings determined at
runtime. In particular, Extensions’ ternary form
input::owner:method(arg0)
is polymorphic on the owner object’s type at runtime:
When the owner is a constructor at runtime, then
input::owner:method(arg0)
is equivalent to
owner.prototype.method.call(input, arg)
– overlapping
with the call-this operator.
When the owner is not a constructor at runtime, then
input::owner:method(arg0)
is equivalent to
owner.method(input, arg0)
– overlapping with Function.pipe and the pipe operator.
Extensions also would add special syntax for the get and set
methods of property descriptors.
Object-oriented paradigm—call-this is a subset of Extensions
Both the Extensions operators and the
call-this operator can
call a function with an
input receiver object:
- Status quo
-
owner.method.call(receiver)
- Pipe operator
-
receiver |> owner.method.call(@,
arg0)
-
In this case, despite having an improved word order, the pipe
operator is less readable than the status quo and is therefore
not recommended.
- Call-this operator
-
receiver~>owner.method(arg0)
- Extensions
-
receiver::(owner.method)(arg0)
-
Extensions also provides a shorthand with its ternary
:: :
operator:
receiver::owner:method(arg0)
However, the :: :
operator
is polymorphic—the owner
must be a constructor at runtime.
...and
get an extracted property descriptor’s value
from an input receiver object:
- Status quo
-
property.get.call(receiver)
- Pipe operator
-
receiver
|> property.get.call(@)
-
In this case, despite having an improved word order, the pipe
operator is less readable than the status quo and is therefore
not recommended.
- Call-this operator
-
receiver~>property.get()
- Extensions
-
receiver::property
…and
set an extracted property descriptor’s value
from an input receiver object:
- Status quo
-
property.set.call(receiver,
value)
- Pipe operator
-
receiver |> property.set.call(@,
value)
-
In this case, despite having an improved word order, the pipe
operator is less readable than the status quo and is therefore
not recommended.
- Call-this operator
-
receiver~>property.set(value)
- Extensions
-
receiver::property = value
Functional paradigm—pipe operator, Extensions, and Function.pipe
etc. overlap depending on arity and async
Function.pipe, Function.pipeAsync,
Function.flow, and Function.flowAsync are four
simple helper functions that can apply or compose a
series of unary functions in a tacit (aka
point-free) style.
Function.pipe, Extensions, and the pipe operator all can
sequentially call a series of
unary functions on a solitary
input argument (although Extensions only does this when the
function’s given owner o
is not a
constructor):
- Status quo
-
o.fn1(o.fn0(arg0))
- Pipe operator
-
arg0 |> o.fn0(@) |>
o.fn1(@)
- Function.pipe etc.
-
Function.pipe(arg0, o.fn0,
o.fn1)
- Extensions
-
arg0::o:fn0()::o:fn1()
Extensions and the pipe operator (but not Function.pipe) are also
able to call an n‑ary function on an
input argument, as long as it is the
zeroth argument:
- Status quo
-
o.fn(arg0, arg1, arg2)
- Pipe operator
-
arg0 |> o.fn(@, arg1,
arg2)
- Extensions
-
arg0::o:fn(arg1, arg2)
-
The
:: :
operator is polymorphic—the owner
o
must not be a
constructor at runtime.
The pipe operator alone (neither Function.pipe nor Extensions) can
call an n‑ary function on an
input argument when it is not the zeroth argument:
- Status quo
-
o.fn(arg0, arg1, arg2)
- Pipe operator
-
arg1 |> o.fn(arg0, @,
arg2)
Function.asyncPipe (and the pipe operator) can sequentially
compose a series of async unary functions:
- Status quo
-
await o.fn1(await o.fn0(await arg0))
- Pipe operator
-
await arg0 |> await o.fn0(@)
|> await o.fn1(@)
- Function.pipe etc.
-
await Function.pipeAsync(arg0,
o.fn0,
o.fn1)
In addition, Function.flow (and the pipe operator) can call a
series of unary functions on a
solitary input argument:
- Status quo
-
arg0 => o.fn1(o.fn0(arg0))
- Pipe operator
-
arg0 => arg0
|> o.fn0(@)
|> await o.fn1(@)
- Function.pipe etc.
-
Function.flow(o.fn0,
o.fn1)
Among the dataflow proposals, only the pipe operator can call an
n‑ary async function on an
input argument:
- Status quo
-
await o.fn(await arg0, arg1,
arg2)
- Pipe operator
-
arg0 |> await o.fn(@, arg1,
arg2)
- Status quo
-
await o.fn(arg0, await arg1,
arg2)
- Pipe operator
-
arg1 |> await o.fn(arg0, @,
arg2)
Other Extensions features
In addition to its ::
and
:: :
operators, Extensions would add a syntax devoted
to extracting property descriptors from an
input owner object, using an import
-like
declaration syntax. This overlaps with the pipe operator insofar
that the pipe operator arguably improves the word order of the
extraction.
- Status quo
-
const property =
Object.getOwnPropertyDescriptor(owner,
'property');
- Pipe operator
-
const property = owner |>
Object.getOwnPropertyDescriptor(@, 'property');
- Extensions
const ::{ property } from owner;
Lastly, Extensions would add a runtime metaprogramming system
based on a proposed well-known symbol (Symbol.extension), as well
as an Extensions global namespace object containing convenience
functions. These are not shown here.
Although Extensions originally also proposed a special variable
namespace for extracted properties, its champion has voiced
willingness to drop the special namespace, and it is not included
in this article.
Dataflow fluency
Developers often transform raw data with a sequence of steps, which here
are called “dataflows”. The five TC39 dataflow
proposals attempt to improve the readability of dataflows by making them
more natural, linear, and readable.
Reading order
For example, dataflows with chains of properties/methods (from the
object prototype chain) already benefit from a linear, left-to-right
word order:
kitchen.getFridge().find(pred).count().toString()
In contrast, dataflows that use other expressions (especially function
calls) result in deeply nested expressions, which have nonlinear word
orders that zigzag between left-to-right and right-to-left directions:
count(find(kitchen.getFridge()), pred).toString()
Right now, these “fluent interfaces” are possible only with
prototype-based method chaining, where each method
already belongs to the prototype chain of its receiver. TC39 should
make the benefits of fluent interfaces applicable to more code. The
dataflow proposals, like the pipe operator, would make this possible
in such situations.
kitchen.getFridge()
|> find(@, pred)
|> count(@).toString()
Excessive boilerplate
Reading order is not the only determinant of dataflow fluency.
Excessive boilerplate and visual noise can also compromise dataflow
fluency.
For example, although the pipe operator improves fluency in many
expressions by rearranging their reading order, developers should be
discouraged to use the pipe operator on ordinary method chains, such
as:
kitchen.getFridge().find(pred).count().toString()
Even though the version with the pipe operator still has a fluent word
order, its excessive boilerplate and visual noise unnecessarily worsen
the dataflow’s readability:
kitchen
|> @.getFridge()
|> @.find(pred)
|> @.count()
|> @.toString()
Furthermore,
.call
is very common in
object-oriented code for various reasons. Despite this, the
.call
method is clunky and not
fluent. It introduces a unnatural, zigzagging reading order that breaks
otherwise left-to-right linear dataflows:
find.call(kitchen.getFridge(), pred)
|> count(@).toString()
Introducing the pipe operator improves the word order to be
consistently left to right, but the result is perhaps even less
readable. Excessive boilerplate separates the function from its
receiver and arguments:
kitchen.getFridge()
|> find.call(@,
pred) |> count(@).toString()
A separate operator like call-this would be needed to
improve the word order without otherwise compromising readability.
Developers would be discouraged from using the pipe operator with
.call
due to poor readability:
kitchen.getFridge()~>find(@, pred)
|> count(@).toString()
Likewise, dataflow could generally be replaced with series of
temporary variables, but temporary variables may also
reduce fluency by introducing excessive boilerplate and
visual noise:
const fridge =
kitchen.getFridge();
const findResult =
find.call(fridge,
pred);
const c =
count(findResult);
c.toString()
Ecosystem schism—status quo
JavaScript is a multiparadigm language, but its design has resulted in
an ecosystem schism, fragmented between two paradigms:
-
“Object-oriented” APIs that use
this
-based functions.
-
“Functional” APIs that use
non-
this
-based functions.
Furthermore, both the
object-oriented paradigm and
functional paradigm in JavaScript may be split
into several sub-paradigms:
-
For the object-oriented paradigm, function
calls can either use
functions from prototype chains or free
this
-using functions.
-
For the functional paradigm, function calls may
be unary (e.g., curried) or
n-ary, and n-ary function calls’ inputs of
interest may be either their zeroth arguments or their last
arguments.)
These two paradigms have different trade-offs, and developers must
choose between APIs that use one or the other. Perhaps the two most
important trade-offs are in dataflow fluency and in
module splitting:
Virality vs. interoperability
& module splitting
Dataflow fluency and modular splitting exert different pressures on API
design, and these pressures have shifted over time.
Dataflow fluency has always been important, and it may have been a
major factor in object-oriented APIs’ general
dominance of the JavaScript ecosystem—such as with jQuery or
Mocha—since the 2000s or earlier.
In contrast, although code payload weight and
time to interactivity have
also always been important, it was relatively difficult to perform
module splitting with then-existing solutions such as
Google Closure Compiler.
Only the recent advent of native modules in the core language has made
module tree shaking and module splitting by
esbuild,
Rollup, and
Webpack more generally practical
and popular. Thus, module splitting has exerted significant pressure
on API designs only recently.
Module splitting’s recent pressure has grown so great that prominent
libraries such as the
Firebase JS SDK have switched from the
object-oriented paradigm to a
functional paradigm.
Developers are willing to trade the prototype-based object-oriented
paradigm’s dataflow fluency for functional paradigm’s module
splitting—and, to do so, they are willing to make breaking,
extensively backwards-incompatible changes.
Currently, object-oriented dataflow and
functional dataflow currently
do not mix well;
as illustrated before, mixing the two results in zigzagging reading
orders. In addition, only the functional paradigm,
and not the object-oriented paradigm, is
currently compatible with code-splitting tools that
work with separately importable or tree-shakeable functions.
Therefore, as those tools (and APIs that rely on them) spread, so too
will the functional paradigm also spread
through the ecosystem, exerting pressure on APIs to make breaking
changes away from the
object-oriented paradigm…and, therefore, to
give up its dataflow fluency.
This ecosystem virality is a manifestation of a
principle articulated by Waldemar Horwat in the
2022-01-27 TC39 dataflow meeting:
The answer to whether multiple ways or syntaxes of doing something
are harmful critically depends on the duplication’s effect on APIs
and how viral it is.
Suppose we’re considering having two syntaxes
𝘟 and 𝘠 to use APIs.
If module or person 𝘈 uses syntax
𝘟 which interoperates better with syntax
𝘟 than syntax 𝘠 and
that pressures module or person 𝘉 to use
syntax 𝘟 in their new APIs to interoperate
with person 𝘈’s APIs, that virality
encourages ecosystem forking and API wars. Introducing multiple such
ways into the language is bad.
On the other hand, if person 𝘈’s choice of
syntax [i.e., 𝘟] has no effect on person
𝘉[’s choice of syntax,
𝘠,] and they can interoperate without any
hassles, then that’s generally benign.
In this case, 𝘟 would be
functional APIs (and the module-splitting,
tree-shaking tools that encourage them), and
𝘠 would be
object-oriented APIs. Mixing
𝘟 and 𝘠 results in
scrambled dataflow and inability to split and tree-shake all module
code. Because of this poor interoperability, developers who use
𝘟 are pressured to use
𝘟 in their new APIs.
Prescriptivism vs. flexibility
Whether or not this ongoing ecosystem schism is desirable or not
depends on one’s opinion regarding prescriptivism. Constraints are a
source of creativity and thinking hard about problems. Therefore, one
may inclined to encourage migration to one single paradigm such as
functional programming.
However, JavaScript is a multiparadigm language. Its core design
combines procedural, functional, and object-oriented principles.
Programmers can work in a variety of styles within JavaScript – freely
intermixing constructs from different paradigms, admitting that a
single paradigm cannot solve all problems in the most efficient way,
and choosing the best tool for the job. To prescribe
one single paradigm as preferred reduces the
flexibility intrinsic to the language.
More concretely, there is a large body of existing JavaScript code
that still uses the object-oriented paradigm,
in addition to the built-in objects of the core language. Developers
will always need to fluently interoperate with such
object-oriented APIs, even when they prefer the
functional paradigm and use tools that work
better with the functional paradigm.
Therefore, TC39 should still seek to bridge the growing ecosystem
schism between two of JavaScript’s fundamental paradigms. Although
constraints are a source of creativity,
TMTOWTDI
is a part of the language’s core identity and everyday work.
Bridging the schism
Assuming that TC39 did want to bridge the ecosystem schism between the
object-oriented paradigm and the
functional paradigm, then the solution would be
to improve interoperability between them: This would
involve improvements in both dataflow fluency and in module splitting:
A dataflow proposal should be easier to mix any paradigm in fluent,
linear dataflows using tree-shakeable functions.
Each of the five dataflow proposals attempts make one of the paradigms
more fluid and more tree shakeable—therefore improving interoperability
between the paradigms:
Here, the call-this operator and the
pipe operator are grouped together because they are
complementary. Because of its generality, the pipe
operator would ostensibly cover the entire space of paradigms, including
the object-oriented paradigm. However,
as illustrated before, the pipe operator does not improve dataflow
fluency of object-oriented calls. Therefore, the pipe operator and the call-this operator do not
practically overlap. Instead, they are complementary: the call-this
operator improves object-oriented dataflow, and
the pipe operator improves only
functional dataflow.
For example, the call-this operator and the pipe operator together would
improve this:
import { x0, x1 } from
'fp';
import { y0, y1 } from
'oop';
y1.call(x1(y0.call(x0(input.m0(a0),
a1),
a2),
a3).m1(a4),
a5);
…into this:
input.m0(a0) |>
x0(@,
a1)~>y0(a2) |>
x1(@,
a3).m1(a4)~>y1(a5);
With the call-this operator and pipe operator combined, the developer
can freely switch between the two paradigms while maintaining readable,
linear dataflow. Object-oriented calls may use
the prototype chain (like .m0()
and
.m0()
) or may use individually
imported, tree-shakeable this
-using functions (like
~>y0()
and
~>y1()
).
Functional calls may also be fluently mixed in as
needed (like |> x0()
and
|> x1()
). In this way, this
combination of new operators could bridge the ecosystem schism.
Function.pipe and PFA syntax may also
improve dataflow fluency for the
functional paradigm, although they do not affect
the object-oriented paradigm. In addition,
without PFA syntax, Function.pipe affects only
unary function calls and cannot improve
n-ary function calls without currying helper
metafunctions.
The Extensions syntaxes may also improve dataflow
fluency for the object-oriented paradigm—as well
as the functional paradigm in a
limited scenario (when a function call belongs to an owner that
is not a constructor, and when its zeroth argument is the
input of interest).
Ecosystem schism—future risk
Ratifying any of the dataflow proposals would hopefully improve the
ecosystem schism, but it may worsen the schism instead. The risk of
worsened schism depends on paradigm interoperability and the balance
between pressures on API designers (i.e., dataflow fluency and module
splitting). Only proposals that interoperate well—both with the status
quo and with each other—should be added to the language.
Because call-this improves
object-oriented dataflows (specifically free
this
-using functions) but not
functional dataflows, it will encourage
developers to use the former instead of the latter. This would be
exacerbated without the pipe operator, without which developers cannot
use n-ary functions in linear dataflows.
Likewise, because the pipe operator improves
functional dataflows but not
object-oriented dataflows (see
excessive boilerplate), adding the
pipe operator without call-this may accelerate API migration to the
former from the latter. This would be exacerbated without the call-this
operator, without which developers cannot use free, tree-shakeable
this
-using functions in linear dataflows.
In contrast, combining the call-this operator and pipe operator would
equalize fluency and tree shaking between
object-oriented dataflows and
functional dataflows—increasing mutual
compatibility in dataflow expressions and in module-splitter tooling—and
therefore, perhaps, reducing ecosystem-schism risk.
Similarly to the pipe operator, PFA syntax with Function.pipe also
emphasizes functional dataflows. Therefore, PFA
syntax with Function.pipe may accelerate migration to
functional paradigm from
object-oriented dataflows. This schism may be
ameliorated by call-this.
Function.pipe alone, without PFA syntax or the pipe operator, may
encourage developers to use more
curried unary function calls, instead of either
n-ary function calls or
object-oriented dataflows.
Lastly, Extensions improves a mixture of
object-oriented dataflows and a limited subset of
functional dataflows (when a
function call belongs to an owner that is
not a constructor, and when its zeroth argument is the
input of interest).
Runtime cost
There is one more factor of the dataflow proposals to consider, which is
runtime cost.
The following proposals would be
“zero-cost” abstractions, which do not have memory or
time overhead during runtime:
-
The pipe operator is rewritable as nested assignment
then reference of a lexical variable.
-
The call-this operator is rewritable as a call to
Function.prototype.call; it may in fact be
faster compared to using
.call
, because no dynamic property lookup from Function.prototype is
necessary.
The following proposals involve runtime allocation or dynamism:
-
Function.pipe necessarily involves calls on
constructed functions; it encourages prolific
use of higher-order functions, particularly
curried unary functions. Therefore, it may
increase object allocation, memory churn, and garbage collection.
-
Like Function.pipe, PFA syntax also necessarily
involves constructed functions. It encourages
prolific use of higher-order functions.
Therefore, it increases object allocation, memory churn, and garbage
collection.
-
Extensions involves several syntaxes, several of
which involve dynamic type dispatch during runtime. In particular,
the
lhs::owner:fn(a0)
operator is polymorphic on its owner
’s type. If
owner
is a constructor, then the operation is
equivalent to
owner.prototype.fn.call(lhs,
a0)
. Otherwise, the operation is equivalent to
owner.fn(lhs, a0)
.
In addition, Extensions would add a runtime metaprogramming system
based on a well-known symbol (Symbol.extension). This affects the
behavior of both the ::
and
:: :
operators.
Due to both their type polymorphism and their metaprogramming
hook, the Extensions operators may therefore be difficult to
statically analyze and necessarily involve dynamic dispatch (at
least before just-in-time analysis).
Appendix—recent TC39 findings
From the
2022-01-26 TC39 plenary meeting
and the
2022-01-27 TC39 ad-hoc post-plenary meeting:
The meeting on 2022-01-27 agreed that it would be okay to have both the
pipe operator and call-this (although Mark Miller, MM, was not present
at the 2022-01-27 meeting).
Mark Miller (MM) does not wish for more than one syntactic dataflow
proposal to advance; he is most positive about the pipe
operator.
Jordan Harband (JHD) will block the pipe operator if call-this does
not advance.
Yulia Starsev (YSV) is uncomfortable with the pipe operator’s
flexibility, but she will not block the pipe operator.
“In general, some overlap is okay, but too much is bad; we have to
decide this on a case-by-case basis.”
“The overlap between the pipe operator and Function.pipe is okay.”
“Extensions and call-this are still mutually exclusive.”
Extensions continues to polarize TC39. HE Shi-Jun aka John Hax (JHX,
Extensions’ champion) plans to present an update on Extensions to a
future plenary, in an effort to increase TC39 support.
PFA syntax also continues to polarize TC39. Ron Buckton (RBN, PFA
syntax’s champion) is waiting for the further advancement of the pipe
operator until trying PFA syntax again.
The discarded F#-style pipe operator is still off the table.