Hi ZmnSCPxj,

Funny you mention BlueSpec -- I actually took 6.175[1] in my undergraduate
studies with Arvind, BlueSpec's creator, and have often cited it as an
inspiration for Sapio given that the target of program compilation is
essentially a transaction circuit and I have a decent amount of experience
working in BlueSpec.

It's entirely the point that Sapio program expansion is Turing complete
whereas the resultant program is "fixed". Via updatable finish clauses
(which require some signature) it's also possible to track the logic for
what states should be generated upon cooperation of parties, making Sapio
turing complete description language for federated operators. I've
previously described it as follows:

*But how is Sapio different from Miniscript? From Solidity? A metaphor that
I like is that tools like Miniscript operate at the physical-key level
(house keys, car keys, train pass, office key). Sapio operates at the
commute level. Miniscript answers, "how do I unlock my car? How do I pay
for the train? How do I get in my office?"*. *Sapio lets you specify that,
"my morning commute is to leave my house, go to my car, start it and drive
to the office, and park in the lot; OR go to the train station, use the
train pass to pay fare, take the train for an hour, then* *walk 3 blocks
north to my  office; and in either case unlock the office door, and enter".
Sapio has plans to integrate Miniscript as it stabilizes *[currently
integrated]* as the backend key description language. Solidity and Sapio
are more similar. Sapio's metaprogramming language is Turing Complete and
allows you to specify a rich set of constraints for the contract you're
building, but can only ever produce a finite deterministic "binary" of
transactions. Solidity on the other hand compiles deterministically, but
the executed binary is Turing Complete. Further, Sapio contracts are
"stateless", whereas Solidity has mutable state. Lastly, Solidity contracts
are non-isolated from one another in the EVM. In Sapio, contracts execute
only with the components you specify in scope. In sum, Sapio has a rich
descriptive power for smart contract programming flows but a limited and
safe execution semantics.*


The DSL v.s. e-DSL is a great question, and while there are surely benefits
to being a full-fledged standalone DSL, here's why the e-DSL approach is
superior for everything Sapio cares about.

Sapio is built as a shallow e-DSL in Rust. Everything in Sapio can be
expressed (macro free -- they're relatively light conveniences) as pure
rust. There's also a more "API Like" interface where Sapio objects can be
built out dynamically by other rust code easily. This means that if you
wanted to come up with a custom DSL for a subset of Sapio programs, you
could still very easily target Sapio (at runtime) as the AST processor.
Said "SapioScript" can be an entirely separate crate targeting this base.

This is also nice because it keeps the Sapio core codebase relatively tight
compared to what would be required were Sapio to be a "full language", and
have tens of thousands of lines of custom lexing, parsing, type systems,
etc etc.

Yes, you have to learn Rust, but Rust is one of the most popular languages
for systems programming. It means that there's a library for almost any
functionality you could want. There's tooling for building for any platform
-- Sapio targets WASM happily, which helps with compile-once run sandboxed
anywhere (this really helps for using Sapio as a replicated state machine
for channels).

I agree with you that Haskell was previously a limiting factor for
BlueSpec, but Rust is not Haskell. Rust is incredibly popular, and easier
to learn (author's opinion) than Haskell, C++, or even Python (perhaps
biased, but I always struggled with python for more than simple scripts
knowing when objects were copied or referenced -- an initial version of
Sapio was in python, but that codebase collapsed under its own
complexity... and I had *great* MyPy coverage). It's certainly easier to
learn Rust than to learn Sapio as a DSL -- Sapio requires learning an
entirely novel way of thinking about structuring programs already, a DSL
would require learning both the "Sapio Programming Model" and "The Sapio
DSL". With embedded Rust, you can transfer all existing knowledge on Rust
programming and add a veneer of Sapio.

Further Sapio is designed to not just compile to smart contracts as Bitcoin
addresses, but be able to be deeply integrated inside of an application.
For example, suppose you wanted to fetch keys for a contract from a
database, query a network oracle for a state resolution, or something else.
A DSL would scope creep infinitely or require numerous hacks, and at that
point we're largely better off benefitting from the larger Rust community's
efforts at providing excellent APIs for any task.

Therefore to make Sapio functional for building and deploying real bitcoin
smart contract applications, a rust eDSL was not just the natural, it was
the only choice.

W.r.t. to succinctness and "extra concepts" I'll admit that there is some
disadvantage to Rust. There's a borrow checker -- which can be mostly
defeated if you don't care about performance with Arc / Clone. You have to
manually impl some traits -- but the trait system ends up not being bloat,
but central to making contract state machines
https://learn.sapio-lang.org/ch08-01-state-machines.html. And if you do
actually look at the Sapio programs themselves, they are still quite
succinct comparatively. I can imagine them being a bit shorter, but I think
optimizing for the shortest possible utterance is an anti-goal for safety
-- I aim for clarity. And that's where I don't exactly hate the borrow
checker, since it makes it easier to tell when sub-contracts are using the
same or modified data.

Best,

Jeremy

[1] http://csg.csail.mit.edu/6.175/archive/2014/index.html
--
@JeremyRubin <https://twitter.com/JeremyRubin>
<https://twitter.com/JeremyRubin>


On Fri, Apr 16, 2021 at 7:36 AM ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:

> Good morning Jeremy, et al.,
>
>
> > Bitcoin Developers,
> >
> > I'm very excited to introduce Sapio[0] formally to you all.
>
> This seems quite interesting to me as well!
>
> I broadly agree with the rant on monetary units.
> In C-Lightning we always (except for some legacy fields that will
> eventually be removed) output values as strings with an explicit `msat`
> unit, even for onchain values (the smallest of which are satoshi, but for
> consistency we always print as millisatoshi), and accept explicit `btc`,
> `sat`, and `msat` units.
>
> --
>
> Personally I would have used a non-embedded DSL.
>
> In practice an embedded DSL requires a user to learn two languages --- the
> hosting language and the embedded language.
> Whereas if you designed a non-embedded DSL, a new user would have to learn
> only one language.
> For instance, if an error is emitted, then the user has to know whether
> the error comes from the hosting language compiler, or the embedded
> language implementation.
>
> In a past career embedded DSLs for hardware description languages were
> being pushed, and we found that one of the drawbacks was the need to learn
> as well the hosting language --- at some point Haskell-embedded DSLs became
> so unpopular that anything that was even Haskell-related had a negative
> reaction in some hardware design shops.
> For example BlueSpec originally was a Haskell-embedded DSL, and eventually
> implemented a Verilog-like syntax that was not embedded in Haskell,
> becoming BlueSpecSystemVerilog.
>
> Further, as per coding theory, the hosting language is often quite generic
> and can talk about anything, including other embedded languages, thus we
> expect (all other things being equal) that in general, an utterance in an
> embedded DSL will be longer than an utterance in a non-embedded DSL (as
> there is more things to talk about, more symbols are necessary, and thus we
> expect things to be longer in the generic hosting language).
> Whereas a non-embedded DSL can cut away most of the extra verbage needed
> to introduce to the hosting language implementation, in order to indicate
> the "entry" into the domain-specific language.
>
> --
>
> If my understanding is correct, I seem, that the hosting language is a
> full, general, Turing-complete language, that "builds up" a total
> (non-Turing-complete) contract description.
>
> I have had (private) speculations before that it would be possible to
> design a language with two layers:
>
> * A non-Turing-complete total "base language".
> * A syntax meta-language similar to Scheme `syntax-rules`, which
> constructs ASTs for the "base language".
>
> Note that Scheme `syntax-rules` is indeed Turing-complete, as a macro can
> expand to a form with two lists that form two "ends" of a tape, and act as
> a Turing machine on that tape, thus Turing-equivalent.
> It is not a general language as it lacks many basic practicalities, but as
> pure computation, indeed it is possible to compute anything in that
> language.
>
> The advantage of this scheme is that the meta-language is executed at
> language compile time, and the developer can see (by observing the
> compilation process) whether the meta-program halts or not.
> However, the end-user executing the program is assured that the program,
> delivered as a compiled binary, will indeed terminate, as the base language
> is total and non-Turing-complete (i.e. the halting problem is trivial for
> the base language --- all programs halt).
>
> I even have started designing a syntax scheme that adds in infix notation
> and indent-sensitivity to a Lisp-like syntax, at the cost of disallowing
> typical Lisp-like names like `pair?`, e.g.
>
>     foo x = value (bar x)
>       where
>         bar x = x
>
> is equivalent to:
>
>     (`=` (foo x)
>          (value (bar x)
>                 (where
>                   (`=` (bar x) x))))
>
> I can provide more details if interested.
>
> Note that the base language is not embedded in the meta-language, as the
> meta-language is effectively only capable of talking about how the
> utterance in the base language is constructed --- the meta-language is not
> quite general enough (i.e. the meta-language cannot implement "Hello
> World").
> Thus coding theory should imply that this should lead to more succinct
> utterances (in general).
> From this point of view, language design is about striking a balance
> between the low input bandwidth of neurotypical human brains (thus
> compression is needed, i.e. the language encourages succinct programs) and
> the limited processing power of neurotypical human brains (thus
> decompression speed is needed, i.e. it should be obvious what something
> expands to).
>
>
> Regards,
> ZmnSCPxj
>