Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

From: Anthony Towns <aj@erisian•com.au>
To: ZmnSCPxj <ZmnSCPxj@protonmail•com>,
	Bitcoin Protocol Discussion
	<bitcoin-dev@lists•linuxfoundation.org>
Subject: Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks
Date: Wed, 23 Mar 2022 09:11:05 +1000	[thread overview]
Message-ID: <20220322231104.GA11179@erisian.com.au> (raw)
In-Reply-To: <NGFW5p2Gl4t6AqL2E29THMT5DbppMJlB6bdUE6nxAdMajxeFcoRNdt5axNLql08EoyIMsBgZHHHYt_MiITZwzyGZIz0iFX4vaKIYrVV2QhU=@protonmail.com>

On Tue, Mar 22, 2022 at 05:37:03AM +0000, ZmnSCPxj via bitcoin-dev wrote:
> Subject: Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks

(Have you considered applying a jit or some other compression algorithm
to your emails?)

> Microcode For Bitcoin SCRIPT
> ============================
> I propose:
> * Define a generic, low-level language (the "RISC language").

This is pretty much what Simplicity does, if you optimise the low-level
language to minimise the number of primitives and maximise the ability
to apply tooling to reason about it, which seem like good things for a
RISC language to optimise.

> * Define a mapping from a specific, high-level language to
>   the above language (the microcode).
> * Allow users to sacrifice Bitcoins to define a new microcode.

I think you're defining "the microcode" as the "mapping" here.

This is pretty similar to the suggestion Bram Cohen was making a couple
of months ago:

https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-December/019722.html
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019773.html
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019803.html

I believe this is done in chia via the block being able to
include-by-reference prior blocks' transaction generators:

] transactions_generator_ref_list: List[uint32]: A list of block heights of previous generators referenced by this block's generator.
  - https://docs.chia.net/docs/05block-validation/block_format

(That approach comes at the cost of not being able to do full validation
if you're running a pruning node. The alternative is to effectively
introduce a parallel "utxo" set -- where you're mapping the "sacrificed"
BTC as the nValue and instead of just mapping it to a scriptPubKey for
a later spend, you're permanently storing the definition of the new
CISC opcode)

> We can then support a "RISC" language that is composed of
> general instructions, such as arithmetic, SECP256K1 scalar
> and point math, bytevector concatenation, sha256 midstates,
> bytevector bit manipulation, transaction introspection, and
> so on.

A language that includes instructions for each operation we can think
of isn't very "RISC"... More importantly it gets straight back to the
"we've got a new zk system / ECC curve / ... that we want to include,
let's do a softfork" problem you were trying to avoid in the first place.

> Then, the user creates a new transaction where one of
> the outputs contains, say, 1.0 Bitcoins (exact required
> value TBD),

Likely, the "fair" price would be the cost of introducing however many
additional bytes to the utxo set that it would take to represent your
microcode, and the cost it would take to run jit(your microcode script)
if that were a validation function. Both seem pretty hard to manage.

"Ideally", I think you'd want to be able to say "this old microcode
no longer has any value, let's forget it, and instead replace it with
this new microcode that is much better" -- that way nodes don't have to
keep around old useless data, and you've reduced the cost of introducing
new functionality.

Additionally, I think it has something of a tragedy-of-the-commons
problem: whoever creates the microcode pays the cost, but then anyone
can use it and gain the benefit. That might even end up creating
centralisation pressure: if you design a highly decentralised L2 system,
it ends up expensive because people can't coordinate to pay for the
new microcode that would make it cheaper; but if you design a highly
centralised L2 system, you can just pay for the microcode yourself and
make it even cheaper.

This approach isn't very composable -- if there's a clever opcode
defined in one microcode spec, and another one in some other microcode,
the only way to use both of them in the same transaction is to burn 1
BTC to define a new microcode that includes both of them.

> We want to be able to execute the defined microcode
> faster than expanding an `OP_`-code SCRIPT to a
> `UOP_`-code SCRIPT and having an interpreter loop
> over the `UOP_`-code SCRIPT.
>
> We can use LLVM.

We've not long ago gone to the effort of removing openssl as a consensus
critical dependency; and likewise previously removed bdb.  Introducing a
huge new dependency to the definition of consensus seems like an enormous
step backwards.

This would also mean we'd be stuck at the performance of whatever version
of llvm we initially adopted, as any performance improvements introduced
in later llvm versions would be a hard fork.

> On the other hand, LLVM bugs are compiler bugs and
> the same bugs can hit the static compiler `cc`, too,

"Well, you could hit Achilles in the heel, so really, what's the point
of trying to be invulnerable anywhere else?"

> Then we put a pointer to this compiled function to a
> 256-long array of functions, where the array index is
> the `OP_` code.

That's a 256-long array of functions for each microcode, which increases
the "microcode-utxo" database storage size substantially.

Presuming there are different jit targets (x86 vs arm?) it seems
difficulty to come up with a consistent interpretation of the cost for
these opcodes.

I'm skeptical that a jit would be sufficient for increasing the
performance of an implementation just based on basic arithmetic opcodes
if we're talking about something like sha512 or bls12-381 or similar.

> Bugs in existing microcodes can be fixed by basing a
> new microcode from the existing microcode, and
> redefining the buggy implementation.
> Existing Tapscripts need to be re-spent to point to
> the new bugfixed microcode, but if you used the
> point-spend branch as an N-of-N of all participants
> you have an upgrade mechanism for free.

It's not free if you have to do an on-chain spend... 

The "1 BTC" cost to fix the bug, and the extra storage in every node's
"utxo" set because they now have to keep both the buggy and fixed versions
around permanently sure isn't free either. If you're re-jitting every
microcode on startup, that could get pretty painful too.

If you're proposing introducing byte vector manipulation and OP_CAT and
similar, which enables recursive covenants, then it might be good to
explain how this proposal addresses the concerns raised at the end of
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-March/020092.html

Cheers,
aj

next prev parent reply	other threads:[~2022-03-22 23:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-22  5:37 ZmnSCPxj
2022-03-22 15:08 ` Russell O'Connor
2022-03-22 16:22   ` ZmnSCPxj
2022-03-22 16:28     ` Russell O'Connor
2022-03-22 16:39       ` ZmnSCPxj
2022-03-22 16:47         ` ZmnSCPxj
2022-03-22 23:11 ` Anthony Towns [this message]
2022-03-23  0:20   ` ZmnSCPxj

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220322231104.GA11179@erisian.com.au \
    --to=aj@erisian$(echo .)com.au \
    --cc=ZmnSCPxj@protonmail$(echo .)com \
    --cc=bitcoin-dev@lists$(echo .)linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox