public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: ZmnSCPxj <ZmnSCPxj@protonmail•com>
To: Anthony Towns <aj@erisian•com.au>
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists•linuxfoundation.org>
Subject: Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks
Date: Wed, 23 Mar 2022 00:20:16 +0000	[thread overview]
Message-ID: <6z4zgwg-r_EKOmZKCC1KyCmSjkZBbzHOKXHiMQf6th4r_PHDbMuCqSQ366hz6LRhdX25YI6IElcr9bFOVsu78UUns-ZNIt-YPgMqEwyg9ZM=@protonmail.com> (raw)
In-Reply-To: <20220322231104.GA11179@erisian.com.au>

Good morning aj,

> On Tue, Mar 22, 2022 at 05:37:03AM +0000, ZmnSCPxj via bitcoin-dev wrote:
>
> > Subject: Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks
>
> (Have you considered applying a jit or some other compression algorithm
> to your emails?)
>
> > Microcode For Bitcoin SCRIPT
> >
> > =============================
> >
> > I propose:
> >
> > -   Define a generic, low-level language (the "RISC language").
>
> This is pretty much what Simplicity does, if you optimise the low-level
> language to minimise the number of primitives and maximise the ability
> to apply tooling to reason about it, which seem like good things for a
> RISC language to optimise.
>
> > -   Define a mapping from a specific, high-level language to
> >     the above language (the microcode).
> >
> > -   Allow users to sacrifice Bitcoins to define a new microcode.
>
> I think you're defining "the microcode" as the "mapping" here.

Yes.

>
> This is pretty similar to the suggestion Bram Cohen was making a couple
> of months ago:
>
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-December/019722.html
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019773.html
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019803.html
>
> I believe this is done in chia via the block being able to
> include-by-reference prior blocks' transaction generators:
>
> ] transactions_generator_ref_list: List[uint32]: A list of block heights of previous generators referenced by this block's generator.
>
> -   https://docs.chia.net/docs/05block-validation/block_format
>
>     (That approach comes at the cost of not being able to do full validation
>     if you're running a pruning node. The alternative is to effectively
>     introduce a parallel "utxo" set -- where you're mapping the "sacrificed"
>     BTC as the nValue and instead of just mapping it to a scriptPubKey for
>     a later spend, you're permanently storing the definition of the new
>     CISC opcode)
>
>

Yes, the latter is basically what microcode is.

> > We can then support a "RISC" language that is composed of
> > general instructions, such as arithmetic, SECP256K1 scalar
> > and point math, bytevector concatenation, sha256 midstates,
> > bytevector bit manipulation, transaction introspection, and
> > so on.
>
> A language that includes instructions for each operation we can think
> of isn't very "RISC"... More importantly it gets straight back to the
> "we've got a new zk system / ECC curve / ... that we want to include,
> let's do a softfork" problem you were trying to avoid in the first place.

`libsecp256k1` can run on purely RISC machines like ARM, so saying that a "RISC" set of opcodes cannot implement some arbitrary ECC curve, when the instruction set does not directly support that ECC curve, seems incorrect.

Any new zk system / ECC curve would have to be implementable in C++, so if you have micro-operations that would be needed for it, such as XORing two multi-byte vectors together, multiplying multi-byte precision numbers, etc., then any new zk system or ECC curve would be implementable in microcode.
For that matter, you could re-write `libsecp256k1` there.

> > Then, the user creates a new transaction where one of
> > the outputs contains, say, 1.0 Bitcoins (exact required
> > value TBD),
>
> Likely, the "fair" price would be the cost of introducing however many
> additional bytes to the utxo set that it would take to represent your
> microcode, and the cost it would take to run jit(your microcode script)
> if that were a validation function. Both seem pretty hard to manage.
>
> "Ideally", I think you'd want to be able to say "this old microcode
> no longer has any value, let's forget it, and instead replace it with
> this new microcode that is much better" -- that way nodes don't have to
> keep around old useless data, and you've reduced the cost of introducing
> new functionality.

Yes, but that invites "I accidentally the smart contract" behavior.

> Additionally, I think it has something of a tragedy-of-the-commons
> problem: whoever creates the microcode pays the cost, but then anyone
> can use it and gain the benefit. That might even end up creating
> centralisation pressure: if you design a highly decentralised L2 system,
> it ends up expensive because people can't coordinate to pay for the
> new microcode that would make it cheaper; but if you design a highly
> centralised L2 system, you can just pay for the microcode yourself and
> make it even cheaper.

The same "tragedy of the commons" applies to FOSS.
"whoever creates the FOSS pays the cost, but then anyone can use it and gain the benefit"
This seems like an argument against releasing a FOSS node software.

Remember, microcode is software too, and copying software does not have a tragedy of the commons --- the main point of a tragedy of the commons is that the commons is *degraded* by the use but nobody has incentive to maintain against the degradation.
But using software does not degrade the software, if I give you a copy of my software then I do not lose my software, which is why FOSS works.

In order to make a highly-decentralized L2, you need to cooperate with total strangers, possibly completely anonymously, in handling your money.
I imagine that the level of cooperation needed in, say, Lightning network, would be far above what is necessary to gather funds from multiple people who want a particular microcode to happen until enough funds have been gathered to make the microcode happen.

For example, create a fresh address for an amount you, personally, are willing to contribute in order to make the microcode happen.
(If you are willing to spend the time and energy arguing on bitcoin-dev, then you are willing to contribute, even if others get the benefit in addition to yourself, and that time and energy has a corresponding Bitcoin value)
Then spend it using a `SIGHASH_ANYONECANPAY | SIGHASH_SINGLE`, with the microcode introduction outpoint as the single output you are signing.
Gather enough such signatures from a community around a decentralized L2, and you can achieve the necessary total funds for the microcode to happen.


> This approach isn't very composable -- if there's a clever opcode
> defined in one microcode spec, and another one in some other microcode,
> the only way to use both of them in the same transaction is to burn 1
> BTC to define a new microcode that includes both of them.

Yes, that is indeed a problem.

> > We want to be able to execute the defined microcode
> > faster than expanding an `OP_`-code SCRIPT to a
> > `UOP_`-code SCRIPT and having an interpreter loop
> > over the `UOP_`-code SCRIPT.
> > We can use LLVM.
>
> We've not long ago gone to the effort of removing openssl as a consensus
> critical dependency; and likewise previously removed bdb. Introducing a
> huge new dependency to the definition of consensus seems like an enormous
> step backwards.
>
> This would also mean we'd be stuck at the performance of whatever version
> of llvm we initially adopted, as any performance improvements introduced
> in later llvm versions would be a hard fork.

Yes, LLVM is indeed the weak link in this idea.
We could use NaCl instead, that has probably fewer issues /s.

> > On the other hand, LLVM bugs are compiler bugs and
> > the same bugs can hit the static compiler `cc`, too,
>
> "Well, you could hit Achilles in the heel, so really, what's the point
> of trying to be invulnerable anywhere else?"

Yes, LLVM is indeed the weak point here.

We could just concatenate some C++ code together when a new microcode is introduced, and compile it statically, then store the resulting binary somewhere, and invoke it at the appropriate time to run validation.
At least LLVM would be isolated into its own process in that case.

> > Then we put a pointer to this compiled function to a
> > 256-long array of functions, where the array index is
> > the `OP_` code.
>
> That's a 256-long array of functions for each microcode, which increases
> the "microcode-utxo" database storage size substantially.
>
> Presuming there are different jit targets (x86 vs arm?) it seems
> difficulty to come up with a consistent interpretation of the cost for
> these opcodes.
>
> I'm skeptical that a jit would be sufficient for increasing the
> performance of an implementation just based on basic arithmetic opcodes
> if we're talking about something like sha512 or bls12-381 or similar.

Static compilation seems to work well enough --- and JIT vs static is a spectrum, not either/or.
The difference is really how much optimization you are willing to use.
If microcodes are costly enough that they happen rarely, then using optimizations that are often used only in static compilation, seems a reasonable tradeoff

> > Bugs in existing microcodes can be fixed by basing a
> > new microcode from the existing microcode, and
> > redefining the buggy implementation.
> > Existing Tapscripts need to be re-spent to point to
> > the new bugfixed microcode, but if you used the
> > point-spend branch as an N-of-N of all participants
> > you have an upgrade mechanism for free.
>
> It's not free if you have to do an on-chain spend...
>
> The "1 BTC" cost to fix the bug, and the extra storage in every node's
> "utxo" set because they now have to keep both the buggy and fixed versions
> around permanently sure isn't free either.

Heh, poor word choice.

What I meant is that we do not need a separate upgrade mechanism, the design work here is "free".
*Using* the upgrade mechanism is costly and hence not "free".

> If you're re-jitting every
> microcode on startup, that could get pretty painful too.

When LLVM is used in a static compiler, it writes the resulting code on-disk, I imagine the same mechanism can be used.

> If you're proposing introducing byte vector manipulation and OP_CAT and
> similar, which enables recursive covenants, then it might be good to
> explain how this proposal addresses the concerns raised at the end of
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-March/020092.html

It does not, I am currently exploring and generating ideas, not particularly tying myself to one idea or another.

Regards,
ZmnSCPxj


      reply	other threads:[~2022-03-23  0:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-22  5:37 ZmnSCPxj
2022-03-22 15:08 ` Russell O'Connor
2022-03-22 16:22   ` ZmnSCPxj
2022-03-22 16:28     ` Russell O'Connor
2022-03-22 16:39       ` ZmnSCPxj
2022-03-22 16:47         ` ZmnSCPxj
2022-03-22 23:11 ` Anthony Towns
2022-03-23  0:20   ` ZmnSCPxj [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='6z4zgwg-r_EKOmZKCC1KyCmSjkZBbzHOKXHiMQf6th4r_PHDbMuCqSQ366hz6LRhdX25YI6IElcr9bFOVsu78UUns-ZNIt-YPgMqEwyg9ZM=@protonmail.com' \
    --to=zmnscpxj@protonmail$(echo .)com \
    --cc=aj@erisian$(echo .)com.au \
    --cc=bitcoin-dev@lists$(echo .)linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox