Thanks for bringing this discovery up and a big thanks to Peter Dettman for working on this.

I second what Nadav said. Removing pointless complexity is worth it even at this stage. I also maintain a non-libsecp implementation of BIP340 etc. Having two ways to convert an xonly to a point is a pain if you are trying to maintain type safe apis. If there is no performance penalty (or even a small one in the short term) to unifying xonly -> point conversion it's worth it from my perspective.

LL

On Thu, Aug 13, 2020 at 6:29 AM Nadav Kohen via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
Hello Pieter and all,

I am one of the maintainers of Bitcoin-S[1] and I maintain our secp256k1 bindings (via JNI) as well as our (inefficient) bouncy castle fallback implementations of all secp256k1 functionality we depend on including Schnorr signatures. In light of this new information that there is no real downside to using evenness as the nonce tie-breaker, I am personally very in favor of this change as it strictly simplifies things as well as making types consistent between nonces and persistent signing keys (I can get rid of our SchnorrNonce type :). An additional minor benefit not already mentioned is that in places in our codebase where deserialized data is just being passed around and not used, we currently require a computation to go from a (x-only) SchnorrNonce to an ECPublicKey whereas going from a SchnorrPublicKey simply requires pre-pending a 0x02 byte.

I am likely not aware of the entire impact that changing the BIP at this stage would have but from my view (of having to update bindings and test vectors and my fallback implementation, as well as wanting to get a stable branch on secp256k1-zkp containing both ECDSA adaptor signatures and Schnorr signatures for use in Discreet Log Contracts), I think this change is totally worth it and it will only become harder to make this simplification in the future. The schnorrsig branch has not yet been merged into secp256k1 (and is nearing this stage I think) and so long as making this change doesn't set us back more than a month (which seems unlikely) I am personally in favor of making this change. Glad to hear other's thoughts on this of course but I figured I'd voice my support :)

Best,
Nadav

[1] https://github.com/bitcoin-s/bitcoin-s/



On Wed, Aug 12, 2020 at 2:04 PM Pieter Wuille via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
Hello all,

The current BIP340 draft[1] uses two different tiebreakers for conveying the Y coordinate of points: for the R point inside signatures squaredness is used, while for public keys evenness is used. Originally both used squaredness, but it was changed[2] for public keys after observing this results in additional complexity for compatibility with existing systems.

The reason for choosing squaredness as tiebreaker was performance: in non-batch signature validation, the recomputed R point must be verified to have the correct sign, to guarantee consistency with batch validation. Whether the Y coordinate is square can be computed directly in Jacobian coordinates, while determining evenness requires a conversion to affine coordinates first.

This argument of course relies on the assumption that determining whether the Y coordinate is square can be done more efficiently than a conversion to affine coordinates. It appears now that this assumption is incorrect, and the justification for picking the squaredness tiebreaking doesn't really exist. As it comes with other trade-offs (it slows down signing, and is a less conventional choice), it would seem that we should reconsider the option of having the R point use the evenness tiebreaker (like public keys).

It is late in the process, but I feel I owe this explanation so that at least the possibility of changing can be discussed with all information. On the upside, this was discovered in the context of looking into a cool improvement to libsecp256k1[5], which makes things faster in general, but specifically benefits the evenness variant.


# 1. What happened?

Computing squaredness is done through the Jacobi symbol (same inventor, but unrelated to Jacobian coordinates). Computing evenness requires converting points to affine coordinates first, and that needs a modular inverse. The assumption that Jacobi symbols are faster to compute than inverses was based on:

* A (possibly) mistaken belief about the theory: fast algorithms for both Jacobi symbols and inverses are internally based on variants of the same extended GCD algorithm[3]. Since an inverse needs to extract a full big integer out of the transition steps made in the extgcd algorithm, while the Jacobi symbol just extracts a single bit, it had seemed that any advances applicable to one would be applicable to the other, but inverses would always need additional work on top. It appears however that a class of extgcd algorithms exists (LSB based ones) that cannot be used for Jacobi calculations without losing efficiency. Recent developments[4] and a proposed implementation in libsecp256k1[5] by Peter Dettman show that using this, inverses in some cases can in fact be faster than Jacobi symbols.

* A broken benchmark. This belief was incorrectly confirmed by a broken benchmark[6] in libsecp256k1 for the libgmp-based Jacobi symbol calculation and modular inverse. The benchmark was repeatedly testing the same constant input, which apparently was around 2.5x faster than the average speed. It is a variable-time algorithm, so a good variation of inputs matters. This mistake had me (and probably others) convinced for years that Jacobi symbols were amazingly fast, while in reality they were always very close in performance to inverses.


# 2. What is the actual impact of picking evenness instead?

It is hard to make very generic statements here, as BIP340 will hopefully be used for a long time, and hardware advancements and algorithmic improvements may change the balance. That said, performance on current hardware with optimized algorithms is the best approximation we have.

The numbers below give the expected performance change from squareness to evenness, for single BIP340 validation, and for signing. Positive numbers mean evenness is faster. Batch validation is not impacted at all.

In the short term, for block validation in Bitcoin Core, the numbers for master-nogmp are probably the most relevant (as Bitcoin Core uses libsecp256k1 without libgmp, to reduce consensus-critical dependencies). If/when [5] gets merged, safegcd-nogmp will be what matters. On a longer time scale, the gmp numbers may be more relevant, as the Jacobi implementation there is certainly closer to the state of the art.

* i7-7820HQ: (verify) (sign)
  - master-nogmp: -0.3% +16.1%
  - safegcd-nogmp: +6.7% +17.1%
  - master-gmp: +0.6% +7.7%
  - safegcd-gmp: +1.6% +8.6%

* Cortex-A53: (verify) (sign)
  - master-nogmp: -0.3% +15.7%
  - safegcd-nogmp: +7.5% +16.9%
  - master-gmp: +0.3% +4.1%
  - safegcd-gmp: 0.0% +3.5%

* EPYC 7742: (verify) (sign)
  - master-nogmp: -0.3% +16.8%
  - safegcd-nogmp: +8.6% +18.4%
  - master-gmp: 0.0% +7.4%
  - safegcd-gmp: +2.3% +7.8%

In well optimized cryptographic code speedups as large as a couple percent are difficult to come by, so we would usually consider changes of this magnitude relevant. Note however that while the percentages for signing speed are larger, they are not what is unexpected here. The choice for the square tiebreaker was intended to improve verification speed at the cost of signing speed. As it turns out that it doesn't actually benefit verification speed, this is a bad trade-off.


# 3. How big a change is it

* In the BIP:
  - Changing both invocations of `has_square_y` to `has_even_y`.
  - Changing the `lift_x_square_y` invocation to `lift_x_even_y`.
  - Applying the same change to the test vector generation code, and the resulting test vectors.
* In the libsecp256k1:
  - An 8-line patch to the proposed BIP340 implementation[7]: see [8]
* In Bitcoin Core:
  - Similarly small changes to the Python test reimplementation[9]
* Duplicating these changes in other draft implementations that may already exist.
* Review for all the above.


# 4. Conclusion

We discovered that the justification for using squaredness tiebreakers in BIP340 is based on a misunderstanding, and recent developments show that it may in fact be a somewhat worse choice than the alternative. It is a relatively simple change to address this, but that has be weighed against the impact of changing the standard at this stage.

Thoughts?


# 5. References

  [1] https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki#design
  [2] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2020-February/017639.html
  [3] https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm
  [4] https://gcd.cr.yp.to/safegcd-20190413.pdf
  [5] https://github.com/bitcoin-core/secp256k1/pull/767
  [6] https://github.com/bitcoin-core/secp256k1/pull/797
  [7] https://github.com/bitcoin-core/secp256k1/pull/558
  [8] https://github.com/sipa/secp256k1/commit/822311ca230a48d2c373f3e48b91b2a59e1371d6
  [9] https://github.com/bitcoin/bitcoin/pull/17977


Cheers,

--
Pieter

_______________________________________________
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
_______________________________________________
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev