--- Day changed Mon Nov 27 2017 00:16 -!- nickler [~nickler@185.12.46.130] has quit [Ping timeout: 255 seconds] 00:17 -!- nickler [~nickler@185.12.46.130] has joined #secp256k1 02:35 -!- jtimon [~quassel@164.31.134.37.dynamic.jazztel.es] has joined #secp256k1 04:13 -!- nickler [~nickler@185.12.46.130] has quit [Ping timeout: 252 seconds] 04:13 -!- nickler [~nickler@185.12.46.130] has joined #secp256k1 06:12 -!- SopaXorzTaker [~SopaXorzT@unaffiliated/sopaxorztaker] has joined #secp256k1 06:32 -!- SopaXorzTaker [~SopaXorzT@unaffiliated/sopaxorztaker] has quit [Remote host closed the connection] 06:32 -!- SopaXorzTaker [~SopaXorzT@unaffiliated/sopaxorztaker] has joined #secp256k1 09:47 -!- arubi [~ese168@gateway/tor-sasl/ese168] has quit [Remote host closed the connection] 09:54 -!- arubi [~ese168@gateway/tor-sasl/ese168] has joined #secp256k1 10:48 -!- arubi [~ese168@gateway/tor-sasl/ese168] has quit [Ping timeout: 248 seconds] 10:53 < andytoshi> so, i finally finished the rangeproof verify (had to add the missing 5 exponentiations of equation (61) and randomize those).. today my laptop is back to giving 130us/bit for the old rangeproofs, so 8250us for 64-bits, 4130 for 32-bit 10:54 < andytoshi> meanwhile the 64-bit bulletproof verifies in 2400us, for a 3.4x speedup 10:54 -!- arubi [~ese168@gateway/tor-sasl/ese168] has joined #secp256k1 10:54 < sipa> w00t 10:54 < andytoshi> yeppers 10:54 < andytoshi> and this is for one proof, when we aggregate we can expect better 10:55 < andytoshi> we'd be going from a 145-point exp to a (145+128)-point exp for an aggregate of two proofs 10:56 < andytoshi> specifically, using N/log(N) and assuming all the verify time is in the multiexp, it looks like we go up to 4.07x 11:00 < andytoshi> https://github.com/ElementsProject/secp256k1-zkp/pull/16 11:02 < andytoshi> oh, lemme rebase to not have so many extraneous commits 11:04 < andytoshi> better 11:04 < gmaxwell> Fantastic! 11:05 < andytoshi> BTW regarding 48-bit or whatever rangeproofs, benedikt and i chatted by email about this and i'm fairly confident that the prover can just round up to the nearest power of 2, using the identity in place of the extra generators. then the verifier doesn't change at all, it just truncates its multiexp after 48 generators rather than doing the full 64 11:06 < andytoshi> i want to update the proof in the paper and PR this so that the stanford people can take a look at it. i think i know how but i haven't worked out all of the details at once yet 11:07 < sipa> so... anothing 4/3 / log(4/3) speedup? 11:07 < gmaxwell> you're still using the jonasless multi-exp, right? those are big enough that it'll be faster though not enormously so. 11:08 < andytoshi> gmaxwell: correct 11:08 < sipa> eh, 4/3 / (log(64)/log(48)) 11:08 < andytoshi> sipa: yeah. though it's a little artificial since we're actually using 32-bit proofs in practice right now 11:08 < sipa> i see 11:09 < andytoshi> so even though it's unfair it might make more sense to compare the 32-bit proofs to 64-bit bulletproofs, so only a ~1.75x speedup 11:09 < gmaxwell> yes, though the 32-bit proofs are too small. they were going to drive us to implement private exponent, which would be a further slowdown. 11:09 < andytoshi> good point 11:10 < gmaxwell> esp when the goal of the CT is CJ, you can't just leak the LSBs of your amounts. (vs the goal being commercial contracting amount privacy, where you can) 11:10 < andytoshi> very good point 11:20 < andytoshi> with the endomorphism on, the old rangeproofs are 91.5us/bit (so 5856us for 64-bits). bulletproofs actually take a 2-3% longer (matching sipa's graph). though with endo on, i'm firmly into "should be using pippenger" territory 11:21 < andytoshi> the speedup drops to a paltry 2.37x because the old rangeproofs do so much better 11:22 < sipa> right, multiexp with larger point count removes the advantages of endo 11:25 < andytoshi> i will email you two, and benedikt and dan, with these numbers 11:25 < gmaxwell> perhaps patch in jonas' code and try again? :P 11:25 < andytoshi> oo yeah, good call, that might be trivial 11:26 < gmaxwell> endo is useful longer with the wnaf-pippenger, IIRC. 11:26 < andytoshi> also, do either of you have a machine you can benchmark on? i am just running `./bench_rangeproof` vs `./bench_bulletproof`, it's super easy to do. my laptop is not a good benchmark machine, it's doing lots at once plus the CPU clock speed is constantly changing 11:26 < gmaxwell> well, you can pin the cpu speed. 11:26 < andytoshi> yeah, sipa's graph showing endo being useful with wnaf-pippenger out to thousands of points 11:27 < andytoshi> i'm not sure, a few kernel upgrades ago cpufreq stopped working because intel something something.. 11:27 < sipa> andytoshi: you need to boot with kernel option intel_pstate=no 11:27 < andytoshi> oh, neat, thanks 11:28 < sipa> sorry, intel_pstate=disable 11:28 < sipa> https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html lists more values for that option; perhaps some other ones don't interfere with benchmarking either 11:36 -!- hdevalence [~hdevalenc@199-188-193-243.PUBLIC.monkeybrains.net] has joined #secp256k1 11:37 -!- hdevalence is now known as hdevalence_ 11:41 < andytoshi> bleh, in dc055200 (get rid of precomputed H tables) sipa changed the WNAF_SIZE macro to take two arguments. jonas moves the one that takes 1 from ecmult_const_impl to ecmult_impl because he reuses it in strauss 11:42 < andytoshi> ok, just renamed one of them for now 11:51 < andytoshi> ok, with pippenger we drop from 2460us to 2260us, an additional 9%. total speedup 3.78x 11:52 < andytoshi> lemme reboot with that pstate option though and benchmark without a gui running 12:13 < andytoshi> ok, done. numbers are roughly the same: without the endomorphism both strauss and pippinger give 2360us for a 64-bit bulletproof. the old code would've taken 8200us. 12:14 < andytoshi> with the endomorphism, strauss slows down to 2420us while pippinger speeds up to 2230us. old code speeds up to 5840us. 12:15 < andytoshi> having said this, during compilation, but not during the benchmarks, i would get kernel messages whining about the CPU overheating and it throttling cores. i'd walk away from the system for a minute or two each time this happened, but it still makes me lose confidence in these numbers 12:17 < andytoshi> anyway i'll email y'all and dan/benedikt with this data 12:18 < sipa> well artifically reduce your cpu clock 12:19 < sipa> which may not be entirely representative, as you get relatively speaking faster RAM that way 12:19 < sipa> but it at least gives a fair comparison 12:22 < andytoshi> ah, i see, i should be able to set frequencies if pstate is disabled 12:23 < sipa> indeed 12:47 < andytoshi> awesome, thanks! TIL 12:47 < andytoshi> so, with my cpu set to 800mhz i get similar results (though more consistent ones after multiple runs) 12:48 < andytoshi> with endo on, a 64-bit rangeproof takes 13760us, vs 5706us for bulletproof-strauss and 5230us for bulletproof-pip. so 2.36x speedup 12:48 < andytoshi> without endo, the rangeproof takes 19584us, bf-strauss 5341us, and bf-pip 5554us. so 3.67x speedup 13:16 < andytoshi> ok, email sent 13:18 < maaku> andytoshi: each aggregation adds 2*b point exp, where b is the number of bits? so 17 + 128*b*n? 13:18 < sipa> andytoshi: i suspect you wrote "without" twice in the performance line of your mail 13:19 < andytoshi> oops 13:19 < andytoshi> maaku: yep 13:19 < maaku> cool thanks 13:20 < andytoshi> 17 seems high, i should investigate that, i had expected 9 13:21 < andytoshi> oh, i see, it's not a fixed 17 13:21 < andytoshi> it's 128*b*n + 2*log2(b*n) + 9 13:21 < andytoshi> err, +5 13:22 < andytoshi> and i'm undercounting by 1, it should be +6 (and my earlier "145" should've been 146). it's a quirk of the multiexp API that it takes a separate scalar to multiply the generator by 15:09 < andytoshi> in C, left-shifting by too many bits is actually UB right? even if i don't use the value ever? 15:45 < sipa> unsure about that 15:45 < sipa> the specification has a concept of "indeterminate value", which is not actually UB to produce, but is UB is you use it 16:30 < gmaxwell> I am vaguely thinking it is implementation defined. 16:36 < hdevalence_> it is UB 16:37 < hdevalence_> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf#page=574 16:38 < sipa> yup, shifting negative values is implementation defined 16:38 < sipa> shifting beyond the range is undefined 16:49 < gmaxwell> sipa: C89 and C99 differe on the negative values front. 16:51 < sipa> oh? 16:57 -!- andytoshi-web [ac3a8820@gateway/web/freenode/ip.172.58.136.32] has joined #secp256k1 16:59 < andytoshi-web> :/ that's annoying. can't do `const size_t n = 1 << d` without risking UB, can't declare it after checking because of declaration/statement ordering rules, can't declare it then check then initialize because of constness rules 16:59 < andytoshi-web> sometimes C89 really irritates me 17:01 -!- hdevalence_ [~hdevalenc@199-188-193-243.PUBLIC.monkeybrains.net] has quit [Quit: hdevalence_] 17:01 < sipa> what's the problem with checking first and returning before defining? 17:04 < andytoshi-web> ?? if i return then my function won't do anything 17:05 < andytoshi-web> if i do `const size_t n; /* ...check... */ n = 1 << d;` the compiler claims i am assigning to an immutable variable 17:06 < sipa> but you can check whether d is outside of the range 17:06 < sipa> maybe i misunderstand what your conditional would do if it's outside of tbe range 17:07 < andytoshi-web> just return 17:07 < sipa> if (d within range) { 17:08 < andytoshi-web> the check is not the problem, the problem is that I can never assign n. If i do it before the check it's UB. if i do it after it's mutating a const 17:08 < sipa> const size_t n = 1 << n; 17:08 < sipa> ... 17:08 < sipa> } else { 17:08 < sipa> return; 17:08 < sipa> } 17:08 < andytoshi-web> ohh yes, i can wrap my entire function in an if 17:09 < andytoshi-web> that's a very hard-to-read sanity check pattern 17:10 < sipa> otherwise, wrap the function 17:10 < sipa> make a version that takes as input n 17:10 < sipa> and another function which takes as input d, and either returns immediately or calls the n version 17:11 < andytoshi-web> that is still verbose and moves logic far apart and now the inner function has a contract that can be violated by any direct callers 17:16 < sipa> perhaps just don't make it const :) 17:20 < andytoshi-web> Yeah, that'd be the cleanest thing I think :) 17:22 < sipa> constness is a tool provided by the compiler that helps you avoid certain mistakes 17:22 < sipa> but if the tool gets in the way of writing readable code, don't use it 17:29 -!- andytoshi-web [ac3a8820@gateway/web/freenode/ip.172.58.136.32] has quit [Ping timeout: 260 seconds] 17:36 -!- andytoshi-web [ac3a89f5@gateway/web/freenode/ip.172.58.137.245] has joined #secp256k1 17:37 < andytoshi-web> well, const on local variables is a tool to aid readability ... but i guess if it gets in the way of compileability I shouldn't use it 17:39 < sipa> in C99 you could have size_t val_mutable; if (cond) { return } else { val_mutable = ...; }; const size_t val = val_mutable; ... 17:41 -!- nickler [~nickler@185.12.46.130] has quit [Ping timeout: 268 seconds] 17:44 < andytoshi-web> yeah, that'd be nice 17:49 -!- nickler [~nickler@185.12.46.130] has joined #secp256k1 17:53 -!- andytoshi-web [ac3a89f5@gateway/web/freenode/ip.172.58.137.245] has quit [Ping timeout: 260 seconds] 18:21 -!- kallewoof [~karl@67.205.138.199] has quit [Ping timeout: 260 seconds] 18:22 -!- kallewoof [~karl@67.205.138.199] has joined #secp256k1 19:10 -!- jtimon [~quassel@164.31.134.37.dynamic.jazztel.es] has quit [Ping timeout: 268 seconds]