--- Log opened Fri Dec 20 00:00:42 2019 03:30 < elichai2> About #702, do we want it to be compile time / runtime or both? (the illegal callback started as runtime but now it's both) 03:44 < gmaxwell> I don't think runtime is interesting. If someone wants some runtime thing they can compile time override with something runtime adjustable. 03:44 < elichai2> gmaxwell: although bitcoin core now wants to add runtime tests to see which hash impl is actually faster 03:44 < gmaxwell> I also don't think runtime can be done without crapping over every interface with a bunch of additional function pointers (and function pointers themselves are not great for security...) 03:45 < elichai2> sometimes AVX/SSE/SHA-NI might be actually slower. especially when some processors are faking it 03:45 < gmaxwell> elichai2: _testing_ for runtime usually doesn't work well, because your measurements get messed up by other processes. 03:45 < elichai2> gmaxwell: you could do something like `secp256k1_context_set_illegal_callback` 03:45 < gmaxwell> not thread safe. 03:45 < gmaxwell> just makes it harder to safely use the library. 03:46 < elichai2> really? it's not thread safe? why? 03:46 < gmaxwell> I think you missed my comment above: if you're dying to have some dynamic substitution, just simply substitute with your dynamic switching function. 03:46 < elichai2> (anyhow the bigger problem I see is the state sturct. we either decide the size of it and make it an opaque struct that you can then cast into your own struct, or make init return a void pointer which requires heap allocation :/ ) 03:46 < gmaxwell> elichai2: because it mutates the state. 03:47 < elichai2> gmaxwell: the state is created by the init function. the functions themselves don't contain the state AFAIU 03:48 < elichai2> gmaxwell: oh right. that makes sense (the dynamic switching thingy) 03:49 < gmaxwell> 03:45:03 < elichai2> sometimes AVX/SSE/SHA-NI might be actually slower. especially when some processors are faking it 03:50 < gmaxwell> I don't believe thats the case for any of those options, have a reference? 03:50 < elichai2> we could force the current state struct `typedef struct {uint32_t s[8];uint32_t buf[16]; /* In big endian */size_t bytes;} secp256k1_sha256;` but that's a bad idea. we could also make an opaque 128 byte struct that could then cast to whatever 03:51 < gmaxwell> (there are some SIMD versions that were slower on some systems, but they were at best only slightly faster on the machines they were faster on, so we didn't include them.) 03:51 < elichai2> gmaxwell: I remember AMD faking hardware support which were implemented in software but I might be wrong here 03:54 < gmaxwell> What you're remembering is just that sometimes earlier SIMD hardware is less wide than the vectors themselves, so say the vector is 256 bits wide, but the processor only has enough execution units to do 128 bits in parallel. This isn't unique to AMD, it's happened for every new SIMD instruction set from intel too. It was notable on AMD Zen because the wider version on intel had been common for a 03:54 < gmaxwell> long time when zen came out. (And zen2 avx2 is now 256 bits wide). In any case, none of this resulted in any of the functions in bitcoin core being slower than an alternative in it. 03:55 < elichai2> sound like it 03:55 < gmaxwell> probably in those sorts of cases if something really were slower than you'd expect from the support-- you'd probably be best off just blacklisting some chips. 03:55 < gmaxwell> simply because of benchmarking being slow, unreliable, or both. :P 03:56 < elichai2> so no one is talking about runtime benchmarks? if so my memory is really garbage today lol 03:57 < gmaxwell> well someone might have! but if so I'd like to know where so I can suggest against it! 03:58 < elichai2> i'll try to look for it. you have any thoughts on how to manage the state with the external functions? (without adding yet another pointer to the API) 03:59 -!- jonatack [~jon@2a01:e0a:53c:a200:bb54:3be5:c3d0:9ce5] has quit [Ping timeout: 248 seconds] 04:01 < gmaxwell> manage which state? 04:02 < elichai2> the sha256 state 04:02 < elichai2> we could force the current state struct `typedef struct {uint32_t s[8];uint32_t buf[16]; /* In big endian */size_t bytes;} secp256k1_sha256;` but that's a bad idea. we could also make an opaque 128 byte struct that could then cast to whatever 04:02 < elichai2> or make the init function return a void pointer which IMHO is the worst because it means heap allocation 04:03 < elichai2> (or a static state which is thread unsafe) 04:05 < gmaxwell> ah. require the user to provide the non-opaque type in scope, the define just tells us its name... which is exactly how sha256 gets implemented in codebases anyways. We'd already require it provide a particular signature. 04:05 < gmaxwell> I think one of the comments on that issue suggested making the option simply omit the code... and then let the user provide somethings that links. 04:06 < elichai2> thats what i'm trying to do. but you still need to carry some state between the 3 functions 04:06 < elichai2> and you need to have it in scope (ie included) so just making it a macro doesn't quite help unless the macro also gives the struct definition (unless there are cool tricks I don't know about) 04:09 < gmaxwell> the 'state' is just some ctx object. Which the user needs to get its typedef into our view somehow. I suppose it can just let you provide the definition, thats kinda ugly. 04:11 < elichai2> yeah. that's why I thought we could maybe require to use some opaque struct which is big enough to be casted into whatever 04:12 < elichai2> and then the user will cast it into his own type as long as our opaque is bigger 04:12 < elichai2> (altough alignment might be a problem) 04:13 < gmaxwell> thats just ugly. perhaps my suggestion just wasn't a good one. :) 04:14 < gmaxwell> (and maybe it should have been 'reorg the codebase slightly to make it really easy to modify to use your own sha256') 04:14 < elichai2> I'm sure there's a better way to do it :/ void pointers are the most ergonomic but adding heap back sucks 04:14 < elichai2> to modify the codebase? not compile time? 04:15 < elichai2> that sucks too :( 04:15 < gmaxwell> not only does it suck, but that just isn't how anyone's existant sha256 code works. having to create a heap wrapper would be more work than just modifying the code to change an include and leave something out of the linker. 04:16 < sipa> also remember that for many uses of sha256 the amount of bytes hashed is tiny 04:16 < gmaxwell> and the malloc could be more expensive than the hash (or at least more expensive than the difference between a fast and a slow hash) 04:16 < sipa> and asking the caller to copy them into a buffer may be less work than serializing/deserializing the sha256 state tk memory 04:17 < sipa> so at least for simole use cases a single full function may actually be faster 04:17 < elichai2> that's why heap isn't a good idea. (especially after we got the library to work without heap) 04:17 < elichai2> a single function does solve this 04:18 < gmaxwell> sipa: yes, but that requires you to materialize the input... which is a non-starter in at least some cases. 04:18 < sipa> gmaxwell: in some cases, yes 04:18 < sipa> but do we have any of those? 04:18 < gmaxwell> also is unlike how typical implementations work, so you're back to asking the user to write some custom wrapper, instead of changing and include and a build setting. :) 04:19 < gmaxwell> sipa: presumably the hashing in batch validation takes a lot of input. 04:19 < sipa> oh yes, good point 04:21 < gmaxwell> probably the code could be adjusted so that swapping out the sha256 compression function requires litterally one include change and one define (to ifdef out the internal copy) ... that would be pretty minimial. (assuming your own implementation provides the same interface as our internal one) 04:21 < gmaxwell> (aside the typedef on the interface in bitcoin core is somewhat different, ints instead of bytes or something, IIRC) 04:21 < sipa> i think our internal interface is oretty typical 04:22 < gmaxwell> yep. 04:23 < elichai2> yep. it's a very standard API. 04:23 < elichai2> sipa: you really think that sometimes the `memcpy` from the opaque to the actual might be more expensive than a hash? 04:24 < sipa> elichai2: no 04:25 < elichai2> oh I misunderstood you. so what are your thoughts on doing something like `typedef struct {unsigned char data[128];} secp256k1_sha256`for the state? 04:25 < elichai2> and then the implementor `memcpy` his part into his own state struct 04:26 < gmaxwell> this stops being particularly interesting if the caller has to do any coding beyond adapting a function name. 04:26 < elichai2> which the only requirement is that it should be less than 128 byte 04:26 < gmaxwell> having to use some weirdo opaque type that has dubious size/alignment properties sounds like custom coding. 04:26 < gmaxwell> The thing any change has to be easier then is basically 10 seconds work changing a build script. 04:26 < sipa> but memcpy the data to be hashed into a single buffer may be less work than loading/saving between init uodate finalize 04:26 < gmaxwell> (well not currently, but it could be made that easy) 04:27 < elichai2> gmaxwell: he anyway needs to add `extern "C" ` functions/wrappers. 04:27 < elichai2> altough in C++ it's almost no work if you adapt the names and interface 04:28 < elichai2> gmaxwell: i'm also thinking of languages using bindings like rust. which there it doesn't help you to modify the C code 04:28 < elichai2> sipa: but it's not practical for batch verify 04:28 < gmaxwell> Replacing parts of the library with code written in weird languages that aren't naturally C-callable isn't interesting to me. 04:29 < gmaxwell> (I mean, something might well want to do that-- good for them, but I don't care about it) 04:29 < gmaxwell> I do care about things like HW wallets not wanting to use libsecp256k1 because it's 30kb larger than their existing code, and 19kb of that is just from the redundant sha2 implementation... 04:30 < elichai2> I do :) idk the performance of the current rust sha256 libraries but if there are some very fast ones I'd appreciate being able to let users of the lib set their own functions in some way (or even rust-secp itself might want to do that) 04:30 < gmaxwell> or where signing or batch veriy is quite a few percent slower than it would be because it's not using sha-ni. 04:30 < elichai2> but I understand the HW usecase might be more interesting 04:31 < gmaxwell> elichai2: none of these rust implementations are anything but just transliterations from a C one. just use the C one. 04:31 < sipa> for the HW use case you'd rather have some #defines that you can override with your own sha256 implementation that gets statically linked in 04:31 < sipa> gmaxwell: they may take advantage of sha-ni & sse & co 04:32 < sipa> which is of course also perfectly possible in c, but yiu still need to have it 04:32 < gmaxwell> sipa: sure, because they're transmiterations of C code that does. :) so just use that. 04:32 < elichai2> it's easier in rust to take advantage of sha-ni/sse/avx because you don't need autotools for that 04:33 < elichai2> it's a language built in to check for those on compile time 04:33 < gmaxwell> what?! you don't need autotools to do that elsewhere either. 04:33 < sipa> autotools lets you detect whether your compiler supports things 04:33 < gmaxwell> And in fact, if you feel like using gnu extensions C++ will also do runtime capability detection and overriding for you. 04:34 < sipa> not whether they're available at runtime 04:34 < gmaxwell> (and why snub gnu extensions, they're supported much more widely than rust itself is. :) ) 04:34 < gmaxwell> in any case, I don't think "this one application has to do a bunch of wrapper work" makes an interface that requires wrapper work particularly interesting. 04:36 < gmaxwell> Like I could just stop spending time on this discussion and we're left with the status quo where someone can substutite the function with a few minutes work by modifing the code. (I know because I recently stubbed in core's sha-ni for benchmarking, and that took about 15 minutes) It's more work than need be, because the sha2 part isn't fully isolated from the hmac wrapper, etc. 04:36 < gmaxwell> (and core's code is written in C++ so that 15 minutes also included 'converting' it to C) 04:37 < elichai2> fair. we can settle on just seperating the sha256 code better 04:38 < elichai2> even though it also means even for core maintaining patches to libsecp which will probably mess with git subtree 04:39 < sipa> can't it just be some ./configure flags to define the size/init/update/finalize functions? 04:40 < elichai2> size of the state? and then what do you do with the size? 04:40 < gmaxwell> sipa: thats what my issue was attempting to suggest. It asl needs to give it an include to get the context typedef. 04:40 < gmaxwell> also* 04:41 < sipa> or that, and a type name 04:41 < sipa> so you don't need to give size and alignment etc 04:41 < gmaxwell> Right. 04:42 < gmaxwell> you'd just pass it defines with the names of the function and the context type. ... and the header file name that needs to get included. 04:43 < gmaxwell> I don't think I've ever passed in a header file name into a build system beore... not sure if there are gotchas there. 04:44 < gmaxwell> I think I have things use defines to decide to include an optional file that you create. 04:44 < gmaxwell> analogous to config.h. 04:47 < elichai2> why can't we declare a struct but link in the definition :/ 04:48 < gmaxwell> because you don't know its size without the definition. 04:48 < sipa> because types don't exist at link time 04:48 < elichai2> you could do `--external-sha256=sha256_sse41.h` and then align the interface to be correct 04:49 < elichai2> I know i'm just complaining 04:50 < gmaxwell> essentially. if a header has to be provided, you could also expect it to define all the function names. 04:50 < elichai2> you still need to know what to call 04:50 < gmaxwell> like just use secp256k1_sha256_foo (whatever it is now) and that header can just #define that string to whatever your function is called. 04:51 < elichai2> oh with macros. right 05:22 -!- jonatack [~jon@54.76.13.109.rev.sfr.net] has joined #secp256k1 07:13 -!- andytoshi [~apoelstra@unaffiliated/andytoshi] has quit [Ping timeout: 260 seconds] 07:29 -!- jonatack [~jon@54.76.13.109.rev.sfr.net] has quit [Ping timeout: 260 seconds] 09:02 -!- jonatack [~jon@2a01:e0a:53c:a200:bb54:3be5:c3d0:9ce5] has joined #secp256k1 09:39 -!- andytoshi [~apoelstra@unaffiliated/andytoshi] has joined #secp256k1 12:56 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has quit [Quit: ZNC - http://znc.sourceforge.net] 12:58 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has joined #secp256k1 13:33 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has quit [Quit: ZNC - http://znc.sourceforge.net] 13:34 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has joined #secp256k1 14:17 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has quit [Read error: Connection reset by peer] 14:19 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has joined #secp256k1 15:16 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has quit [Quit: ZNC - http://znc.sourceforge.net] 15:17 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has joined #secp256k1 16:03 -!- BlueMatt [~BlueMatt@unaffiliated/bluematt] has quit [Ping timeout: 250 seconds] 16:04 -!- BlueMatt [~BlueMatt@unaffiliated/bluematt] has joined #secp256k1 19:11 -!- andytoshi [~apoelstra@unaffiliated/andytoshi] has quit [Ping timeout: 260 seconds] --- Log closed Fri Dec 20 19:36:00 2019 --- Log opened Fri Dec 20 19:36:00 2019 20:19 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has quit [Quit: ZNC - http://znc.sourceforge.net] 20:21 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has joined #secp256k1 --- Log closed Sat Dec 21 00:00:44 2019