--- Day changed Tue Sep 15 2015 01:48 -!- andytoshi [~andytoshi@unaffiliated/andytoshi] has quit [Ping timeout: 250 seconds] --- Log opened Thu Sep 24 12:57:41 2015 12:57 -!- kanzure [~kanzure@unaffiliated/kanzure] has joined #secp256k1 12:57 -!- Irssi: #secp256k1: Total of 19 nicks [2 ops, 0 halfops, 0 voices, 17 normal] 12:57 -!- Irssi: Join to #secp256k1 was synced in 1 secs 12:57 -!- Guest24663 [~jorn@g227014.upc-g.chello.nl] has joined #secp256k1 12:58 -!- Guest24663 [~jorn@g227014.upc-g.chello.nl] has left #secp256k1 ["Konversation terminated!"] 12:58 -!- CodeShark [~CodeShark@cpe-76-167-237-202.san.res.rr.com] has joined #secp256k1 13:00 <@gmaxwell> I guess that brings us back to which DER subset it handles. Right now it's broad enough that it will not roundtrip. 13:02 < andytoshi> current consensus rules are too broad to roundtrip...if we can change consensus rules i think we should use the compact encoding 13:02 < andytoshi> (i think) 13:03 <@gmaxwell> andytoshi: In Bitcoin the BIP66 rules will round-trip. 13:03 <@sipa> unless the signature is {} 13:03 <@sipa> which is allowed by BIP66, but not a valid signature 13:05 <@gmaxwell> The context for my question is that I'm using a fuzz tester to generate parser test cases. A round tripping test sounded good, but I knew it wouldn't work. And indeed it doesn't work. I'll probably move over the bip66 test from my older harness and test roundtripping only in that case. 13:05 < andytoshi> i think it'd be very useful if we could roundtrip 13:05 < andytoshi> except for the one bit in s, ECDSA is a strong signature. i'd hate to lose that property for the sake of encoding 13:05 <@sipa> gmaxwell: so you think we should document exactly which DER subset is supported? 13:06 <@gmaxwell> andytoshi: we do roundtrip if restricted to the BIP66 subset (minus the empty signature, as sipa nodes). Or at least we damn well should. I'll have results later. :) 13:07 <@gmaxwell> sipa: In terms of safe interfaces our current interface encourages people to reproduce Bitcoin's mistakes. 13:07 < andytoshi> ok, so i'm suggesting that be the only subset we accept 13:07 <@sipa> gmaxwell: agree 13:07 <@gmaxwell> andytoshi: so we have a compatiblity problem because in bitcoin we need to check the historical chain. 13:07 < andytoshi> ahh gross 13:08 <@gmaxwell> We could have seperate parse functions, e.g. a normal one and a _lax. 13:08 <@sipa> gmaxwell: that is fine by me 13:09 < andytoshi> ditto 13:10 < andytoshi> that makes me a bit more comfortable since _lax wouldn't be "consensus code" except in the sense that'd it'd have to validate the historical chain, which is an easily determinable property 13:10 < andytoshi> s/a bit// 13:12 <@gmaxwell> We could even make the availablity of lax a module. __ducks__ 13:14 < andytoshi> hehe, actually i would like that 13:15 < andytoshi> it really ought to be bitcoin-specific, any new blockchains shouldn't need it 13:15 <@sipa> we can of course also switch to actually pure strict DER in libsecp256k1 13:38 -!- CoinMuncher [~jannes@178.132.211.90] has joined #secp256k1 14:10 -!- CoinMuncher [~jannes@178.132.211.90] has quit [Quit: Leaving.] 14:27 -!- zmanian [uid113594@gateway/web/irccloud.com/x-rdnzemvwameoygoy] has joined #secp256k1 14:31 <@gmaxwell> I have to say, the parsing split has made testing a lot easier. 14:41 <@gmaxwell> FWIW, currently generating parser test cases using AFL and https://people.xiph.org/~greg/parse_harness.patch ... opinions on putting tools like this (without build system integration?) in a verify directory? 14:57 <@sipa> sounds good 14:58 <@sipa> also, making it compile as part of the test system may be uaeful 14:58 <@sipa> so its code does not go uncompilable 14:59 <@gmaxwell> K. I guess I could add ifdefs for module support. 14:59 -!- btcdrak [uid115429@gateway/web/irccloud.com/x-mmesabtwospzgqsy] has joined #secp256k1 15:01 <@gmaxwell> hm. we don't really have a way to test if a pubkey or a signature object is invalid. 15:01 <@gmaxwell> I mean externally to the library. 15:02 <@sipa> of course not... they always are 15:02 <@sipa> (valid) 15:03 <@sipa> ah, no 15:03 <@gmaxwell> e.g. if parsing fails. 15:03 <@sipa> when pubkey_parse fails, it creates an excplititly invalid object 15:03 <@sipa> which is guaranteed to not result in undefined behaviour if used 15:03 <@sipa> yes, we should add that 15:04 <@gmaxwell> I want to test that in the api tests, but it seemed to me like something that should be public. ... ugh, more API surface area. :( 15:14 <@gmaxwell> sipa: we have some defines like DETERMINSTIC and VERIFY which should be namespaced. 15:14 <@sipa> ugh, yes 15:20 <@gmaxwell> speaking of ugh. so most of the gap remaining for MISRA conformance is small, one thing that is a little obnoxious is in the C spec (e.g. C99) it is only required that the first 31 characters of identifers are significant (in C90 this was actually only the first 6 which is nuts). We violate the 31 rule, in enough places to be annoying but (somewhat surprisingly) not everywhere. 15:22 <@gmaxwell> E.g. secp256k1_fe_normalizes_to_zero_var is allowed to be interperted the same as secp256k1_fe_normalizes_to_zero according to C99. You can have more characters, you just can't depend on them being distinct. 15:22 <@sipa> ugh 15:23 <@sipa> let's prefix every internal function name by the base32 encoding of the md5 sum of what follows 15:23 <@gmaxwell> fortunately, a checker tool will tell me where all the cases are.... but they will not prevent you from wanting to kill me if I go fix it. :) 15:24 <@gmaxwell> I think a few abbrivations e.g. normalizes -> norms and likewise will handle it. At least for C99 (and fuck C90 on this, I've never personally encountered a compiler that would only look at 6 characters.. except maybe the DMR compiler on PDP) 15:25 <@sipa> ok 15:25 <@sipa> because certifications are awesome 15:25 <@sipa> and being able to say "We are MISRA compliant!" sounds almost as good as "We are ISO9001 compliant" 15:26 <@gmaxwell> beyond certifying, which is fun, I could imagine someone using some embedded compiler and getting some really nasty bugs. 15:26 < TD-Linux> gmaxwell, does that mean the entirety of daala's API with the "daala_" prefix is not C89 compliant? 15:27 <@gmaxwell> TD-Linux: I believe it's implementation defined, indeed. 15:29 <@sipa> gmaxwell: So, that means that having two functions with the same 31/6-character prefix in a codebase can lead to 1) working fine 2) result in the wrong function being called 3) link error (if the linker truncates the names)? 15:29 <@gmaxwell> sipa: Yes. 15:30 <@sipa> epic 15:30 <@gmaxwell> Fixing suckyness in C like this is why these standards exist. 15:30 * TD-Linux submits a patch to gold to speed up linking 15:30 <@gmaxwell> Fortunately GCC/clang sucks less than C, but who knows what awful compiler someone will use on the code. 15:31 <@gmaxwell> same people also probably will not bother running the tests (their target has no screen, how would they see their output?? :P ) 15:32 <@gmaxwell> sipa: with respect to static linking, with libtool the source doesn't even get compiled twice for static and shared. Thats part of why I really don't like having preprocessor macros for static vs not. 15:38 <@gmaxwell> In any case, most of the work required for MISRA left is documentation work. To claim complaince we'll need to write a requirements document (and all the software functionality should be tracable to the requirements), and create a compliance matrix which documents deviations; here is what one looks like: www.state-machine.com/doc/AN_QP-C_MISRA.pdf (though I'll make an ascii one... :P ) 15:39 <@gmaxwell> I figure it doesn't need to hold up the release, we can just make progress and get any remaining disruptive changes in... 15:48 < maaku> there's no reason _lax has to be in the secp256k1 codebase. it could be a 'secp256k1_legacy.c' file in bitcoin repo 15:51 <@gmaxwell> thats true, though this would potentially make the library less useful for others. 15:52 <@gmaxwell> as until very recently openssl was also lax, though even laxer than what we'd implement. :-/ 15:55 <@sipa> so there are 3 options 15:56 <@sipa> 1) have a as-wide-as-possible parser (in addition to maybe a strict der only one) in libsecp and require that consensus-critical callers do their own sanity checking on the input 15:56 <@sipa> 2) only strict DER, and put a parse-and-reencode-as-DER in bitcoin (which may be skipped post BIP66, but that's somewhat of a layer violation) 15:58 <@sipa> 3) have a bunch of flags to the parser to indicate what der violations are allowed 15:58 <@gmaxwell> (1) is a large development excercise, and would require a lot of testing. E.g. as it would have really be a BER parser. It also has the bad effect of presenting an unsafe default. Even outside of consensus most applications really do not want non-canonical signatures. 15:58 <@gmaxwell> oh lol we actually need to report another vulnerability to openssl. 15:58 <@gmaxwell> damnit. 16:02 <@gmaxwell> (3) is also perhaps just as bad as (1) depending on how far we go with it. 16:03 <@sipa> i guess we could do (2) and have an exposed parse+reencode function that is more lax, with no guarantees about what it actually supports 16:03 <@sipa> that will make it less convenient to use the non-safe behaviour in new apps 16:04 <@gmaxwell> I think thats my preference. Or instead of parse/reencode.. just a second parser? 16:04 <@sipa> a second parser works too 16:04 <@sipa> (though is less annoying) 16:05 <@sipa> i feel like we shouldn't intentionally have anti features, though 16:05 <@sipa> so a second parser it is 16:05 <@gmaxwell> we can call it _risky. :) 16:05 <@gmaxwell> or _sloppy ... who wants to use sloppy? 16:06 <@sipa> secp256k1_signature_parse_yes_i_have_read_the_terms_and_conditions(...); 16:06 <@gmaxwell> _postel_was_wrong_but_I_wont_admit_it() 16:06 <@sipa> _dont_use_if_uncertain() 16:06 <@gmaxwell> _ThErE_bE_dRaGoNs() 16:07 <@sipa> oh, 31-character limit :( 16:07 <@gmaxwell> you can go over, it just has to be unique before that point. :) 16:07 <@sipa> yes, so a suffix won't help 16:07 <@sipa> function names are case sensitive and preserving, right? 16:07 <@gmaxwell> Yes. 16:08 <@sipa> if so, we can encode one bit of checksum in each of the first 31 characters 16:08 <@sipa> in its lower/uppercaseness 16:08 * gmaxwell stabs 16:08 <@sipa> is there a word for that? 16:08 < TD-Linux> does that mean my type names aren't reserved if they end with a _t and are longer than 31 characters? 16:08 <@gmaxwell> sipa: abusive 16:08 <@gmaxwell> TD-Linux: LOL 16:08 <@sipa> gmaxwell: a word for "lower/upper caseness" i mean 16:09 <@gmaxwell> sipa: capitolization. (I likely spelled that wrong) 16:09 <@sipa> capitalization, i believe 16:10 <@sipa> capitolization would either be the act of turning a city into a capitol 16:10 <@sipa> or the influence of the california town capitola :p 16:13 <@gmaxwell> In secp256k1_ec_pubkey_serialize's docs... the flags paramter seems to not tell you how to get uncompressed. :P 16:14 <@sipa> we should list the flags' bits and their effect explicitly at the call site 16:14 <@sipa> also, they shouldn't be passed through to the module 16:15 <@sipa> module shouldn't depend on public api definitions, only the .c file 16:17 < cfields> sipa: i forgot to update yesterday.. i kinda gave up on the clang formatting thing. it was taking forever and no end in sight :\ 16:17 < cfields> i was hoping to break it off into chunks, but i'm beginning to think that doing it all at once might actually be the less painful route 16:19 <@sipa> cfields: agree 16:20 <@gmaxwell> I'm fine with it happening at once, especially if doing it that way makes it easier to shed paint the settings without wasting your time. :) 16:20 < cfields> sipa: there is one quick/easy one for me though, if you're still in favor of moving the pregenerated file from .h -> .c 16:21 <@sipa> cfields: that's fine by me 16:21 < cfields> gmaxwell: yea, i think that's what sucked the life out of me. knowing that even after it was done, there would be a list of ~30 things that would just cause more bickering 16:22 < cfields> a good example was that it wanted to do (void*) -> (void *) all over the place 16:22 < TD-Linux> moving the generated file to .c makes sense. the only reason it wasn't was because the idea before was to only have one object file generated 16:23 <@sipa> we can support multiple object files just fine... but it prevents inter-module optimization, which we need 16:23 <@sipa> but for a bit of pregenerated data it is not useful 16:24 <@gmaxwell> cfields: yea, I was kinda worrying about that when I heard you were doing a lot of manual work. I doubt we'll care about most things, but I don't want to get stuck with some formatting decision that will be hard to stick with just because I don't want to burn you out with redoing work. 16:24 < cfields> sipa: not arguing that point, but out of curiosity, have you measured lto's effect on smarter inlining? 16:25 <@sipa> cfields: nope, i assume lto greatly avoids that 16:25 <@sipa> though what's the point, compiling is super fast 16:25 < cfields> gmaxwell: well it was basically just 'clang-format blah...' && git add -p. but i underestimated how long it'd take 16:25 < cfields> sipa: avoids smarter inlining? 16:25 <@gmaxwell> cfields: well currently it has ~no effect, so it's hard to measure. Still-- not available everywhere, and the inilining is really critical in this codebase. 16:26 < cfields> sipa: again, i wasn't arguing for it. just curious as to the effect 16:26 <@sipa> cfields: avoid the problem 16:26 <@sipa> cfields: so i think lto would work fine as a replacement for having everything in one compilation unit 16:26 <@gmaxwell> cfields: based on my expirence elsewhere LTO should more or less work here. 16:26 <@sipa> cfields: but with little benwfit 16:27 <@sipa> which re 16:27 < cfields> sipa: ah, thanks for clarifying 16:27 <@sipa> which reminds me: benchmark the effects of -O1 -O2 -O3 -Os etc 16:27 < cfields> gmaxwell: roger 16:28 < cfields> sipa: you're forgetting -Ofast for the ricers :) 16:28 <@sipa> oh 16:29 <@sipa> and -O42 -fomit-broken-code, i guess 16:29 <@gmaxwell> in GCC Ofast should be the same as O3 for us, we have no floating point!. :P 16:29 <@sipa> the gentoo default 16:29 <@gmaxwell> (IIRC Ofast is just O3 and ffast-math currently) 16:29 < cfields> gmaxwell: i was just about to ask that, actually. Are there no flags that make non-floating point operations less safe somehow? 16:30 <@gmaxwell> cfields: none that matter for us. There is a flag that breaks errno handling IIRC. 16:30 < cfields> seems like a ridiculous question as i ask it, but i'm usually surprised by what compilers let you get away with 16:32 <@gmaxwell> cfields: or another way to look at it, the 'less safe' flags are O2 and especially O3. 16:32 <@gmaxwell> shocking as that may sound. 16:32 <@gmaxwell> Due to C aliasing rules. 16:32 < cfields> on obscure platforms? or real world concerns? 16:32 <@sipa> the strict aliasing rules... are they C99? 16:32 <@gmaxwell> Oh reall world, I'm just referring to the fact that C has very strict rules for what pointers can alias other pointers, _most_ C programmers are not very familar with them, lots of code violates them.. and optimization with respect to them can cause miscompliation in practice. And exploiting these rules is enabled by default. 16:33 <@gmaxwell> sipa: no. they apply everwhere. 16:33 < cfields> sipa: we hit some vocal aliasing warnings in the sha256 code, i'm not sure what set those apart 16:34 <@gmaxwell> esp. people who try to write in-place parsing code that accesses the same memory with two different non-char types and without a union, this stuff actually does get 'miscomplied' in practice on modern compilers. 16:35 < cfields> gmaxwell: i assumed it mostly had to do with alignment where (for ex) a dereferenced int64 ptr has 32bit alignment. Is that the usual case you're referring to, or is there more black magic i've missed out on? 16:35 <@sipa> it's not about alignment 16:36 <@sipa> but more about the fact that some value written to a variable may still be in a register and not flushed to memory if you access the same memory address through a different type pointer 16:36 <@sipa> because the compiler cannot infer that you're referring to that same memory 16:36 <@gmaxwell> cfields: No-- though alignment is another thing that people get wrong (mostly because x86 is astonishingly permissive). 16:37 < cfields> oh wow 16:37 * cfields has some reading to do 16:37 <@gmaxwell> The simplest statement of the rule is that you may not have two pointers of different types to the same memory, except where one is a character type. 16:38 <@gmaxwell> The compiler is allowed to assume that pointers to non-character types never alias, and can optimize loops with respect to the assumption (e.g. leaving data in registers as sipa notes). 16:38 <@gmaxwell> C99 adds the restrict flag to get even _more_ strict aliasing control, where you say that a pointer doesn't even alias any other pointer of the same type. 16:39 < cfields> i see 16:39 <@sipa> which we actually use, and actually improves the generated code 16:39 <@gmaxwell> In general the strict aliasing stuff in C improves performance a lot, thats why the compiler writers are so keen to use it. 16:40 <@sipa> cfields: if the hashing code gives warnings we should definitely look at it 16:40 < cfields> mm, isn't that very common in (for ex) byte-swapping macros? 16:40 < cfields> sipa: no, you were right. that was an alignment issue, and you took care of it already 16:40 <@sipa> those typically have char pointers :) 16:40 <@sipa> ah! 16:41 <@gmaxwell> cfields: you should use a union for that.. hopefully. (or go through char). (there is even some language lawyers debate if union is enough to bypass the rules, but that debate is not taken seriously because deseralization code without using unions would be hell). 16:42 < cfields> gmaxwell: yea, i can't think of any actual cases off the top of my head. but i can swear i've seen macros that do high/low swapping of integral types that way 16:43 <@gmaxwell> Some of this gets wrapped up in this decades long debate where the language authors say that C is a language, with no promises that the commands have any relation to what the machine does. Vs lots of engineers who think C is a fancy macro assembler that maps directly to the machine. ... which is how compilers more or less worked... 30 years ago. :) 16:43 < cfields> well, that was great to learn. I'll read up on the details for sure. Thank you both for the quick tips :) 16:44 <@gmaxwell> cfields: oh yea, there is lots of code that is flat out wrong with respect to the aliasing rules. 16:44 < cfields> heh 17:50 < CodeShark> the direct mapping to machine code is a lot more relevant to those who work on embedded systems 17:50 < CodeShark> generally speaking, that is 17:50 < CodeShark> most people who program for PCs don't even know how to use a compiler :p 17:52 < CodeShark> I should add systems programming to that list, I suppose - not just embedded systems 17:52 < CodeShark> but to most app developers, meh :p 18:42 -!- Pierre_Rochard [~Pierre@unaffiliated/pierre-rochard/x-3593157] has joined #secp256k1 19:10 <@sipa> :) 19:13 <@gmaxwell> holy crap. debian installer is kind of offensive. It needlessly hard binds language/country/timezone... you pick a language and it limits what countries you can select, pick a country and it limits what timezones you can select. 19:13 <@gmaxwell> "screw you computer, I want US english + GMT" 19:15 <@sipa> hmm, never noticed! 19:15 <@sipa> but you can easily change the timezone later 19:15 <@sipa> independently.of the rest 19:16 <@gmaxwell> yea, I know. just the assumption that language implies country implies timezone is really culturally/politically myopic. (also, I dunno why everyone doesn't keep the computer times in GMT; geesh.) 19:18 <@sipa> gmaxwell: that question is equivalent to "I dunno why everyonr doesn't just use GMT everywhere; geesh 19:18 <@gmaxwell> well I'm fine with using civil time generally, but as soon as you have infrastructure managed by multiple people all over the world... 19:37 < midnightmagic> Don't suppose y'all noticed my comment about the benchmark/unification+windowG override commit I stuffed into my secp branch..? 19:39 <@sipa> i have not looked at the commit at all 19:40 < midnightmagic> Okay. It's enough for me to know you're aware of it. 19:41 <@sipa> i am not aware of anything, and shall soon forget this conversation 19:49 -!- adam3us1 [~Adium@195.138.228.20] has quit [Quit: Leaving.] 19:58 < midnightmagic> MY KIND OF HUMAN 20:32 <@gmaxwell> heh. New desktop here does 392k ECDSA verifies per second. 20:41 <@sipa> 2.55us per verify? 20:41 <@sipa> is that a 32-core machine...? 20:45 <@gmaxwell> 24 cores of haswell v3. 20:45 <@gmaxwell> sipa: you are asleep. 20:55 <@sipa> is that actually benchmarked by running 24 bench_verify's in parallel? 20:55 <@sipa> or by taking the number from one and extrapolating 20:59 <@gmaxwell> 24 in parallel. 21:00 <@gmaxwell> I think these CPUs don't have turbo enabled in any case. 21:49 -!- Pierre_Rochard [~Pierre@unaffiliated/pierre-rochard/x-3593157] has quit [Quit: Pierre_Rochard] 22:07 -!- maaku [~quassel@173-228-107-141.dsl.static.fusionbroadband.com] has quit [Remote host closed the connection] 22:26 -!- maaku [~quassel@173-228-107-141.dsl.static.fusionbroadband.com] has joined #secp256k1 22:27 -!- maaku is now known as Guest16568