--- Log opened Sun Jan 26 00:00:20 2020 02:22 < elichai2> Reviewing #710 now. I can't seem to find good info on if `vpcmpeqd` and `pcmpeqd` are constant time or not. on one hand the pseudo code in the intrinsic website looks variable time but on the other it's a single instruction 02:24 < gmaxwell> elichai2: https://www.agner.org/optimize/instruction_tables.pdf 02:27 < elichai2> I searched there before, but I now see that they're there but without the double precision (ie `vpcmpeq`) 02:28 < elichai2> anyhow looks constant time 02:31 < elichai2> is `1-2` in Reciprocal throughput means it might be variable time? or do we only care about Ops? 02:32 < gmaxwell> Reciprocal throughput and latency are the relevant columns. ops is about the instruction decoder. 02:33 < elichai2> so `vmovdqu` can be variable time on AMD Bulldozer 02:34 < elichai2> which gcc trunk with `-O3 -march=native` produces. 02:34 < elichai2> * on the cmov function 02:34 < elichai2> and icc 19.0.1 is literally introducing branches :O 02:35 < elichai2> even on `-O2` 02:35 < elichai2> https://godbolt.org/z/882jPA 02:40 < gmaxwell> elichai2: I suspect those branches are for alignment. 02:40 < elichai2> I hope you're right. this assembly isn't easy/fun to read 02:41 < gmaxwell> oh not alignment, overlapping. 02:41 < gmaxwell> it checks that the input and output are not overlapping, and if they're not it uses vector operations. 02:42 < elichai2> I like how it do `cmp` twice and negate between hehe but yeah seems like you're right. so I guess i'll remove my comment on that :) 02:43 < elichai2> weird though. because gcc(trunk) uses vector ops without checking 02:44 < gmaxwell> can you change compiler flags on icc on godbolt? it would be interesting to see if restrict annotating it would make it omit the check. 02:44 < gmaxwell> oh I see where you can. 02:47 < gmaxwell> elichai2: fwiw https://godbolt.org/z/K4gwZi 02:48 < elichai2> ha, so with that icc produces the same asm as gcc(trunk) 02:48 < elichai2> well, almost hehe 02:48 < elichai2> also uses the weird `vmovdqu` though 02:49 < gmaxwell> (probably the code in the library should get restrict annotated) 02:50 < elichai2> gmaxwell: it also looks like it's working without the `-restrict` 02:51 < gmaxwell> at least in the past ICC would ignore restricts without an argument, guess they changed. 02:55 < gmaxwell> I don't see why GCC doesn't need extra code to handle overlap. 02:55 < gmaxwell> maybe because its a struct it can't be partially overlapped? 02:57 < gmaxwell> yep, making it pointers to integer arrays, gcc emits the overlap handling if there is no restrict. 02:58 < gmaxwell> oh actually it just refuses to vectorize it at all. 02:58 < gmaxwell> but with the restrict it vectorizes it. 02:59 < elichai2> hmm but what if the structs are literally the same? 02:59 < elichai2> *the pointers 02:59 < gmaxwell> that case would be okay. 02:59 < gmaxwell> the case that would be wrong is if they overlap, like [0] in the first one is [1] in the second one. 03:03 < elichai2> ok 12:44 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has quit [Quit: ZNC - http://znc.sourceforge.net] 12:46 -!- luke-jr [~luke-jr@unaffiliated/luke-jr] has joined #secp256k1 17:11 -!- belcher [~belcher@unaffiliated/belcher] has quit [Quit: Leaving] --- Log closed Mon Jan 27 00:00:23 2020