--- Day changed Mon May 29 2017 00:00 < gmaxwell> The major secret operation is the same as generating a public key. 00:01 < gmaxwell> yea, it would be interesting to measure if the power signature of twice with the same nonce (resp. two public key generations with the same secret) are more correlated than ones with different secrets, and if so by how much. 00:02 < TD-Linux> yeah okay so with a locked message it's a lot more stable. it's very content dependent. 00:02 < gmaxwell> I wonder if that power dip is a cache miss when going to the table? 00:02 < TD-Linux> I'll need to check if I have caching on 00:03 < gmaxwell> a prefetcher probably hides the table accesses, esp since it reads 16 entries does some caculations then 16 more. 00:03 < gmaxwell> but it wouldn't hide the first one. 00:04 < gmaxwell> TD-Linux: how does the probing for this work? 00:04 < TD-Linux> we only care about key bits and not msg bits correct? 00:05 < TD-Linux> gmaxwell, it's a remade amplifier board with 1 ohm resistor across the power line 00:05 < gmaxwell> TD-Linux: yea, the attack you do to compromise signing is that you read off bits from the nonce. 00:05 < gmaxwell> if you can learn k you can recover the private key from the resulting signature. 00:05 < TD-Linux> I was able to see a less high quality but otherwise quite similar output just from the power rail. 00:05 < TD-Linux> alright let's see if I can extract 1 bit of information just visually 00:07 < gmaxwell> keep in mind that the table entries are in 64 16 entry tables, which are accessed in a way that should make them constant time, but perhaps not constant power. (and, in fact, looks to me that those spikes are pretty nicely consistently timed). 00:09 < gmaxwell> You can simplify your attack by generating a compressed public key, since that will do all the same operations, but let you set the bits going into it more directly. uh also the operations are blinded. 00:11 < gmaxwell> you might want to disable that for the moment. The blinding takes the key you've input (or the nonce) and adds it with some random number stored in the context, and adjusts the sequence of operations so that the initial value of the addition includes the negative of this number to cancel it out. If you're not calling the reblind call, that blinding factor is a constant, so it doesn't really add much 00:11 < gmaxwell> in the way of security... but it may make your analysis confused. 00:12 < gmaxwell> commenting out the call to secp256k1_ecmult_gen_blind in the ecmult_gen_create will turn that off. 00:15 < TD-Linux> yup can definitely extract 1 bit of information visually 00:15 < gmaxwell> oh yea? 00:15 < TD-Linux> well maybe not one bit 00:15 < gmaxwell> hah. 00:16 < TD-Linux> I can make two messages, store one as ref, and identify on the scope 00:16 < gmaxwell> will the scope helpfully align a stored trace on the trigger for you? 00:17 < TD-Linux> yes 00:19 < gmaxwell> sweet. 00:22 < TD-Linux> compare https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint43.png https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint44.png 00:22 < gmaxwell> these are certantly much cleaner than the stuff you recorded before. 00:22 < TD-Linux> yeah I haven't piped this to the lfrx yet though I certainly can 00:27 < TD-Linux> and yeah to be clear here the constant timeness is perfect, to the cycle. 00:28 < gmaxwell> yea, though I wonder if the phase of the operations changes. e.g. our constant time moves might consume more power earlier or later depending on which table entry they read. 00:29 < gmaxwell> on those zooms, where is that zoomed in? in the nonce generation in the middle somewhere? 00:30 < indutny> TD-Linux: awesome work! 00:32 < TD-Linux> gmaxwell, that's near the beginning of the 64 cycles. 00:32 < TD-Linux> there's a bit before that that's also key dependent 00:36 < gmaxwell> TD-Linux: but this trace I see, this is one of the 64 things? --- I think the flat top is the table access, it seems to do 16 things, and if so then the dip down after is the point addition. if so, thats a little sad, because that is a really big power signature difference at the end of it. 00:36 * TD-Linux drops out serialization too as it's not relevant 00:36 < TD-Linux> gmaxwell, well we can get an answer to this real quick. 00:36 < TD-Linux> which function should I annotate on another channel? :) 00:37 < gmaxwell> you could put a led pulse in secp256k1_ecmult_gen right before ge_from_storage perhaps? 00:37 < TD-Linux> sure 00:39 < gmaxwell> or maybe turn on the led before the storage cmov loop and turn it off after? 00:39 < gmaxwell> we have a candidate patch that changes how the cmov works that might make its power signature less, so it would be good to try to just do that measurement right now. 00:50 < TD-Linux> gmaxwell, high = cmov loop https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint45.png 00:51 < gmaxwell> oh interesting! 00:51 < TD-Linux> also compare these two very first cmov loops with 2 messages 00:51 < TD-Linux> https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint46.png 00:51 < TD-Linux> https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint47.png 00:51 < TD-Linux> *keys 00:51 < gmaxwell> so it was the cmov loop that had the strong sidechannel. 00:51 < gmaxwell> thats good news. 00:52 < gmaxwell> I think the first cmov loop is suffering a cache miss. 00:52 < gmaxwell> or something? 00:54 < TD-Linux> indeed, the timing doesn't change though 00:54 < gmaxwell> that is strange. hm. 00:55 < gmaxwell> I don't get it. 01:01 < TD-Linux> are there any degenerate keys? like if key = 0 01:01 < TD-Linux> (this is literally key 0 and 1) 01:01 < gmaxwell> key = 0 is not valid (it produces the point at infinity which cannot be seralized) 01:01 < TD-Linux> okay let me change that. 01:01 < TD-Linux> I'll set the high bit as well. 01:01 < gmaxwell> how are you setting key 0? it should get rejected. 01:02 < gmaxwell> are you just timing a secp256k1_ecmult_gen directly? 01:02 < gmaxwell> have you disabled the blinding? the values are randomized. 01:03 < TD-Linux> yes blinding is off 01:03 < TD-Linux> and I'm calling the high level sign function 01:03 < TD-Linux> also I can come up with other key values that do the same thing I just showed so that's not related 01:04 < TD-Linux> (for some values the difference is smaller) 01:04 < gmaxwell> if (!overflow && !secp256k1_scalar_is_zero(&sec)) { 01:04 < TD-Linux> it is really weird because the *exact same* table values are iterated over in both cases, righT? 01:04 < gmaxwell> ^ _sign should reject a sec key of zero, and do so in a very not constant time way. 01:06 < gmaxwell> TD-Linux: well the way the cmov works is that it does bit masking, where the same table entries are read, and the one we want gets anded with 1s or 0s then ored into the target buffer. 01:06 < TD-Linux> ... oh lol my key is stack allocated and I didn't zero everything. 01:07 < gmaxwell> I suggest you change to using secp256k1_ec_pubkey_create as it will let you directly change the secret value that you're working with, and has a lot less garbabe that can't leak secrets. 01:08 < gmaxwell> All secp256k1_ec_pubkey_create does is takes your input, calls mulgen on it and seralizes the result. 01:09 < TD-Linux> does it still have the nonce stuff 01:10 < TD-Linux> (because that's also leaking, although less) 01:16 < gmaxwell> no, it's effectively just ecmultgen. but the signature process is k = hash(secret || message) then R = kG (exmultgen) then s = (secret * r + m) * (1/k) then it serializes r,s. 01:19 < gmaxwell> TD-Linux: if you're looking to bisect for other leak sources, all that after setting the k happens in secp256k1_ecdsa_sig_sign if you're saying there is a leak _before_ the secp256k1_ecmult_gen that would have to be in the RFC6979 hash... which seems pretty hopeless/unlikely? 01:22 < TD-Linux> yeah I think I'll look at that one later 01:23 < TD-Linux> first I want to figure out what's up with the cmov loop 01:25 < gmaxwell> well if you use the pubkey create you can access the input to that cmov directly, esp with blinding off. I think an interesting key set to look at would be 0xFF 0x00 ... x ... 0x00 0x00 with dots willed in with zeros and x filled in with the numbers 0-15 01:25 < gmaxwell> er 0xFF 0x00 ... x 0x00 0x00 01:26 < gmaxwell> so you'll have a bunch of constant cmovs, then one that varries with the key, then 16 cmovs that are constant. 01:27 < gmaxwell> or something like that, you should be able to look at the one that varries and see if the 16 different values leave obviously different power signature.s 01:28 < gmaxwell> the reason I keep suggesting pubkey create is because that gives you direct control of the cmov loop... while signing only lets you give different hash inputs to it. 01:29 < gmaxwell> (unless you replace the nonce function point with one that outputs what you want... which you can do too, but simpler to just use the pubkey create function.) 01:33 -!- arubi [~ese168@gateway/tor-sasl/ese168] has quit [Ping timeout: 248 seconds] 01:34 < TD-Linux> gmaxwell, okay so I swapped for pubkey create 01:34 -!- arubi [~ese168@gateway/tor-sasl/ese168] has joined #secp256k1 01:34 < TD-Linux> the huge initial cmov thing is totally gone 01:35 < TD-Linux> now I can set individual bytes in the key, and see them show up in the corresponding cmov 01:35 < TD-Linux> (first byte is second last cmov) 01:37 < gmaxwell> 'huge initial cmov thing is gone' hm. that seems suspect to me. 01:40 < TD-Linux> yes same 01:58 < TD-Linux> here's a pic of the super professional setup https://people.xiph.org/~tdaede/pics/sidechannel/20170529_0003.jpg 02:03 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has joined #secp256k1 02:09 < gmaxwell> TD-Linux: what amplifier is that? 02:13 < TD-Linux> AD8132 02:14 < TD-Linux> set for 10x gain 02:14 < TD-Linux> I have a couple others I could try as well 02:15 < TD-Linux> using a 1x probe because the output impedance is quite low 02:25 < gmaxwell> signal looks more than clean enough. probably bigger improvements would come from sheielding the setup, and using a test board with less unrelated junk that might be making noise. 02:31 < gmaxwell> TD-Linux: what clockrate are you running at? 05:21 < gmaxwell> TD-Linux: it would be interesting to try to characterize the CPU's impulse response. E.g. do a bust of heavy arithemetic, then go into whatever nop loop should use the least power. ( asm("wfi") ? ) I wonder if you get a similar delay as we see during the cmov? 05:22 < gmaxwell> s/delay/decay/ 05:22 < gmaxwell> e.g. the current may be dropping in that exponential shape due to internal capacitance filling up during a time of lower load. 05:28 < gmaxwell> hm. but it's too slow to be capaitance. Perhaps some kind of dynamic voltage control inside the cpu that cuts things back when the cpu is doing boring stuff like the cmov. 10:37 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has quit [Read error: Connection reset by peer] 10:38 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has joined #secp256k1 10:43 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has quit [Remote host closed the connection] 11:46 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has joined #secp256k1 13:39 -!- instagibbs_ [~instagibb@pool-100-15-117-236.washdc.fios.verizon.net] has quit [Ping timeout: 245 seconds] 15:04 -!- instagibbs [~instagibb@pool-100-15-117-236.washdc.fios.verizon.net] has joined #secp256k1 17:30 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has quit [Ping timeout: 245 seconds]