--- Day changed Mon May 29 2017
00:00 < gmaxwell> The major secret operation is the same as generating a public key.
00:01 < gmaxwell> yea, it would be interesting to measure if the power signature of twice with the same nonce (resp. two public key generations with the same secret) are more correlated than ones with different secrets, and if so by how much.
00:02 < TD-Linux> yeah okay so with a locked message it's a lot more stable. it's very content dependent.
00:02 < gmaxwell> I wonder if that power dip is a cache miss when going to the table?
00:02 < TD-Linux> I'll need to check if I have caching on
00:03 < gmaxwell> a prefetcher probably hides the table accesses, esp since it reads 16 entries does some caculations then 16 more.
00:03 < gmaxwell> but it wouldn't hide the first one.
00:04 < gmaxwell> TD-Linux: how does the probing for this work?
00:04 < TD-Linux> we only care about key bits and not msg bits correct?
00:05 < TD-Linux> gmaxwell, it's a remade amplifier board with 1 ohm resistor across the power line
00:05 < gmaxwell> TD-Linux: yea, the attack you do to compromise signing is that you read off bits from the nonce.
00:05 < gmaxwell> if you can learn k you can recover the private key from the resulting signature.
00:05 < TD-Linux> I was able to see a less high quality but otherwise quite similar output just from the power rail.
00:05 < TD-Linux> alright let's see if I can extract 1 bit of information just visually
00:07 < gmaxwell> keep in mind that the table entries are in 64  16 entry tables, which are accessed in a way that should make them constant time, but perhaps not constant power. (and, in fact, looks to me that those spikes are pretty nicely consistently timed).
00:09 < gmaxwell> You can simplify your attack by generating a compressed public key, since that will do all the same operations, but let you set the bits going into it more directly. uh also the operations are blinded.
00:11 < gmaxwell> you might want to disable that for the moment.  The blinding takes the key you've input (or the nonce) and adds it with some random number stored in the context, and adjusts the sequence of operations so that the initial value of the addition includes the negative of this number to cancel it out.  If you're not calling the reblind call, that blinding factor is a constant, so it doesn't really add much
00:11 < gmaxwell> in the way of security... but it may make your analysis confused.
00:12 < gmaxwell> commenting out the call to secp256k1_ecmult_gen_blind  in the ecmult_gen_create will turn that off.
00:15 < TD-Linux> yup can definitely extract 1 bit of information visually
00:15 < gmaxwell> oh yea?
00:15 < TD-Linux> well maybe not one bit
00:15 < gmaxwell> hah.
00:16 < TD-Linux> I can make two messages, store one as ref, and identify on the scope
00:16 < gmaxwell> will the scope helpfully align a stored trace on the trigger for you?
00:17 < TD-Linux> yes     
00:19 < gmaxwell> sweet.
00:22 < TD-Linux> compare https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint43.png https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint44.png
00:22 < gmaxwell> these are certantly much cleaner than the stuff you recorded before.
00:22 < TD-Linux> yeah I haven't piped this to the lfrx yet though I certainly can
00:27 < TD-Linux> and yeah to be clear here the constant timeness is perfect, to the cycle.
00:28 < gmaxwell> yea, though I wonder if the phase of the operations changes. e.g. our constant time moves might consume more power earlier or later depending on which table entry they read.
00:29 < gmaxwell> on those zooms, where is that zoomed in? in the nonce generation in the middle somewhere?
00:30 < indutny> TD-Linux: awesome work!
00:32 < TD-Linux> gmaxwell, that's near the beginning of the 64 cycles.
00:32 < TD-Linux> there's a bit before that that's also key dependent
00:36 < gmaxwell> TD-Linux: but this trace I see, this is one of the 64 things? --- I think the flat top is the table access, it seems to do 16 things,  and if so then the dip down after is the point addition.  if so, thats a little sad, because that is a really big power signature difference at the end of it.
00:36  * TD-Linux drops out serialization too as it's not relevant
00:36 < TD-Linux> gmaxwell, well we can get an answer to this real quick.
00:36 < TD-Linux> which function should I annotate on another channel? :)
00:37 < gmaxwell> you could put a led pulse in secp256k1_ecmult_gen  right before ge_from_storage  perhaps?
00:37 < TD-Linux> sure
00:39 < gmaxwell> or maybe turn on the led before the storage cmov loop and turn it off after?
00:39 < gmaxwell> we have a candidate patch that changes how the cmov works that might make its power signature less, so it would be good to try to just do that measurement right now.
00:50 < TD-Linux> gmaxwell, high = cmov loop https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint45.png
00:51 < gmaxwell> oh interesting!
00:51 < TD-Linux> also compare these two very first cmov loops with 2 messages
00:51 < TD-Linux> https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint46.png
00:51 < TD-Linux> https://people.xiph.org/~tdaede/pics/sidechannel/DS1Z_QuickPrint47.png
00:51 < TD-Linux> *keys
00:51 < gmaxwell> so it was the cmov loop that had the strong sidechannel.
00:51 < gmaxwell> thats good news.
00:52 < gmaxwell> I think the first cmov loop is suffering a cache miss.
00:52 < gmaxwell> or something?
00:54 < TD-Linux> indeed, the timing doesn't change though
00:54 < gmaxwell> that is strange. hm.
00:55 < gmaxwell> I don't get it.
01:01 < TD-Linux> are there any degenerate keys? like if key = 0
01:01 < TD-Linux> (this is literally key 0 and 1)
01:01 < gmaxwell> key = 0 is not valid (it produces the point at infinity which cannot be seralized)
01:01 < TD-Linux> okay let me change that.
01:01 < TD-Linux> I'll set the high bit as well.
01:01 < gmaxwell> how are you setting key 0?  it should get rejected.
01:02 < gmaxwell> are you just timing a secp256k1_ecmult_gen  directly?
01:02 < gmaxwell> have you disabled the blinding? the values are randomized.
01:03 < TD-Linux> yes blinding is off
01:03 < TD-Linux> and I'm calling the high level sign function
01:03 < TD-Linux> also I can come up with other key values that do the same thing I just showed so that's not related
01:04 < TD-Linux> (for some values the difference is smaller)
01:04 < gmaxwell>     if (!overflow && !secp256k1_scalar_is_zero(&sec)) {
01:04 < TD-Linux> it is really weird because the *exact same* table values are iterated over in both cases, righT?
01:04 < gmaxwell> ^ _sign should reject a sec key of zero, and do so in a very not constant time way.
01:06 < gmaxwell> TD-Linux: well the way the cmov works is that it does bit masking, where the same table entries are read, and the one we want gets anded with 1s or 0s  then ored into the target buffer.
01:06 < TD-Linux> ... oh lol my key is stack allocated and I didn't zero everything.
01:07 < gmaxwell> I suggest you change to using secp256k1_ec_pubkey_create  as it will let you directly change the secret value that you're working with, and has a lot less garbabe that can't leak secrets.
01:08 < gmaxwell> All secp256k1_ec_pubkey_create  does is takes your input, calls mulgen on it and seralizes the result.
01:09 < TD-Linux> does it still have the nonce stuff
01:10 < TD-Linux> (because that's also leaking, although less)
01:16 < gmaxwell> no, it's effectively just ecmultgen.  but   the signature process is   k = hash(secret || message)  then  R = kG (exmultgen)  then s = (secret * r + m) * (1/k)   then it serializes r,s.
01:19 < gmaxwell> TD-Linux: if you're looking to bisect for other leak sources, all that after setting the k happens in secp256k1_ecdsa_sig_sign  if you're saying there is a leak _before_ the secp256k1_ecmult_gen  that would have to be in the RFC6979 hash... which seems pretty hopeless/unlikely?
01:22 < TD-Linux> yeah I think I'll look at that one later
01:23 < TD-Linux> first I want to figure out what's up with the cmov loop
01:25 < gmaxwell> well if you use the pubkey create you can access the input to that cmov directly, esp with blinding off.  I think an interesting key set to look at would be  0xFF 0x00 ... x ... 0x00 0x00   with dots willed in with zeros  and x filled in with the numbers 0-15
01:25 < gmaxwell> er 0xFF 0x00 ... x 0x00 0x00
01:26 < gmaxwell> so you'll have a bunch of constant cmovs,  then one that varries with the key,  then 16 cmovs that are constant.
01:27 < gmaxwell> or something like that, you should be able to look at the one that varries and see if the 16 different values leave obviously different power signature.s
01:28 < gmaxwell> the reason I keep suggesting pubkey create is because that gives you direct control of the cmov loop... while signing only lets you give different hash inputs to it.
01:29 < gmaxwell> (unless you replace the nonce function point with one that outputs what you want... which you can do too, but simpler to just use the pubkey create function.)
01:33 -!- arubi [~ese168@gateway/tor-sasl/ese168] has quit [Ping timeout: 248 seconds]     
01:34 < TD-Linux> gmaxwell, okay so I swapped for pubkey create
01:34 -!- arubi [~ese168@gateway/tor-sasl/ese168] has joined #secp256k1     
01:34 < TD-Linux> the huge initial cmov thing is totally gone
01:35 < TD-Linux> now I can set individual bytes in the key, and see them show up in the corresponding cmov
01:35 < TD-Linux> (first byte is second last cmov)
01:37 < gmaxwell> 'huge initial cmov thing is gone' hm. that seems suspect to me.
01:40 < TD-Linux> yes same
01:58 < TD-Linux> here's a pic of the super professional setup https://people.xiph.org/~tdaede/pics/sidechannel/20170529_0003.jpg
02:03 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has joined #secp256k1     
02:09 < gmaxwell> TD-Linux: what amplifier is that?
02:13 < TD-Linux> AD8132
02:14 < TD-Linux> set for 10x gain
02:14 < TD-Linux> I have a couple others I could try as well
02:15 < TD-Linux> using a 1x probe because the output impedance is quite low
02:25 < gmaxwell> signal looks more than clean enough. probably bigger improvements would come from sheielding the setup, and using a test board with less unrelated junk that might be making noise.
02:31 < gmaxwell> TD-Linux: what clockrate are you running at?
05:21 < gmaxwell> TD-Linux: it would be interesting to try to characterize the CPU's impulse response.  E.g. do a bust of heavy arithemetic, then go into whatever nop loop should use the least power.  ( asm("wfi") ? )  I wonder if you get a similar delay as we see during the cmov?
05:22 < gmaxwell> s/delay/decay/
05:22 < gmaxwell> e.g. the current may be dropping in that exponential shape due to internal capacitance filling up during a time of lower load.
05:28 < gmaxwell> hm. but it's too slow to be capaitance.  Perhaps some kind of dynamic voltage control inside the cpu that cuts things back when the cpu is doing boring stuff like the cmov.
10:37 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has quit [Read error: Connection reset by peer]     
10:38 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has joined #secp256k1     
10:43 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has quit [Remote host closed the connection]     
11:46 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has joined #secp256k1     
13:39 -!- instagibbs_ [~instagibb@pool-100-15-117-236.washdc.fios.verizon.net] has quit [Ping timeout: 245 seconds]     
15:04 -!- instagibbs [~instagibb@pool-100-15-117-236.washdc.fios.verizon.net] has joined #secp256k1     
17:30 -!- jtimon [~quassel@117.29.134.37.dynamic.jazztel.es] has quit [Ping timeout: 245 seconds]