I'm wondering about whether (don't laugh) moving signing into the kernel and then using the MTRRs to disable caching entirely for a small scratch region of memory would also work. You could then disable pre-emption and prevent anything on the same core from interrupting or timing the signing operation.

However I suspect just making a hardened secp256k1 signer implementation in userspace would be of similar difficulty, in which case it  would naturally be preferable.


On Wed, Mar 5, 2014 at 11:25 PM, Gregory Maxwell <gmaxwell@gmail.com> wrote:
On Wed, Mar 5, 2014 at 2:14 PM, Eric Lombrozo <elombrozo@gmail.com> wrote:
> Everything you say is true.
>
> However, branchless does reduce the attack surface considerably - if nothing else, it significantly ups the difficulty of an attack for a relatively low cost in program complexity, and that might still make it worth doing.

Absolutely. I believe these things are worth doing.

My comment on it being insufficient was only that "my signer is
branchless" doesn't make other defense measures (avoiding reuse,
multsig with multiple devices, not sharing hardware, etc.)
unimportant.

> As for uniform memory access, if we avoided any kind of heap allocation, wouldn't we avoid such issues?

No. At a minimum to hide a memory timing side-channel you must perform
no data dependent loads (e.g. no operation where an offset into memory
is calculated). A strategy for this is to always load the same values,
but then mask out the ones you didn't intend to read... even that I'd
worry about on sufficiently advanced hardware, since I would very much
not be surprised if the processor was able to determine that the load
had no effect and eliminate it! :) )

Maybe in practice if your data dependencies end up only picking around
in the same cache-line it doesn't actually matter... but it's hard to
be sure, and unclear when a future optimization in the rest of the
system might leave it exposed again.

(In particular, you can't generally write timing sign-channel immune
code in C (or other high level language) because the compiler is
freely permitted to optimize things in a way that break the property.
... It may be _unlikely_ for it to do this, but its permitted— and
will actually do so in some cases—, so you cannot be completely sure
unless you check and freeze the toolchain)

> Anyhow, without having gone into the full details of this particular attack, it seems the main attack point is differences in how squaring and multiplication (in the case of field exponentiation) or doubling and point addition (in the case of ECDSA) are performed. I believe using a branchless implementation where each phase of the operation executes the exact same code and accesses the exact same stack frames would not be vulnerable to FLUSH+RELOAD.

I wouldn't be surprised.

------------------------------------------------------------------------------
Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works.
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
_______________________________________________
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development