[+cc aaron]

We recently added an implementation of BIP 38 (password protected private
keys) to bitcoinj. It came to my attention that the third test vector may
be broken. It gives a hex version of what the NFC normalised version of the
input string should be, but this does not match the results of the Java
unicode normaliser, and in fact I can't even get Python to print the names
of the characters past the embedded null. I'm curious where this normalised
version came from.

Given that "pile of poo" is not a character I think any sane user would put
into a passphrase, I question the value of this test vector. NFC form is
intended to collapse things like umlaut control characters onto their prior
code point, but here we're feeding the algorithm what is basically garbage
so I'm not totally surprised that different implementations appear to
disagree on the outcome.

Proposed action: we remove this test vector as it does not represent any
real world usage of the spec, or if we desperately need to verify NFC
normalisation I suggest using a different, more realistic test string, like
Zürich, or something written in Thai.



Test 3:

   - Passphrase ϓ␀𐐀💩 (\u03D2\u0301\u0000\U00010400\U0001F4A9; GREEK
   UPSILON WITH HOOK <http://codepoints.net/U+03D2>, COMBINING ACUTE ACCENT
   <http://codepoints.net/U+0301>, NULL <http://codepoints.net/U+0000>, DESERET
   CAPITAL LETTER LONG I <http://codepoints.net/U+10400>, PILE OF POO
   <http://codepoints.net/U+1F4A9>)
   - Encrypted key:
   6PRW5o9FLp4gJDDVqJQKJFTpMvdsSGJxMYHtHaQBF3ooa8mwD69bapcDQn
   - Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF
   - Unencrypted private key (WIF):
   5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4
   - *Note:* The non-standard UTF-8 characters in this passphrase should be
   NFC normalized to result in a passphrase of0xcf9300f0909080f09f92a9 before
   further processing