public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Peter Todd <pete@petertodd•org>
To: Gavin Andresen <gavinandresen@gmail•com>
Cc: Bitcoin Dev <bitcoin-development@lists•sourceforge.net>
Subject: Re: [Bitcoin-development] Payment Protocol Proposal: Invoices/Payments/Receipts
Date: Wed, 28 Nov 2012 03:33:06 -0500	[thread overview]
Message-ID: <20121128083306.GA13919@savin> (raw)
In-Reply-To: <CABsx9T0PsGLEAWRCjEDDFWQrb+DnJWQZ7mFLaZewAEX6vD1eHw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3445 bytes --]

On Mon, Nov 26, 2012 at 05:37:31PM -0500, Gavin Andresen wrote:
> Why not JSON?
> -------------
> 
> Invoice, Payment and Receipt messages could all be JSON-encoded. And
> the Javascript Object Signing and Encryption (JOSE) working group at
> the IETF has a draft specification for signing JSON data.
> 
> But the spec is non-trivial. Signing JSON data is troublesome because
> JSON can encode the same data in multiple ways (whitespace is
> insignificant, characters in strings can be represented escaped or
> un-escaped, etc.), and the standards committee identified at least one
> security-related issue that will require special JSON parsers for
> handling JSON-Web-Signed (JWS) data (duplicate keys must be rejected
> by the parser, which is more strict than the JSON spec requires).
> 
> A binary message format has none of those complicating issues. Which
> encoding format to pick is largely a matter of taste, but Protocol
> Buffers is a simple, robust, multi-programming-language,
> well-documented, easy-to-work-with, extensible format.

I'm not sure this is actually as much of an advantage as you'd expect. I
looked into Google Protocol buffers a while back for a timestamping
project and unfortunately there are many ways in which the actual binary
encoding of a message can differ even if the meaning of the message is
the same, just like JSON.

First of all while the order in which fields are encoded *should* be
written sequentially, parsers are also required to accept the fields in
any order. There is also a repeated fields feature where the
fields can either be serialized as one packed key-list pair, or multiple
key-value(s) pairs; in the latter case the payloads are concatenated.

The general case of how to handle a duplicated field that isn't supposed
to be repeated seems to be undefined in the standard. Yet at the same
time the standard mentions creating messages by concatenating two
messages together. Presumably parsers treat that case as an error, but I
wouldn't be surprised if that isn't always true.

Implementations differ as well. The current Java and C++ implementations
write unknown fields in arbitrary order after the sequentially-ordered
known fields, while on the other hand the Python implementation simply
drops unknown fields entirely. As far as I know no implementation
preserves order for unknown fields.

Finally, while not a Protocol Buffers specific problem, UTF8 encoded
text isn't guaranteed to survive a UTF8-UTFx-UTF8 round trip.  Multiple
code point sequences can be semanticly identical so you can expect some
software to convert one to the other. Similarly lots of languages
internally store unicode strings by converting to something like UTF16.
One solution is to use one of the normalization forms such as NFKD - an
idempotent transformation - although I wouldn't be surprised if
normalization itself is complex enough that implementation bugs exist,
not to mention the fact that the normalization forms have undergone
different versions.

I think the best way(1) to handle (most) the above by simply treating the
binary message as immutable and never re-serializing a deserialized
message, but if you're willing to do that just using JSON isn't
unreasonable either.


1) Of course I went off an created Yet Another Binary Serialization for
my project, but I'm young and foolish...

-- 
'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

  parent reply	other threads:[~2012-11-28  9:33 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-26 22:37 Gavin Andresen
2012-11-26 23:02 ` Mike Hearn
2012-11-26 23:13   ` Luke-Jr
2012-11-26 23:16     ` Mike Hearn
2012-11-26 23:19       ` Luke-Jr
2012-11-26 23:27         ` Mike Hearn
2012-11-26 23:32         ` Gregory Maxwell
2012-11-26 23:44           ` Luke-Jr
2012-11-27  0:16             ` Gregory Maxwell
2012-11-27  0:26               ` Mike Hearn
2012-11-27  0:45                 ` Rick Wesson
2012-11-27  1:09                   ` Gavin
2012-11-27  8:44                   ` Mike Hearn
2012-11-27  0:44               ` Luke-Jr
2012-11-26 23:38 ` Rick Wesson
2012-11-26 23:52 ` Jeff Garzik
2012-11-27  0:02   ` Rick Wesson
2012-11-27  0:31     ` Luke-Jr
2012-11-27  0:37       ` Rick Wesson
2012-11-27  2:16 ` Walter Stanish
2012-11-27  2:47   ` Gregory Maxwell
2012-11-27  3:16     ` Walter Stanish
2012-11-27  3:29       ` Rick Wesson
2012-11-27  3:31         ` Walter Stanish
2012-11-27  3:54           ` Rick Wesson
2012-11-27  4:17             ` Walter Stanish
2012-11-27  8:43               ` Michael Gronager
2012-11-27 10:23                 ` Mike Hearn
2012-11-27 10:42                   ` Michael Gronager
2012-11-27 11:36                     ` Pieter Wuille
2012-11-27 11:46                       ` Michael Gronager
2012-11-27 12:03                     ` Mike Hearn
2012-11-27 12:39                       ` Michael Gronager
2012-11-27 14:05                         ` Gavin Andresen
2012-11-27 14:26                           ` Gavin Andresen
2012-11-28 13:55                           ` Walter Stanish
2012-11-27 17:03 ` Andy Parkins
2012-11-27 17:14   ` Mike Hearn
2012-11-27 17:26     ` Andy Parkins
2012-11-27 18:16       ` Mike Hearn
2012-11-27 21:39         ` Gavin Andresen
2012-11-28 10:43           ` Mike Hearn
2012-11-28 12:57             ` Peter Todd
2012-11-28 14:09               ` Gavin Andresen
2012-11-28  8:33 ` Peter Todd [this message]
2012-11-28 23:36 ` Roy Badami
2012-11-29  0:30   ` Watson Ladd
2012-11-29  8:16     ` slush
2012-11-29 16:11   ` Gavin Andresen
2012-11-29 17:07     ` Roy Badami
2012-11-29 17:30       ` Gavin Andresen
2012-11-29 17:31       ` Mike Hearn
2012-11-29 18:53         ` Roy Badami
2012-12-01 19:25           ` Gavin Andresen
2012-12-03 19:35             ` Mike Koss
2012-12-03 20:59               ` Gavin Andresen
2012-12-03 21:28               ` Mike Hearn
2012-12-03 22:26                 ` Roy Badami
2012-12-03 22:34                   ` Jeff Garzik
2012-12-03 22:48                     ` Roy Badami
2012-12-16 21:15               ` Melvin Carvalho
2012-12-17  2:18                 ` Jeff Garzik
2012-12-17  8:24                   ` Melvin Carvalho
2012-12-17  9:19                     ` Mike Hearn
2012-12-17  9:31                       ` Gary Rowe
2012-12-17 11:23                       ` Melvin Carvalho
2012-12-17 17:57                         ` Gavin Andresen
2012-12-20 16:53                           ` Stephen Pair
2012-12-20 17:43                             ` Mike Hearn
2012-12-20 19:32                               ` Stephen Pair
2012-12-21 17:05                                 ` Stephen Pair
2012-12-24  0:38                                   ` Elden Tyrell
2012-12-04 17:06             ` Mike Hearn
2012-12-05 19:34               ` Gavin Andresen
2012-12-06  6:31                 ` Andreas Petersson
2012-12-06  8:53                   ` Mike Hearn
2012-12-06 16:56                     ` Gavin Andresen
2012-12-06 17:55                       ` Mike Hearn
2012-12-06 19:13                         ` Gavin Andresen
2012-12-07 10:45                           ` Mike Hearn
2012-12-07 11:01                             ` Mike Hearn
2012-12-07 16:19                               ` Gavin Andresen
2012-12-07 16:27                                 ` Mike Hearn
2012-12-06 18:13                       ` Alan Reiner
     [not found]                       ` <CALf2ePx5jS@mail.gmail.com>
2014-09-17 19:28                         ` Vezalke
2012-12-03 21:42         ` Gregory Maxwell
2012-12-23  2:33 ` Mark Friedenbach

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121128083306.GA13919@savin \
    --to=pete@petertodd$(echo .)org \
    --cc=bitcoin-development@lists$(echo .)sourceforge.net \
    --cc=gavinandresen@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox