On Tue, Jun 19, 2012 at 10:33 AM, Alan Reiner wrote: > I hope that someone else here would chime in on the issue raised in the > thread, about using a tree-structure that has multiple valid > configurations for the same set of unspent-TxOuts. If you use any > binary tree, you must replay the entire history of insertions and > deletions in the correct order to get the tree structure and correct > root. Along those lines, using something like a red-black tree, while > theoretically well-known, could be subject to implementation errors. > One implementation of a red-black tree may do the rebalancing > differently, and still work for it's intended purpose in the majority of > applications where it doesn't matter. One app developer updates their > RB tree code which updated the RB-tree optimizations/rebalancing, and > now a significant portion of the network can't agree on the correct > root. Not only would that be disruptive, it would be a disaster to > track down. > Then use a 2-3-4 tree (aka self-balancing B-tree of order 4), which is a generalization of RB-trees that doesn't allow for implementation choices in balancing (assuming ordered insertion and deletion). As gmaxwell points out, this is an trivially fixable 'problem'. Choose a standard, mandate it, and write test cases. If we were to use a raw trie structure, then we'd have all the above > issues solved: a trie has the same configuration no matter how elements > are inserted or deleted, and accesses to elements in the tree are > constant time -- O(1). There is no such thing as an unbalanced trie. > But overall space-efficiency is an issue. > > A PATRICIA tree/trie would be ideal, in my mind, as it also has a > completely deterministic structure, and is an order-of-magnitude more > space-efficient. Insert, delete and query times are still O(1). > However, it is not a trivial implementation. I have occasionally looked > for implementations, but not found any that were satisfactory. > No, a trie of any sort is dependent upon distribution of input data for balancing. As Peter Todd points out, a malicious actor could construct transaction or address hashes in such a way as to grow some segment of the trie in an unbalanced fashion. It's not much of an attack, but in principle exploitable under particular timing-sensitive circumstances. Self-balancing search trees (KVL, RB, 2-3-4, whatever) don't suffer from this problem. Mark