Are we just talking about pruning the spent transactions from an old block? We already have a data structure that allows us to replace any un-needed transaction by just it's hash - and possibly a whole sub-tree if we get lucky in that the un-needed transaction all fall within a common node of the merkle tree. If a lite client only cares to retain a single transaction in a block (the most common case) - it will only need O(log2(T)) merkle hashes plus the transaction it cares about. Does it really make sense to adopt a more complex data-structure than the merkle tree for inclusing in the bticoin protocol? And we're not talking about blocks with millions of transactions in them - I don't understand the relevance of Order statistics for random access to a transaction given its block. On Tue, Jun 19, 2012 at 11:30 AM, Alan Reiner wrote: > On 06/19/2012 02:18 PM, Mark Friedenbach wrote: > > On Tue, Jun 19, 2012 at 10:33 AM, Alan Reiner wrote: > > If we were to use a raw trie structure, then we'd have all the above >> issues solved: a trie has the same configuration no matter how elements >> are inserted or deleted, and accesses to elements in the tree are >> constant time -- O(1). There is no such thing as an unbalanced trie. >> But overall space-efficiency is an issue. >> >> A PATRICIA tree/trie would be ideal, in my mind, as it also has a >> completely deterministic structure, and is an order-of-magnitude more >> space-efficient. Insert, delete and query times are still O(1). >> However, it is not a trivial implementation. I have occasionally looked >> for implementations, but not found any that were satisfactory. >> > > No, a trie of any sort is dependent upon distribution of input data for > balancing. As Peter Todd points out, a malicious actor could construct > transaction or address hashes in such a way as to grow some segment of the > trie in an unbalanced fashion. It's not much of an attack, but in principle > exploitable under particular timing-sensitive circumstances. > > Self-balancing search trees (KVL, RB, 2-3-4, whatever) don't suffer from > this problem. > > Mark > > > I was using "unbalanced" to refer to "query time" (and also insert/delete > time). If your trie nodes branch based on the next byte of your key hash, > then the max depth of your trie is 32. Period. No one can do anything to > ever make you do more than 32 hops to find/insert/delete your data. And > if you're using a raw trie, you'll always use *exactly* 32 hops > regardless of the distribution of the underlying data. Hence, the trie > structure is deterministic (history-independent) and cannot become > unbalanced in terms of access time. > > My first concern was that a malicious actor could linearize parts of the > tree and cause access requests to take much longer than log(N) time. With > the trie, that's not only impossible, you're actually accessing in O(1) > time. > > However, you are right that disk space can be affected by a malicious > actor. The more branching he can induce, the more branch nodes that are > created to support branches with only one leaf. > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > > -- Mike Koss CTO, CoinLab (425) 246-7701 (m) A Bitcoin Primer - What you need to know about Bitcoins.