--- Log opened Fri Jun 12 00:00:47 2020 00:39 -!- ThomasV [~thomasv@unaffiliated/thomasv] has joined #utreexo 02:48 < ThomasV> is deletion of leaves order-independent? 02:52 < ThomasV> also, does batch deletion result in the same tree as multiple DeleteOne operations? 05:04 -!- ThomasV [~thomasv@unaffiliated/thomasv] has quit [Ping timeout: 272 seconds] 05:52 -!- gojiHmPFPN [~textual@c-73-47-220-190.hsd1.ma.comcast.net] has joined #utreexo 06:03 -!- gojiHmPFPN [~textual@c-73-47-220-190.hsd1.ma.comcast.net] has quit [Quit: My MacBook has gone to sleep. ZZZzzz…] 06:17 -!- gojiHmPFPN [~textual@c-73-47-220-190.hsd1.ma.comcast.net] has joined #utreexo 06:25 -!- rafalcpp [~racalcppp@ip-178-214.ists.pl] has joined #utreexo 08:11 -!- ThomasV [~thomasv@unaffiliated/thomasv] has joined #utreexo 09:20 < adiabat> ThomasV: from the accumulator's point of view, deletion order does matter, but we delete everythin in a block at the same time 09:21 < adiabat> I don't think DeleteOne is used at all... maybe we should get rid of it 09:21 < ThomasV> adiabat: ok, so it means an implementation that deletes in a different order would still be compatible 09:22 < adiabat> Possibly not; calling DeleteOne many times is much less efficient than deleting all at once, and I may result in the leaves being in different order 09:22 < ThomasV> oh, sorry, I read "does not matter" :( 09:23 < adiabat> so that would change the proofs. It may end up with the same ordering, but there's ni guarantee 09:23 < adiabat> *no 09:23 < ThomasV> hmm, ok, so another implementation would have to go block after block 09:23 < adiabat> It's possible to get the same result deleting one at a time, but it would be a lot slower 09:24 < adiabat> because with each deletion, you need to hash all the way back up to the root 09:24 < ThomasV> yeah thats what I'm currently doing :D 09:24 < adiabat> in a tree with say 1 million leaves, and you delete 1000, that'd be about 1000*20 hashes 09:25 < adiabat> but deleting all at the same time ends up being much fewer hashes 09:25 < adiabat> especially if they are near each other 09:26 < ThomasV> adiabat: in the paper you mention maintaining an index of utxo->path_to_leaf. how is that cheaper than using pointers and not having to maintain that index? 09:26 < adiabat> How would pointers work? 09:26 < ThomasV> (for the bridge node, where all hashes are written contiguously) 09:27 < adiabat> The main thing is given a hash, you need to know where it is 09:27 < adiabat> so you can do a linear search, it just takes too long 09:27 < ThomasV> well, you could do a malloc for each node and have pointer to the parent 09:28 < adiabat> yes you can get there, but you don't know where "there" is. If I'm given some 32 byte leaf hash, I know that leaf is somewhere in my tree 09:28 < adiabat> but my tree has millions of leaves, and they're not in order 09:29 < ThomasV> oh, but you would still have an index, utxo->pointer to the node in memory 09:29 < adiabat> right, which is basically what it is 09:29 < ThomasV> the thing is you would not need to maintain it 09:29 < adiabat> it's just a mapping of hash to position, which is uint64 09:29 < ThomasV> yes but you need to update those positions everytime you shuffle a subtree 09:29 < adiabat> oh if it was all pointer based, the pointer would follow it you mean 09:29 < ThomasV> yes 09:30 < adiabat> yeah, I actually had pollard working that way for a while 09:30 < adiabat> and changed it because it was trickier... it does work though 09:30 < ThomasV> was it slower? 09:30 < adiabat> I don't think so 09:31 < adiabat> but... it's not really needed in pollard ( the sparse forest) 09:31 < adiabat> the full forest is where it's needed, and I haven't tried it there 09:31 < ThomasV> maybe batch deletion is easier with the array approach because you can easily sort leaves 09:32 < adiabat> I can see how it might be faster, you just swap pointers when moving things 09:32 < adiabat> but it also seems a bit harder to code, but maybe worth it 09:33 < ThomasV> hmm.. it's actually easier to code.. unless there is something I don't get in the array approach 09:33 < adiabat> I'm not sure how much the position map slows things down... I don't *think* it's a bottleneck compared to hashing but haven't measured 09:33 < ThomasV> btw, why go and not c++/bitcoind ? 09:34 < adiabat> I just liking working with go better :) Eventually I will need to make a c++ version I think 09:35 < adiabat> the part that would be harder with pointer based is keeping it all on disk. In ram it's not too bad, but then need to write to disk sometimes, maybe not every block 09:36 < adiabat> also forest currently has no pointers involved, so switching to pointers is 50% more memory usage 09:36 < adiabat> (every 32 byte hash gets an extra 2 8-byte pointers, 32 -> 48 bytes per node in tree) 09:39 < ThomasV> how do you update the position map? do you maintain a list of all leaves that have been shuffled in the current batch? 09:42 < adiabat> every time a leave moves (anything on the bottom row) the position map is updated 09:42 < adiabat> https://github.com/mit-dci/utreexo/blob/master/accumulator/forest.go#L226 09:43 < adiabat> I think that function is the only place it happens 09:47 < ThomasV> adiabat: another question. when you do a swap, I guess you swap with the closest leaf on the row. choosing another candidate would probably affect the result, just like deletion is order dependent 09:47 < ThomasV> so, to perform swaps in a compatible way, I kind of need the position index, regardless of using pointers 09:49 < adiabat> The swaps are not always with closest 09:49 < adiabat> there's some freedom in how the swaps work, especially when deleting many things at once 09:49 < ThomasV> ok.. but the choice would affect the result, right? 09:49 < adiabat> with DeleteOne there's basically only one way to do it 09:49 < adiabat> yeah that would affect the ordering of the leaves 09:50 < adiabat> It's very possible that there are better ways to swap things, that result in shorter proofs by keeping leaves together 09:50 < ThomasV> so it would be good to document how the candidate is chosen :-) 09:50 < adiabat> I haven't played around with it, and it probably doesn't make a huge difference, but it could be something to optimize 09:51 < ThomasV> my question is not about performance, it's about reproducibility 09:51 < adiabat> ah OK then yeah everyone has to do it the same way otherwise they'll get different proofs 09:52 < ThomasV> exactly 09:52 < adiabat> right now it's pretty simple, it's "extract twins" so ignore two deletions that are siblings 09:53 < adiabat> then the swaps are with the sibling of the next deletion to the right 09:53 < ThomasV> ok, that's what I thought :-) 09:53 < adiabat> so that's "closest" but only looking at the next over; there could be an algorithm that minimizes swap distance over the whole range 09:53 < adiabat> https://github.com/mit-dci/utreexo/blob/master/accumulator/transform.go#L104 09:53 < ThomasV> but you can see that if I only use pointers, I will not know who is the 'next to the right' 09:54 < adiabat> that's the swap algo now which is pretty simple 09:54 < adiabat> well even with pointers it should be possible I think 09:55 < adiabat> you can tell position by the path you take to get to a node, how many left / right pointers you take 09:55 < ThomasV> I think too.. I just need to know what you are doing 09:56 < adiabat> the current way is to look at the positions of all the deletions 09:56 < adiabat> and operate first on those positions to get a list of what to swap 09:57 < adiabat> in DeleteOne, you can't actually swap anything, you can only move roots 09:58 < adiabat> swapping is the optimization deleting many at a time gives you, as you can move everything around into the final ordering and then re-hash everything from there 09:58 < adiabat> interestingly, when *adding* leaves, adding one at a time is just as efficient as adding many at once 10:01 < adiabat> there are a bunch of things where you could definitely do it differently and it would still work, and maybe even be better 10:01 < adiabat> one of the biggest: right now it takes the block, deletes all inputs then adds all outputs 10:01 < adiabat> this seems simplest, but might not be most efficient! 10:02 < adiabat> doing add first then delete would be simple to change, but think that would be worse / slower 10:02 < adiabat> but what might be better is to mark all the deletions, overwrite them with the new additions, and then only do deletions / swapping / etc if there are more deletions than additions 10:03 < adiabat> that would minimize the swapping part, in many (most!) blocks there wouldn't be any 10:03 < ThomasV> oh but it would be hard to maintain compatibility with your implementation then 10:03 < adiabat> the downside is the leaves would not be grouped together as much by creation time. So maybe it ends up being worse due to longer proofs 10:03 < adiabat> yeah this would be totally incompatible 10:04 < adiabat> right now though nobody's using it in the wild yet so 10:04 < adiabat> easy to switch :) 10:05 < adiabat> I haven't tried looking at the performance of different ways, maybe I should... my guess is it won't make much difference so I've just left it the simplest way 10:05 < adiabat> but if it ends up being 20% faster or something, definitely worth switching 10:09 < ja> with the c bindings i made it would be easier for any alternate implementation to ensure compatibility ;) 10:10 < adiabat> ja: yeah bindings are easiest. I think it's also cool if there's a different language rewrite, because then might find some edge cases & nail down the spec better 10:11 < adiabat> but yeah it is a "consensus" kind of thing; if two different computers do different things with the accumulator, their proofs won't work for each other 10:12 < ThomasV> heh 10:14 < adiabat> hmm also if we do find better ways to do it, there can be different versions, so when they connect they should announce versions to check compatibility 10:14 < adiabat> like maybe people start using it and a year later there's a better algorithm 10:15 < adiabat> for nodes to upgrade / update to a different algo, they can be presented with a proof 10:15 < adiabat> that whatever accumulator state they have now, here's the equivalent with the new algo 10:16 < adiabat> proof would be pretty big though. But it beats having to re-sync the whole blockchain 10:22 < ThomasV> adiabat: suppose I have a light node with only the accumulator. I get a block, with a proof for each utxo that is spent in the block. I guess I will also need to receive a special ordering, that matches the batch deletion process? 10:24 < ThomasV> or at least I will need to implement a strategy that mimics the batch deletion 10:24 < adiabat> the way we have it now, is it gets the proof, and the utxo data for each input 10:25 < adiabat> and the proof does cointain a list of positions within the forest 10:26 < ThomasV> oh, but the swap candidate is part of the proof, so I guess that's how you can do it 10:26 < adiabat> yes, the deletion algorithm is such that only the proofs are needed / used for swaps 10:26 < ThomasV> ...my brain hurts.. 10:29 < adiabat> they both use the exact same data 10:29 < adiabat> heh :) You can kind of thing of "verify proof" and "delete" as the same function; 10:29 < adiabat> (hm those two lines got switched) 12:57 -!- ThomasV [~thomasv@unaffiliated/thomasv] has quit [Ping timeout: 272 seconds] 13:37 -!- ThomasV [~thomasv@unaffiliated/thomasv] has joined #utreexo 13:53 -!- jb55 [~jb55@gateway/tor-sasl/jb55] has quit [Remote host closed the connection] 13:54 -!- jb55 [~jb55@gateway/tor-sasl/jb55] has joined #utreexo 14:18 -!- ThomasV [~thomasv@unaffiliated/thomasv] has quit [Ping timeout: 272 seconds] 14:54 -!- ThomasV [~thomasv@unaffiliated/thomasv] has joined #utreexo 15:21 -!- ThomasV [~thomasv@unaffiliated/thomasv] has quit [Ping timeout: 246 seconds] 19:05 -!- gojiHmPFPN [~textual@c-73-47-220-190.hsd1.ma.comcast.net] has quit [Quit: My MacBook has gone to sleep. ZZZzzz…] 19:07 -!- gojiHmPFPN [~textual@c-73-47-220-190.hsd1.ma.comcast.net] has joined #utreexo 19:35 -!- fanquake [sid369002@gateway/web/irccloud.com/x-dpentetezvtqljkq] has quit [Ping timeout: 256 seconds] 20:03 -!- fanquake [sid369002@gateway/web/irccloud.com/x-atimgvzlvyuuhsrv] has joined #utreexo 20:09 -!- gojiHmPFPN [~textual@c-73-47-220-190.hsd1.ma.comcast.net] has quit [Quit: My MacBook has gone to sleep. ZZZzzz…] 20:54 -!- fanquake [sid369002@gateway/web/irccloud.com/x-atimgvzlvyuuhsrv] has quit [Ping timeout: 264 seconds] 20:59 -!- ThomasV [~thomasv@unaffiliated/thomasv] has joined #utreexo 21:18 -!- fanquake [sid369002@gateway/web/irccloud.com/x-zpfuishxfwzmfiwj] has joined #utreexo 22:02 -!- ThomasV [~thomasv@unaffiliated/thomasv] has quit [Ping timeout: 256 seconds] 22:34 -!- ThomasV [~thomasv@unaffiliated/thomasv] has joined #utreexo 23:17 -!- ThomasV [~thomasv@unaffiliated/thomasv] has quit [Ping timeout: 264 seconds] --- Log closed Sat Jun 13 00:00:48 2020