--- Log opened Fri Dec 18 00:00:52 2020
02:53 -!- jb55 [~jb55@gateway/tor-sasl/jb55] has quit [Ping timeout: 240 seconds]
03:00 -!- jb55 [~jb55@gateway/tor-sasl/jb55] has joined #utreexo
07:57 -!- belcher_ [~belcher@unaffiliated/belcher] has joined #utreexo
08:00 -!- belcher [~belcher@unaffiliated/belcher] has quit [Ping timeout: 268 seconds]
08:01 -!- belcher_ is now known as belcher
08:58 < adiabat> hm ok.  Should utreexo stuff point to utcd?
08:58 < adiabat> anyway works if I just remove that line in go.mod
10:12 < kcalvinalvin> adiabat maybe? If there's a need in the future
11:22 < dergoegge> the meetup was interesting. TLDR: smaller proofs but slower? calvin might have a better tldr i think he understood the thing a bit better
11:23 < adiabat> oh shoot I missed the meetup...
11:23 < adiabat> I'm in calls today a lot, maybe couldn't go
11:23 < adiabat> do they have a paper to read?
11:25 < dergoegge> not yet it seems
11:25 < adiabat> yeah overlapping call I'm still stuck in... oh well
11:25 < adiabat> would have been good to ask questions.  maybe can get them on IRC
11:26 < dergoegge> damn it we should have asked for them to come on here...
11:27 < adiabat> so the slower is more cpu / hashing I guess?
11:27 < adiabat> could be a good tradeoff as apparently the code doesn't spend any time hashing... or at least not long enough that 2 cores would help :)
11:28 < dergoegge> he said generating the proofs took several days longer... 
11:30 < dergoegge> boltonb2@illinois.edu thats his email, we could ask him to come to one of the meetings
11:30 < adiabat> hm could be an implementation thing. but yeah that's slow
11:30 < adiabat> although, genproofs taking longer and ibd faster is a good tradeoff since genproofs only has to happen once
11:47 < adiabat> OK thinking out lound (on IRC) I don't know what their paper is, but if you limit yourself to only previous blocks, there are serious proof size reductions possible
11:53 < adiabat> like, since the whole thing is after the fact, you could sort additions in an optimal way so that leaves cluster together
11:55 < adiabat> to minimize proof size.  Like use TTL data to cluster leaf additions.  Right now leaf addition just happens in the block order
11:55 < adiabat> which is super naive.  Even without TTL data you could sort it in different ways.  Like...
11:57 < adiabat> utxos created by long lived utxos might tend to live longer
11:57 < adiabat> like if a tx spends 2 blocks old utxos, maybe the utxos it creates won't live long
11:57 < adiabat> that would be doable with only past data.
11:58 < adiabat> with future data could do much more without any heuristic, just cluster based on spend time.
11:58 < adiabat> So... this is probably *not* what's in their paper but... if the goal is "reduce proof size" I have lots of ideas too :)
12:02 < adiabat> The problem is as blocks come out, you gain information which allows you to reduce the proof size of that block, but in order to do so you need to change things that happened a long time ago
12:03 < adiabat> what would be really great is a way to modify old stuff retroactively...
12:11 < adiabat> it gets complicated though, but I think there are ways to do it.  I guess the first question is how much the proof size goes down
12:12 < adiabat> if it's complicated and proof size only goes down 10 or 20% then not worth it.  But if it's like 80% reduction then... yeah
12:12 < adiabat> OK I should work on the PR stuff but that *is* a fun idea that would be testable now
12:12 < adiabat> do 1 run of genproofs, make proofs & TTL data, measure proof size
12:14 < adiabat> then do a 2nd run, where you have TTL data for all the new utxo every block.  Instead of adding leaves in the same order that utxos appear in the blocks,
12:14 < adiabat> add them in TTL order, so the ones that disappear in the same block get clustered together at insertion time
12:15 < adiabat> then measure the size of that 2nd run's proofs.
12:15 < adiabat> That's actually only a few dozen lines probably.  Something to think about anyway
14:59 < adiabat> actually there's even more flexibility
14:59 < adiabat> (hm not sure this is the best place to write this, maybe should make a doc in the repo or something but ... )
15:01 < adiabat> right now the Modify() call is straightforward - delete leaves, compact the trees down to eliminate the gaps, then add new leaves on the right
15:01 < adiabat> but you could have the remove & add steps mixed.  So you're not limited to putting new leaves just on the right
15:02 < adiabat> the new leaves could go to any spot opened up by the deletions
15:02 < adiabat> not just the deletion spots directly, but even places "near" that, revealed by the proofs
15:05 < adiabat> for example - printout.txt smallest tree (8 leaves) :
15:05 < adiabat> you delete 1 leaf, 02, and add 4 new ones
15:05 < adiabat> you can make the new ones into a 4-tree, and the proof for 02 has 13 in it.
15:07 < adiabat> so you can put the new 4-tree in position 12 or 13, and 00, 01 and 03 end up on the right
15:07 < adiabat> if the forest keeps TTL data associated with leaves (which wouldn't be hard to do) you have a lot of freedom to move things around
15:08 < adiabat> in many cases you would be able to move all the leaves for a block's inputs all together into a big subtree just in time for those leaves to be deleted
15:08 < adiabat> in which case the proof would be tiny, just a few hashes for the whole block
15:09 < adiabat> that's probably worth it, or would be worth it, as the additional genproofs work only needs to happen once, and after that everyone benefits from it
15:10 < adiabat> the problem is it's non-causal; you don't know which leaves to group together until they all get deleted (in the block)
15:11 < adiabat> and once that happens it's too late; the leaves weren't in the right place
15:12 < adiabat> on one hand, this is similar to TTL data.  TTL only works after the fact, for historic IBD, and doesn't work once you're synced up
15:12 < adiabat> on the other hand, TTL is just a caching optimization and has no effect on the consensus critical arrangement of leaves.  If you ignore TTL data everything will still work
15:14 < adiabat> but arranging leaves so that they all end up next to each other in time to be removed is something everyone would need to do, and can only do for historic data...
15:14 < adiabat> also it's not something that can be continually updated
15:14 < adiabat> for TTL data, say we start and the blockchain is 1000 blocks long, and we use TTL data going up to 1000
15:14 < adiabat> then time passes, there are 1300 blocks
15:15 < adiabat> we keep adding TTL data into the flat files every time a block arrives, and incrementally help new nodes connecting
15:15 < adiabat> nothing changes about block 1200 data because of block 1300
15:16 < adiabat> arranging things for optimal deletion proofs doesn't work that way though..
15:16 < adiabat> you can make perfect (or very close) proofs up to block 1000
15:17 < adiabat> but then if you want to extend that so that block 1200 also gets a perfect proof, you need to change the proofs for block 900
15:17 < adiabat> feels like there's no way around that.  So it's great for old data IBD, but then you only get 1 shot
15:18 < adiabat> you could do something really ugly, like every 1000 blocks have the bridge nodes recompute the optimal swapping, and then give a re-arrange proof
15:19 < adiabat> like hey everyone, it's block 5000, move all these things around for no apparent reason.  I'll give you the proofs so you know it's legit and just swaps, no adds or deletes
15:20 < adiabat> and then everyone does that, and now all the old proofs changed completely, and proofs for 4000-5000 are now optimal
15:20 < adiabat> that's so ugly though
15:21 < adiabat> It's interesting to think about long term though.  It does seem like you can probably get very small proof sizes if you really want to
15:22 < adiabat> I'm also assuming the bridge nodes have enough flexibility in where to place things, which I think is likely.  
15:22 < adiabat> Also even if they didn't, it wouldn't be that crazy to include an "extra proof" where something is proven just so a new leaf can be swapped to that position to make later proofs smaller
15:22 < adiabat> guessing you wouldn't need many of those
15:23 < adiabat> anyway I'm pretty sure this stuff has *nothing* to do with that paper :)
16:48 < adiabat> there's also a tradeoff in this optimization
16:49 < adiabat> the better (closer grouped together, smaller proof) you make the proof for one block, the less freedom you have to improve later blocks.
16:49 < adiabat> for example, say there are 255 leaves (8 trees).  We've got all the leaves perfectly grouped together, they take up  the entire 64-tree
16:49 < adiabat> so the proof for that block is... nothing.  0 byte proof, 0 hashes.  Which is very cool.
16:49 < adiabat> but that also means we learn nothing about leaves or anything below the roots, so we can't move anything around.
16:49 < adiabat> we can only arrange the newly inserted leaves into different places
16:49 < adiabat> there is still a good amount of freedom there.  We have the ordering, but we can also shuffle around the existing trees.
16:49 < adiabat> in this example, say there are 64 more leaves to be added.  So we end up back at 255 with 8 trees.
16:50 < adiabat> We can collapse first and put the new leaves on the right, in whatever order we want.  
16:50 < adiabat> Or we can put them in the recently vacated 64-tree.
16:50 < adiabat> Or really anywhere in between.  Take 32 long-ttl leaves, group them with the 32-tree, filling the 64-tree.  Then the rest go on the right.
16:50 < adiabat> Hm.  So even in the extreme case where there's *no* proof, you still have a lot of freedom to spread hashes around.  
16:50 < adiabat> Not total freedom though; the 128-tree is off limits.
16:50 < adiabat> 255 is the best-case though, as there are lots of trees & the accumulator state is large
16:50 < adiabat> if it's 256 (the other extreme) the proof just becomes 2 hashes long (to the 64 tree) and then you only know a 64-tree and a 128-tree.  Pretty much stuck putting everything on the right.
16:50 < adiabat> Usually we're somewhere in between though.  Also the number of deletions isn't going to be a power of 2.
16:50 < adiabat> So the proofs will have more hashes because there will be several subtrees.  But not too many.
16:50 < adiabat> But yeah that's the tradeoff; the smaller the proofs the less freedom you have to arrange things in the right way to make future proofs small.
16:50 < adiabat> Sounds hard to optimize.  Maybe a worse proof for block n can allow a better proof for block n+247.  
16:50 < adiabat> Anyway if we really want to get proof sizes down, there's tons of stuff to try... 
16:50 < adiabat> but probably should get what we have working first :)
17:08 -!- belcher [~belcher@unaffiliated/belcher] has left #utreexo ["Leaving"]
--- Log closed Sat Dec 19 00:00:51 2020