--- Log opened Thu Oct 21 00:00:10 2021 03:08 -!- promag [~promag@188.250.84.129] has joined #bitcoin-core-pr-reviews 03:08 -!- promag_ [~promag@188.250.84.129] has quit [Read error: Connection reset by peer] 03:29 -!- yonson [~yonson@2600:8801:d900:7bb:1e69:7aff:fea2:4e85] has quit [Remote host closed the connection] 03:29 -!- yonson [~yonson@2600:8801:d900:7bb:1e69:7aff:fea2:4e85] has joined #bitcoin-core-pr-reviews 03:39 -!- esraa [~esraa@147.236.159.129] has joined #bitcoin-core-pr-reviews 03:39 -!- esraa [~esraa@147.236.159.129] has quit [Client Quit] 04:07 -!- luke-jr [~luke-jr@user/luke-jr] has quit [Quit: ZNC - http://znc.sourceforge.net] 04:07 -!- luke-jr [~luke-jr@user/luke-jr] has joined #bitcoin-core-pr-reviews 04:23 -!- luke-jr [~luke-jr@user/luke-jr] has quit [Quit: ZNC - http://znc.sourceforge.net] 04:24 -!- luke-jr [~luke-jr@user/luke-jr] has joined #bitcoin-core-pr-reviews 04:31 -!- luke-jr [~luke-jr@user/luke-jr] has quit [Quit: ZNC - http://znc.sourceforge.net] 04:32 -!- luke-jr [~luke-jr@user/luke-jr] has joined #bitcoin-core-pr-reviews 07:01 < pinheadmz> sipa interesting comment, I've been running bitcoind on a RPi with USB-SSD for years. I even build and sell pre-synced nodes as an art project. I've only rarely ever had issues and almost none in the last year or two 07:02 < pinheadmz> This is my go-to: https://www.amazon.com/gp/product/B0874XN4D8 07:08 < sipa> pinheadmz: i'm sure my statement isn't absolute, and if you find the right hardware it can work 07:08 < sipa> but for a number of years, a large majority of all corrupted blockchain reports i saw were all ones with datadirs on usb drives 07:08 < pinheadmz> heh, dang 07:08 < pinheadmz> I used to get better performance with ext2 but the newer RPis work just as fast with ext4 07:09 < pinheadmz> the ext2 would get corrupted in a nasty way, that was like 2016/2017 though 07:32 < DavidBakin> (agree: USB is fairly reliable now - but for many years was unreliable as heck for storage. I think things got better with 1) experience on the part of chip makers and driver writers that shows up in USB 3/3.1/etc and MAINLY 2) higher power available to the drive on the newer USB standards - I bet drives starved of power during heavy load was the principal cause, with bad chipsets/drivers second.) 07:34 < DavidBakin> (also agree that the Samsung Tn series external USB are absolutely great - reliable as anything and can be powered from the usual connector (doesn't need hub). Though TBH, my RaspiBlitz setup only worked reliably with the T5 - the newer T7 draws a bit more power ...) 07:52 < sipa> actual USB SSDs is probably a lot better than spinning disks in enclosures, or cheap usb sticks 07:52 < sipa> *are 07:53 < DavidBakin> oh ... well ... the sticks ... didn't even occur to me we were talking about _those_ since we were talking about high duty cycle and reliability ... 07:54 < DavidBakin> but it true for portable devices SSD is far far ahead of HDD in reliability ... also true IMO for non-portable machines but the reliability of HDDs is so high there, given proper enclosures and cooling, that in practice I don't think there's that much of a difference 07:56 < sipa> i wasn't solely talking about usb sticks; mostly about spinning disks actually 07:56 < sipa> i think my information may date from a time when SSD USB drives weren't very common 07:58 < DavidBakin> now ... on a related topic ... I would have thought there were reliability problems due to leveldb - at least, if you read various articles on the web (remembering they might be written with various biases and also that they might be out of date due to subsequent progress) - the main reason for the rocksdb fork was leveldb UNreliability ... true/false? 07:59 < sipa> i've never been able to corrupt a leveldb database on my own hardware 07:59 < sipa> including with power failures etc 07:59 < DavidBakin> ok good to know! and a good reminder to be careful interpreting what I read ... 08:00 < sipa> i thought rocksdb was mainly faster and better for multi-threaded applications etc 08:01 < sipa> bitcoin's usable is fairly unusual, in that its records are typically written once, read once, deleted once 08:01 < sipa> *use case 08:01 < DavidBakin> yes - and though I'm sure that's taken advantage of in the current implementation I'm wondering if there are other advantages of that yet to be taken - I'm actively looking at the code with that in mind (it's an interest of mine) 08:02 < sipa> it'd be interesting to try swapping it out and seeing if you get a performance difference 08:02 < sipa> just changing src/dbwrapper.cpp is probably enough (+ build system changes) 08:03 < sipa> though, rocksdb is also more complex, which raises questions about attack surface etc 08:03 < sipa> not that that's likely, but in the context of bitcoin we do need to be extra careful - an actual bug in the database layer could mean a consensus fork 08:03 < sipa> (see the march 2013 bdb/leveldb event) 08:11 < DavidBakin> yes i remember the 2013 issue ... (though IMO from my reading that was about process too: recommending a change in consensus only a month before pushing a release with a completely rewritten layer - happened to be the database layer but could have been a different component/layer - but that is (again) based on reading, correct me PLEASE if I'm wrong) 08:12 < sipa> 2013 was a different time :) 08:12 < sipa> which consensus change are you talking about though? 08:13 < DavidBakin> that was my interpretation of changing to allow more transactions/block? or something like that ... it was the combination that made the problem right - too many transactions caused too many blocks to be locked but that was something that had just been told to the miners they could up ... I definitely remember reading that ... 08:13 < sipa> there was no such consensus change, only a policy 08:14 < sipa> i don't think anyone could have expected an interaction between those two 08:14 < sipa> it wasn't just more transactions that caused the issue 08:14 < DavidBakin> oh, ok, guess i'm not clear on the difference between consensus and policy - but the effect was that the problem _could_ have been caught if the policy change had had time to be field tested (miners would have complained that blocks they thought would be valid were causing problems) 08:15 < sipa> i don't think so, actually 08:15 < sipa> the problem wasn't just the number of transactions - though it compounded the effect 08:15 < sipa> the real issue was transactions spending unusually high number of inputs compared to the number of outputs 08:16 < sipa> as the total number of transaction inputs correlated with the number of database pages that needed to be locked in bdb 08:16 < sipa> (in the whole block) 08:16 < sipa> and that number of locks was set based on a conservative overestimate based on what was actually seen in production, a long time before 08:16 < DavidBakin> ok, obviously i defer to your knowledge - you're suggesting that at that time - coincidentally - there were transactions getting more inputs due to the way bitcoin usage was evolving ... and it didn't just trigger the lock bug but also happened to coincide with the new release? 08:17 < sipa> not due to evolving 08:17 < sipa> it was due to tons of dust being cleaned up by miners or something like that 08:17 < sipa> or maybe it was all the tiny "correct horse stable battery" outputs being spent simultaneously 08:17 < DavidBakin> oh! i thought it had to do with exchanges beginning to group transactions - i forget what that's called ... 08:17 < sipa> it was a very one-time thing 08:18 < sipa> it took almost a year before the old code saw the same issue again 08:18 < sipa> (long after the network was upgraded to the leveldb code which didn't have a limit) 08:18 < DavidBakin> well! if it was a one time thing why are we talking about it! you brought up the problem of a DB bug causing an inadvertent fork but if it was a one-time thing! (just kiddding) 08:19 < sipa> really, i'd say the root cause was a bug in the code - it treated failure to acquire a database lock as an error, which resulted in the block being marked as permanently invalid 08:19 < sipa> without that, we'd have had a DoS attack, but not a fork 08:19 < DavidBakin> (because "one time thing" is definitely "famous last words" to anyone who's ever dismissed a bug report/ticket due to "transient error" ... as I'm sure you know!) 08:20 < sipa> of course, i'm definitely not excluding that something vaguely similar could occur again (which is why i caution against just switching to another database layer) 08:21 < DavidBakin> hmm, i'll have to think about it but generally the correction to a failure to acquire a lock is to retry - but in this case we're talking about the single process was in fact _holding_ all the locks and would _continue_ to hold the locks therefore retry would be useless - and so why would anyone have coded it that way? treating it as an error seems correct, no? 08:21 < sipa> it should have been treated as a system error, and do a graceful shutdown or so 08:21 < sipa> not mark the block as invalid 08:21 < DavidBakin> ah! a different kind of error, yes I see that 08:21 < sipa> it's like a file system error 08:22 < sipa> but still, it made it clear how the database layer could in subtle ways influence consensus 08:22 < sipa> if somehow the database would write/read some record incorrectly, and it'd do so in a vaguely deterministic way, we could actually have a problem 08:23 < sipa> again, that's highly unlikely, but it's the reason why the database code is subtreed into the source code, and it's not using system leveldb 08:23 < sipa> so that at least every node uses the same one (and we can review code changes before updating) 08:23 < DavidBakin> hmm, and yet, the database usage here is so simple - doesn't even need ACID since, as you pointed out, it's mostly write once (and the parts that aren't - UTXO - probably could be ...) 08:24 < sipa> leveldb is also entirely single-process 08:24 < DavidBakin> only a single writer ... 08:24 < sipa> and writing is restricted to a single thread at a time 08:24 < sipa> indeed 08:24 < DavidBakin> yep 08:29 < DavidBakin> here's something I'd like to know about bitcoin core development: has any interest ever been expressed in having sufficient modularizatoin so that you could do A::B comparison of different implementations of modules - e.g., have the database layer not just pluggable but also able to call _two_ different implementations of the module with the same calls and verify you get the same results? I've used this before 08:29 < DavidBakin> on major $$-producing production systems as both a way to _evolve_ it (compare new vs old) and also as a way to test for agreement to the spec (in the same way that avionics will implement the same function with two differnet approaches) 08:29 < DavidBakin> if it has been talked about what was the result? 08:37 < sipa> "is there interesting in sufficient modularization": more modularization is always welcome of course, for a variety of reasons, but refactoring code is often risky on itself 08:38 < sipa> and it's important to realize, for almost everything in consensus-related aspect: consistency is more important than correctness 08:42 < DavidBakin> yes I do get that. I was talking about that that technique could be used to _confirm_ consistency (in fact, that's _all_ it can do) 08:43 < sipa> so we do have functional tests that compare behavior between old and new nodes 08:43 < sipa> as in: literally run an old bitcoind and a new one, and subject them to certain P2P/RPC calls, and see if they behave identically 08:44 < DavidBakin> yes, that's the traditional way to do it! just curious because, as I said, I've actually used the technique when migrating important legacy systems to newer technology. thanks! 08:44 < DavidBakin> (in production!) 08:45 < DavidBakin> (where the issue is that things happen in production in different orders, different timings w.r.t. each other, different relationships to other things happening at the same time ... you see more than funcitonal tests - but I'm not trying to beat a drum here ... just asking) 10:17 -!- Talkless [~Talkless@mail.dargis.net] has joined #bitcoin-core-pr-reviews 12:07 -!- lsilva_ [sid489830@id-489830.helmsley.irccloud.com] has joined #bitcoin-core-pr-reviews 12:11 -!- Talkless [~Talkless@mail.dargis.net] has quit [Quit: Konversation terminated!] 12:39 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te55mxt1kkm1n0k.ipv6.telus.net] has quit [Remote host closed the connection] 12:41 -!- Kaizen_Kintsugi [~Kaizen_Ki@d137-186-173-66.abhsia.telus.net] has joined #bitcoin-core-pr-reviews 12:46 -!- gene [~gene@gateway/tor-sasl/gene] has joined #bitcoin-core-pr-reviews 13:00 -!- Kaizen_Kintsugi [~Kaizen_Ki@d137-186-173-66.abhsia.telus.net] has quit [Ping timeout: 260 seconds] 13:22 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 13:50 -!- nickbar [~nickbar@host-78-146-218-200.as13285.net] has joined #bitcoin-core-pr-reviews 13:56 -!- nickbar [~nickbar@host-78-146-218-200.as13285.net] has quit [Quit: Connection closed] 13:57 -!- ghost43 [~ghost43@gateway/tor-sasl/ghost43] has quit [Remote host closed the connection] 13:58 -!- ghost43 [~ghost43@gateway/tor-sasl/ghost43] has joined #bitcoin-core-pr-reviews 14:28 -!- gene [~gene@gateway/tor-sasl/gene] has quit [Ping timeout: 276 seconds] 14:29 -!- gene [~gene@gateway/tor-sasl/gene] has joined #bitcoin-core-pr-reviews 14:30 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has quit [Remote host closed the connection] 14:33 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 14:38 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has quit [Ping timeout: 252 seconds] 14:49 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 14:55 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has quit [Ping timeout: 258 seconds] 15:03 -!- luke-jr [~luke-jr@user/luke-jr] has quit [Quit: ZNC - http://znc.sourceforge.net] 15:05 -!- luke-jr [~luke-jr@user/luke-jr] has joined #bitcoin-core-pr-reviews 15:09 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 15:38 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has quit [Remote host closed the connection] 15:38 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 16:41 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4d87wto2h7yqi.ipv6.telus.net] has quit [Remote host closed the connection] 16:50 -!- jamesob [uid180710@id-180710.helmsley.irccloud.com] has quit [Quit: Connection closed for inactivity] 16:57 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 17:03 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 258 seconds] 17:04 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 17:09 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 264 seconds] 17:38 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 17:43 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 258 seconds] 18:31 -!- Kaizen_Kintsugi [~Kaizen_Ki@d137-186-173-66.abhsia.telus.net] has joined #bitcoin-core-pr-reviews 18:35 -!- Kaizen_Kintsugi [~Kaizen_Ki@d137-186-173-66.abhsia.telus.net] has quit [Ping timeout: 260 seconds] 19:17 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 19:19 -!- yonson [~yonson@2600:8801:d900:7bb:1e69:7aff:fea2:4e85] has quit [Remote host closed the connection] 19:19 -!- yonson [~yonson@2600:8801:d900:7bb:1e69:7aff:fea2:4e85] has joined #bitcoin-core-pr-reviews 19:22 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 258 seconds] 19:22 -!- gene [~gene@gateway/tor-sasl/gene] has quit [Quit: gene] 19:26 -!- pg2 [sid518209@id-518209.hampstead.irccloud.com] has quit [Ping timeout: 260 seconds] 19:27 -!- lsilva_ [sid489830@id-489830.helmsley.irccloud.com] has quit [Ping timeout: 260 seconds] 19:27 -!- schmidty [sid297174@id-297174.lymington.irccloud.com] has quit [Ping timeout: 260 seconds] 19:27 -!- stick [sid403625@user/prusnak] has quit [Ping timeout: 258 seconds] 19:28 -!- schmidty [sid297174@lymington.irccloud.com] has joined #bitcoin-core-pr-reviews 19:28 -!- stick [sid403625@user/prusnak] has joined #bitcoin-core-pr-reviews 19:29 -!- lsilva_ [sid489830@helmsley.irccloud.com] has joined #bitcoin-core-pr-reviews 19:43 -!- pg2 [sid518209@hampstead.irccloud.com] has joined #bitcoin-core-pr-reviews 19:50 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 19:54 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 252 seconds] 20:40 -!- grettke [~grettke@cpe-65-29-228-30.wi.res.rr.com] has joined #bitcoin-core-pr-reviews 20:44 -!- grettke [~grettke@cpe-65-29-228-30.wi.res.rr.com] has quit [Client Quit] 20:44 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 20:49 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 264 seconds] 20:51 -!- grettke [~grettke@cpe-65-29-228-30.wi.res.rr.com] has joined #bitcoin-core-pr-reviews 20:52 -!- grettke [~grettke@cpe-65-29-228-30.wi.res.rr.com] has quit [Client Quit] 21:20 -!- belcher [~belcher@user/belcher] has quit [Ping timeout: 258 seconds] 21:24 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 21:31 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 252 seconds] 21:33 -!- belcher [~belcher@user/belcher] has joined #bitcoin-core-pr-reviews 22:09 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 22:14 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 258 seconds] 22:49 -!- commmon [~common@096-033-221-075.res.spectrum.com] has quit [Read error: Connection reset by peer] 22:49 -!- commmon [~common@096-033-221-075.res.spectrum.com] has joined #bitcoin-core-pr-reviews 23:00 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 23:05 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 252 seconds] 23:51 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has joined #bitcoin-core-pr-reviews 23:55 -!- Kaizen_Kintsugi [~Kaizen_Ki@node-1w7jr9yi65te4021dky9o7qud.ipv6.telus.net] has quit [Ping timeout: 252 seconds] --- Log closed Fri Oct 22 00:00:10 2021