public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
* [Bitcoin-development] On bitcoin testing
@ 2012-10-09 23:12 Jeff Garzik
  2012-10-09 23:42 ` Arklan Uth Oslin
  2012-10-10  0:03 ` Gregory Maxwell
  0 siblings, 2 replies; 3+ messages in thread
From: Jeff Garzik @ 2012-10-09 23:12 UTC (permalink / raw)
  To: Bitcoin Development

Copying from a response posted to "Bitcoin software testing effort"
https://bitcointalk.org/index.php?topic=117487.0 as it is relevant to
a recent thread here...

Any level of testing is useful and appreciated.  Various types of
testing that are helpful:

* "it works" testing:  Simply run the latest Release Candidate (or
latest version, if released).  Make sure all the basics work (for
whatever definition of "basics" you desire).  This is the level most
accessible to casual users.
* Major features testing:  Develop a short checklist of must-work
features, and organize volunteers to work together and go through that
checklist, item by item.  Test each major feature on each major
platform.
* Stress and fuzz testing:  Attempt to "stress" the system somehow, or
randomly corrupt bits of data.  See what breaks.
* Regression testing:  Record bugs fixed, and develop automated test
cases that successfully reproduce the bugs on older versions, and
verify newer versions remain fixed.
* Unit function testing:  Rigorously exercise each C++ class to ensure
it behaves as expected at a micro level.
* Full peer automated testing:  Automated testing of RPC and P2P
functions is non-existent, because of the difficulty in doing so.
Find a solution to this problem.
* Data-driven tests: If possible, write software-neutral, data-driven
tests.  This enables clients other than the reference one (Satoshi
client) to be tested.  Embed tests in testnet3 chain, if possible.


The community at large can be a big help simply by doing the first
item:  download and run the Release Candidates and the latest version,
and report any problems.  Even reporting success is fine by me, for
example: "Version 0.7.1 works for me on Windows 7/32-bit" posted on a
forum thread.

It is always very difficult to organize any sort of testing regime
with open source volunteers that come and go.  Each volunteer chooses
their level of involvement.  Any amount of testing and test-case
writing, large or small, is helpful to bitcoin.

-- 
Jeff Garzik
exMULTI, Inc.
jgarzik@exmulti•com



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Bitcoin-development] On bitcoin testing
  2012-10-09 23:12 [Bitcoin-development] On bitcoin testing Jeff Garzik
@ 2012-10-09 23:42 ` Arklan Uth Oslin
  2012-10-10  0:03 ` Gregory Maxwell
  1 sibling, 0 replies; 3+ messages in thread
From: Arklan Uth Oslin @ 2012-10-09 23:42 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Bitcoin Development

[-- Attachment #1: Type: text/plain, Size: 3345 bytes --]

thanks for the great reply jeff. i'm going to get a virtual machine set up
on my system later tonight so at the very least, i myself can start testing.

steve - haven't heard from you in almost a week. I'd still really like to
get a look at the test cases and such you set up.

Arklan

----------
As long as there is light, the darkness holds no fear. And yet, even in the
deepest black, there is life. - Arklan Uth Oslin

I want to leave this world the same way I came into it: backwards and on
fire. - Arklan Uth Oslin



On Tue, Oct 9, 2012 at 5:12 PM, Jeff Garzik <jgarzik@exmulti•com> wrote:

> Copying from a response posted to "Bitcoin software testing effort"
> https://bitcointalk.org/index.php?topic=117487.0 as it is relevant to
> a recent thread here...
>
> Any level of testing is useful and appreciated.  Various types of
> testing that are helpful:
>
> * "it works" testing:  Simply run the latest Release Candidate (or
> latest version, if released).  Make sure all the basics work (for
> whatever definition of "basics" you desire).  This is the level most
> accessible to casual users.
> * Major features testing:  Develop a short checklist of must-work
> features, and organize volunteers to work together and go through that
> checklist, item by item.  Test each major feature on each major
> platform.
> * Stress and fuzz testing:  Attempt to "stress" the system somehow, or
> randomly corrupt bits of data.  See what breaks.
> * Regression testing:  Record bugs fixed, and develop automated test
> cases that successfully reproduce the bugs on older versions, and
> verify newer versions remain fixed.
> * Unit function testing:  Rigorously exercise each C++ class to ensure
> it behaves as expected at a micro level.
> * Full peer automated testing:  Automated testing of RPC and P2P
> functions is non-existent, because of the difficulty in doing so.
> Find a solution to this problem.
> * Data-driven tests: If possible, write software-neutral, data-driven
> tests.  This enables clients other than the reference one (Satoshi
> client) to be tested.  Embed tests in testnet3 chain, if possible.
>
>
> The community at large can be a big help simply by doing the first
> item:  download and run the Release Candidates and the latest version,
> and report any problems.  Even reporting success is fine by me, for
> example: "Version 0.7.1 works for me on Windows 7/32-bit" posted on a
> forum thread.
>
> It is always very difficult to organize any sort of testing regime
> with open source volunteers that come and go.  Each volunteer chooses
> their level of involvement.  Any amount of testing and test-case
> writing, large or small, is helpful to bitcoin.
>
> --
> Jeff Garzik
> exMULTI, Inc.
> jgarzik@exmulti•com
>
>
> ------------------------------------------------------------------------------
> Don't let slow site performance ruin your business. Deploy New Relic APM
> Deploy New Relic app performance management and know exactly
> what is happening inside your Ruby, Python, PHP, Java, and .NET app
> Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> http://p.sf.net/sfu/newrelic-dev2dev
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists•sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>

[-- Attachment #2: Type: text/html, Size: 4291 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Bitcoin-development] On bitcoin testing
  2012-10-09 23:12 [Bitcoin-development] On bitcoin testing Jeff Garzik
  2012-10-09 23:42 ` Arklan Uth Oslin
@ 2012-10-10  0:03 ` Gregory Maxwell
  1 sibling, 0 replies; 3+ messages in thread
From: Gregory Maxwell @ 2012-10-10  0:03 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Bitcoin Development

On Tue, Oct 9, 2012 at 7:12 PM, Jeff Garzik <jgarzik@exmulti•com> wrote:
> * Data-driven tests: If possible, write software-neutral, data-driven
> tests.  This enables clients other than the reference one (Satoshi
> client) to be tested.  Embed tests in testnet3 chain, if possible.

The mention of testnet3 here reminds me to make a point:  Confirmation
bias is a common problem for software testing— people often over test
the success cases and under-test the failure cases.  This is certainly
the case in Bitcoin: For example, testnet3+the packaged tests test all
the branches inside the interior script evaluation engine _except_ the
rejection cases.

For us failure cases can be harder to package up (e.g. can't be placed
in testnet) but Matt's node-simulation based tester provides a good
example of how to create a data driven test set that tests both
failure cases and dynamic behavior (e.g. reorgs).

Testing of failure cases is absolutely critical for testing of
implementation compatibility: The existence of a difference in what
gets rejected in a widely deployed alternative node could result in an
utterly devastating network split.

Generally every test of something which must succeeded should be
matched by a test of something that must fail. Personally, I like to
test the boundary cases— e.g. if something has an allowed range of
[0-8], I'll test -1,0,8,9 at a minimum. Though reasoning trumps rules
of thumb.

Confirmation bias is another reason why it's important to have a more
diverse collection of testers than the core developers.  People who
work closely with the software have strong expectations of how the
software should work and are less likely to test crazy corner cases
because they "know" the outcome, sometimes erroneously.


To reinforce Jeff's list of different approaches: I've long found that
each mechanism of software testing has diminishing returns the more of
it you apply. So you're best off using as many different approaches a
little rather than spending all your resources going as deep as
possible with any one approach.

There are also some kind of testing which are synergistic: Almost all
testing is enhanced enormously by combining it with valgrind because
it substantially lowers the threshold of issue detection substantially
(e.g. detecting bogus memory accesses which are _currently_ causing a
crash for you but could). If I could only test one of "with valgrind"
or "without" I'd test with every time.  Sadly valgrind doesn't exist
on windows and it's rather slow. Dr. Memory
(http://code.google.com/p/drmemory/) may be an alternative on Windows,
and there is work to port ASAN to GCC so it may be possible to mingw
ASAN builds in not too long.

I've also found that any highly automatable testing (coded data
driven, unit, and fuzz testing) combines well with diverse
compilation, e.g. building on as many system types and architectures—
including production irrelevant ones— as possible in the hopes that
some system specific quark make a bug easier to detect.



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-10-10  0:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-09 23:12 [Bitcoin-development] On bitcoin testing Jeff Garzik
2012-10-09 23:42 ` Arklan Uth Oslin
2012-10-10  0:03 ` Gregory Maxwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox