public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Peter Todd <pete@petertodd•org>
To: Andy Parkins <andyparkins@gmail•com>
Cc: bitcoin-development@lists•sourceforge.net
Subject: Re: [Bitcoin-development] Code review
Date: Fri, 4 Oct 2013 07:35:17 -0400	[thread overview]
Message-ID: <20131004113517.GA8373@savin> (raw)
In-Reply-To: <3552695.aET6a1zFq8@momentum>

[-- Attachment #1: Type: text/plain, Size: 3789 bytes --]

On Fri, Oct 04, 2013 at 11:42:29AM +0100, Andy Parkins wrote:
> On Friday 04 October 2013 12:30:07 Mike Hearn wrote:
> > Git makes it easy to fork peoples work off and create long series of
> > commits that achieve some useful goal. That's great for many things.
> > Unfortunately, code review is not one of those things.
> > 
> > I'd like to make a small request - when submitting large, complex pieces of
> > work for review, please either submit it as one giant squashed change, or
> 
> Don't do this.  It throws away all of the good stuff that git lets you record.  
> There is more to a git branch than just the overall difference.  Every single 
> log message and diff is individually valuable.  It's easy to make a squashed 
> diff from many little commits; it's impossible to go the other way.
> 
> Command line for you so you don't have to think about it:
> 
>   git diff $(git merge-base master feature-branch) feature-branch 
> 
> git-merge-base finds the common ancestor between master and feature-branch, 
> and then compares feature-branch against that.

Git is a revision *communication* system that happens to also make for a
good revision *control* system.

Remember that every individual commit is two things: what source code
has changed, and a message explaining why you thought that change should
be made. Commits aren't valuable in of themselves, they're valuable
because they serve to explain to the other people you are working with
why you thought a change should be made. Sometimes it makes sense to
explain your changes in 10 commits, sometimes it makes sense to squash
them all up into one commit, but there's no hard and fast rule other
than "Put yourself in your fellow coders' shoes - what's the best way to
explain to them what you are trying to accomplish and why?" You may have
generated a lot of little commits in the process of creating your patch
that tell a story that no-one else cares about, or equally by squashing
everything into one big commit you wind up with a tonne of changes with
little explanation as to why they were made.

Two caveats apply however: git-bisect works best if every commit in the
tree you are trying to debug works well enough that you can run tests
without errors - that is you don't "break the build". Don't make commits
that don't compile at the very least, and preferably everything you do
should be refactored to the point where the commit as a whole "works".

The second caveat is more specific to Bitcoin: people tend to rebase
their pull-requests over and over again until they are accepted, but
that also means that code review done earlier doesn't apply to the later
code pushed. Bitcoin is a particularly high profile, and high profit,
target for people trying to get malicious code into the codebase. It may
be the case that we would be better off asking reviewers making small
changes to their pull-requests to add additional commits for those
changes rather than rebasing, to make it clear what changes were
actually made so that code reviewers don't have to review the whole
patch from scratch. After all, the place where the most eyes will
actually look at the commits is during the pull-req process; after the
code has been pulled the audience for those commits is in most cases
almost no-one.

Having said that, there's currently a lot of other holes in the review
and source code integrity process, so fixing this problem is probably
not the low-hanging fruit right now.


FWIW personally I tend to review patches by both looking at the
individual commits to try to understand why someone wanted to make a
change, as well as all commits merged into one diff for a "what actually
changed here?" review.

-- 
'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

  parent reply	other threads:[~2013-10-04 11:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-04 10:30 Mike Hearn
2013-10-04 10:42 ` Andy Parkins
2013-10-04 11:32   ` Mike Hearn
2013-10-04 12:34     ` Andy Parkins
2013-10-04 11:35   ` Peter Todd [this message]
2013-10-04 11:58     ` Arto Bendiken
2013-10-04 12:14       ` Peter Todd
2013-10-04 12:34     ` Andy Parkins
2013-10-04 11:53 ` Peter Todd
2013-10-04 12:14   ` Mike Hearn
2013-10-04 12:22     ` Eugen Leitl
2013-10-05  2:31 ` Gavin Andresen
2013-10-05  4:02   ` Warren Togami Jr.
2013-10-05 11:36   ` Mike Hearn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131004113517.GA8373@savin \
    --to=pete@petertodd$(echo .)org \
    --cc=andyparkins@gmail$(echo .)com \
    --cc=bitcoin-development@lists$(echo .)sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox