Reposting something that came up recently in a private discussion with some
academics:

Concretely, let's define a prunable MMR with the following grammar. This
definition is an improvement on whats in the python-proofmarshal by committing
to the number of items in the tree implicitly; an obvious max-log2(n)-sized
proof-of-tree-size can be obtained by following the right-most nodes:

    Maybe(T) := UNPRUNED <T> | PRUNED <Commitment(T)>

    FullNode(0) := <Value>
    FullNode(n) := <Maybe(FullNode(n-1)> <Maybe(FullNode(n-1))>

    PartialNode(0) := SOME <FullNode(0)> | NONE
    PartialNode(n) := <Maybe(FullNode(n-1))> <Maybe(PartialNode(n-1))>

    MMR := FULL <N> <FullNode(n)> | PARTIAL <N> <PartialNode(n)>

Basically we define it in four parts. First we define Maybe(T) to represent
pruned and unpruned (hash only) data. Secondly we define full nodes within 2^n
sized trees. Third we define partial nodes. And finally we define the MMR
itself as being either a full or partial node.

First of all, with pruning we can define a rule that if any operation (other
than checking commitment hashes) attempts to access pruned data, it should
immediately fail. In particular, no operation should be able to determine if
data is or isn't pruned. Equally, note how an implementation can keep track of
what data was accessed during any given operation, and prune the rest, which
means a proof is just the parts of the data structure accessed during one or
more operations.

With that, notice how proving the soundness of the proofs becomes trivial: if
validation is deterministic, it is obviously impossible to construct two
different proofs that prove contradictory statements, because a proof is simply
part of the data structure itself. Contradiction would imply that the two
proofs are different, but that's easily rejected by simply checking the hash of
the data.

-- 
https://petertodd.org 'peter'[:-1]@petertodd.org