As Tier says, the current network message limit is 2MB (reduced from 32MB in the... uhh, 0.10? release).
I think keeping the consensus rules distinct from limitations of the p2p network makes sense-- we are already seeing different protocols for announcing transactions and blocks (Matt's relay network is, essentially, a separate protocol). I could write a separate BIP describing the change to the p2p network protocol, but that feels like busy-work to me.
RE: setting the DoS size check farther than 2 hours into the future: the block, itself, will be rejected if it has a timestamp more than 2 hours in the future. That is already a consensus rule.
RE: what happens if block timestamps are not in chronological order: Nothing.
The activation counting happens in block-height-order, so timestamps on all but the "activating" block are all that matters.
Code that looks for the activation condition must properly handle re-orgs around the activation block, of course.
RE: testnet parameters: big blocks can be tested in -regtest mode with arbitrary timestamps in the past or future. Testing maximum-8MB-blocks mined "in the past" on testnet will just result in a testnet that is even more useless for ordinary testing of products or services being developed -- part of what makes testnet useful for things like testing transaction creation code is it syncs quickly.
That said, I have thought for a while now somebody should take a fresh look at the testnet, talk to people who might be customers for a reset testnet or testnets (we probably want separate testnets for people testing mining and people testing transaction creation, for example), and implement testnets designed to make it easy to test what people need testing.