I had these following ideas as I was thinking about how to allow larger blocks without incentivizing the creation of excessively large blocks. I don't know how much of this is novel, I'd appreciate if anybody could link to any relevant prior art. I'm making no claims on this, anything novel in here is directly released into public domain. 

In short, I'm trying to rely on simple but direct incentives to improve the behavior of Bitcoin. 

Feedback requested. Some simulations requested, see below if you're willing to help. Any questions are welcome. 

---

Expedience fees. Softfork compatible. 

You want to really make sure your transaction gets processed quickly? Transactions could have a second fee type, a specially labeled anyone-can-spend output with an op_return value defining a "best-before" block number and some function describing the decline of the fee value for every future block, such that before block N the miners can claim the full expedience fee + the standard fee (if any), between block N+1 and N+X the miner can claim a reduced expedience fee + standard fee, afterwards only the standard fee. 

When a transaction is processed late such that not the full expedience fee can be claimed, the remainder of the expedience fee output is returned to the specified address among the inputs/outputs (could be something like in#3 for the address used by the 3rd UTXO input). This would have to be done for all remaining expedience fees within the last transaction in the block, inserted there by the miner. 

These additional UTXO:s does increase overhead somewhat, but hopefully not by too much. If we're going to modify the transaction syntax eventually, then we could take the chance to design for this to reduce overhead. 

My current best idea for how to handle returned expedience fees in multiuser transactions (coinjoin, etc) is to donate it to an agreed upon address. For recurring donation addresses (the fee pool included!), this reduces the number of return UTXO:s in the fee processing transaction. 

The default client policy may be to split the entire fee across an expedience fee and a fee pool donation, where the donation part becomes larger the later the transaction gets processed. This is expected to slow down the average inclusion speed of already delayed transactions, but they remain profitable to include. 

The dynamics here is simple, a miner is incentivized to process a transaction with an expedience fee before a standard fee of the same value-per-bit in order to not reduce the total value of the available fees of all standing transactions they can process. The longer they wait, the less total fees available. 

Sidenote: a steady stream of expedience fees reduces the profitability of block withholding attacks (!), at some threshold it should make it entirely unprofitable vs standard mining. This is due to the increased risk of losing valuable expedience fees added after you finished your first block (as the available value will be reduced in your block #2, vs what other miners can claim while still mining on that previous block). 
(Can somebody verify this with simulations?) 

---

Fee pool. Softfork compatible. 

We want to smooth out fee payments too for the future when the subsidy drops, to prevent deliberate forking to steal fees. We can introduce a designated P2SH anyone-can-spend fee pool address. The miner can never claim the full fees from his block or claim the full amount in the pool, only some percentage of both. The remainder goes back into the pool (this might be done at the end of the same expedience fee processing transaction described above). Anybody can deliberately pay to the pool. 

The fee pool is intended to act as a "buffer" such that it remains profitable to not try to steal fees but to just mine normally, even during relatively extreme fee value variance (consider the end of a big international shopping weekend). 

The fee value claimed by the miners between blocks is allowed to vary, but we want to avoid order-of-magnitude size variation (10x). We do however want the effect of expedience fees to have an impact. Perhaps some logarithmic function can smooth it out? Forcing larger fees to be distributed over longer time periods? 

---

Block size dependent difficulty scaling. Hardfork required. 

Larger blocks means greater difficulty - but it doesn't scale linearly, rather a little less than linearly. That means miners can take a penalty in difficulty to claim a greater number of high fee transactions in the same amount of time (effectively increasing "block size bitrate"), increasing their profits. When such profitable fees aren't available, they have to reduce block size.

In other words, the users literally pay miners to increase block size (or don't pay, which reduces it). 

(Sidenote: I am in favor of combining this with the idea of a 32 MB max blocksize (or larger), with softforked scheduled lower size caps (perhaps starting at 4 MB max) that grows according to a schedule. This reduces the risk of rapidly increasing load before we have functional second layer scaling in place.) 

In order for a miner to profit from adding additional transactions, their fees must exceed the calculated cost of the resulting difficulty penalty to make it worth it to create a larger block. Such loads are expected during international shopping weekends. 
With only a few available high value transactions the incentive becomes the reverse, to create a smaller block with lower difficulty to faster claim those fees.

To keep the average 10 minute block rate and to let this mechanism shift the "block size bitrate" as according to the fee justified block size changes, we set an Expected blocksize value that changes over time, and we change the difficulty target into the Standard difficulty target, where each block must reach a Scaled difficulty target . 

In terms of math we do something like this:
Scaled difficulty = Standard difficulty * f(blocksize), where f would likely be some logarithmic function, and blocksize is defined in terms of units of Expected blocksize (a block 1.5x larger than Expected blocksize gets a value of 1.5). 

When we retarget the Standard difficulty and Expected blocksize we do this: 
Standard difficulty = Network hashrate per 10 minutes (approximately same as before, but now we take the Scaled difficulty of the last period's previous blocks into consideration)
Standard blocksize = Recent average effective block bitrate = (sum of recent (weighted!) block sizes / length of timeperiod) / number of blocks in a retargeting period. 

Thus, generating larger blocks drives up the long term standard block bitrate, smaller blocks reduces it, in both cases we strive to average 1 block per 10 minutes. 

Combining this with expedience fees makes it even more effective;

There's always a cutoff for where a miner stops including unprocessed transactions and let the rest remain for the next block. For standard fees, this would result in a fairly static block size and transactions backlog. 
With expedience fees your transaction can bypass standard fees with same value-per-bit, as explained above, because otherwise the miners reduces the value of their future expected fees. The more people that do this, the greater incentive to not delay transactions and instead increase the blocksize. (Can somebody help with the math here? I want simulations of this.) 

(Sidenote: I'm in favor of RBF, replace-by-fee. This makes the above work much more smoothly. Anybody relying on the security of unconfirmed transactions for any significant value *have to* rely on some kind of incentive protected multisignature transaction, including LN type second layer schemes. The other option is just not secure.) 

If load is low then you can add a high expedience fee to incentivize the creation of a smaller block with your transaction, since difficulty will be reduced for the smaller block. This means the miner has a higher chance of beating the competition. Adding additional lower fee transactions may reduce his average value-per-bit to become less profitable. 

Miners simply aim to maximize their fees-per-bit, while also paying as little as possible in mining costs. 

To make this work as intended for those willing to explicitly pay to reduce block size, one could tag such an expedience fee with a maximum allowed blocksize (where the fee will be claimed in such a smaller block if it is the more profitable option), such that it won't be countered by others making more high expedience fees to increase blocksize. Note: I'm not particularly in favor of this idea, just mentioning the possibility.