As an interested party not intimately familiar with the bitcoin codebase
who also spent some time setting up a node a while ago, I would like to
add one thing to the above list - network rate limiting.

The problem is that this is easier said than done. Bitcoin Core won't notice a remote peer is working but slow and switch to a faster one, and even if it did, it'd just mean throttling your connection would cause all remote nodes to give up and hit the other unthrottled peers even more.

The best way to implement this is to do chain pruning, so your node will still try and shovel bytes as fast as possible, but it's limited by how many bytes it has to shovel. Remote nodes that are pulling down the block chain can then switch between nodes depending on what they have available in order to try and avoid hitting one node too hard. Nodes that were offline for a while and just catching up would prefer nodes that have less of the chain.

It'd be great if someone could experiment with this. The first step is extending the p2p protocol so addr broadcasts and version messages include how much of the chain (counting blocks from the head?) the peer is willing to serve, and then updating the downloading code so it tries to be smarter about peer selection. Unfortunately all this work is sort of backed up waiting for sipa to finish merging in headers-first downloading.