You can not consider the outcome resulting by replace-by-fee fraudulent, as it could be the world as observed by some.

Fraudulent in what sense?

If you mean the legal term, then you'd use the legal "beyond reasonable doubt" test. You mined a double spend that ~everyone thinks came 5 minutes later once? OK, that could be a fluke. Reasonable doubt. You do it 500 times in a row? Probably not a fluke.

If you mean under a technical definition then I think Tom Harding has been researching this topic, though I've only kept half an eye on it. I guess it's some statistical approximation of the above, i.e. sufficient to ensure good incentives with only small false positive losses. Sort of like how the block chain algorithm already works w.r.t orphans.