The total number of factors that makes combat hard to scale is so high that I'm not sure I could list them all if I tried. But let me try to pare them down to the big ones.
Available PC resources - The closer you are to the end of a logistics period, the harder it is to know what resources the PCs have available. In monster camp, we often try to count life spells (roughly) and use that as a metric for other resources, but it is a weak metric at best. For any given encounter, you are pretty much all but guessing how many defenses are available, how much curing is available, how much offense is available, and even how many PCs will engage it (except in a module). I've seen a wave battle that appeared underscaled on paper cause multiple deaths because nobody realized that the two biggest PC teams in town had left site for breakfast.
Effects that scale poorly - Is shatter a powerful effect? That entirely depends on whether the PCs that face it are level 3 or level 15. For the former, shatter is close in power to a death spell. For the latter, disarm is generally more powerful. Similarly, how deadly is drain on a carrier attack. On 1 or 2 enemies, usually not a big deal. But on 4 or 5, it can suddenly be a TPK. And the tipping point is incredibly sudden (and hard to judge), especially if there is a mix of other enemies in the battle. To put it simply, there are a number of effects in the game that don't scale consistently. It takes experience and, honestly, a little guesswork, to work with them.
NPC Fatigue - In my experience, the two deadliest moments in the game are shortly after game on and when PCs jump fence. The main reason for this is that those are situations when you have NPCs that are most fresh. Sure, PCs tire over the weekend, too, but not at the same rate. An influx of new NPCs that show up Saturday morning (because they couldn't get out to game Friday night) can easily tip the scales much more than you might expect.
Level Disparity - I am including this one last because I think it has the least effect on scaling difficulty (though still has a notable effect) of the things I listed. Level disparity primarily makes defensive effects like Threshold and <damage type> to hit a problem. It also changes the significance of Body values. For some types of enemies that have really low body (like plants), level disparity isn't that big a deal. It also doesn't affect enemies with carrier attacks much (a 1st level PC and a 5oth level PC can both wear the same maximum armor value). But, for something like a Colossal Juggernaut (huge health, 1/2 damage from weapons, moderate damage), level disparity is a major pain in the butt. Usually the only way to deal with level disparity is including a range of enemies and giving NPCs very good instructions (and that only helps so much).
The first two issues, in my opinion, are the biggest. In fact, I listed them as broad categories, but the truth is that they involve lots of very specific problems that combine in very ugly ways. For any scaling, the more you can decrease the variables involved, the easier it is to scale. But other than modules (where the PC count variable can be set in stone), there are simply a lot of variable that are almost impossible to do more than estimate.
-MS