Because we web developers still suck at reliably generating correct HTML. A lot of us learned, back when we were working with interpreted languages on much slower computers, that generating HTML the sensible way, by calling functions to generate one tag at a time, was slow. So we've been doing string concatenation with sloppy escaping instead, for like 25+ years. I'd venture to say it's still particularly a problem in PHP.
Or, in the case you're talking about, it's possible that the content was escaped for HTML on its way into the database, on the assumption that it would be going into a web page, and then used in a context where that pre-escaping wasn't needed. That at least used to be a common pattern in PHP, "sanitizing" data at input time rather than escaping it at output time for the specific output format.
@cachondo Yeah, that's usually when something doesn't understand the character and generates this crap. It's like the %20 that you sometimes see in wbe addresses.
Matt Campbell
in reply to Jayson Smith • • •Jayson Smith
in reply to Matt Campbell • • •Matt Campbell
in reply to Jayson Smith • • •Matt Campbell
in reply to Matt Campbell • • •Deciphering Glyph :: Data In, Garbage Out
blog.glyph.imMatt Campbell
in reply to Matt Campbell • • •Sean Randall
Unknown parent • • •Martin in Toronto
Unknown parent • • •