Tuesday May 16, 2006

Firefox Encodings

Has anybody else noticed Firefox mis-guessing the character encoding of web pages when it's set to "Auto-Detect, Universal"?  I keep going to pages that I think I've visited without trouble before in Firefox and seeing lots of these: �.  That's the Unicode REPLACEMENT CHARACTER, U+FFFD, which is "used as a substitute for an uninterpretable character from another encoding".  I guess this means auto-detect is deciding that some pages are UTF-8, then when non-ASCII characters occur in patterns that don't form legal UTF-8 encodings, they get replaced with �.  This seems to have been happening to me a lot recently—did they change something in a recent patch?

