Request #31148

From:
Account Type:
Seed Account
Dreamwidth:
Account Name: [personal profile] pseudomonas
Style: (S2) core: public, layout: public, theme: public, user: custom
Email confirmed? Yes
cluster: 10
data version: 10
scheme: tropo-purple
Media storage used: 6.373 MB (0.2%)
Support category:
Time posted:
Mon, 29 Jun 2015 08:00:17 GMT (527 weeks ago)
Status:
Open
Summary:
Encoding bug in FOAF XML - server response headers are declaring UTF8
Original Request:
https://cedara.dreamwidth.org/data/foaf has some encoding issues, I think. Server headers are declaring it to be UTF8, but my browser and XML parser don't think it is. Any ideas?
Diagnostics: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0
pseudomonas: "pseudomonas" in London Underground roundel (roundel) [personal profile] pseudomonas - pseudomonas
Comment (#81841)
Posted: Mon, 29 Jun 2015 08:11:49 GMT (527 weeks ago)
$ iconv -f latin1 -tutf8 cedara.xml | xmllint - | less

this seems to fix the problem (though if it's just the one ü character it doesn't guarantee precisely that it's latin1-encoded)
pseudomonas: "pseudomonas" in London Underground roundel (roundel) [personal profile] pseudomonas - pseudomonas
Comment (#82604)
Posted: Thu, 06 Aug 2015 14:16:09 GMT (521 weeks ago)
So, I *think* the issue is not that the encoding is wrong (cos DW's FOAF-generation normally escapes all non-ASCII utf8 characters and this appears to be working OK), but that the user has been able to include broken encoding in their profile.

There's a character 0xFC which has got in there somehow and is forbidden utf8 but permitted as latin1. It displays wrongly on the profile page too (tooltip of the hourglass icon)

My guess is that this was at some point an importer bug pulling the page from http://cedara.livejournal.com/profile - but the ü displays correctly there.
You must log in to answer Support requests.
Go to: previous open request, next open request
Return to the list of open requests.
Back to the Support Area.