Character encoding

graphex

Member
Joined
Jan 10, 2008
Messages
136
Matt, are you able to set the encoding on the forum to UTF-8?
Right now it is looking like ISO-8859-1 which isn't so great for fun german translations like Tauchs?gen.
 
graphex,
Yes, this has come up in the past, but I have let it slip.  I'll look into this today.
Thanks for reminding me.
Matthew
 
Now I remember what the hold up was!

Switching the forum over to UTF-8 requires a sensitive database conversion, which I have been rather hesitatnt to do.  However, with the growing international presence here, I am revisiting the matter and checking with technical sites on what it means to convert.  I do think it would be worthwhile for international characters to show up properly.

We have a few members here who are programmers and database experts.  Perhaps they have some experience about such conversions, or UTF-8 in particular?

Here's a link to the technical page for SMF regarding UTF-8:http://docs.simplemachines.org/index.php?topic=865.0

Matthew
 
Unfortunately I'm a MSSQL/C#/ASP.net and Flash/Flex guy, so my MySQL experience is pretty limited. In terms of the actual conversion process, it doesn't look too bad (especially if you're doing it on a backup) based on the process you linked to. Since SMF does have a process in place for the conversion, it doesn't look you're exactly in the wild west - I'm sure others have paved the way to some extent.

Of course, if you don't have a backup and mess things up, this wonderful forum is SOL - this forum is backed up, isn't it? My guess is that only a few existing posts might get some ?'s or a few ?? -type things where they used to have a non-english character if the process isn't perfect, but then again, it wasn't really possible to put an umlaut (?) in a post as it stands now (at least in Safari on a Mac - maybe it would work if I used my a Windows machine, but truthfully I don't know how to make an umlaut in Windows offhand). Pretty minor collateral damage, and I seriously doubt any performance difference would be noticeable.

 
graphex,
The forum is backed up every night.  Even so, I don't want to have to do a restore if I don't have to!
Probably, doing this on the test forum is the best method, although I would need to populate the test forum with posts that can test the new functionality.
Another thing to consider -- switching to UTF-8 may cause certain modifications to function badly, or not function at all.  I have made a lot of modifications to the forum.
I'll look into getting UTF-8 installed on the test forum and we can go from there.
Matthew
 
Sounds like SMF does the heavy lifting!

Yeah, it's going to hurt a little bit, but getting everything over to UTF-8 now, when everything is still mostly straight ASCII, will make it easier, rather than having a proliferation of character sets that have to get converted over later.
 
Back
Top