Daniel,
Strongly agree on the read-only. Given the number of messages, that could be a big issue unless it can be set globally.
Regarding the mapping of Yahoo users to current FOG users... Assuming that we have can parse the Yahoo user names precisely, I'd suggest:
1) Create a "YahooFOGMap" mapping table containing two columns - "YahooUserName" and "FOGUserName".
2) Insert the Yahoo User Names into the "YahooUserName" column.
3) If possible, identify the most prodigious Yahoo posters - Matthew of course, John Lucas, Jerry Work, Per, and several others. We can map the top 10-20 pretty quickly. Update the those users with their FOG names.
4) Retrieve a list of users with a "ORDER BY YahooUserName" clause.
5) Send list to Matthew, who posts the list and promotes the heck out of it. Each current FOG user is supposed to find their Yahoo name and respond to the post with their Yahoo name + a separator value (tilde or something easily identifiable) + FOG name.
6) After some period of time, Matthew runs a SQL query to retrieve the posts into a format that can be easily parsed.
7a) Create a Default Migration User with a name like "Yahoo Migration User" for those posts with not mapped FOG name.
7b) We parse the list and use it to update the YahooFOGMap map table. Hopefully the active members will have updated the list and we'll have most of the key Yahoo posts covered. Then we run a query that sets the FOG name to "Yahoo Migration User" where it's NULL. We'd then run a query something like:
UPDATE YahooFOGMap
SET FOGUserName = "Yahoo Migration User"
WHERE FOGUserName IS NULL
That's ANSI SQL, but I'm not sure if MySQL has any issues with that.
8 ) We run a migration program that 1) converts the original posts into the FOG format, 2) updates the original user names to the FOG user names, and 3) shoves the data into the correct FOG MySQL tables (TEST SERVER!!!) in a relationally consistent format.
9) We test to ensure a valid conversion. RI should be correct, data is not corrupted, and posts are assigned to the correct user. More importantly, we need to ensure that the CURRENT system data and metadata is still valid.
10) Once we've ensured that all is well, we can create a complete rollout plan including detailed steps for system downtime (mandatory), system backup, data migration, validation testing, and system startup. We'd also need to a set of rollback steps just in case.
Some initial thoughts.
Regards,
Dan.
p.s., It turns out that I'm currently in charge of the one-time migration process of a major new system for my customer. After a couple of weeks of coding and tweaking, the migration program (currently about 15 stored procedures) migrates slightly more than 1 billion rows of data, including derivations and data cleansing, in 121 minutes. (I'm a happy guy.)