by TheBrain » Wed, 10Aug11 03:47
I've had a quick look at the scrïpt (running it was a bit tricky without admin access to the forum... :P). To put it mildly, it has issues. Most importantly, in pretty much all cases the scrïpt assumes everything is without error (for example, the function that makes a new HTTP connection can return failure, but the code that uses it never checks for such failure). Now normally you could detect stuff like that from PHP notices/warnings, but due to a very mediocre implementation it emits a lot of those even when nothing is technically wrong.
So with the scrïpt as it is, it might fail to retrieve a page, but you'll never really see the failure in the output. I'd say this is probably how posts "get lost".
While I'm not sure why the bbcode extraction would fail, I presume it's because it drastically increases the amount of requests made (by a factor of 20, if I'm not mistaken). So failure is probably much more common.
Although I'm not making any promises I'll try to refactor the scrïpt a bit so it's more robust against these failures (retrying when a connection fails, checking against actual failure, etc.).
Sadly making it so that it recovers anything not recovered yet is quite a bit harder.
On the username issue, assuming that you're in no rush with the migration and you'd rather do it right the first time. It really shouldn't be that hard to adapt the queries emitted to the phpbb3 database structure.