Ma.gnolia Data Recovery Status
Some background: Ma.gnolia's database server suffered from file system corruption, which also corrupted it's database backup, even though it was on a separate system. This much was bad luck. I was relying on a single backup; the database was fast approaching half a terrabyte and I had been unable to implement a practical, economical solution to version that quantity of data. Having a more robust and comprehensive backup system in place was my responsibility; and, believe me, I know that I let you, the Ma.gnolia community, and myself down in that.
I am currently working with a data recovery company in hopes that they can recover a working version of the database. I will update this thread, our home page, and Twitter account as I hear from them, which unfortunately could be as late as next week.
I am currently working with a data recovery company in hopes that they can recover a working version of the database. I will update this thread, our home page, and Twitter account as I hear from them, which unfortunately could be as late as next week.
6
people like this idea
I like this idea!
Tell me when this idea gets some attention.
The more people who like this idea, the more it gets noticed.
The more people who like this idea, the more it gets noticed.
The best points from everyone
-
We use the hot dump tools provided by the various databases we run and then we rsync the dump file. People have been backing up large databases for a while, you know.
Larry apparently didn't have a backup. The "hot backup" being corrupted due to filesystem damage on the live server illustrates how that scheme was no different than relying on RAID for backups. It was a bad idea.
I read that paid subscribers were going to be issued refunds (no idea how much) but they're no better off in terms of data recovery so there goes the, "You have no right to complain since the service was free." argument.
Again, I hope for the sake of all concerned that the data recovery company comes through.
3 people think
this is one of the best points
-
@onions
Rsyncing a static file is very different from rsyncing a DB file that is constantly changing. Are you saying that's how you're currently "backing up" your production mysql servers?
Besides, thats besides the point. Larry HAD a hot copy which was corrupted because well apparently the "rsync" (I'm sure he wasn't using that but for argument's sake) copied over the corruption.
Hence you always need to be keeping distinct copies of 500GB data sets and that gets unwieldly quickly.
On another note... I wonder what the terms were for the "paid" subscribers... are they getting the remainder of their subscription back or the full amount?
3 people think
this is one of the best points
-
Thank you for the transparency. I can't imagine how hard it is to be open about a mistake with the magnitude of consequences you're dealing with, but I appreciate knowing what's going on so I can cheer you on in your efforts.
To recovery and beyond!
I’m recoverable!
5 people think
this is one of the best points
-
Inappropriate?Thank you for the transparency. I can't imagine how hard it is to be open about a mistake with the magnitude of consequences you're dealing with, but I appreciate knowing what's going on so I can cheer you on in your efforts.
To recovery and beyond!
I’m recoverable!
5 people think
this is one of the best points
-
Inappropriate?Thanks for the update Larry. What a challenging time for you, and like Lisa, I can't imagine what you have been going through. I thank you for your continued updates and hope the next week brings better news for you. Lisa says it best "To recovery and beyond!"
I’m thankful
1 person thinks
this is one of the best points
-
Inappropriate?Thank you for the transparency and the honesty. That is just a horrible thing to happen, and I feel for you. I'm struggling to efficiently back up a 4GB database on a regular basis ... I can't imagine trying to deal with that much data. Good luck with the recovery.
I’m grateful for the honesty
2 people think
this is one of the best points
-
Inappropriate?Finally, some useful information. Thank you, Larry. Which filesystem were you using? Was this caused by a hardware failure? Were you using MySQL? How were backups being done and how were they being transported to the "separate system"?
For future reference, Amazon S3 is a very economical backup solution for large amounts of data.
2 people think
this is one of the best points
-
Are you kidding me? Backing up 500GB to S3 would take forever. This is not someone's rinky dink app we're talking about. -
does S3 support incremental backups? that way you'd only have to send over the changes since the last one. assuming that people aren't making changes more quickly than they can be relayed to S3 it wouldn't literally take forever. of course if you wanted to actually test the backups, well that would take a pretty long time. What's the up/down bandwidth like with S3? -
Inappropriate?I originally posted this (inappropriately, I now realize) over on the recovery tools thread... it seems better-placed here.
Since there was no claim made (at least not that I noticed) that my data would be kept safe and sound, I don't feel like we have much of a right to complain. I was nonetheless pretty shocked to learn that there was no backup made that could still be accessed. I just sort of assumed that a project of this nature would be using offsite backups of some variety, which I'm gathering was not the case. Seems like not an unreasonable assumption that something along these lines would be done.. but still no system is infallible, and things happen: viruses, crackers, toilet flush spiral direction reversals, etc. So I'm not broken-hearted about losing the 40 or so bookmarks that I tagged since my last backup, which is otherwise known as my delicious account.
At this point, I'm still kind of into keeping my bookmarks on ma.gnolia, mostly since it integrates with epiphany and firefox, and I think konqueror, which is 1 or 2 browsers more than delicious. also i would less prefer to keep my data with whoever ultimately owns yahoo these days than with Larry, who seems like a decent enough guy, though presumably and understandably too busy to keep this free service running infallibly. And the fact that it uses open standards is another factor. The main point I must concede to delicious is that their Firefox integration is a lot better than ma.gnolia's.
To better inform my decision, I would like to know if a) more reliable backup procedures will be followed in the future, and b) if there will be (or is already?) a convenient and preferably regularly-scheduled (ie not requiring me to do anything) system for creating and updating a local copy of my bookmarks. Ideal for me would be to have new bookmarks emailed to my account as I add them. Another decent option would be a script of some variety that could be added to my crontab.
Anyway I do hope that ma.gnolia continues to run and that the loss of the data store didn't result from Larry's house burning down.
Cheers -
Inappropriate?Larry, Thanks for the Transparency and No Worries, I truly believe I wouldn't be where I am in the Social Space if it weren't for Ma.gnolia and its Members. You will always have my loyalty.
Truth be told I've done the same thing with my own blog data. So I feel the pain. Plus I have a backup of my own bookmarks on my HD.
I’m confident
2 people think
this is one of the best points
-
Inappropriate?Larry recommended Diigo. Well does anyone know how THEIR backup procedures are like?
-
For fear of the exact same thing I'm using Delicious which has the support of the Yahoo!. -
Inappropriate?@jimmyjo, ever heard of rsync? You take the hit once, when you upload the data. Assuming the server wasn't sitting in Larry's basement at the other end of a DSL line, transferring 500GB over a 100Mb/s connection, like you'd have in a server in a colo facility, is perfectly feasible. Thereafter, rsync only transfers the deltas. Works like a charm for us and we're backing up almost triple the "not someone's rinky dink app" amount on a daily, weekly, monthly basis. Who'd a thunk that people might need to do backups for large datasets too?
@intuited, as I wrote in the recovery thread, don't confuse the service being no cost to you with being free. There was a revenue model. That it might now have been a viable one isn't my concern beyond the immediate impact it has on me and others. It doesn't seem to me that this had much to do with lack of revenue so much as lack of clue in disaster recovery procedures. Whether Larry is a decent guy or not isn't at issue. Those, like you, who seem to think that just backing up our own bookmarks should have been sufficient seem to have been using Magnolia far differently than I had been. I care not just about the bookmarks. I care about the countless hours that I spent categorizing them. If just a bunch of bookmarks arranged in one, rigid format would have been sufficient, I would have just continued to use Firefox for managing my bookmarks as I had been for years.
Anyway, let's hope that the data recovery company can recover something.
2 people think
this is one of the best points
-
`rsync` would be the wrong tool for this job. What you want is something that can post incrementals, like `duplicity` (which also supports encryption and signing.) -
Inappropriate?@onions
Rsyncing a static file is very different from rsyncing a DB file that is constantly changing. Are you saying that's how you're currently "backing up" your production mysql servers?
Besides, thats besides the point. Larry HAD a hot copy which was corrupted because well apparently the "rsync" (I'm sure he wasn't using that but for argument's sake) copied over the corruption.
Hence you always need to be keeping distinct copies of 500GB data sets and that gets unwieldly quickly.
On another note... I wonder what the terms were for the "paid" subscribers... are they getting the remainder of their subscription back or the full amount?
3 people think
this is one of the best points
-
Inappropriate?We use the hot dump tools provided by the various databases we run and then we rsync the dump file. People have been backing up large databases for a while, you know.
Larry apparently didn't have a backup. The "hot backup" being corrupted due to filesystem damage on the live server illustrates how that scheme was no different than relying on RAID for backups. It was a bad idea.
I read that paid subscribers were going to be issued refunds (no idea how much) but they're no better off in terms of data recovery so there goes the, "You have no right to complain since the service was free." argument.
Again, I hope for the sake of all concerned that the data recovery company comes through.
3 people think
this is one of the best points
-
Inappropriate?Does Mysql have an equivalent to Postgres' write-ahead-logging+live-snapshots+point-in-time-recovery? If it did then one could easily implement an economical and practical s3 approach. For next time anyway.
1 person thinks
this is one of the best points
-
Inappropriate?To clarify, I'm suggesting that someone present a way to create a local backup that preserves the subtleties of the tagging system. ie when a bookmark is retagged, another email would be sent out giving the tags and the other information for the bookmark, and a backup script would download & store a new version of it. Presumably an email-based system could also be configured to send out categorized lists of bookmarks on a daily, weekly, etc. basis. It seems like the basis for this already exists: as I recall, there is an option in the interface to create a backup of your bookmarks. All that's needed is for this procedure to be made to happen on a regular basis. Actually probably a two-line bash script using wget would do the trick, if crudely, and in a non-Windows-friendly way.
Another, arguably more important, system that seems to be lacking in our age of migrating precious and personal data to online data stores is one of certification of these bodies. There is no well-known certification agency that gives a stamp of approval to organizations following established and thorough backup procedures. If there had been, clients of ma.gnolia could have known to look for it and, in its absence, been more inclined to make regular backups of their bookmarks. Of course the effectiveness of such a system might be limited to those conscientious enough to take proper precautions; likely most of these people already do. In reality, the main effect of such a system might be to encourage those who have, for example, spent hours arranging their bookmarks, and are knowledgeable about good data management practices, to save a bit of time by not creating a personal copy of their updated bookmarks that are kept on certified sites.
-
Inappropriate?Call Percona. You don't just need filesystem recovery, you need MySQL-specific data recovery, and we are the only ones who can do that well.
-
Inappropriate?Larry, I also want to thank you for your honesty and transparency through this process, and also for your efforts to recover the data.
I look forward to following your work on the next great thing.
I’m thankful
1 person thinks
this is one of the best points
-
Inappropriate?Who is the data recover company?
What was the RAID system (Linux, Windows, Other)?
...Stephen -
Inappropriate?Good Morning! What GetSatisfaction's recovery options page? It was erased from Ma.gnolia's website. Can somebody please post a link or was it erased completely? Thanks for the info!
-
Inappropriate?Heidi,
I see the GetSatisfaction's recovery options page without any problem, check http://cli.gs/H4VT4e. Refresh your cache? Perhaps your ISP had DNS issues?
The Ma.gnolia recover page lists options as well at http://cli.gs/ZSvmQm -
Thank you kindly, Deborah. -
Inappropriate?No problem. I follow the thread and get email updates with the link to the GetSatisfaction's Recovery Options page, when others comment.
-
Inappropriate?I was finally contacted today by the data recovery folks; but, only with an update that they're still working and need more time, until Monday. I'll report back then when I hear more.
-
Inappropriate?Aside from the recovery, is the site usable? I have a backup that I can upload and keep going. Or, is this the end of Magnolia?
In any event I think it would be a great service to the IT community if you could document how you were doing things, what happened, how you fixed them, and what you will be doing differently (and how) from now on.
Good luck!
- Charles
I’m thankful
-
If you haven't you might be interested in this video interview with Larry on exactly that subject:
http://tr.im/fj_cg11 -
Inappropriate?Unfortunately, the attempts to recover the Ma.gnolia database file were unsuccessful and the tools posted for recovering your public bookmarks remain the only source for restoring your bookmark collections.
I will keep the Ma.gnolia twitter account and homepage updated with developments for relaunching the Ma.gnolia service in the coming months under a more robust infrastructure.
If you want to know more about what happened and the history of Ma.gnolia in general, I recommend watching the latest Citizen Garden which was recorded last week.
2 people think
this is one of the best points
-
Inappropriate?Larry,
Please see my original comment on this thread, and then further discussion on http://www.xaprb.com/blog/2009/02/19/... where another person points out that my original comment is not phrased well. I believe there is still a chance of recovering this data. Whether it's worth pursuing that is totally up to you and how much the data matters to your users. If you have the corrupted backup, perhaps you'd be willing to attach that drive to some server with an Ubuntu live cd running and let us SSH into it to assess the situation. There is not necessarily a need to recover the corrupted filesystem -- it may be possible to extract the data right from the raw device. MySQL data recovery is a very specialized skill, and there's a better chance at recovering the data than with more generalized data recovery techniques.
You can also contact us to discuss this separately: http://www.percona.com/contacts.html
Baron
1 person thinks
this is one of the best points
-
Inappropriate?Excuse me, but i don't know where to ask.
I had an account on ma.gnolia but the screen name is long forgotten, i've used only openid. I've tried several possible names, is it possible to know one's screen name by openid used (otkin.livejournal.com)? -
Inappropriate?Are there plans to recover the publicly-cached bookmarks and establish a new magnolia database using them, or otherwise make them available to people through magnolia.com? This would be much more accessible to people and prevent people from losing their data when the caches are rebuilt. I'm disappointed that this hasn't already been done. What percentage of the bookmarks are available through these means?
Thanks -
Inappropriate?Hello Larry,
We were told "Ma.gnolia's database server suffered from file system corruption, which also corrupted it's database backup, even though it was on a separate system."
What caused this "file system corruption"? Was it a physical disk failure?
Was it a problem with the file system? If it was, OS X uses a journaling file system. Why didn't that work? What version of OS X were you running?
Was it a problem with MySQL? If so, what caused it? What version of MySQL? What sort of tables, MyISAM or InnoDB? Were you using replication?
How did the backup also get corrupted? What types of backups were being done, how frequently, and were they automated?
In any case, how was it that the backup and the database on the production disk drive had no useful data? What was the nature of this corruption? It's very unusual for there to be no recoverable data on not one but two separate drives.
What were you doing in that week or ten days before you handed the disk drive over to the data recovery folks? Was the backup drive also handed over to them? Were you/they working on disk images of the original or were you/they working on the original?
I'm interested in this not only from the perspective of being a Ma.gnolia user but also from a professional perspective. Plane crashes are investigated and documented extensively with the aim of preventing future crashes. While it's impossible to prevent computer system failures, we can all learn lessons from them. You would be doing an invaluable service to many people if you documented this in detail and put it up somewhere, not necessarily here, so that we can all learn from this unfortunate incident.
Regards,
Clifford
-
My experience using Mac OS's HFS+ journaled filesystem is that it's unfortunately very unreliable. It corrupts in completely routine jobs such as copying files, with no reason at all, and is utterly unable to recover itself. I have been able to reproduce this behavior across a wide array of hardware and software. FAT32 is much more reliable, believe it or not. -
Wow! I only use OS X very occasionally so I wasn't aware of that. I have no idea why anyone would run OS X for a hosting platform anyway but that's another story. -
Inappropriate?First of all you should decide if the problem is physical, logical or the combination of both. Logical issues lie in the software part of your computer, while the physical problems result into hardware failures. Once you find the type of problem, you could plan to have appropriate data recovery solution.
I’m happy
Loading Profile...








