Thursday, 2 June 2011

WMF in 2016 - a User Story

Here's a SciFi user story, with the Wikimedia Foundation in mind, to illustrate the importance of a verified backup process and archive policy, it's not a prediction of the future!

In 2016 San Francisco has a major earthquake and the servers and operational facilities for the WMF are damaged beyond repair. The emergency hot switchover to Hong Kong is delayed due to an ongoing DoS attack from Eastern European countries. The switchover eventually appears successful and data is synchronized with Hong Kong for the next 3 weeks. At the end of 3 weeks, with a massive raft of escalating complaints about images disappearing, it is realized that this is a result of local data caches expiring. The DoS attack covered the tracks of a passive data worm that only activates during back-up cycles and the loss is irrecoverable due backups aged over 2 weeks being automatically deleted. Due to a lack of operational archive strategy it is estimated that the majority of digital assets (including over 80 million photographs) have been permanently lost and estimates for 60% partial reconstruction from remaining cache snapshots and independent global archive sites run to over 2 years of work.


  1. This is a fascinating scenario, but just to be clear: in my understand an earthquake in San Francisco would not affect the servers that support the projects. All such servers are in the data centers in Virginia, Florida, and the Netherlands (the last are caching only).

    Before the Foundation started migrating its primary data center to Virginia, it really is an act of god that a hurricane didn't wipe out Wikipedia. But Wikimedia is, technically speaking, pretty safe from any California earthquakes. The staff are another story. ;)

  2. Glad to be reassured. Of course, it's just SciFi.

    Now let's consider an escalating price of copper wire in 2015 leads to mass theft of cabling in the USA, the backups fail and a mass trojan account hijacking scam happens whilst a new targeted Chinese network virus attack is corrupting data caches...

    I hope long term scenario planning is part of the archive strategy :)