Custom Search

Thursday, July 28, 2011

Facebook has just moved 30 petabytes of Hadoop data

Facebook, a huge Hadoop user, has just moved its whopping 30-petabyte cluster from one data center to another.

FYI:
1 Petabyte (PB) = 1.024 Terabytes (TB) = 1.048.576 Gigabytes (GB)

The move was necessary because Facebook had run out of both power and space to expand the cluster — very likely the largest in the world — and had to find it a new home. Facebook’s data team undertook a multi-step process to copy over data, trying to ensure that any file changes made during the copying process were accounted for before the new system went live.

Unlike a traditional warehouse using SAN/NAS storage, HDFS-based warehouses lack built-in data-recovery functionality. They showed that it was possible to efficiently keep an active multi-petabyte cluster properly replicated, with only a small amount of lag.

For Facebook, though, it looks like its fast-growing Hadoop data warehouse is just part of a larger trend toward needing more space. Recently, Facebook confirmed it’s building a second data center in Prineville, Ore., next to its existing one. That will make three for the company, which also is building a data center in Forest City, N.C.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...