Facebook, a huge Hadoop user, has just moved its whopping 30-petabyte cluster from one data center to another.
FYI:
1 Petabyte (PB) = 1.024 Terabytes (TB) = 1.048.576 Gigabytes (GB)
The move was necessary because Facebook had run out of both power and space to expand the cluster — very likely the largest in the world — and had to find it a new home. Facebook’s data team undertook a multi-step process to copy over data, trying to ensure that any file changes made during the copying process were accounted for before the new system went live.
Unlike a traditional warehouse using SAN/NAS storage, HDFS-based warehouses lack built-in data-recovery functionality. They showed that it was possible to efficiently keep an active multi-petabyte cluster properly replicated, with only a small amount of lag.
For Facebook, though, it looks like its fast-growing Hadoop data warehouse is just part of a larger trend toward needing more space. Recently, Facebook confirmed it’s building a second data center in Prineville, Ore., next to its existing one. That will make three for the company, which also is building a data center in Forest City, N.C.
No comments:
Post a Comment