Start of main content

Hadoop 3: Erasure coding catastrophe

Day 1

RU

Imagine this: your HDFS are close to 100 Pb, every year you order machines for a dozen petabytes, start them up, balance them for months and repeat this procedure over and over again. Then Hadoop 3 comes out, which promises to save twice the space with the same guarantees and you want to apply it immediately. But you're experienced and wait for version 3.1+, test it, deploy it consistently and test again. But after six months, you start to watch how your data turns into a pumpkin and not only at midnight. Imagine the disappearance of 100 Pb data? It hurts!

The team walked almost on the wild side and learned a lot. This talk will be about findings and mistakes, new experiences with Hadoop, and how to avoid dangerous situations.

Targeted audience: engineers and BigData developers using Hadoop or planning to.

  • #hadoop

Speakers

Invited experts