February 19, Tuesday
12:00 – 13:00
Data Reduction for Enterprise Storage: Estimation and Effective Resource Utilization
Computer Science seminar
Lecturer : Ronen Kat
Affiliation : IBM Haifa Research Labs
Location : 202/37
Host : Dr. Aryeh Kontorovich
Real-time compression and deduplication for primary storage is quickly
becoming widespread as data continues to grow exponentially, but
adding compression and deduplication on the data path consumes scarce
CPU and memory resources on the storage system.
In this talk we present different approaches to efficient estimation
of the potential data reduction ratio of data and how these methods
can be applied in advanced storage systems.
The main focus is on compression ratio evaluation where we employ two
filters: The first level of filtering that we employ is at the data
set level( e.g., volume or file system), where we estimate the overall
compressibility of the data at rest. According to the outcome, we may
choose to enable or disable compression for the entire data set, or to
employ a second level of finer-grained filtering. The second
filtering scheme examines data being written to the storage system in
an online manner and determines its compressibility.
We also discuss the challenges in achieving similar results when
deduplication is involved and suggest alternatives for this scenario.