HomeProductsProducts Details

Hardware compression solution for Apache Hadoop

Date: 28/10/2013
Exar and data management solutions have announced hardware-accelerated compression solution for Apache Hadoop. The solution called AltraHD enables all applications in the Hadoop stack to be transparently compressed using hardware acceleration.

AltraHD integrates seamlessly into the Hadoop stack, and is able to transparently compress all files, including files that are stored using the file system as well as files stored locally as intermediate data outside of the file system. AltraHD is the only compression solution for Hadoop to offer the following key features, as per Exar:

Application transparent file system filter that sits below the Hadoop Distributed File System (HDFS) to compress all files that are using HDFS.

Compression codec that compresses intermediate data during the MapReduce phase of Hadoop processing.
Exar's high performance PCIe-based hardware compression card that automatically accelerates all compression operations, maximizing performance while offloading the host CPU. A single card provides up to 3.2 gigabyte/sec of compression throughput.

"Exar has taken big data analytics to the next level," stated Rob Reiner, Exar's director of marketing for Data Compression and Security products. "AltraHD typically improves Terasort performance by 50 – 100%, and typically provides a 3x - 5x savings in storage capacity, enabling unmatched Hadoop cluster performance and efficiency, and establishing AltraHD as the industry leading solution."

AltraHD is available now.

Apache Hadoop is explained as "The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures." at http://hadoop.apache.org/.