Hedvig technical and architectural overview whitepaper

Executive overview

Today’s enterprises are global, information-led businesses, regardless of what they produce or what services they deliver. The agility and precision of their IT systems means the difference between winning or losing customers and translating new concepts into revenue-producing products and services. The world’s most admired and best-run businesses use optimal IT systems for a competitive advantage.

Virtualization, automation and self-service are the cornerstones of modern enterprise data centers. Traditional approaches to storage do not fit this new paradigm. A new software-defined storage (SDS) approach keeps pace with the exponential growth of data, while still achieving an automated, agile and cost-effective infrastructure. This approach is based on hyperscale (also known as web-scale) approaches pioneered by Amazon, Facebook
and Google.

Hedvig provides software-defined storage built on a truly hyperscale architecture that uses modern distributed system techniques to meet all of your primary, secondary and cloud data needs. The Hedvig Distributed Storage Platform transforms commodity x86 or ARM servers into a storage cluster that scales from a few nodes to thousands of nodes.

Hedvig’s patented Universal Data Plane™ architecture stores, protects and replicates data across any number of private and/or public cloud data centers. The advanced software stack of the Hedvig Distributed Storage Platform simplifies all aspects of storage with a full set of enterprise data capabilities that can be granularly provisioned at the application level and automated via a complete RESTful API.

This whitepaper describes the Hedvig Distributed Storage Platform architecture and its enterprise storage capabilities — and how this platform delivers business agility, IT flexibility, a competitive advantage, enhanced scalability and a significant cost reduction.

Introduction

Hedvig system and component overview

The Hedvig Distributed Storage Platform transforms how you deliver and manage enterprise storage. It is fully programmable with a complete RESTful API that simplifies and automates management and plugs into any orchestration framework.

Figure 1: Hedvig Distributed Storage Platform Architecture

1. The Hedvig Storage Service — Hedvig’s patented distributed systems engine — is the primary component of the platform. It installs on commodity x86 or ARM servers to transform existing server and storage assets — including SSD/flash media and hard disk —into a full-featured elastic storage cluster. The software deploys to an on-premise infrastructure or to hosted or public clouds to create a single storage cluster that is implicitly hybrid.

2. The Hedvig Virtual Disk is the fundamental abstraction of the Hedvig Distributed Storage Platform. Organizations can spin up any number of virtual disks — each thinly provisioned and instantly available.

Figure 2: Hedvig WebUI: “Add New Virtual Disk” dialog

These attributes include:

  • Name — to specify a unique name to identify the virtual disk.
  • Size — to set the desired virtual disk size. Hedvig supports single block and NFS virtual disks of unlimited size.
  • Disk Type — to specify the type of storage protocol to use for the virtual disk: block or file (NFS). Note: Object containers/buckets are provisioned directly from OpenStack via Swift, via the Amazon S3 API.
  • Encryption — to encrypt both data at rest and data in flight for the virtual disk.
  • RDM — to utilize raw device mapping (RDM) with the virtual disk. RDM provides a VM with direct access to a LUN.
  • Backup — to enable backup, either Hedvig or the Hedvig OST Plugin for NetBackup, for a retention policy of two weeks, one month, two months or six months.
  • Enable Deduplication — to enable inline global deduplication.
  • Clustered File System — to indicate that the virtual disk will be used with a clustered file system. When selected, Hedvig enables concurrent read/write operations from multiple VMs or hosts.
  • Description — to provide an optional brief description of the virtual disk.
  • Compressed — to enable virtual disk compression to reduce data size.
  • Client-side Caching — to cache data to local SSD or PCIe devices at the application compute tier to accelerate read performance.
  • CSV — to enable Cluster Shared Volumes for failover (or HA) clustering. A CSV is a shared disk containing an NTFS or ReFS volume that is made accessible for read and write operations by all nodes within a Windows Server failover cluster.
  • Replication Policy — to set the policy for how data will replicate across the cluster: Agnostic, Rack Aware or Data Center Aware. See Replication for more detail.
  • Replication Factor — to designate the number of replicas for each virtual disk. Replication factor is tunable, ranging from one to six. See Replication for more detail.
  • Block Size — to set a block virtual disk size to 512 bytes, 4k or 64k. File (NFS)-based virtual disks have a standard 512 size, and object-based virtual disks have a standard 64K size.
  • Residence — to select the type of media on which the data is to reside: hard drive (HDD) or Flash (including support for NVMe and 3D NAND SSDs).

3. The Hedvig Storage Proxy is a lightweight software component that deploys at the application tier as a VM or Docker container — or on bare metal to provide storage access to any physical host or VM in the application tier. The storage proxy provides intelligent access to the hyperscale storage nodes, directing I/O requests to the relevant backend storage nodes based on latency response times.
The storage proxy runs in user space and can be managed by any virtualization management or orchestration tool. The storage proxy acts as a gatekeeper for all I/O requests to the Hedvig Storage Cluster Nodes. It acts as a protocol (iSCSI/NFS/S3/Swift) converter, load balances I/O requests to the backend storage nodes and provides edge caching using local flash (SSD/NVMe) devices to optimize storage performance directly at the application hosts. It also caches data fingerprints and eliminates the transmission of duplicate data over network links. See I/O basics for more detail.

Hedvig deployment options

Hedvig Distributed Storage Platform components can be configured to support two types of deployment: hyperscale and hyperconverged.

Figure 3: Hyperscale and hyperconverged deployment options 
  • Hyperscale deployments scale storage resources independently from application compute resources. With hyperscale, storage capacity and performance scale out horizontally by adding commodity servers running the Hedvig Storage Service.
  • Application hosts consuming storage resources scale separately with the Hedvig Storage Proxy, allowing for the most efficient usage of storage and compute resources.
  • Hyperconverged deployments scale compute and storage in lockstep, with workloads and applications residing on the same physical nodes as data. In this configuration, the Hedvig Storage Proxy and the Hedvig Storage Service software are packaged and deployed as VMs on a compute host with a hypervisor installed.
  • Hedvig provides plug-ins for hypervisor and virtualization tools, such as VMware vCenter, to provide a single management interface for the hyperconverged solution.

Hedvig technical differentiators

Harnessing the power of distributed systems, the simplicity of cloud and a complete set of enterprise capabilities, the Hedvig Distributed Storage Platform is the only software-defined storage solution designed to collapse primary and secondary storage into a single platform.

This platform architecture provides six unique features:

  1. A single-storage platform with multi-protocol support. The Hedvig Distributed Storage Platform eliminates the need for disparate primary and secondary storage solutions by providing native support for block, file and object storage, combined with a complete platform RESTful API for orchestration and automation. With storage proxies that run in user space, the Hedvig solution is compatible with any hypervisor, container, OS or bare metal compute environment.
    Business benefit: A consolidated platform cuts down the cost of learning and managing disparate storage solutions. By eliminating a “siloed” infrastructure, Hedvig simplifies and improves overall cost efficiencies.
  2. Advanced enterprise storage services. The Hedvig Distributed Storage Platform provides a rich set of enterprise storage services, including deduplication, compression, snapshots, clones, replication, auto-tiering, multitenancy and self-healing (of both silent corruption and disk/node failures) to support production storage operations and enterprise SLAs. Business benefit: Hedvig eliminates the need for enterprises to deploy bolted-on or disparate solutions to deliver a complete set of data services. This simplifies infrastructure and further reduces overall IT CapEx and OpEx.
  3. Application-level provisioning. This rich set of enterprise storage capabilities can be configured at the granularity of a Hedvig Virtual Disk, providing each application, VM or container with its own unique storage policy. Every storage feature can be switched on or off to fit the specific needs of any given workload.
    Business benefit: The granular provisioning of features empowers administrators to avoid the challenges and compromises of a “one size fits all” approach to storage and helps effectively support business SLAs, while decreasing operational costs.
  4. Unmatched scale with performance optimization. The Hedvig Distributed Storage Platform scales-out seamlessly with off-the-shelf commodity servers. Its superior metadata architecture and intelligent Client-side Caching help to optimize read performance for different workloads. It can start with as few as three nodes and scale to thousands. Performance and capacity can be scaled up or down independently and linearly. Business benefit: Storage administrators are empowered to provision accurately, scale only as and when the business needs it and eliminate dreaded forklift upgrades. This improves business alignment and eliminates wasted CapEx.
  5. Multi-site disaster recovery. The Hedvig Distributed Storage Platform inherently supports multi-site high availability, which removes the need for additional costly disaster recovery solutions. This empowers businesses to achieve native high availability for applications across geographically dispersed data centers by setting a unique replication policy and replication factor at the virtual disk level. Administrators simply choose one to six replicas and indicate a replica destination, which can be a private, public or hosted data center. Business benefit: Enterprises no longer need to deploy complex, disparate and expensive replication technologies on top of their storage infrastructure to meet business continuity and disaster recovery SLAs. This eliminates costly downtime, the risk of business outages and the cost of additional replication solutions.
  6. Cloud-like simplicity with superior economics. The Hedvig WebUI provides intuitive workflows to streamline and automate storage provisioning. Administrators can monitor and even provision storage assets from any device, including mobile devices via a native HTML 5 interface that does not require Flash or Java. This brings the provisioning simplicity of public clouds, such as AWS, to any data center.
  7. Business benefit: Hedvig reduces the overhead of storage operations and enables tasks that would normally take days, weeks or even months to be completed in minutes. This improves business responsiveness, eliminates downtime due to human error and significantly reduces OpEx costs.

Core architecture and I/O flow

Hedvig main differentiator is its patented distributed systems technology — a pure “shared- nothing” distributed computing architecture in which each node is independent and self-sufficient. Thus, Hedvig eliminates any single point of failure, allows self-healing capabilities, provides non-disruptive upgrades and scales indefinitely by adding nodes.

Each node stores and processes metadata and data, then communicates with other nodes for data distribution. All nodes participate in data services, such as deduplication, compression and replication, as well as cluster operations, such as data re-protection (rebuild from failure or corruption) and rebalancing.

Put simply, the Hedvig Distributed Storage Platform gets better and faster as the cluster grows.

Architecture basics

Hedvig Virtual Disks, Containers and Storage Pools

Figure 4: Hedvig Distributed Storage Platform abstractions: Virtual Disks, Containers, Storage Pools 

A Hedvig Virtual Disk is partitioned into fixed size virtual chunks, each of which is called a Hedvig Container. Different replicas are assigned for each container. Since replica assignment occurs at the container level — not simply at a virtual disk level — the data for a virtual disk is spread across the Hedvig Cluster, thus eliminating hotspots and allowing increased parallelism during client I/Os or disk rebuilds. Replicas are chosen according to replication factor and replication policy settings to support the application’s data protection needs.

Within a replica, containers are assigned to a Hedvig Storage Pool. Hedvig Storage Pools are logical groupings of disks/drives in the Hedvig Storage Cluster Nodes and are configured as the protection group for disk/drive failures and rebuilds. A typical storage node will host two to four storage pools. See Disk failures for more detail.

Metadata and processes

Figure 5: Hedvig metadata and data processes 
  • Pages refers to the metadata process. Metadata in the Hedvig Storage Service is partitioned and replicated across all Pages processes. Each Pages process tracks the state of all other HBlock and Pages processes in the Hedvig Cluster to form a global view of the cluster. The Pages process is responsible for optimal replica assignment and tracks all writes in the cluster, including where they were successfully written into.
  • HBlock refers to the data process. The HBlock process is responsible for replicating user data to other HBlock processes and striping data within and across storage pools.

Block, file and object storage support

Hedvig Storage Proxies provide native block, file and object protocol support, as follows:

  • Block storage — Hedvig presents a block-based virtual disk through a storage proxy as a LUN. Access to the LUN, with the properties applied during virtual disk provisioning, such as compression, deduplication and replication, is given to a host as an iSCSI target. After the virtual disk is in use, the storage proxy translates and relays all LUN operations to the underlying cluster.
  • File storage — Hedvig presents a file-based virtual disk to one or more storage proxies as an NFS export, which is then consumed by the hypervisor as an NFS datastore. Administrators can then provision VMs on that NFS datastore. The storage proxy acts as an NFS server that traps NFS requests and translates them into the appropriate RPC calls to the backend.
  • Object storage — Buckets created via the Amazon S3 API, or containers created via the OpenStack Swift API, are translated via the storage proxies and internally mapped to virtual disks. The Hedvig Storage Cluster acts as the object (S3/Swift) target, which clients can utilize to store and access objects.

Hedvig Storage Proxy caches

Figure 6: Hedvig Storage Proxy caching mechanisms 

Metacache

The metacache is a mandatory cache of the metadata that is stored locally on the Hedvig Storage Proxy, preferably on SSDs. This cache prevents the need to traverse the network for metadata lookups, leading to significant read acceleration.

Block cache

For virtual disks enabled with the Client-side Caching option during provisioning, the block cache stores a working set of disk blocks to local SSD/PCIe drives to accelerate reads. By returning blocks directly from local flash media, read operations avoid network hops when accessing recently used data.

Dedupe cache

For virtual disks enabled with the Enable Deduplication option during provisioning, the dedupe cache resides on local SSD media and stores fingerprint information for virtual disks that use the deduplication policy. This cache allows the storage proxy to determine whether blocks have been previously written and, if so, to bypass the need to write the data over the network to the storage cluster.

The storage proxy first queries the cache to determine whether the data is a duplicate. If so, the storage proxy simply updates the Pages process to map the new block(s) and immediately sends a write acknowledgement back to the application. If the data is unique, the storage proxy queries the Pages process to see if the data has been written anywhere in the cluster. If so, the dedupe cache and the Pages process are updated, and the acknowledgement goes back to the client. If not, the write proceeds as a normal new data write.

I/O basics

The Hedvig Distributed Storage Platform delivers a unique distributed systems architecture, but overall storage operations will be very familiar to server, storage or virtualization administrators.

Here is an example of the usual workflow.

  1. An administrator provisions a Hedvig Virtual Disk with the associated storage policies via the Hedvig WebUI, CLI or RESTful API.
  2. Block and file virtual disks are attached to a Hedvig Storage Proxy, which presents the storage to application hosts.
    In the case of object storage, applications directly interact with the virtual disk via Amazon S3 or OpenStack Swift protocols.
  3. The Hedvig Storage Proxy captures guest I/O through the native storage protocol and communicates it to the underlying Hedvig Storage Cluster via remote procedure calls
    (RPCs).
  4. The Hedvig Storage Service distributes and replicates data throughout the cluster based on individual virtual disk policies.
  5. The Hedvig Storage Service conducts background processes to auto-tier and balance across racks, data centers and even public clouds based on individual virtual disk policies.

The following two sections provide more detail about the write and read operations.

Summary of write operations

Figure 7: Hedvig I/O workflow for Write operations 
  1. The Hedvig Storage Proxy determines the replica nodes for the blocks to be written and sends the blocks to one of the replica nodes in a load-balanced manner. If the virtual disk is enabled for deduplication, the storage proxy calculates a fingerprint, queries the dedupe cache and, if necessary, the Pages process, and either makes a metadata update or proceeds with a new write.
  2. The HBlock process on the replica node receives and writes the blocks locally and forwards them to the other replica nodes. The HBlock process writes the incoming blocks to memory and to the Hedvig Commit Log on SSD. Data is later flushed sequentially to local storage (see step 4).
  3. A successful write acknowledgment is sent back to the client after a quorum of HBlock process replicas have completed step 2. For Replication Factor = 3, two ACKs (RF/2 + 1) are needed for the quorum. Two of the three replicas are written synchronously, and one is written asynchronously.
  4. An atomic write is made into the metadata subsystem, after which the write is deemed successful.

Summary of read operations

Figure 8: Hedvig I/O workflow for Read operations 
  1. The Hedvig Storage Proxy queries the local metacache for a particular block to be read and consults the Pages process if the information is not found.
  2. The Hedvig Storage Proxy sends the block details to one of the closest HBlock processes, based on observed latency.
  3. The HBlock process reads the data and sends the block(s) back if found.

If the read operation fails due to any error, the read is attempted from another replica.

If the Client-side Caching option is enabled during provisioning, the Hedvig Storage Proxy queries the local cache to fetch the data instead, bypassing the remote HBlock process and eliminating the need to traverse the network.

Enterprise storage capabilities

Scalability and performance

The Hedvig Distributed Storage Platform is an elastic storage offering that scales from a few terabytes to tens of petabytes.

The platform scales performance and capacity by adding backend storage nodes. Scaling out of a cluster is completely automated and seamless.

The compute tier can also be scaled out (independent of the backend) by adding Hedvig Storage Proxies. If SSD assets are provided to the storage proxies, then re-performance will scale linearly.

In general, it is recommended that you keep storage nodes in homogeneous groups for consistent performance. However, Hedvig supports heterogeneous nodes with different hardware specifications throughout the cluster. You can add groups of nodes as the system grows to take advantage of faster processors and higher capacity/performance drives as they become available.

Note that while the performance of the system is subject to the workload and the actual system configuration, the Hedvig Distributed Storage Platform can support sub-millisecond latency and scale to tens of thousands of IOPS and hundreds of MB/sec per virtual disk.

Note: See the Hedvig hardware guide for specific recommendations on capacity and performance-optimized configurations and expected performance.

Storage efficiency

The Hedvig Distributed Storage Platform contains a rich set of advanced storage efficiency capabilities, grouped in five major categories: thin provisioning, deduplication, compression, compaction and auto-tiering.

Thin provisioning

Each Hedvig Virtual Disk is thinly provisioned by default and does not consume capacity until data is written. This space-efficient dynamic storage allocation capability is especially significant in DevOps environments that use Docker, OpenStack and other cloud platforms where volumes do not support thin provisioning inherently, but can support it using Hedvig.

Deduplication

The Hedvig Distributed Storage Platform supports inline global deduplication that delivers space savings across the entire storage cluster. Deduplication is not “one size fits all.” It can be toggled at the virtual disk level to optimize I/O and lower the cost of storing data for data and apps that are suited for data reduction. As writes occur, the storage system calculates the unique fingerprint of data blocks and replaces redundant data with a small pointer. The deduplication process can be configured to begin at the storage proxy as previously mentioned, improving write performance and eliminating redundant data transfers over the network. Data reduction rates vary based on data type, with most clusters seeing an average of a 75% reduction.

Compression

The Hedvig Distributed Storage Platform supports inline compression that can be toggled at the virtual disk (application) level to optimize capacity usage. Hedvig uses the standard Snappy compression library and stores only compressed data on-disk. While the actual compression ratio and speed depend on data type and system configuration, Hedvig itself never looks into the data to perform any content-specific compression.

Compaction

To improve read performance, as well as to optimize disk space, Hedvig periodically performs garbage collection to compact redundant blocks and generate large sequential chunks of data.

Auto-tiering

The Hedvig Distributed Storage Platform balances performance and cost by supporting tiering of data. To accelerate read operations, the platform supports Client-side Caching of data on SSDs accessible by the storage proxy. Data is also cached on storage node SSDs. For all caching activities, Hedvig supports the use of PCIe and NVMe SSDs. All writes are executed in memory and flash (SSD/NVMe) and flushed sequentially to disk when the appropriate thresholds are met. For persistent storage of data, Hedvig supports Flash (MLC or 3D NAND SSD) or HDD (spinning disk) residence options at the virtual disk level.

Enterprise Resiliency

The Hedvig Distributed Storage Platform is designed to survive disk, node, rack and data center outages without any application downtime and with minimal performance impact. These resiliency features are grouped in five categories: high availability, non-disruptive upgrades (NDU), disk failures, replication and snapshots and clones.

High availability

Storage nodes running the Hedvig Storage Service support a distributed redundancy model with a recommended minimum of three nodes. Redundancy can be set as agnostic, at the rack level or at data center level. The system initiates transparent failover in case of failure. During node, rack or site failures, reads and writes continue as usual from remaining replicas.

Figure 9: Hedvig Storage Service redundancy options

To protect against a single point of failure, Hedvig Storage Proxies install as a high availability (HA) active/passive pair. A virtual IP (VIP) assigned to the HA pair redirects network traffic automatically to the active storage proxy at any given time. If one storage proxy instance is lost or interrupted, operations fail over seamlessly to the passive instance to maintain availability. This happens without requiring any intervention by applications, administrators or users.

Figure 10: Hedvig Storage Proxy high availability 

During provisioning, administrators can indicate that a host will use a clustered file system. This automatically sets internal configuration parameters to ensure seamless failover when using VM migration to a secondary physical host running its own Hedvig Storage Proxy. During live VM migration, such as VMware vMotion or Microsoft Hyper-V, any necessary block and file storage “follows” guest VMs to another host.

Non-disruptive upgrades (NDUs)

The Hedvig Distributed Storage Platform supports non-disruptive software upgrades by staging and rolling the upgrade across individual components, using the highly available nature of the platform to eliminate any downtime or data unavailability.
Storage nodes running the Hedvig Storage Service undergo upgrades first one node at a time. Any I/O continues to be serviced from alternate available nodes during the process. Storage proxies are upgraded next, starting with the passive storage proxy. After the passive storage proxy upgrade is complete, it is then made active, and the formerly active storage proxy is upgraded and resumes as the passive of the pair. This process eliminates any interruption to reads or writes during the upgrade procedure.

Disk failures

The Hedvig Distributed Storage Platform supports efficient data rebuilds that are initiated automatically when there is a disk failure. The Hedvig Storage Service recreates data from other replicas across the cluster.

Figure 11: Hedvig provides self-healing from disk failures 

The rebuild is an efficient background process that happens without impact to primary I/O. Rebuild time reduces as the size of a cluster increases. As such, the platform easily supports the latest 8TB and higher capacity drives, which are not feasibly supported by traditional RAID-protected systems.

Replication

The Hedvig Storage Service uses a combination of synchronous and asynchronous replication processes (two of the three replicas are written synchronously, and one is written asynchronously) to distribute and protect data across the cluster and provide near-zero recovery point objectives (RPO) and recovery time objectives (RTO).

Hedvig supports an unlimited number of active data centers in a single cluster with up to six copies of data per virtual disk, through the tunable replication factor and replication policy options. Replication factor designates the number of replicas to create for each virtual disk, and replication policy defines the destination for the replicas across the cluster.

It is important to note that replicas occur at the container level of abstraction. If a 100 GB virtual disk with Replication Factor = 3 is created, then the entire 100 GBs are not stored as contiguous chunks on three nodes. Instead, the 100 GBs are divided among several containers, and replicas are spread across different storage pools within the cluster.

The table below summarizes the replication factor recommendations for different workloads and the associated system impact.

Replication factorWorkloads and system impact
1• non-critical data (test/dev and data that can be rebuilt for batch jobs/transcoding)
• highest write performance
• lowest capacity overhead (raw capacity = usable capacity)
• deduplication/compression optional for capacity efficiency
2• non-critical availability with average data protection
• higher write performance
• low capacity overhead (raw capacity = 2x usable capacity)
• deduplication/compression optional for capacity efficiency
3• mission-critical availability and data protection
• high write performance
• medium capacity overhead (raw capacity = 3x usable capacity)
• deduplication/compression optimal to reduce overhead

For additional disaster recovery protection against rack and data center failures, the Hedvig Distributed Storage Platform supports replication policies that can span multiple racks or data centers using structured IP addressing, DNS naming/suffix or customer-defined snitch endpoints. Hedvig recommends using the fully qualified domain name to identify the rack and datacenter location, for example, node.rack.datacenter.localhost.

Replication policyData distribution methology
agnosticData is spread across the cluster using a “best-effort” to improve availability.
rack awareData is spread across as many physically distinct racks as possible, in a single data center.
data center awareData replicates to additional physical sites, which can include private or hosted data centers and public clouds. Locations can be specified from the Hedvig CLI or selected from a dropdown list in the Hedvig WebUI.

Replication example

In a disaster recovery setup where the Replication Policy = Data Center Aware and the Replication Factor = 3, the Hedvig Distributed Storage Platform divides the data into Hedvig Containers and ensures that three copies of each container are spread to geographically dispersed physical sites — Data Centers A, B and C. Two copies of the data are written synchronously, and the third is written asynchronously. At any time, if a data copy fails, rereplication is automatically initiated from replicas across the data centers.

Figure 12: Example of a three-data center disaster recovery setup 

Snapshots and clones

In addition to replication policies, data management tasks include taking snapshots and making “zero-copy” clones of virtual disks. There is no limit to the number of snapshots or clones that can be created. Snapshots and clones are space-efficient, requiring capacity only for changed blocks.

You can set new policies for virtual disk clones, for example selecting a different replication factor and/or residence. Snapshot management is intuitive and can be done in the Hedvig WebUI or CLI or automated via the Hedvig RESTful API.

Figure 13: Hedvig WebUI: “Snapshot Management” dialog 

Encryption

Hedvig provides software-based encryption with the Encrypt360 feature. This enables encryption of data at the point of ingestion (on the storage proxy server). Data encrypted in this way remains protected in flight between storage proxy and storage nodes, in flight between storage nodes (or sites) as part of replication, in-use at the storage proxy and at rest.

Encryption level is 256-bit AES and is intended to ensure compliance with, for example, PCI, HIPAA and Gramm-Leach-Bliley, as well as KMIP-compliant KMS. Additionally, any third-party key management system can be plugged in to alleviate key management concerns.

Manageability

The Hedvig Distributed Storage Platform is designed to be simple and intuitive for storage, server, virtualization and DevOps professionals. Complex storage operations can be completed in seconds. Hedvig supports three interfaces for deployment and ongoing management: the Hedvig WebUI, the Hedvig CLI and the Hedvig RESTful API.

Hedvig WebUI

The Hedvig Distributed Storage Platform provides a simple, intuitive graphical user interface — the Hedvig WebUI — which is customizable and skinnable (new themes can be applied). It supports a rich set of metrics per virtual disk or per storage proxy. Administrators can get real-time insights into performance, including IOPS, throughput and latency statistics. The WebUI is delivered with HTML 5 support and works responsively across all modern devices, including locked down servers and mobile phones.

Figure 14: Hedvig WebUI: Across multiple devices with multiple themes 

Hedvig CLI

The Hedvig Distributed Storage Platform provides a comprehensive Command Line Interface — the Hedvig CLI — which is fully scriptable and provides complete control of all features and functionality. The CLI is accessible via an SSH connection, using a Linux shell or a PuTTY-like utility.

Hedvig RESTful API

The Hedvig Distributed Storage Platform provides a RESTful API to simplify configuration, provisioning, management and monitoring. The API calls are a “first class citizen” in the platform, providing access to all Hedvig capabilities. This is essential for DevOps environments and ensures seamless integration into automation and orchestration frameworks.

Hedvig also uses API calls to provide storage support for OpenStack (Cinder and Swift), Amazon S3, the Docker Volume API and the VMware vSphere Storage APIs Array Integration (VAAI). Hedvig is continuously expanding its API support and integration.

Ecosystem integration

VMware

The Hedvig Distributed Storage Platform features a vCenter plug-in that enables provisioning, management and snapshotting and cloning of Hedvig Virtual Disks directly from the vSphere Web Client, as shown in Figure 15: Hedvig vSphere Web Client Plugin.

Additionally, Hedvig incorporates support for the VMware vSphere Storage APIs Array Integration (VAAI), allowing the offloading of host operations to the platform.

Figure 15: Hedvig vSphere Web Client Plugin 

Docker

The Hedvig Distributed Storage Platform provides persistent storage for Docker containers through the Hedvig Volume plugin.

The Hedvig Volume plugin enables an end user to create a persistent Docker volume backed by a Hedvig Virtual Disk. Different options, such as deduplication, compression, replication factor and block size, can be set for each Docker volume, using the “Volume options” in the Docker Universal Control Plane (UCP) or using the “docker volume” command line. The disk can then be attached to any host.

The Hedvig Volume plugin also creates a file system on this virtual disk and mounts it using the path provided by the user. The file system type can also be configured by the user. All I/O to the Docker volume goes to the Hedvig Virtual Disk. As the container moves in the environment, the virtual disk will automatically be made available to any host, and data will be persisted using the policies chosen during volume creation.

For container orchestration platforms, such as Kubernetes and OpenShift, the Hedvig Distributed Storage Platform provides persistent storage for containers through the Hedvig Dynamic Provisioner.

OpenStack

The Hedvig Distributed Storage Platform delivers block, file and object storage for OpenStack all from a single platform via native Cinder and Swift integration. Using Hedvig, you can set granular, per-volume (Cinder) or per-container (Swift) policies for capabilities, such as compression, deduplication and snapshots and clones.

Figure 16: Hedvig components for OpenStack support 

OpenStack administrators can provision the full set of Hedvig storage capabilities in OpenStack Horizon via OpenStack’s QoS functionality. As with VMware, administrators do not need to use the Hedvig WebUI or RESTful API. Storage can be managed from within the OpenStack interface.

Hedvig also provides a validated Mirantis Fuel plugin that automates installation of Hedvig storage drivers with Mirantis OpenStack.

Multitenancy

Hedvig supports the use of Rack Aware and Data Center Aware replication policies for customers who must satisfy regulatory compliance and restrict certain data by region or site.

Additionally, these capabilities provide the backbone of a multitenant architecture, which Hedvig supports with three forms of architectural isolation.

1. LUN masking — With this option, different tenants are hosted on a shared (virtual) infrastructure: Hedvig multitenancy capability: LUN masking. Logical separation is achieved by presenting virtual disks only to a certain VM and/or physical hosts (IP range). Quality of Service (QoS) is delivered at the VM level.

Figure 17: Hedvig multitenancy capability: LUN masking 

2. Dedicated storage proxies — With this option, storage access is provided with a dedicated Hedvig Storage Proxy per tenant: Hedvig multitenancy capability: Dedicated storage proxies. Storage proxies can be deployed on a physical host or a dedicated shared host. This provides storage as a shared infrastructure, while compute is dedicated to each tenant. Quality of Service (QoS) is at the VM level.

Figure 18: Hedvig multitenancy capability: Dedicated storage proxies 

3. Complete physical isolation — With this option, different tenants are hosted on dedicated Hedvig Storage Clusters (each running their own Hedvig Storage Service and Hedvig Storage Proxies) to provide complete logical and physical separation between tenants: Hedvig multitenancy capability: Complete physical isolation. The number of clusters does not impact the licensing model.


Figure 19: Hedvig multitenancy capability: Complete physical isolation 

For all of these multitenant architectures, each tenant can have unique virtual disks with tenant-specific storage policies, because the Hedvig platform supports policies at the virtual disk level.

Policies can be grouped to create classes of service (CoS), as shown in the following examples:

  • CoS = Gold. Virtual Disk properties: Residence = Flash, Replication Factor = 3, Replication Policy = Data Center Aware, Client-side Caching = enabled, Deduplication = enabled.
  • CoS = Silver. Virtual Disk properties: Residence = HDD, Replication Factor = 3, Replication Policy = Rack Aware, Client-side Caching = enabled, Deduplication= enabled, Compression = enabled.
  • CoS = Bronze. Virtual Disk properties: Residence = HDD, Replication Factor = 1, Replication Policy = Agnostic, Client-side Caching = disabled, Deduplication = disabled, Compression = disabled.

Summary and conclusion

The Hedvig Distributed Storage Platform is the only software-defined storage solution built with true distributed system DNA. With the capability to connect to any OS, hypervisor, container or cloud, this unique platform also has the versatility to deploy in hyperscale or hyperconverged mode.

Scaling seamlessly and linearly from a few nodes to thousands of nodes, the Hedvig Distributed Storage Platform provides a rich set of enterprise storage capabilities that can be configured at the virtual disk (application) level. The architecture is implicitly hybrid and protects each application with its own custom disaster recovery policy across multiple data centers or clouds.

The elastic, simple and flexible Hedvig Distributed Storage Platform is the solution of choice for enterprises seeking to modernize data centers and build private and hybrid clouds.
For environments where explosive growth in data is affecting the bottom line — in terms of the cost-per-terabyte to store the data — as well as the operational overhead of managing silos of disparate storage infrastructure, the Hedvig Distributed Storage Platform is the perfect fit.

Simply stated, the Hedvig Distributed Storage Platform empowers IT to reduce costs and to improve business responsiveness.