Image of VAST Data logo

VAST Data Unveils New VAST Data Platform

VAST Data Platform storage

Context

Traditional storage products focus, of course, on the business of storing data. A typical enterprise-class storage array might look at the bits flowing through the system to encrypt or decrypt data, but it doesn’t attempt to interpret that data. There is no real “data intelligence” in a storage array. Interpreting and understanding the bits within a storage system is the job of a software stack that typically lives somewhere else.

The software stack for a modern analytics solution can become very complicated, involving many software components. VAST DATA shared a slide, reproduced below illustrating this complexity. Modern analytics and AI workloads could be vastly more efficient with a data store that assists in understanding the data.

Graphic illustrating the before-and-after complexity in AI stack.
Simplifying the AI StackVAST DATA

VAST has been slowly enabling increasing levels of data awareness within its products. The company this week unveiled more elements, while teasing a future one, that proves VAST Data is true to its word. VAST is a Data company whose product is the VAST Data Platform.

News: The VAST Data Platform

The VAST Data Platform extends VAST’s universal storage capabilities with new data access and analytics features. The VAST Data Platform combines the VAST DataStore, VAST DataBase, and VAST DataSpace. The company also teased the VAST DataEngine, coming sometime next year. Let’s look at each of these elements.

It Starts with the VAST DataStore

The VAST DataStore is the product most associated with VAST Data. VAST can linearly scale its multi-protocol file services to exabytes of data using its unique disaggregated, shared-everything architecture. It delivers this using standard off-the-shelf server- and storage-hardware. The VAST DataStore supports the full range of features you want to see in an enterprise storage product, from data protection to security.

Graphic showing the capabilities of VAST DataStore
VAST DataStore. VAST DATA

The first public proof of VAST’s ambitions to move up the data stack came earlier this year with its announcement of the VAST Data Catalog (and, implicitly, the VAST DataBase). The data catalog is a feature of VAST’s Universal Storage that allows users to tag unstructured data with user-defined metadata into a queryable table for future analysis – essentially giving structure to unstructured data.

The VAST Data Catalog eliminates the need to perform inefficient operations that walk filesystems to build this data, allowing users and administrators to gain instant insights directly from the storage system. This is foundational to what VAST Data is delivering.

VAST DataBase: Enabling Analytics

As VAST tells it, the VAST DataBase combines an exabyte-scale namespace for natural data types such as images, video, LIDAR, genomes, and other rich, real-world data with a tabular database to hold the catalog of metadata associated with that data. This metadata includes user-defined tags.

The VAST DataBase provides easy integration with nearly all the most popular data wrangling and query interfaces, including Apache Spark, Parquet, databricks, RAPIDS, and Vertica (among others). VAST also has its own SQL-based query language, and feature-rich API, for those who want to get even closer to the data.

Chart showing integrations possible with VAST DataBase.
VAST DataBase Connectivity. VAST DATA

VAST has performed significant tuning to match the needs of data analytics with the sometimes-different needs of storing and managing data. When designing the VAST DataBase, the company built its own database engine instead of leveraging an existing open-source product.

This has paid off. VAST’s approach to storing columnar data has enabled the VAST DataBase to achieve remarkable levels of query filtration, reducing the number of records a query engine must sift through.

VAST illustrated the power of its VAST DataBase, comparing the same Trino query on the VAST DataBase and on a Fast S3 datastore. The query against Fast S3 returned 580M rows of data in just over 40 seconds, while the VAST DataBase returned just 2,000 rows in only 1.84 seconds. Those are stunning results. I’m anxious to see if real-world performance is as impressive.

Illustration showing the query performance of VAST DataBase
Query Performance. VAST DATA

VAST DataSpace: Data Consistency without Performance Tradeoffs

One of the critical challenges of managing data across a distributed architecture is in managing locks on the data. Lock management can make-or-break a distributed system. This is something that every distributed file system vendor is relentlessly focused on. VAST is no different, solving the distributed data problem with its new VAST DataSpace architecture.

The VAST DataSpace does two things well: it implements an element-level (e.g., file, object, table) locking scheme, and it contains a unique cache integrity architecture that ensures read consistency that doesn’t sacrifice performance on hot data.

As VAST Data describes it, reads can achieve peak performance while writes maintain consistency. This happens because, as a write happens, all globally cached copies of that element are removed. At the same time, any references are directed to the cluster holding the lock. There’s not enough space here to do justice to what VAST has delivered, so I encourage the curious reader to walk through VAST Data’s description of the technology.

The end result of all this technology is that the VAST Data Platform enables global access from edge to the cloud using unified file and object semantics, as well as with Table APIs. VAST does this without sacrificing performance.

Analysis

I talked to a technology reporter just after VAST’s launch event for the VAST Data Platform. He asked me whether the VAST announcement will disrupt the legacy storage vendors. I don’t think it will. There’s a traditional approach to storage that’s deeply embedded within nearly every tier-one storage company, and there’s little motivation to change that. Seeing the world through the same lens as VAST Data requires a specific kind of vision, different from one that most legacy OEMs possess.

The reality is that traditional storage-focused offerings work just fine for most of today’s enterprise workloads. Where classic storage breaks down is along the edges, where extreme performance and scalability live, and where globally distributed data may be important. This is also where the future of enterprise IT might live, where real-time analytics and data-hungry AI clusters are more prominent.

VAST Data may be ahead of the technology curve, but that’s ok. VAST enables a specific future, but it allows its users to adapt at their own speed. VAST isn’t charging a premium for any of its features. Customers can deploy traditional multi-protocol storage features and, as needs evolve, begin to take advantage of what the VAST DataBase and VAST DataSpace offer. As analytics and AI of all varieties permeate the enterprise, VAST is ready to take care of those workloads.

Those in the digital transformation business like to talk about how an enterprise’s data impact its competitiveness. Central to the task of making data a competitive differentiator is making data queryable. That’s a complex challenge, or at least it was until VAST Data unveiled its VAST Data Platform this week. VAST Data simplifies the understanding of an enterprise’s data. It does this while offering some of the most performant and feature-rich storage technology available.

Disclosure: The author is an industry analyst, and NAND Research an industry analyst firm, that engages in, or has engaged in, research, analysis, and advisory services with many technology companies, which may include those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.