Unstructured Data Management for AI and Beyond Just Got Faster, Easier and More Cost-Effective

With businesses generating so much unstructured data — up to 90 percent — each day, you need to maximize its value by curating and delivering it to your general business and generative AI workflows. Three recent innovations in data discovery, reporting and visualization from CloudSoda, the technology-leading platform for automating unstructured data management, make this faster, easier and more cost-effective.

Selectable Scanning for Faster Data Discovery

To assure data management efficiency, CloudSoda scans your entire on-premises and multi-cloud data storage environments to find, index and manage your unstructured data folders and files at ultra-high speed. Three new selectable scanning modes for on-premises storage maximize the speed at which unstructured data is discovered depending upon the target:

  • Folder Multi-Threaded: Maximizes scanning speed of a single storage
  • Volume Multi-Threaded: Maximizes scanning speed across multiple storage systems
  • Hybrid Multi-Threaded: Maximizes scanning speed for deep folder structures with very high (5 million or more) files per folder

Volume Multi-Threaded scanning mode is the default mode for on-premises environment.

For cloud, CloudSoda provides Single-Threaded scanning mode for individual buckets and Multi-Threaded scanning for simultaneously scanning multiple buckets.

In competitive bake-offs, CloudSoda has clocked at a 4-to-1 performance advantage and consistently pushes performance to the infrastructure hardware’s maximum limitations.

Image of Figure 1

Figure 1: Selectable scanning modes

New Reporting Engine for Easier Decision Making

CloudSoda now extends beyond unstructured data discovery screens and Google-style file index search to include flexible reports based on default and user-defined criteria that make it easier to identify and manage files and folders. This new feature offers three selectable report types:

  • Duplicated: Identifies duplicate files and folders across multiple storage locations (storages, folders, and projects) against a reference list
  • Unique: Highlights unique files and folders in the reference list that do not have copies in the comparison list.
  • Search Query: Enables advanced search queries to flag files and folders based on specific criteria.

New, easy to read insights help you ensure that storage is efficiently optimized, that important files are backed up, that files are not unnecessarily duplicated, and that on-premises storage isn’t wasted on data already archived in the cloud. The reports also help you plan and estimate unstructured data needed to train large language models for generative AI applications.

Figure 2: Reporting engine user interface

After gaining these insights, CloudSoda lets you automate unstructured data movement to any on-premises or cloud location and tier, as well as update, sync and delete that data anywhere.

New Visualization for Cost-Effective Data Management

CloudSoda has enhanced how it displays current cost of unstructured data storage to help you make more informed storage decisions faster about moving and deleting files to reduce costs. A new panel displays cost of storage, the percentage ratios and costs of cold data and duplicate data and the potential savings of moving or deleting the cold and duplicate data. These analytics can be searched by storage device or tags such as business unit and project and are conveniently displayed next to graphic visualization of related cloud and on-premises storage metrics.

Figure 3: Storage cost panel

With just several clicks, CloudSoda lets you delete or move the unstructured data anywhere to achieve meet your financial objectives.

Combined with the new storage cost panel, a new capacity usage timeline helps you plan storage capacity better by indicating the delta size between the first day and the last day of the timeline view. The timeline can be displayed for any on-premises storage device or cloud repository.

Figure 4: Capacity usage timeline

You can learn more details about these new features from the CloudSoda November 2024 Technical Release Notes.