Azure HPC Cache is a game-changer for high-performance computing (HPC) workloads on Azure. It's a distributed, in-memory caching layer that stores frequently accessed data in a high-performance, low-latency cache.
By storing data in the cache, you can significantly reduce the time it takes to access data, which can lead to a substantial boost in performance. This is especially important for HPC workloads that require fast access to large amounts of data.
Azure HPC Cache is designed to work seamlessly with Azure Blob Storage, which means you can store your data in Blob Storage and still benefit from the performance boost of the cache. This makes it easy to integrate with your existing Azure setup.
What Is Azure HPC Cache?
Azure HPC Cache is a cloud-based caching solution that accelerates access to frequently accessed data in Azure Blob Storage and Azure Data Lake Storage Gen2.
It reduces the latency and improves the performance of high-performance computing (HPC) workloads by caching data closer to the compute resources.
By storing frequently accessed data in a cache layer, Azure HPC Cache minimizes the need for remote access to storage resources, resulting in faster data access and reduced storage costs.
Azure HPC Cache supports a wide range of file systems, including Lustre, NFS, and SMB, making it a versatile solution for various HPC workloads.
It also integrates seamlessly with popular HPC frameworks and tools, such as OpenFOAM and ANSYS, to provide a streamlined and efficient HPC experience.
Getting Started
To get started with Azure HPC Cache, you'll first need to log in to the Azure portal using your credentials. This will give you access to the various tools and resources you'll need to set up your cache.
You can create a new HPC Cache resource by clicking on the "Create a resource" button in the left-hand menu and searching for "HPC Cache" in the search bar. This will take you to the creation page where you can fill in the necessary details.
In the Basics tab, you'll need to provide a name for your HPC Cache, select the subscription you want to use, and specify the resource group. You can either create a new group or choose an existing one.
The service details section is where you'll select the location where you intend to create the HPC Cache. You can choose from any available Azure region. Next, you'll need to select the virtual network and subnet where the cache will be located.
You'll also need to choose the cache type, which can be either Standard or Premium. The Standard cache is suitable for workloads that require low to moderate performance, while the Premium cache is designed for high-performance workloads.
Here are the main differences between the two cache types:
Once you've selected your cache type, you'll need to choose the right cache size and throughput. The size of the cache determines how much data it can store, while the throughput determines how much data can be read or written per second.
Configuration and Management
To configure Azure HPC Cache, you'll need to set up a cache cluster, which can be done in the Azure portal. This involves selecting a resource group, virtual network, and subnet for the cache cluster.
The cache cluster's configuration can be customized to meet specific needs, such as setting the cache size and configuring data replication. You can also set up a cache cluster in an existing resource group.
Azure HPC Cache can be managed through the Azure portal, where you can monitor cache performance, view cache usage, and manage cache settings. This includes updating cache configuration, scaling the cache cluster, and monitoring cache performance metrics.
Important Configuration Settings
When configuring your system, it's essential to set the right timezone to ensure accurate scheduling and notifications.
The timezone can be set to match your location, such as setting it to UTC-5 for Eastern Standard Time.
Setting the timezone can be done in the system settings, usually found in the control panel.
A well-configured timezone ensures that your system is aware of daylight saving time, which can be adjusted to account for seasonal time changes.
The timezone should be set to the correct offset from UTC, such as UTC-8 for Pacific Standard Time.
Incorrect timezone settings can lead to scheduling conflicts and notifications being sent at the wrong time.
It's also crucial to set the correct date and time format to match your local preferences.
The date and time format can be set to display the date in the format MM/DD/YYYY or DD/MM/YYYY.
Setting the correct date and time format ensures that your system is consistent with your local conventions.
In addition, configuring the system's language settings is also important.
The language settings can be set to match the language of your users, such as setting it to English or Spanish.
Setting the correct language ensures that your system's interface and messages are displayed in the correct language.
Finally, configuring the system's time zone, date and time format, and language settings will help ensure that your system is running smoothly and efficiently.
Flush Data
The Flush button on the overview page tells the cache to immediately write all changed data that is stored in the cache to the back-end storage targets.
You can use the Flush button before taking a storage snapshot or checking the data set size to make sure the back-end storage system is up to date.
During the flush process, the cache can't serve client requests and cache access is suspended.
The cache status on the overview page changes to Flushing during the flush process.
The flush process can take a few minutes or over an hour, depending on how much data needs to be flushed.
After all the data is saved to storage targets, the cache automatically starts taking client requests again.
The cache status returns to Healthy after the flush process is completed.
To flush the cache, click the Flush button and then click Yes to confirm the action.
You can also use the az hpc-cache flush command to force the cache to write all changed data to the storage targets.
When the flush finishes, a success message is returned.
Upgrade Software
Upgrading your software is an essential part of maintaining a healthy cache. You'll see a message at the top of the page about updating software, and the Upgrade button will become active if a new version is available.
The software update can take several hours, and cache performance will slow down during this time. Plan to upgrade during non-peak usage hours or in a planned maintenance period.
If you don't upgrade during the given time frame, Azure will automatically apply the new software to your cache. You'll have a week or so to apply the update manually, and the end date is listed in the upgrade message.
You can use the Azure portal to schedule a more convenient time for the upgrade. To do this, click the Upgrade button and select Schedule later to choose a new date and time.
The date and time you choose will be shown in your browser's local time zone. You can't choose a later time than the deadline in the original message.
Here's a quick rundown of the upgrade options:
- Upgrade now: Upgrades the software immediately.
- Schedule later: Allows you to choose a new date and time for the upgrade.
If you want to revise your scheduled upgrade date, click the Upgrade button again and select the Reset date link. This will remove your scheduled date and allow you to choose a new one.
Note that you can't change the schedule if there are fewer than 15 minutes remaining before the upgrade.
Delete the
Deleting an Azure HPC Cache is a straightforward process, but it's essential to follow the right steps to avoid data loss.
To delete a cache, you'll need to stop it first, making sure it shows the status Stopped. This ensures all data in the cache has been written to long-term storage.
The Delete button permanently removes the cache, destroying all its resources and stopping account charges. You can reuse the back-end storage volumes used as storage targets in a future cache or decommission them separately.
Use the Azure CLI command az hpc-cache delete to permanently remove the cache.
Performance and Scalability
Azure HPC Cache is a game-changer for high-performance computing. It helps accelerate file access when many clients are simultaneously requesting data from the same source.
This is made possible by caching frequently accessed data in a shared cache, reducing the number of requests to the remote source. The result is improved overall performance for all clients, making it a highly scalable solution.
High Read-to-Write Ratio
Azure HPC Cache is an excellent solution for workloads that have a high read-to-write ratio, such as repeatedly accessing the same files or data.
This type of workload often involves many more read operations than write operations, resulting in a high read-to-write ratio.
The data is read frequently, but changes to the data are relatively infrequent, making it ideal for caching.
Azure HPC Cache can cache the frequently accessed data and serve it from the cache, eliminating the need to access the remote storage each time.
Heavy Request Loads
Heavy Request Loads can be a real performance killer, especially in high-traffic environments. HPC Cache helps alleviate this issue by caching frequently accessed data in a shared cache.
This reduces the number of requests to the remote source, making it a highly scalable solution.
Use Cases and Examples
Azure HPC Cache can be a game-changer for research institutes looking to port their genomic analysis workflows into Azure, as it provides POSIX file access, eliminating the need for client-side changes.
Many life sciences workflows can benefit from scale-out file caching, such as genomic analysis, secondary analysis, pharmacological simulation, and AI-driven image analysis.
Financial services companies can use Azure HPC Cache to speed up quantitative analysis calculations, risk analysis workloads, and Monte Carlo simulations.
File-Based Analytics Workloads
File-Based Analytics Workloads are a great use case for Azure HPC Cache. It provides a shared, high-speed cache for frequently accessed data, reducing latency and improving performance for all clients accessing the data.
This is particularly useful for pipelines that use file-based data and involve multiple clients. Azure HPC Cache helps prevent time-consuming file access from slowing down performance.
In media and entertainment, Azure HPC Cache can speed up data access for time-critical rendering projects. VFX rendering workflows often require last-minute processing by large numbers of compute nodes.
By caching file data in the cloud, Azure HPC Cache reduces latency and enhances flexibility for on-demand rendering. This is especially important in industries where time is of the essence.
Azure HPC Cache is also useful for workloads that require remote data access. It provides a distributed file system that can cache frequently accessed data locally, enabling file bursting.
This is particularly useful for data-intensive applications that require high-throughput access to large datasets stored in remote locations.
Silicon Design Verification
Silicon design verification is a compute-intensive process that can be run on large-scale virtual machine compute grids. This process is crucial for the silicon design industry.
Azure HPC Cache can provide on-cloud caching of design data, libraries, binaries, and rule database files from on-premises storage systems. This results in local-like response times for directory listings, metadata, and data reads.
The need for complex data migration, syncing, and copying operations is eliminated with Azure HPC Cache. This simplifies the process and reduces the workload.
Azure HPC Cache can also be set up to cache output files being written by the compute jobs. This configuration gives immediate acknowledgement to the compute workflow and subsequently writes the changes back to the on-premises NAS.
Chip designers can scale EDA verification jobs to tens of thousands of cores with ease using Azure HPC Cache. This allows them to focus on their work without worrying about storage performance.
Sources
- https://www.bdrsuite.com/blog/azure-hpc-cache-file-caching-for-high-performance-computing/
- https://argonsys.com/microsoft-cloud/library/azure-hpc-cache-reducing-latency-between-azure-and-on-premises-storage/
- https://learn.microsoft.com/en-us/azure/hpc-cache/hpc-cache-overview
- https://bluexp.netapp.com/blog/azure-cvo-blg-azure-hpc-cache-use-cases-examples-and-a-quick-tutorial
- https://learn.microsoft.com/en-us/azure/hpc-cache/hpc-cache-manage
Featured Images: pexels.com