Azure GPUs: ND vs NG Family Comparison

Visak Krishnakumar
Azure GPUs_ ND vs NG Family Comparison

Graphics Processing Units (GPUs) were initially designed for video game graphics but have since become a driving force in fields like artificial intelligence (AI), machine learning (ML), high-performance computing (HPC), and media production. 

Unlike Central Processing Units (CPUs), which perform tasks sequentially, GPUs excel in parallel processing, enabling them to handle thousands of operations simultaneously. This makes them crucial for applications requiring immense computational power, such as AI model training, real-time graphics rendering, scientific simulations, and data analytics.

Cloud-based GPUs offer businesses a way to scale without worrying about hardware limitations. But not all workloads require the same kind of GPU power. Azure has specifically tailored its ND and NG GPU families to address different types of demands—whether you're training AI models or rendering complex graphics.

If you’ve worked with AWS GPU instances  (P Family and G Family) before, you might be wonder: Why explore Azure GPUs? Are they a better fit for my workload? 

This blog will help you answer that question by exploring the key differences between Azure’s ND and NG families of GPUs, offering a detailed comparison based on real-world applications. 

The ND Family: Powering AI and ML at Scale

The Azure ND family is built for AI training, high-performance computing, and data-intensive tasks. With GPUs powered by NVIDIA A100 and V100 chips, the ND family excels at handling the matrix-heavy calculations necessary for deep learning and AI models, significantly reducing training times.

Real-World Example: AI in Healthcare
For a healthcare AI startup working on medical image analysis, the ND series' high memory and parallel compute power—along with HBM2 memory and Tensor Cores—accelerate model training and enable faster iterations, ultimately bringing life-saving solutions to market sooner.

Core Features of the ND Family:

  • Tensor Cores for AI/ML acceleration
  • HBM2 Memory (16GB to 40GB) for large datasets
  • High compute power for deep learning and scientific simulations

The NG Family: Optimized for Graphics and Media Workloads

While the ND family dominates in AI and HPC, the NG family is designed for graphics-intensive applications like video rendering, 3D modeling, and gaming. For industries like media production and gaming, where real-time performance is critical, NG series GPUs deliver the power needed for high-quality graphics and low-latency experiences.

Real-World Example: Game Streaming Platform
In cloud gaming, where users need real-time game rendering and streaming with minimal lag, the NG family’s NVIDIA T4 GPUs provide exceptional performance, ensuring smooth gameplay even on less powerful devices.

Core Features of the NG Family:

  • NVIDIA T4 GPUs for real-time graphics and video inferencing
  • GDDR6 Memory for optimal media performance
  • Optimized for real-time video transcoding and 3D rendering

ND vs. NG: Key Differences

While both the ND and NG GPU families offer high-performance capabilities, they excel in different domains. Here’s how they stack up against each other:

FeatureND Family (AI/ML, HPC)NG Family (Graphics & Media)
Ideal WorkloadsAI/ML Training, Scientific Simulations, Data AnalyticsGaming, 3D Rendering, Video Transcoding, Virtual Workstations
GPU ModelsNVIDIA A100, V100NVIDIA T4
Memory TypeHBM2 (16GB to 80GB)GDDR6 (16GB)
Memory Bandwidth900 GB/s (A100), 900 GB/s (V100)320 GB/s (T4)
Compute FocusHigh-Performance Parallel ComputingReal-Time Graphics Rendering and Media Tasks
Networking PerformanceUp to 100 GbpsUp to 25 Gbps
Tensor Cores (for AI)Yes (Optimized for AI/ML)No (Graphics-focused)
Power Consumption~300W (A100), ~250W (V100)~70W (T4)

Performance Comparison: ND Family vs NG Family

In this section, we’ll compare the ND and NG families based on key performance metrics: computational powermemory bandwidth, and power efficiency.

ND Family Performance

  1. Computational Power: Optimized for parallel processing, the ND family (e.g., A100, V100) delivers high computational performance, excelling in handling complex matrix operations and large-scale data processing.
  2. Memory & Bandwidth: Features HBM2/HBM3 memory (up to 40GB) with a bandwidth of 900 GB/s, enabling efficient handling of massive datasets and reducing data transfer bottlenecks.
  3. Power Efficiency: Power consumption ranges from 250W to 300W per GPU, reflecting the high power demands of computationally intensive tasks.

NG Family Performance

  • Computational Power: The NG family (e.g., T4) offers strong performance but is less computationally intensive than the ND family, focusing on tasks with lower latency requirements.
  • Memory & Bandwidth: Equipped with GDDR6 memory (up to 16GB) and 320 GB/s bandwidth, the NG family is optimized for efficient data processing in real-time tasks.
  • Power Efficiency: Power consumption is lower, ranging from 70W to 100W per GPU, making it more energy-efficient.
FeatureND FamilyNG Family
Workload FocusAI model training, scientific simulations, big dataGaming, 3D rendering, video transcoding, media
Computational PowerHigh parallel processing for large-scale tasksOptimized for low-latency graphics and media tasks
Memory TypeHBM2/HBM3 (up to 40GB)GDDR6 (up to 16GB)
Memory BandwidthUp to 900 GB/sUp to 320 GB/s
Power ConsumptionHigher (~250W–300W)More power-efficient (~70W–100W)
Real-World ApplicationsAI research, big data, HPC simulationsGraphics rendering, video streaming, gaming

Key Takeaways

  • Computational Power:
    • The ND family excels in highly parallel processing and is ideal for compute-heavy tasks like AI/ML training and big data processing.
    • The NG family focuses on low-latency tasks and delivers solid performance for media and real-time applications, but doesn't match the computational capacity of the ND family.
  • Memory & Bandwidth:
    • The ND family’s HBM2/HBM3 memory (up to 40GB) and 900 GB/s bandwidth make it highly capable of handling large datasets and complex workloads, such as AI and scientific simulations.
    • The NG family uses GDDR6 memory (up to 16GB) and 320 GB/s bandwidth, which is efficient for real-time rendering and media tasks but doesn't provide the same scale for AI-heavy applications.
  • Power Efficiency:
    • The NG family is significantly more power-efficient, consuming just 70W–100W per GPU, making it suitable for energy-conscious environments.
    • The ND family consumes 250W–300W, prioritizing performance over power efficiency for demanding workloads like AI training and scientific simulations.

Scalability: Optimizing for Growth and Long-Term Success

When planning for large-scale deployments, scalability becomes a critical factor in ensuring that your infrastructure can grow alongside increasing demand. The ND family and NG family each offer distinct advantages when it comes to scaling infrastructure. 

ND Family Scalability: Vertical Scaling for Large-Scale AI and HPC Tasks

Scaling Approach: Vertical Scaling

  • The ND family is designed for vertical scaling, which is ideal for environments where high memory and computational power are required within a single system or across multi-GPU clusters. This approach allows for scaling resources within a single node or multiple nodes, supporting large memory capacities and high computational demands.

Why Vertical Scaling?

  • The ND family benefits from vertical scaling, where tasks can leverage direct memory access within a single system, minimizing the need for complex inter-node communication and enhancing performance for memory-intensive workloads.
  • The high memory bandwidth (up to 900 GB/s) and larger memory capacities (up to 40GB with HBM2/HBM3) make the ND family well-suited to handle memory-intensive tasks within a tightly connected system, where fewer nodes are needed but each requires substantial power and memory.

Scalability Features:

  • Single-Node Scaling: The ND family supports vertical scaling within a single system, enabling upgrades to GPUs with larger memory capacities (up to 40GB with HBM2/HBM3) and increased computational power. This scaling approach is well-suited for memory-intensive and high-computation tasks.
  • Multi-Node Scaling: ND GPUs can also scale effectively across multi-node clusters, utilizing high-bandwidth interconnects (e.g., NVLink) to facilitate seamless GPU communication, ensuring low-latency data transfer between nodes for large-scale workloads.

NG Family Scalability: 

Scaling Approach: Horizontal Scaling

  • The NG family is optimized for horizontal scaling, ideal for workloads distributed across multiple nodes or GPUs. This approach allows tasks to be efficiently spread out over many instances, ensuring smooth performance for distributed and concurrent workloads.

Why Horizontal Scaling?

  • The NG family supports horizontal scaling because workloads in real-time applications (e.g., cloud gaming, video transcoding, interactive media) can be broken down into smaller, parallel tasks. These tasks benefit from being distributed across multiple GPUs or nodes, which allows for efficient processing with minimal latency.
  • Horizontal scaling ensures that many GPUs can work together in parallel, processing tasks simultaneously without overloading a single node. This makes it ideal for high-concurrency tasks where low-latency and high-throughput performance are essential.

Scalability Features:

  • Multi-Node Scaling: The NG family is designed to scale horizontally by distributing tasks across multiple GPUs or nodes. This scalability feature is ideal for applications where the workload can be divided into smaller tasks and processed concurrently.
  • Efficient Task Distribution: Horizontal scaling in the NG family ensures that tasks are distributed effectively across many GPUs, enabling high concurrency. The family supports scalable applications that can benefit from parallel processing without significant performance degradation.
  • Low-Latency Communication: The NG family prioritizes low-latency communication between nodes, ensuring that even as workloads are distributed across multiple GPUs, the system maintains responsiveness, which is essential for real-time applications.
FeatureND FamilyNG Family
Scaling TypeVertical Scaling (within single nodes or clusters)Horizontal Scaling (across multiple nodes)
Key Use CaseAI model training, scientific simulations, big dataCloud gaming, media rendering, real-time graphics
Scalable Workload TypeHigh-computational tasks and large datasetsInteractive and media-intensive tasks
Scaling EfficiencyEfficient for memory-heavy, data-intensive tasksEfficient for concurrent, real-time tasks
Ideal EnvironmentAI research labs, HPC environments, cloud platformsGaming platforms, media production, virtual workstations

Key Takeaways: Scalability

  1. ND Family: Focused on vertical scaling, the ND family supports the scaling of memory and computational power within single or multi-node systems. It is suited for environments that require a significant increase in system resources within nodes or clusters.
  2. NG Family: The NG family, optimized for horizontal scaling, is ideal for environments that need to distribute workloads across multiple nodes. It ensures efficient task distribution with low-latency communication, making it highly suitable for real-time, high-concurrency applications.

Cost Efficiency

When it comes to managing cloud costs for your specific workload, selecting the right VM series within the ND and NG families can have a significant impact on your overall budget. In this section, we’ll break down the on-demand pricingreserved instances, and spot instances for each VM series to help you choose the most cost-efficient option based on your needs.

ND Family Cost Efficiency: High-Performance GPUs for Compute-Intensive Workloads

The ND family of VMs is designed for compute-heavy tasks such as AI trainingdeep learning, and scientific simulations. The VM series in this family cater to businesses requiring powerful GPUs like the A100V100, and H100 for parallel processing.

VMs in the ND Family:

  • NDasrA100_v4 series: High-performance VMs powered by NVIDIA A100 GPUs, designed for AI and ML workloads.
  • NDm_A100_v4 series: Similar to the NDasrA100_v4, these VMs offer multi-node capabilities for distributed AI training.
  • NDv2 series: Powered by NVIDIA V100 GPUs, optimized for large-scale AI model training.
  • ND-H100-v5 series: Latest generation VMs, utilizing the NVIDIA H100 GPUs, for cutting-edge AI/ML models and high-performance computing.
  • ND-H200-v5 series: Features NVIDIA H200 GPUs for even more specialized and performance-demanding applications.
  • ND-MI300X-v5 series: Newer VM series with the AMD MI300X GPUs, aimed at workloads requiring significant computational and memory resources.

On-Demand Pricing for ND VMs:

Below is an overview of on-demand pricing for select ND family VM series. These prices reflect the hourly cost for each VM configuration.

VM Series

GPU Model

GPU Memory

On-Demand Price (Hourly)

NDasrA100_v4

NVIDIA A100

40 GB HBM2

$32.77

NDm_A100_v4

NVIDIA A100

40 GB HBM2

$32.77

NDv2

NVIDIA V100

16 GB HBM2

$24.48

ND-H100-v5

NVIDIA H100

80 GB HBM3

$45.56

ND-H200-v5

NVIDIA H200

80 GB HBM3

$50.12

ND-MI300X-v5

AMD MI300X

128 GB

$55.60

Note: Refer to the official Azure Pricing Calculator for accurate and up-to-date pricing. 

Optimizing Costs with Reserved Instances:

For long-term projects, you can opt for reserved instances (RIs) for ND family VMs, saving up to 75% on the hourly rate compared to on-demand pricing. Below is an example of reserved pricing for NDasrA100_v4 series.

VM Series

On-Demand Price (Hourly)

1-Year Reserved Price (Hourly)

Savings

NDasrA100_v4

$32.77

$8.19

~75%

Maximizing Savings with Spot Instances:

If your workloads can handle interruptions, spot instances provide the highest cost savings, with discounts of up to 90% compared to on-demand pricing. Here’s how much you could save with spot instances for the NDasrA100_v4:

VM Series

On-Demand Price (Hourly)

Spot Price (Hourly)

Savings

NDasrA100_v4

$32.77

$3.28

~90%

NG Family Cost Efficiency: Affordable Graphics-Optimized VMs for Media and Real-Time Processing

The NG family of VMs is optimized for graphics-intensive workloads, such as 3D renderingreal-time video transcoding, and game streaming. The VMs in this family are equipped with NVIDIA T4 GPUs, which offer an excellent balance of cost and performance for these types of tasks.

VMs in the NG Family:

  • NGads V620 series: These VMs feature NVIDIA T4 GPUs and are ideal for running graphics-heavy applications and workloads like gaming and media streaming.

On-Demand Pricing for NG VMs:

Here is the on-demand pricing for NGads V620 series instances:

VM Series

GPU Model

GPU Memory

On-Demand Price (Hourly)

NGads V620

NVIDIA T4

16 GB GDDR6

$0.53

Optimizing Costs with Reserved Instances:

For predictable workloads like real-time rendering and video transcodingreserved instances offer up to 75% savings on the hourly rate compared to on-demand pricing.

VM Series

On-Demand Price (Hourly)

1-Year Reserved Price (Hourly)

Savings

NGads V620

$0.53

$0.13

~75%

Maximizing Savings with Spot Instances:

If you can tolerate interruptions in your graphics-heavy workloadsspot instances for the NGads V620 series offer up to 80% savings off on-demand pricing.

VM Series

On-Demand Price (Hourly)

Spot Price (Hourly)

Savings

NGads V620

$0.53

$0.10

~81%

Cost Comparison for Typical Use Cases: ND vs. NG Family VMs

Let’s break down the costs for real-world use cases for both the ND and NG families, based on the pricing options for on-demand, reserved, and spot instances.

Example 1: AI Model Training

Let’s assume you have a workload that runs for 1500 hours per month.

VM Series

On-Demand Price (Hourly)

Total Monthly Cost

Use Case

NDasrA100_v4

$32.77

$49,155

AI Training

NGads V620

$0.53

$795

Graphics Processing

Example 2: Real-Time Game Streaming

Let’s assume a workload of 1000 hours per month.

VM Series

On-Demand Price (Hourly)

Total Monthly Cost

Use Case

NDasrA100_v4

$32.77

$32,770

AI Inference

NGads V620

$0.53

$530

Game Streaming

Summary: Which Family Offers the Best Cost Efficiency?

  • ND Family VMs are ideal for businesses needing powerful compute-heavy resources, such as AI trainingdeep learning, and high-performance computing. While the cost is higher, using reserved or spot instances can help reduce expenses.
  • NG Family VMs offer more affordable solutions for graphics-intensive tasks, such as real-time renderinggame streaming, and video transcoding. They provide substantial savings through spot pricing and are ideal for businesses with graphics-optimized workloads that need budget-friendly options.

By understanding the specific needs of your workload and selecting the appropriate VM series and pricing model, you can significantly optimize your cloud costs.

Industry- Specific Use Cases for ND and NG GPUs

To provide a clearer picture of how ND and NG GPUs are applied in real-world scenarios, let's explore how these GPU families address specific needs across industries. This section highlights the most relevant use cases for each family, focusing on how their unique strengths enable businesses and organizations to solve industry-specific challenges.

ND Family: Accelerating AI and Data-Intensive Industries

The ND family excels in industries that require large-scale data processing and advanced computational power for running complex models and simulations.

  1. AI in Autonomous Vehicles: Enhancing Real-Time Decision Making

ND GPUs are pivotal in the development of AI models for autonomous vehicles, enabling real-time decision-making from sensor data. 

These GPUs accelerate the processing of vast amounts of data from LIDAR, radar, and cameras, ensuring that vehicles can interpret their environment and make critical driving decisions instantaneously.

  1. Scientific Research and High-Performance Computing (HPC)

In scientific domains like genomics, climate science, and physics, ND GPUs provide the computational power needed to handle enormous datasets and complex simulations. 

These GPUs drastically reduce the time required for simulations, advancing research in fields like climate modeling, genomics, and physics by accelerating critical computations.

  1. AI/ML in Finance and Retail

In the financial sector, ND GPUs are leveraged for predictive analytics, fraud detection, and risk modeling. By processing large volumes of transactional and historical data, these GPUs help financial institutions identify trends, optimize portfolios, and predict market movements, while simultaneously enhancing fraud detection systems with real-time capabilities.

NG Family: Powering Graphics and Interactive Media

The NG family is optimized for industries requiring real-time graphics rendering, video transcoding, and interactive media. These GPUs excel in providing low-latency performance for visual applications like gaming, media production, and virtual workstations.

  1. Entertainment and Rendering: Transforming Creative Industries

NG GPUs play a key role in the entertainment industry, powering tasks such as CGI rendering and special effects creation. 

With their ability to handle complex graphical workloads, NG GPUs help produce realistic 3D models and animations for films and video games, ensuring high standards of visual fidelity and immersion.

  1. Media and Broadcast: Enabling Video Streaming and Real-Time Processing

NG GPUs are essential for video transcoding in media streaming platforms, enabling the real-time conversion of video files into different formats for various devices. 

This capability ensures smooth, high-quality playback on a wide range of platforms, from high-end TVs to mobile devices with limited bandwidth.

  1. Virtual Workstations: Facilitating Remote Work for Creative Professionals

NG GPUs support virtual workstations that allow creative professionals, such as designers and video editors, to work remotely on demanding tasks like 3D rendering and interactive design. 

These GPUs enable seamless access to high-performance design software from any location, facilitating collaboration on complex projects without the need for on-premises hardware.

The Future of Azure GPUs: Evolving Technologies and Trends

As demand for AI, ML, and HPC grows, the future of GPU technology is evolving rapidly. Azure is committed to ensuring that both ND and NG GPUs stay at the forefront of technological innovation, integrating next-gen GPUs such as HBM3 and specialized AI chips for even more advanced performance.

The expansion of quantum computing and the growing demand for real-time decision-making will continue to push the limits of what these GPUs can do, ensuring Azure’s GPUs remain critical to future workloads.

Tags
CloudOptimoAzuremachine learningArtificial IntelligenceHigh Performance ComputingAI and MLGPU PerformanceHPC workloadsAzure GPUCloud GamingND familyNG familyAzure ND vs NG FamilyHPCGPU ComparisonAzure VMs
Maximize Your Cloud Potential
Streamline your cloud infrastructure for cost-efficiency and enhanced security.
Discover how CloudOptimo optimize your AWS and Azure services.
Request a Demo