Name: Fundamentals of Accelerated Computing with Modern CUDA C++
Price: 500 USD

Fundamentals of Accelerated Computing with Modern CUDA C++ (FACCC)

Course Overview

This workshop provides a comprehensive introduction to general-purpose GPU programming with CUDA. You'll learn how to write, compile, and run GPU-accelerated code, leverage CUDA core libraries to harness the power of massive parallelism provided by modern GPU accelerators, optimize memory migration between CPU and GPU, and implement your own algorithms. At the end of the workshop, you'll have access to additional resources to create your own GPU-accelerated applications.

Please note that once a booking has been confirmed, it is non-refundable. This means that after you have confirmed your seat for an event, it cannot be cancelled and no refund will be issued, regardless of attendance.

Prerequisites

Basic C++ competency, including familiarity with lambda expressions, loops, conditional statements, functions, standard algorithms and containers.
No previous knowledge of CUDA programming is assumed.

Course Objectives

At the conclusion of the workshop, you'll have an understanding of the fundamental concepts and techniques for accelerating C++ code with CUDA and be able to:

Write and compile code that runs on the GPU
Optimize memory migration between CPU and GPU
Leverage powerful parallel algorithms that simplify adding GPU acceleration to your code
Implement your own parallel algorithms by directly programming GPUs with CUDA kernels
Utilize concurrent CUDA streams to overlap memory traffic with compute
Know where, when, and how to best add CUDA acceleration to existing CPU-only applications

Outline: Fundamentals of Accelerated Computing with Modern CUDA C++ (FACCC)

Introduction

Meet the instructor.
Create an account at courses.nvidia.com/join

CUDA Made Easy: Accelerating Applications with Parallel Algorithms

To make your first steps in GPU programming as easy as possible, this lab teaches you how to leverage powerful parallel algorithms that make GPU acceleration of your code as easy as changing a few lines of code. While doing so, you’ll learn fundamental concepts such as execution space and memory space, parallelism, heterogeneous computing, and kernel fusion. These concepts will serve as a foundation for your advancement in accelerated computing. By the time you complete this lab, you will be able to:

Write, compile, and run GPU code
Refactor standard algorithms to execute on GPU
Extend standard algorithms to fit your unique use cases

Break (60 mins)

Unlocking the GPU’s Full Potential: Harnessing Asynchrony with CUDA Streams

In the previous lab, you learned how to use parallel algorithms. However, But the concept of parallelism is not sufficient for accelerating your applications. To fully utilize GPUs, this lab will teach you another fundamental concept: asynchrony. In this lab, you'll learn how and when to leverage asynchrony. You’ll use Nsight Systems to distinguish synchronous and asynchronous algorithms and identify performance bottlenecks. By the time you complete this lab, you will be able to:

Use CUDA streams to overlap execution and memory transfers
Use CUDA events for asynchronous dependency management
Profile CUDA code with NVIDIA Nsight Systems

Break (15 mins)

Implementing New Algorithms with CUDA Kernels

Previous labs equipped you with necessary understanding of how using standard parallel algorithms can provide both convenient and speed-of-light GPU acceleration. However, sometimes your unique use cases are not covered by accelerated libraries. In this lab, you’ll learn the CUDA SIMT programming model to program the GPU directly using CUDA kernels. Besides that, this lab will cover utilities provided by the CUDA ecosystem to facilitate development of custom CUDA kernels. By the time you complete this lab, you will be able to:

Write and launch custom CUDA kernels
Control thread hierarchy
Leverage shared memory
Use cooperative algorithms

Final Review

Review key learnings and wrap up questions.
Complete the assessment to earn a certificate.
Take the workshop survey.

Prices & Delivery methods

Online Training

Duration
8 hours

Price

US $ 500

Enroll now

Request a date

Classroom Training

Duration
8 hours

Price

United States: US $ 500

Enroll now

Request a date

Click on town name or "Online Training" to book Schedule

Instructor-led Online Training: This is an Instructor-Led Online (ILO) course. These sessions are conducted via WebEx in a VoIP environment and require an Internet Connection and headset with microphone connected to your computer or laptop. If you have any questions about our online courses, feel free to contact us via phone or Email anytime.

Germany

Sep 11, 2026	Online Training Time zone: Central European Summer Time (CEST)	Enroll
Oct 16, 2026	Online Training Time zone: Central European Summer Time (CEST)	Enroll
Nov 6, 2026	Online Training Time zone: Central European Time (CET)	Enroll
Dec 4, 2026	Online Training Time zone: Central European Time (CET)	Enroll