Get in Touch

Course Outline

Overview of the Domestic AI GPU Ecosystem

  • Comparison of Huawei Ascend, Biren, and Cambricon MLU
  • Contrasting CUDA with CANN, Biren SDK, and BANGPy frameworks
  • Industry trends and vendor ecosystems

Preparation for Migration

  • Evaluating your CUDA codebase
  • Identifying target platforms and SDK versions
  • Installing toolchains and setting up environments

Code Translation Techniques

  • Porting CUDA memory access patterns and kernel logic
  • Mapping compute grid and thread models
  • Exploring automated versus manual translation options

Platform-Specific Implementations

  • Utilizing Huawei CANN operators and custom kernels
  • Understanding the Biren SDK conversion pipeline
  • Rebuilding models using BANGPy (Cambricon)

Cross-Platform Testing and Optimization

  • Profiling execution on each target platform
  • Analyzing memory tuning and parallel execution efficiency
  • Monitoring performance and iterative refinement

Managing Mixed GPU Environments

  • Implementing hybrid deployments across multiple architectures
  • Establishing fallback strategies and device detection mechanisms
  • Utilizing abstraction layers for improved code maintainability

Case Studies and Best Practices

  • Porting vision and NLP models to Ascend or Cambricon
  • Adapting inference pipelines for Biren clusters
  • Addressing version mismatches and API discrepancies

Summary and Next Steps

Requirements

  • Practical experience in programming with CUDA or GPU-based applications
  • Understanding of GPU memory architectures and compute kernels
  • Familiarity with AI model deployment or acceleration workflows

Target Audience

  • GPU developers
  • System architects
  • Porting specialists
 21 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories