About the Role

The role involves designing, implementing, and optimizing CUDA kernels to achieve maximum performance on GPU architectures, with a strong emphasis on low-level code efficiency and scalability.

Responsibilities

Design and write efficient CUDA kernels for parallel computing tasks
Optimize existing GPU code for improved throughput and reduced latency
Collaborate with performance engineers to profile and benchmark kernel execution
Debug and resolve issues in GPU-accelerated applications
Analyze hardware limitations and adapt algorithms accordingly
Implement memory access patterns that maximize bandwidth utilization
Work with minimal supervision in a fast-paced technical environment
Contribute to code reviews and technical design discussions
Ensure kernel compatibility across different GPU architectures
Translate algorithmic specifications into high-performance GPU code
Maintain documentation for kernel functionality and performance characteristics
Respond to performance regressions with targeted optimizations
Stay current with advancements in GPU computing and CUDA capabilities
Support integration of CUDA modules into larger software systems
Assist in defining best practices for GPU programming within the team

Nice to Have

Experience with HPC workloads or scientific computing
Background in algorithm acceleration on GPUs
Familiarity with modern CUDA features like cooperative groups
Knowledge of GPU memory hierarchy and caching behavior
Experience with cross-platform GPU development
Exposure to machine learning or data-intensive applications
Contributions to open-source GPU computing projects
Understanding of power and thermal constraints in GPU systems

Compensation

Competitive salary based on experience

Work Arrangement

Remote within the United States

Team

Small, focused engineering team working on high-performance computing solutions

What You’ll Do

Develop and refine CUDA kernels to meet strict performance targets
Work closely with system architects to align kernel design with hardware capabilities
Use profiling data to guide optimization efforts and validate improvements

What We Look For

Deep technical expertise in GPU programming models
A methodical approach to performance analysis and tuning
Commitment to writing clean, maintainable, and efficient code

Not available

Pragmatike is hiring a CUDA Kernel Engineer (Remote US)

About the Role

Responsibilities

Nice to Have

Compensation

Work Arrangement

Team

What You’ll Do

What We Look For

Similar Jobs

Customer Success Manager (US)

Enterprise Solutions Strategist

Developer Growth Manager

Account Manager - Strategic Accounts (USA Remote)

Strategic Account Executive | Housing

Level 2 Technical Support Agent – Europe (On-site | Relocation Required | Local Language Mandatory)