A leading AI technology firm in California is seeking a skilled professional to optimize distributed training systems using PyTorch. The ideal candidate will have over 8 years of experience in distributed systems and high-performance computing, with a strong command of Python and low-level performance optimizations using CUDA. Responsibilities include developing monitoring tools and enhancing GPU cluster performance. This role provides a unique opportunity to work on cutting-edge AI solutions in a collaborative environment.
J-18808-Ljbffr