Senior AI Research Engineer, Model Inference (Remote)

New Today

Overview Fintech opportunity with emphasis on AI and GPU-accelerated model optimization. Remote work available for a global team. Join a pioneering organization focused on innovative products that empower businesses and individuals.
Responsibilities
Implement and optimize custom inference and fine-tuning kernels for language models across multiple hardware backends
Design and customize Vulkan compute shaders for quantized operators and fine-tuning workflows
Investigate and resolve GPU acceleration issues on Vulkan and integrated/mobile GPUs
Collaborate with research and engineering teams to prototype, benchmark, and scale new model optimization methods
Deliver production-grade, efficient language model deployment for mobile and edge use cases
Qualifications
Proven expertise in GPU acceleration with the Vulkan framework
Strong background in quantization and mixed-precision model optimization
Proficiency in C++ and GPU kernel programming
Hands-on experience with mobile GPU acceleration and model inference
Familiarity with large language model architectures (e.g., Qwen, Gemma, LLaMA, Falcon, etc.)
Preferred Qualifications
Experience with LoRA fine-tuning and parameter-efficient training methods
Ability to debug GPU-specific performance and stability issues on desktop and mobile devices
Experience creating and curating custom datasets for style transfer and domain-specific fine-tuning
Demonstrated ability to apply empirical research to overcome challenges in model optimization
#Fintech #AI #GPUAcceleration #RemoteWork #Innovation #CareerGrowth
We prioritize candidate privacy and champion equal-opportunity employment. Central to our mission is our partnership with companies that share this commitment. We aim to foster a fair, transparent, and secure hiring environment for all. If you encounter any employer not adhering to these principles, please bring it to our attention immediately.
We are not the EOR (Employer of Record) for this position. Our role in this specific opportunity is to connect outstanding candidates with a top-tier employer.
#J-18808-Ljbffr
Location:
San Francisco, CA, United States
Salary:
$200,000 - $250,000
Job Type:
FullTime
Category:
Engineering