HPC Engineer

49 Days Old

Be among the first 25 applicants About Us RCH Solutions is an established and rapidly growing global provider of computational, research, and data science expertise within Life Sciences and Healthcare. At RCH Solutions, our team rallies around a culture crafted for learning and achieving. We’re relentless in our pursuit of innovation and demanding of ourselves to deliver a ground-breaking computing experience for our clients, enabling them to deliver life-saving science to humanity. Core Values At RCH, our Core Values are more than just words—they embody the fabric of our culture, guiding our hiring, performance evaluation, and client engagement. Our core values include: Embrace Excellence: Striving for best-in-class delivery of innovation and service Be Accountable: Upholding integrity, ownership, and accountability Adventure Together: Fostering a culture of continuous improvement Succeed as a Team: Harnessing the power of teamwork to achieve outcomes Boundaries and Balance: Valuing work-life balance as a core aspect of our culture If you share these values, we encourage you to read on—you might find a great home for your career. Job Description RCH Solutions is seeking an HPC Engineer to work closely with customer stakeholders, scientists, and IT professionals. The role involves delivering Compute at Scale and supporting scientific initiatives through developing, evolving, and administering HPC platforms, supporting scientific applications, workflows, and infrastructure on-premises and in the cloud. The ideal candidate will have hands-on experience with Linux system administration, solution architecting, and engineering (both on-prem and cloud-based), and will play a key role in transforming IT computing services to support client growth. Responsibilities include architecture, project execution, and establishing IT infrastructure best practices. Responsibilities include full-stack support: platform design and evolution, application administration, workflow support, performance tuning, system monitoring and maintenance, troubleshooting hardware/software/network issues, solution architecting, engineering, and documentation. You will leverage your expertise to provide consulting and recommendations in research or analytics initiatives, and support client projects. Specific focuses include: Collaborating with teams and clients to deliver HPC and compute services Understanding industry best practices Supporting architecture and design efforts Advising on and supporting customer workflows Documenting computational assets Supporting AWS cloud applications, migrations, and modernization Implementing CloudOps / Infrastructure as Code (IaC) Configuring AWS infrastructure for new platforms Ensuring security and compliance standards Supporting transition to operational teams Providing troubleshooting and service restoration Mentoring junior team members Managing multiple client engagements Essential Qualifications Bachelor’s or Master’s degree in Computer Science or related field 5+ years of experience with HPC clusters and systems administration Experience with SLURM and Grid Engine preferred 5+ years in Solution Architecture or Cloud Infrastructure Experience with Scientific/Research IT, preferably Life Sciences Familiarity with POSIT products Extensive Linux command-line system administration skills Knowledge of Active Directory, DNS, DHCP, LDAP, NFS, SMB Experience with application building, installation, and troubleshooting Linux OS installation and tuning Familiarity with Linux package management Intermediate networking knowledge Scripting, automation, and configuration management skills (Ansible, Terraform, CloudFormation) Experience with Scientific/Research applications Strong time-management and communication skills Proactive problem-solving abilities Attention to detail and ability to handle multiple clients Ability to work independently or as part of a team Must not require sponsorship now or in the future Preferred Qualifications Experience with Python, R, or related data science languages Experience with databases and big data management Experience with AI/ML technologies Experience containerizing workloads with Docker or Singularity Experience with Nvidia DGX systems Additional Information We offer a competitive salary, comprehensive health benefits, 401(k), continuing education, and a remote work environment aligned with West Coast hours. Candidates must not need sponsorship now or in the future.
#J-18808-Ljbffr
Location:
San Francisco, CA, United States