TID Scientific Data Architect
New Today
TID Scientific Data Architect
Job ID
6470
Location
SLAC - Menlo Park, CA
Full-Time
Regular
SLAC Job Postings
Position Overview
SLAC National Accelerator Laboratory, operated by Stanford University for the U.S. Department of Energy's Office of Science, is a premier research facility advancing knowledge in disciplines such as particle physics, astrophysics, materials science. The Scientific Computing Systems (SCS) division within the Technology and Innovation (TID) Directorate at SLAC is seeking a Big Data Architect to contribute to the development of data management tools that support SLAC's diverse scientific programs, including the Linac Coherent Light Source (LCLS).
LCLS is the world's leading X-ray free-electron laser, featuring a suite of instruments capable of generating large, high-velocity, and high-variety datasets. These datasets are essential for elucidating atomic structures and dynamics at the femtosecond timescale.
In this role, you will focus on the development, integration, and management of systems that transport, store, search, retrieve, analyze, and visualize petascale scientific datasets, including those generated by LCLS. You will collaborate with a diverse team to create systems that facilitate the seamless transfer of data to Department of Energy Leadership Class Facilities for real-time analysis. Additionally, you will play a key role in ensuring that these systems are adaptable to meet the diverse needs of various scientific programs at SLAC, including but not limited to LCLS.
As a Big Data Architect in the SCS Data Management Department, you will collaborate within an interdisciplinary team of software developers, scientists, and engineers to oversee scientific data management systems. This includes responsibilities related to data curation, timely data transfer, and ensuring archival integrity for various scientific programs utilizing the SLAC Shared Science Data Facility (S3DF).
Key responsibilities include managing data transfer, replication, and backup processes; building and testing automation tools; and ensuring the effective handling of data received from next-generation scientific instruments that connect to remote computing resources for executing time-intensive and data-intensive workflows. You will work closely with scientific and operations support staff across SLAC to develop data management solutions that enhance the capabilities of various scientific programs.
Given the nature of this position, SLAC is open to on-site and hybrid work options.
Your specific responsibilities include:
Collaborate with science groups, the SLAC Shared Science Data Facility (S3DF), and SLAC IT to ensure the complete and timely delivery of data from scientific instruments to S3DF and remote Department of Energy computing facilities.
Oversee scientific data repositories, ensuring their integrity, enforcing data retention and access policies, and managing data movement for analysis or archival purposes.
Interface with scientific teams and partners from the Department of Energy Advanced Scientific Computing Research facilities to understand future data requirements and design Big Data systems that are scalable, optimized, and fault-tolerant, particularly in the context of multi-facility data processing.
Develop, test, implement, and maintain database management applications.
Contribute to the development of guidelines, standards, and processes to ensure data quality, integrity, and security across systems and datasets.
Partner with scientific data owners and team members to understand the types of data collected in various programs and suggest new tools or methods to enhance data ingestion, storage, and access, ensuring datasets are FAIR (Findable, Accessible, Interoperable, and Reusable) and AI-ready.
Document system builds and application configurations, maintaining and updating documentation as needed.
Serve as a technical resource for applications.
Follow team software development methodology
Other duties may also be assigned.
To be successful in this position you will bring:
Bachelor's degree and five years of relevant experience, or a combination of education and relevant experience.
Demonstrated experience in designing, developing, testing, and deploying applications.
Strong understanding of data design, architecture, relational databases, and data modeling
Knowledge of key data structures, algorithms, and techniques pertinent to systems that support high-volume, high-velocity, or high-variety datasets, including data mining, machine learning, NLP, and data retrieval.
Experience in parallel and distributed data processing techniques and platforms (MPI, MapReduce, Batch).
Experience in scripting languages and debugging.
Ability to analyze systems and data pipelines and propose solutions that leverage emerging technologies.
Experience deploying reliable data systems and data quality management.
Ability to research, evaluate, architect, and deploy new tools, frameworks, and patterns to build scalable Big Data platforms.
Thorough understanding of all aspects of the software development life cycle and quality control practices.
Strong communication skills with both technical and non-technical clients.
Ability to select, adapt, and effectively use a variety of programming methods.
Ability to recognize and recommend needed changes in user and/or operations procedures.
Comfortable writing efficient, scalable, and documented code in Python or C/C++.
Experience with libraries and frameworks such as Rucio, Kafka, ZeroMQ, Kubernetes, Jupyter, and Grafana is a plus.
SLAC Employee Competencies
Effective Decisions: Uses job knowledge and solid judgment to make quality decisions in a timely manner.
Self-Development: Pursues a variety of venues and opportunities to continue learning and developing.
Dependability: Can be counted on to deliver results with a sense of personal responsibility for expected outcomes.
Initiative: Pursues work and interactions proactively with optimism, positive energy, and motivation to move things forward.
Adaptability: Flexes as needed when change occurs, maintaining an open outlook while adjusting and accommodating changes.
Communication: Ensures effective information flow to various audiences and creates and delivers clear, appropriate written and spoken messages.
Relationships: Builds relationships to foster trust, collaboration, and a positive climate to achieve common goals.
Job-Specific Competencies
Comfortable writing efficient, scalable, and documented code in Python or C/C++.
Experience with libraries and frameworks such as Rucio, Kafka, ZeroMQ, Kubernetes, Jupyter, and Grafana is a plus.
Physical requirements and Working conditions:
- Consistent with its obligations under the law, the University will provide reasonable accommodation to any employee with a disability who requires accommodation to perform the essential functions of his or her job.
Work Standards :
Interpersonal Skills: Demonstrates the ability to work well with Stanford colleagues and clients and with external organizations.
Promote Culture of Safety: Demonstrates commitment to personal responsibility and value for environment, safety and security; communicates related concerns; uses and promotes safe behaviors based on training and lessons learned.Meets the applicable roles and responsibilities as described in the ESH Manual, Chapter 1¿General Policy and Responsibilities:http://www-group.slac.stanford.edu/esh/eshmanual/pdfs/ESHch01.pdf
Subject to and expected to comply with all applicable University policies and procedures, including but not limited to the personnel policies and other policies found in the University's Administrative Guide,http://adminguide.stanford.edu.
Classification Title: Big Data Architect 1
Grade: K, Job Code: 4734
Employment Duration: Regular Continuing
The expected pay range for this position is $157,945 - $177,385 per annum. SLAC National Accelerator Laboratory/Stanford University provides pay ranges representing its good faith estimate of what the university reasonably expects to pay for a position. The pay offered to a selected candidate will be determined based on factors such as (but not limited to) the scope and responsibilities of the position, the qualifications of the selected candidate, departmental budget availability, internal equity, geographic location, and external market pay for comparable jobs.
SLAC National Accelerator Laboratory is an Affirmative Action / Equal Opportunity Employer and supports diversity in the workplace. All employment decisions are made without regard to race, color, religion, sex, national origin, age, disability, veteran status, marital or family status, sexual orientation, gender identity, or genetic information. All staff at SLAC National Accelerator Laboratory must be able to demonstrate the legal right to work in the United States. SLAC is an E-Verify employer.
#J-18808-Ljbffr- Location:
- Menlo Park, CA, United States
- Salary:
- $200,000 - $250,000
- Job Type:
- FullTime
- Category:
- IT & Technology