The Sr. Systems Engineer will coordinate the planning of and conduct advanced research computing engineering duties. Implement current and develop new RC solutions to keep up with the pace of complex research problems. Work independently to build, monitor, and maintain the integrity of RC systems. Provide technical expertise to teams and projects alongside research programs. Be a key contributor to multiple projects simultaneously.
Working with the FASRC Lead Storage Engineer, assist in the design, implementation, and lifecycle management of current and future storage deployments (currently Lustre, ceph, and DellEMC Isilon systems).
Engage with FASRC groups to make available production-ready storage instances using the storage management/accounting services developed internally.
Work with Lead Storage Engineer, vendors, other related groups on campus, and national/international groups to identify storage trends and new technologies. Discuss and coordinate with groups when appropriate.
Faculty of Arts & Sciences Research Computing
Research Computing at Harvard is an enterprise that continues to reflect the University’s decentralized heritage, the evolution of research-computing infrastructure and funding opportunities, and the strategic development of its central information-technology organization, Harvard University Information Technology (HUIT) and its University Research Computing team.
Starting in 2007, FAS began consolidating and centralizing research-computing resources within the Division of Sciences and soon began expanding across the School as faculty in the social sciences and humanities began to use advanced computing in their research. The organization came to be called FAS Research Computing and, as more and more faculty across Harvard came to it for support, it extended its services beyond FAS, increasingly in collaboration with HUIT. With HUIT now developing an array of University-wide services and platforms to support faculty beyond FAS, FAS has the opportunity to strategically redefine what resources and services it will continue to support for the Arts & Sciences at Harvard and to broaden the base of users across the School who are advancing their research through computing.
In this context, FAS Research Computing continues to evolve, to expand its offerings, and to support research faculty across the School and their collaborators around the world. It has earned a reputation for building partnerships to accelerate research and collaboration. The Director of FAS RC will continue this legacy.
FAS’ research computing team directly engages with researchers through help requests, office hours, training, and in-depth consultations. FAS resources include a Top500.org high-performance computing cluster, virtual machines, storage, databases, instrumentation core facility workstations, and other development platforms. FAS Research Computing has numerous other successful collaborations, including building the MGHPCC (http://www.mghpcc.org/) in Holyoke, MA with leading partner universities. With these and other institutions, FAS launched the NSF-funded NESE project (http://nese.mghpcc.org), which creates a regional cloud storage repository.
Occasionally required to work outside of normal business hours, and may be contacted during off hours.
This is a full-time position with flexible hours and a hybrid in-person/remote work schedule option to be agreed upon at hire. The selected candidate will periodically need to be on campus as business needs require. All remote work must be performed in a state in which Harvard is registered to do business (CA, CT, MA, MD, ME, NH, NY, RI, and VT).
Basic Qualifications
Minimum of seven years’ post-secondary education or relevant work experience
Additional Qualifications and Skills
Broad knowledge of the deployment and management of systems (e.g. storage, cluster computing, network, database, virtualized systems)
Demonstrated team performance skills, service mindset approach, and the ability to act as a trusted advisor
Experience with git and version control in general
Advanced *NIX/Linux system administration experience
Experience with automation and configuration management (Puppet)
General Networking skills (OSI model, Layer 2 & 3 switching and routing, TCP/IP…)
Knowledge of storage, network, and server hardware
Experience evaluating and deploying open-source tools
Cloud/AWS/Azure (EC2/EBS/S3/SES/EBS) experience
Source management (git/mercurial/svn)
SAN and NAS knowledge, including iSCSI, NFS, and FC (EMC / Isilon)