Date: October 2017 Title: IT Engineer 4 (Cyberinfrastructure Engineer) Department: University Technology - Research Computing and Cyberinfrastructure
POSITION OBJECTIVE Working under general direction, serve as lead architect to manage, develop and design high-level systems. The position shares responsibility for the research computing services provided to the university community, which leverages the university's investment in cyberinfrastructure. This includes identifying appropriate computational platforms (locally and externally) for research projects and diagnosing and resolving issues in the configuration, installation, tuning, and management of very large distributed and tightly coupled computer systems, based on the Linux OS. Additionally, the incumbent will architect, procure, deploy, and operate various storage systems, fast parallel filesystems, peta-scale storage, and tape archival systems. The incumbent will be a key contributor to the development of innovative solutions for supporting campus research work with the high performance computing cluster, research data storage, data visualization resources, secure research environment, and other services provided to faculty and researchers at Case Western Reserve University. The incumbent will work in a challenging environment on mid-scale and large-scale local HPC, cloud computing platforms and national supercomputing facilities. The incumbent will help research, evaluate, and develop new technologies to support the university's research mission. The incumbent will work directly with university researchers on projects that require specialized HPC skills. The position requires honesty, integrity, and regulatory compliance when handling confidential research data, and compliance with all regulatory requirements connected to such activities, as well as adherence to established change control procedures.
Architect, design, build, operate, and manage core technical services, specifically including high performance computing, fast parallel storage for HPC, and general research data storage services. Evaluate various storage and file system options including those based on PanFS, GPFS, Lustre, Gluster, and other technologies, and includes evaluating various computational options. Develop and use tools to manage task automation on the computational systems. (25%)
Provide consulting, programming, and other effort to faculty and research staff in identifying and using high performance clusters, research storage, and research archival systems. Collaborate with faculty on research projects and identify external funding opportunities, and write research proposals to federal, state and private entities; facilitate interdisciplinary research. (20%)
Develop and teach workshops, course modules, seminars, and training sessions in basic use of HPC resources, MPI programming, C/C++/FORTRAN debugging, GPU computing, and other topics. (10%)
Lead (or collaborate on) systems programming projects to maintain and enhance system functionality, in areas such as large systems monitoring, systems and cluster management and file systems and I/O subsystems. Define and scope assigned projects, including budgets. Synchronize, plan, and manage multiple projects that include technology, financial, and staffing components, with a goal of enhancing reliability, stability, usability, performance, and security. (10%)
Provide detailed systems support that involves direct interaction on a regular basis with a growing group of faculty, postdoctoral scholars, staff, and students who use RCCI services. Respond to questions, troubleshoot, and provide advice on optimal use of the facility and on opportunities to increase research productivity and reduce cost. (10%)
Work with vendors and contractors on project efforts designed to better meet the needs of our research computational users. (7%)
Generate internal technical documentation as well as user documentation for the university's growing community of users of research computing and cyberinfrastructure resources. (7%)
Diagnose and fix system problems, help analyze system issues and develop and implement workarounds and/or patches for software bugs. (6%)
NONESSENTIAL FUNCTIONS Other duties as assigned. (5%)
CONTACTS Department: Frequent interaction with all RCCI staff and interacts regularly with other UTech staff, especially in the network engineering team and the servers and storage team to coordinate installations, network configuration, hardware deployment and, hardware supports.
University: Regular interaction with university faculty, research staff to carry out main collaboration and support functions.
External: Regular contact with vendors of computing, storage, and networking equipment and services (e.g., Dell, Panasas, Arista, Mellanox, NVIDIA) to obtain the optimal hardware computing platform. Regular contact with counterparts in other organizations (e.g. NSF, NIH, affiliated hospitals, other universities) to carry out main collaboration and support functions.
Students: Regular interaction with the students to support and advice on the research group computational work.
SUPERVISORY RESPONSIBILITY May coordinate efforts of 3-5 staff in support of project deliverables. May include supervision and team leadership of core technical services projects. May direct the work of student employees.
QUALIFICATIONS Experience: 5 to 7 years of progressive experience, success, and leadership in support of computational resources or services used for research or in use of such resources or services in conducting computationally intensive research.
Education: Bachelor's degree required in a scientific discipline (master's or doctoral degree in a scientific discipline preferred). Must have relevant level of certification or equivalent technical expertise.
Expertise in systems programming and management of large-scale UNIX/Linux based systems, preferably in a high-performance computing (HPC) environment.
Knowledge, skills and experience in optimizing processor, interconnect, and storage technologies for high performance computing systems.
Knowledge and understanding of UNIX/Linux internals, preferably RedHat Enterprise Linux.
Strong understanding of network concepts including TCP/IP, DNS, routing, and firewalls.
Some familiarity with installation, configuration, monitoring, and tuning of workload management systems such as SLURM or PBS/Torque.
Strong skills in several of the following areas of scientific and high performance computing: parallel computing using distributed and shared memory with MPI and OpenMP; scientific programming in C, C++, FORTRAN or other languages; scientific programming experience in Python, Java, or MATLAB preferred; use of debugging and profiling tools; porting scientific code across architectures; modeling and simulation of physical and simulated systems; numerical methods and use of mathematical libraries such as IMSL, NAG, GSL, ScaLAPACK, and MKL; writing research proposals to federal agencies such as NIH and NSF; writing manuscripts for publication in peer-reviewed journals; Shell/PERL/Python systems programming.
Expert skill in computer programming and algorithm development sufficient for designing new scientific or other software, modifying existing scientific algorithms, debugging and profiling existing scientific or other algorithms, and assisting other programmers in these tasks across a wide variety of programming languages such as C, C++, FORTRAN, Python, Java, etc.
Expert skill in design and development of parallel algorithms for scientific or other research application, using MPI, OpenMP, or OpenACC.
High level of proficiency with word processor, spreadsheet, presentation, and diagramming tools such as Word, Excel, PowerPoint, and Visio.
Demonstrated ability to work independently as well ascollaboratively in large projects, and contribute to an active intellectual environment. Must be self-motivated, work independently and as part of a team, able to learn quickly.
Strong technical and collaboration skills needed to create and deploy innovative ways of allowing our diverse user base to effectively utilize the unique resources that RCCI provides.
Experience in the use of open source software. Prefer experience in supporting university enterprise technology at the enterprise or school level.
Knowledge of high-performance computing platforms, cloud computing platforms, and application models, applications programming, parallel programming, database technologies, data and network infrastructure, and software deployment management tools and methods used in compute-, data-, and network-intensive research preferred.
Ability to develop personal networks and use them to strengthen internal and external support. Ability to identify opportunities and take action to build strategic relationships between UTech and other university areas, teams, departments, etc., to help achieve business goals. (Building Strong Alliances Skills)
Excellent written and oral communication skills. Ability to actively listen, responsive to verbal and non-verbal clues.
Ability to respond to difficult, stressful or sensitive interpersonal situations in ways that reduce or minimize potential conflict and maintain good working relationships among internal and external customers. The ability to recognize awkward or potentially embarrassing situations that sometimes arise. Always aware of tone and careful choice of words, while at the same time ensuring that the intended message is clear, polite and readily understood. (Tact & Diplomacy Skill)
Ability to look at situations from multiple perspectives, break problems into component parts, and look for underlying causes and think through the consequences of different courses of action (Analytical Skills).
Ability to optimize the use of time and resources to achieve the desired results; effectively plans and organizes work to minimize crises; prioritize appropriately. (Planning and Organization Skills)
Ability to identify various types of problems along with the creation of workable solutions. Requires the identification and analysis of problems, evaluation of alternatives, and provision of solutions. (Problem Solving Skills)
Ability to develop in-depth understanding of client needs in order to be more helpful. The ability to consider how different audiences are likely to respond and choose the best method of communicating the message to each audience. (Customer Focus Skill)
Ability to recognize the importance of certain tasks and responsibilities and the ability to prioritize to ensure that deadlines are met; how and when to escalate issues to higher levels. (Dependability & Reliability Skill)
Consistently models high standards of honesty, integrity, trust, openness and respect for the individual. Embraces diversity.
Ability to work in a fast-paced environment while working on multiple projects.
Ability to work with and maintain confidential information.
Ability to understand and follow project management processes to meet continuing assignments.
WORKING CONDITIONS Working conditions are typical of an office environment and computer rooms. The position requires typing on a computer keyboard and using a computer mouse. The position occasionally requires entering a UTech data center and reaching above shoulder height or bending/stooping/crouching to push a cable into a connector or pull a cable from a connector. Some heavy lifting of 40 pounds may be required. Bending and kneeling may be required. The employee may be required to attend meetings/functions outside normal working hours including weekends. The employee may be required to carry a cell phone, during and after their normal work hours, including weekends to attend to after-hours emergencies.
DIVERSITY STATEMENT In employment, as in education, Case Western Reserve University is committed to Equal Opportunity and Diversity. Women, veterans, members of underrepresented minority groups, and individuals with disabilities are encouraged to apply.
REASONABLE ACCOMMODATIONS Case Western Reserve University provides reasonable accommodations to applicants with disabilities.
Applicants requiring a reasonable accommodation for any part of the application and hiring process should contact the Office of Inclusion, Diversity and Equal Opportunity at 216-368- 8877 to request a reasonable accommodation. Determinations as to granting reasonable accommodations for any applicant will be made on a case-by- case basis.