System Overview

All Systems Operational

System Identity

whoami
Walid Abu Al-Afia

Walid Abu Al-Afia

Computational Engineer @ St. Jude Children's Research Hospital

M.S. Computer Science @ The University of Texas at Austin

Memphis, TN Originally from Amman, Jordan NVIDIA Certified: AI Infrastructure & Operations

CPU Cores

0+
across 6 clusters

GPUs

0+
managed & allocated

Cluster Environments

0
HPC & SCCE clusters

Data Migrated

0 PB
cryo-EM/ET imaging data

Active Roles

systemctl status
RUNNING Computational Engineer St. Jude Children's Research Hospital May 2024 – Present
RUNNING IS Internship Coordinator St. Jude Children's Research Hospital May 2023 – Present
RUNNING M.S. Computer Science UT Austin Aug 2025 – Dec 2028

System Uptime

since June 2022
0years
0months
0days

Recent Events

tail -f /var/log/career
[2025-08] Started M.S. Computer Science at UT Austin
[2025-01] NVIDIA Certified Associate: AI Infrastructure & Operations
[2024-05] Promoted to Computational Engineer II
[2024-03] Completed 30 PB cryo-EM data migration
[2024-01] Piloted AI Agent adoption across HPC clusters
[2023-10] Published at IEEE/RSJ IROS 2023
[2023-05] Promoted to Computational Engineer I
[2023-05] Graduated Rhodes College — Magna Cum Laude
[2022-09] Started HPC Engineering Student role at St. Jude
[2022-06] HPC Research Computing Intern at St. Jude

Experience

squeue -u walid

Job Queue

squeue -u walid --format="%.8i %.12P %.30j %.8T %.12M %.6D"
JOBID PARTITION NAME STATE TIME NODES
$ cat slurm-100006.out
  • Designs and implements large-scale monitoring infrastructure using Prometheus and Grafana for comprehensive cluster observability
  • Piloted secure adoption of AI Agents across all St. Jude HPC clusters; negotiated contracts with OpenAI, Anthropic, GitHub, Cursor
  • Serves as the institutional resource and subject matter expert for AI Agents at St. Jude
  • Architected 20,000-line MLOps-focused Python package for low-code/no-code ML model training
  • Expanded Open OnDemand from single to multi-cluster deployment spanning 4 environments (Slurm HPC, SCCE/GDPR, colocation, main HPC)
  • Led SCCE cluster (GDPR-compliant) and Slurm HPC Model Training cluster implementation
  • Sole resource for 19 CryoSPARC instances; built On-The-Fly cryo-EM/ET processing pipeline
  • Conducted 4-month, 30 PB data migration to dedicated Imaging Storage system
  • Transformed module installation system from bare-metal to container-based builds
$ cat slurm-100005.out
  • Manages internship program, reviewing 300-500 applications yearly
  • Conducts interviews and coordinates hiring of 15 interns annually
  • Mentors interns throughout the summer program
  • Established intern-to-full-time pipeline creating career tracks for permanent positions
$ cat slurm-100004.out
  • Built and deployed Open OnDemand instance serving 20,000+ core cluster
  • Authored multiple Interactive Applications (Maestro, VMD, Scipion)
  • Managed all software and module installations across RHEL7/RHEL8
  • Organized and taught seminars on HPC programming tools
  • Optimized parallel programs (MPI, OpenMP, CUDA) for researchers
$ cat slurm-100003.out
  • Built Prometheus + Grafana metrics collection environment
  • Designed cluster monitoring dashboards for resource utilization insights
  • Performed routine module installations on RHEL7 HPC Cluster
  • Gained proficiency in LSF workload manager administration
$ cat slurm-100002.out
  • Developed VR application using Unity + Meta SDK for pediatric patient training
  • Built REST API for AlphaFold-based protein structure prediction
  • Compiled deployment documentation for HPC engineering team
$ cat slurm-100001.out
  • CS Head Tutor: hired, trained, managed 9 tutors; built TutoringBot queue system
  • CS Tutor: tutored ~20 students in Python, Java, C, data structures, algorithms
  • Cloud Admin: managed JupyterHub Kubernetes cluster on GCP; integrated OneLogin SSO
  • Research Fellow: co-authored IROS 2023 paper on human-robot interaction; built AR app in Unity
$ cat slurm-100000.out
  • Created Bitcoin trading bot with live price data and technical indicators
  • Integrated cryptocurrency exchange APIs for automated trading
  • Developed data processing pipelines in Python and Java

Resource Allocation Timeline

sacct --starttime=2018-06-01 --format=JobName,Start,End,State
201820192020202120222023202420252026
INTRASOFT
CS Tutor
Research Fellow
Cloud Admin
HPC Intern
HPC Student
CE I
IS Coordinator
CE (Current)
M.S. UT Austin

Skills & Technologies

nvidia-smi && module avail

GPU Utilization

nvidia-smi
+-----------------------------------------------------------------------------------------+
| WALID-SMI 550.127       Driver Version: 550.127       CUDA Version: 12.4               |
|-----------------------------------------+------------------------+----------------------+
| Skill                                   | Proficiency            | Utilization          |
|=========================================+========================+======================|
| Python                                  | Expert                 | ███████████████████ 95% |
| C/C++                                   | Advanced               | ████████████████    80% |
| Rust                                    | Intermediate           | ███████████         55% |
| Go                                      | Intermediate           | ██████████          50% |
| Bash/Shell                              | Expert                 | ██████████████████  92% |
| JavaScript/React                        | Advanced               | ██████████████      70% |
| Java                                    | Advanced               | ███████████████     75% |
|-----------------------------------------+------------------------+----------------------|
| Slurm/LSF                               | Expert                 | ███████████████████ 95% |
| MPI/OpenMP/CUDA                          | Advanced               | ████████████████    82% |
| Prometheus/Grafana                       | Expert                 | ██████████████████  90% |
| Docker/K8s/Apptainer                     | Advanced               | ████████████████    80% |
| MLOps/Deep Learning                      | Advanced               | ████████████████    78% |
| AI Agents/LLMs                           | Advanced               | █████████████████   85% |
| CryoSPARC/Cryo-EM                       | Expert                 | ██████████████████  90% |
+-----------------------------------------------------------------------------------------+

Loaded Modules

module avail

--- /opt/languages ---

python/3.xc-cpp/gccrust/stablego/1.xjava/jdkr/4.xruby/3.xbash/5.xjavascript/es6racket/8.xcsharp/dotnet

--- /opt/hpc ---

slurm/23.xlsf/10.xopenmpi/4.xopenmp/5.xcuda/12.xondemand/3.xmodules/5.x

--- /opt/devops ---

prometheus/2.xgrafana/10.xdocker/25.xkubernetes/1.xapptainer/1.xsingularity/3.xconda/24.xgcp-sdk/latest

--- /opt/ml ---

pytorch/2.xtransformers/latestjupyter/labmlops-toolkit/1.0alphafold/2.xcryosparc/4.x

--- /opt/editors ---

vim/9.xemacs/29.xvscode/latestcursor/latestintellij/2024rstudio/latest

Language Interfaces

ip link show
lang0: UP English (Native)
lang1: UP Arabic (Native)
lang2: PARTIAL French (Working Proficiency)
lang3: PARTIAL Spanish (Working Proficiency)

Certifications

/etc/certs

NVIDIA Certified Associate

AI Infrastructure and Operations

2025

Focus Areas

research interests
Parallel Systems Distributed Systems Compiler Design Systems Programming Scientific Computing Human-Robot Interaction VR/AR Development

Education

/var/log/education

System Log

journalctl -u education --no-pager
Aug 2025 LOAD kernel: Loading module ms_cs_utaustin... GPA 4.0
The University of Texas at Austin — M.S. Computer Science (Expected Dec 2028)
Focus: Parallel Systems, Deep Learning
May 2023 DONE kernel: Module bs_cs_rhodes loaded successfully — GPA 3.87, Magna Cum Laude
Rhodes College — B.S. Computer Science, Minor: Religious Studies (2019 – 2023)
May 2019 DONE kernel: Module hs_kings_academy loaded — GPA 3.86
King's Academy, Manja-Madaba, Jordan — High School Diploma
Apr 2016 NET Remote node connection established: scotch_college_perth (Australia)
Round Square Program Global Cultural Exchange — Study Abroad

Coursework Modules

lsmod | grep coursework

--- UT Austin (Graduate) ---

parallel_systemsdeep_learning

--- Rhodes College (Undergraduate) ---

parallel_systemsdistributed_systemsoperating_systemsartificial_intelligenceautomata_theorycomputer_organizationprogramming_languagesdata_structures_algorithms

Awards & Honors

achievements unlocked
Magna Cum Laude
Upsilon Pi Epsilon (CS Honor Society)
Theta Alpha Kappa
Joseph Reeves Hyde Award
Jack U. Russell Award
Presidential Scholar
Dean's List
Student Body President (King's Academy)
President's Award for Outstanding Leadership

Research

Publications & Data Analysis

Publication Record

SELECT * FROM publications ORDER BY year DESC
IEEE/RSJ IROS 2023 • Detroit, MI, USA

Development and Evaluation of Exploratory Experiences to Facilitate Reasoning About Robotic Systems

S. Balali, M. Hudspeth, I. Afflerbach, H. Helgesen, J. McCurry, W. Abu Al-Afia et al.

ABSTRACT: This paper introduces a novel interactive approach — Exploratory Experiences — that aims to improve the ability of people to reason about the capabilities and limitations of robotic technology. We focus on two areas: robot navigation and object detection. We evaluate the Exploratory Experiences with a novel approach that measures the participant's ability to predict when the robot will fail.
DOI: 10.1109/IROS55552.2023.10342409 pp. 4107-4114

Research Keywords

indexed terms
Human-Robot Interaction Navigation Object Detection Explainable AI Atmospheric Measurements Cognition Intelligent Robots

External Data Source

curl -s https://ieeexplore.ieee.org
Connected

ieeexplore.ieee.org/document/10342409

Read Full Paper

Projects

systemctl list-units --type=service

Running Services

docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
Active

mlops-toolkit

20,000-line MLOps Python package for low-code ML model training with comprehensive Jupyter notebooks

PythonMLOpsJupyter
Active

hpc-monitoring-stack

Prometheus + Grafana monitoring infrastructure for CPU, GPU, and Slurm job-level metrics

PrometheusGrafanaSlurm
Active

ondemand-multi-cluster

Open OnDemand deployment spanning 4 cluster environments with LSF and Slurm support

RubyOpen OnDemandHPC
Active

cryosparc-fleet

19 CryoSPARC instances with On-The-Fly processing pipeline for structural biology research

CryoSPARCCryo-EMPipeline
Archived

vr-patient-training

Unity + Meta SDK VR application for pediatric radiation oncology patient preparation

UnityC#VR
Archived

alphafold-api

REST API endpoint for AlphaFold-based protein structure prediction on HPC

PythonRESTAlphaFold

Container Registry

github.com/walidabualafia
Connected

github.com/walidabualafia

Distributed systems, HPC tools, system-level programming, and more

View Repositories

Build Pipeline

CI/CD Status
Research
Prototype
Build
Deploy

More projects in development. Stay tuned.

Contact

ifconfig && ping -c 1

Connection Status

health check
Available for Opportunities

Open to relocation • Remote-friendly

Download Artifacts

build output

abualafia-curriculum-vitae.pdf

Latest build • 6 pages

Download

Node Location

hostname --fqdn

Memphis, TN

Open to Relocation

Originally from Amman, Jordan