Resume

Available in PDF format

Infrastructure, Production, DevOps, and Site Reliability Engineer


Overview #

Versatile engineer with over 15 years of relevant experience optimizing secure, scalable, and cloud-native infrastructure. Expert in coding for software and infrastructure deployment, on-the-fly incident troubleshooting across network, system, and application layers, and mentoring team members to drive operational excellence. Proven success in automating deployments for FedRAMP and PCI-DSS compliant environments, reducing mean time to resolution, and fostering cross-functional collaboration to deliver robust systems.

Seeking a role as an Infrastructure, Production, DevOps, or Site Reliability Engineer to deliver highly-available products and services to customers.


Key Skills #

  • Cloud and Infrastructure: AWS, GCP, Kubernetes, Terraform, Helm
  • CI/CD and Automation: Jenkins, GitLab CI, GitHub Actions, Argo, flux
  • Monitoring and Observability: Prometheus, Thanos, AlertManager, Grafana, Splunk, Loki, Elasticsearch
  • Security and Compliance: PCI-DSS, FedRAMP, WireGuard, IPsec, OpenVPN
  • Programming: Go, Python, Ruby, Bash, Git
  • Core Competencies: Incident Response, Scalability, Cloud-Native Architecture, Cross-Functional Collaboration, Mentorship

Experience #

2024 ~ present / Cisco Systems #

Site Reliability Engineer / Frisco, TX (remote)

  • Architect service re-implementation in Kubernetes with Argo and Helm.
  • Achieve 99.99% uptime for FedRAMP-compliant environments at Moderate and High Impact Levels.
  • Secure Provisional Authorization to Operate (P-ATO) for new environments, enabling $2-3M in federal contracts.
  • Streamline deployment pipelines for over 150 component services using GitHub, Kubernetes, and Argo, reducing service onboarding time by more than 50%.

2022 ~ 2024 / Schmoll Systems LLC #

President / Frisco, TX (remote)

  • Developed Go-based cloud resource management software, automating infrastructure provisioning for multiple clients across AWS and GCP.
  • Reduced client infrastructure costs over 30% through server consolidation and improved autoscaling configurations.
  • Mentored client teams in Kubernetes and Terraform, improving their operational effectiveness.
  • Led incident response for critical outages, resolving issues within SLAs 98% of the time.

2020 ~ 2022 / Salesforce.com (MuleSoft) #

Site Reliability Engineer / Santa Fe, NM (remote)

  • Enhanced stability of FedRAMP Moderate Impact Level (IL) GovCloud environments, achieving 99.9% uptime and uplifting to High IL.
  • Automated incident remediation workflows, reducing manual interventions by more than 40%.
  • Collaborated with development teams to implement cloud-native monitoring with Prometheus and Grafana, improving availability of common Service Level Indicators (SLIs) and establishing useful Service Level Objectives (SLOs).
  • Mentored 3 junior engineers in advanced troubleshooting techniques, fostering a culture of proactive incident management.

2018 ~ 2020 / Subsplash #

Site Reliability Engineer / Santa Fe, NM (remote)

  • Migrated 20+ Go-based microservices from AWS EC2 instances to AWS EKS, reducing deployment time by 50% and standardized deployments using Terraform, GitLab CI, and Helm.
  • Ensured PCI-DSS compliance for payment card processing systems, passing all audits with zero outstanding findings.
  • Oversaw and implemented infrastructure consolidation from 3 distinct acquisitions, unifying networking and systems, and scaling infrastructure to handle 200% user growth.
  • Trained 10+ developers in Kubernetes best practices, enabling daily production deployments.

2013 ~ 2018 / Salesforce.com (Pardot) #

Site Reliability Engineer / Seattle, WA (remote)

  • Automated infrastructure deployments with Chef and Terraform, supporting 10+ daily application code deployments in a dynamic environment of more than 50 developers.
  • Ensured Salesforce Trust compliance, reducing security vulnerabilities by 25% through proactive monitoring with standard tooling.
  • Led cross-functional teams to optimize system performance, improving application response times by more than 50%.
  • Mentored junior SREs and built a scalable incident response framework.

Earlier Career #

  • ServiceNow, Performance Engineer (2012 ~ 2013): Optimized MySQL database performance for 1000+ instances, improving query response times by more than 25%.
  • SAP Concur, Unix Systems Engineer (2007 ~ 2011): Managed 100+ Red Hat Linux systems and supported a Hadoop cluster for data mining; assisted with over 1300 servers across multiple sites.
  • Breakwater Security Associates, Network Engineer (2005 ~ 2006): Responded to network and system outages under strict Service Level Agreements; assisted clients with system modifications and updates.

Education #

2005 ~ 2008 / University of Washington #

Bothell, WA

  • Attained Bachelors of Science in Computing & Software Systems