Staff Site Reliability Engineer - Trust @ LinkedIn
I'm a Systems Engineer specializing in Linux, with a wide background, including Unix, VMWare, Windows, networking, and scripting in Python, Perl and bash. I love DevOps and automation. I've spent some time doing development full time, and I loved doing that as well. I love to learn new things, and solve problems, and engineering is how I
I'm a Systems Engineer specializing in Linux, with a wide background, including Unix, VMWare, Windows, networking, and scripting in Python, Perl and bash. I love DevOps and automation. I've spent some time doing development full time, and I loved doing that as well. I love to learn new things, and solve problems, and engineering is how I get to do that every day.
Specialties: Linux (RedHat, CentOS, Fedora, OEL, Debian, Ubuntu)
Unix (Solaris, FreeBSD, OneFS, HP-UX, AIX)
Python and Perl programming, bash and other various scripting
Clustering and SAN/NAS
Windows Server (2000,2003)
Microsoft Exchange (2000-2003)
Most of my programming/scripting work, and my best work, belongs to previous employers, and is not available publicly, but you can find a very small subset of code I’ve written at: https://github.com/csuttles
Senior Systems Engineer @ • I work on the hardware engineering team at Twitter, where we characterize performance of all server hardware and network OEM hardware before it is purchased; we also collaborate with ODMs and deploy designs co-authored by our team and hardware vendors.
• I am the frontend platform lead systems engineer for the hardware team, and the lead system engineer for all GPU/deep learning platforms.
• I collaborate with several service engineer teams (SRE), as well as OS team, kernel team, and PE (deployment automation team).
• System engineers on my team also contribute to the code base for automating deployment.
• My role includes maintaining our test lab infrastructure (devops for my team). This includes core infrastructure services like DNS, as some other more esoteric services specific to our environment.
• I developed and maintain a widely used custom liveOS, using open source tools (livecd-tools); this OS includes many specialized tools, as well as open source tools and is built on CentOS. I make several build flavors for different testing purposes.
• I’ve written several tools in Python, including a tool for power qualification and regression testing, and wrappers for industry benchmarks like fio, SpecCPU, SpecJVM, and stream. I significantly improved our memory qualification testing after inheriting code from another engineer. I also wrote several test cases for autotest, and integrated the most popular tools I wrote into the autotest suite. I also automated log shipping between us and a key vendor, to enable us to determine the quality of that vendor’s testing and other deliverables.
• I mentor other engineers, including system engineers, and some of the more hardware focused people on my team as well. Primary areas of mentorship include Linux, Python, and Bash. From July 2012 to Present (3 years 6 months) San Francisco Bay AreaSenior Systems Engineer @ • I worked with a small team of engineers to maintain the core services for deployment and development tools for all of Yahoo! These services included bugzilla, twiki, etc. (devtools), and (deptools) services that provide system role data, system state data, and package repositories (for the internal/proprietary package system yinst) and lastly the deployment engine (pogo).
• Within my team, I rotated around and worked on the properties/services that were problematic, and stabilized them. My first challenge was our deployment engine; once the deployment engine was stable and performance was far improved, my next focus was the system roles service. Most of the services were focused on a REST API and passed data around with JSON, and the most common language for tools and services was Perl.
• I contributed policy where appropriate on best practices to improve the scalability/maintainability of the services within the group.
• I authored and commited scripts, packages and patches, both of my own creation and in collaboration with development on their existing code base.
• I was part of an on-call rotation.
• My daily tasks included primarily system and log analysis and troubleshooting, as well as Perl authoring and debugging. From September 2011 to July 2012 (11 months) Advanced Support Engineer @ • I worked with a small team of engineers as the bridge between support and development engineering.
• I analyzed complex problems, and when appropriate, filed defects and worked with development engineering to provide insight into the problem.
• I authored many tools for myself, my team, and some for the entire support organization. Almost all of those scripts were in Perl. I used subversion for version control on a Fedora Linux sever, for which I was also the System administrator.
• I also worked with customers during serious outages and recovery scenarios. This ranged from re-cabling and replacing components to running commands on the console (serial or ssh) of various system components to recover and prevent data loss (in the scale of TeraBytes to PetaBytes)
• I was part of an on-call rotation.
• I was the Linux, UNIX, and scripting SME (Subject Matter Expert) for my team and organization.
• My daily tasks included log analysis and troubleshooting, Perl and bash scripting (mostly Perl), and mentoring of new ASE (Advanced Support Engineer) members, as well as TSE (Technical Support Engineers, or first-level support) From February 2011 to September 2011 (8 months) Senior Systems Engineer @ • My duties were to operate, build, test, and support the production AAA application environments within a fast paced WiMAX network.
• I also worked with other groups to enhance network reliability and support new features. As the WiMAX standard was enhanced, and vendors added new features, our network grew and changed to accommodate. As our subscriber count grew, so did the technology required to support those subscribers.
• Worked with the appropriate vendors toward resolution when problems cannot be solved in house
• My day to day tasks included Solaris and Linux administration, scripting, and networking tasks. This included but was not limited to: log analysis, packet capture and analysis, network troubleshooting, load balancer config, package and OS install/config, Perl/shell scripting, and database usage.
• I was on-call 24x7, and the only engineer responsible for the production AAA instance after only a month. I held this responsibility for the remainder of my tenure at Clearwire. From November 2009 to February 2011 (1 year 4 months) Technical Support Engineer @ • Technical support for clustered storage running Isilon’s OneFS (based on FreeBSD)
• Support integration with Active Directory (via Samba), OpenLDAP, and NIS
• In depth troubleshooting, including packet capture and analysis of network traffic, tuning for CIFS and NFS, and other various break/fix issues.
• Log analysis, remote support, configuration, and troubleshooting. It is very common for me to leverage remote access to resolve customer issues on demand.
• Hardware troubleshooting
• Identify workarounds for existing bugs and identify/escalate new bugs to Application Engineering From July 2008 to October 2009 (1 year 4 months) System Administrator - Global DC Operations / Developer - Global It Tools @ • Perl development: development of utilities, reporting, and automation for multiple UNIX platforms. EPM Packaging for deployment on Linux, Solaris, HP-UX, and AIX. Similar shell scripting.
• Member of Global Data Center Operations Team. We support Oracle’s critical, high visibility, high uptime environments, and critical internal operations.
• Manage Petabytes of backups per month in the Oracle’s largest, primary Data Center in Austin, Texas, which houses more than 12,000 servers.
• Provisioning of new systems, per company build SOP.
• Contributor to development of Oracle Enterprise Linux Certification.
• Deliver internal training on backups and infrastructure.
• Part of weekly On Call rotation; it's a 24x7 operation.
• Management of large tape libraries such as the ADIC/Quantum Scalar i2000.
• Some SAN/NAS administration. From March 2007 to May 2009 (2 years 3 months) Lab System Administrator @ • Support and administer a lab of roughly 300 systems, running many versions of various operating systems, including RedHat Linux, SuSE Linux, OpenBSD, AIX, HP-UX, Sun Solaris, Windows, and Netware
• Administrator for all high end hardware, including a Sun Fire E6900, an EqualLogic iSCSI device, and our more esoteric/expensive platforms, such as PA-RISC, PPC, and Itanium.
• Administrate network devices and infrastructure, including firewalls, routers, switches, DNS, DHCP, etc.
• Support testing for both our database and integration products, which means testing on several releases of many products on all platforms. Some applications I manage include Oracle, Microsoft SQL, MySQL, PostgreSQL, Informix, and IBM DB2.
• Support testers and developers using various combinations of operating systems, applications, and network services
• Assist with production network responsibilities, similar to lab responsibilities. From August 2006 to March 2007 (8 months) Systems Administrator @ • Support and administer all machines in Zebra Imaging network, primarily focusing on Linux clusters. We had 5 clusters: one test cluster of only 4 nodes, three production clusters with 16-20 nodes, and a cluster of 50 nodes, running on ppc. We also had about a dozen infrastructure Linux machines, in two sites, and about 6 important windows machines for active directory and corporate users.
• Supported development environment, developers, and the cutting edge technology Zebra developed. Most of the code base was written in C, C++, TCL, and an extension of TCL.
• Managed data integrity, backups and security for the entire organization. This includes developing maintenance plans, password and security policy, and implementation.
• Scripting in Perl and Bash
• System monitoring and performance tuning.
• Redesign, implement and reconfigure two sites to meet government security, auditing, integrity, and availability standards (DCID6/3) for processing Top Secret data (SCIF). From January 2006 to August 2006 (8 months) Server Support Analyst @ POD Server Support Analyst • 06/2005 – 10/2005
• Linux, VMWare ESX, Active Directory, Exchange, MS Cluster Services, HPCC, Dell Open Manage skill sets
• Enterprise hardware and software, advanced configuration, deployment and disaster recovery; this includes calls that require billable software resolutions, with emphasis on complex software configuration. Took calls with end to end case ownership in a “Top Tier” support role
• One of the highest escalation points at Dell
CSD Network / Tech Analyst • 08/2004 – 06/2005
• SME (Subject Matter Expert) for Active Directory, Exchange, and Linux
• Troubleshooting of issues from break-fix to advanced configuration and deployment / disaster recovery
• Enterprise level hardware and software support, such as SCSI, RAID, Clustering
Network Tech / CSD • 12/2003 – 08/2004
• Networking and Escalations SME (Subject Matter Expert)
• Wireless support
• Troubleshooting of various issues from basic break-fix to advanced configuration From December 2003 to October 2005 (1 year 11 months)
none, Computer Science @ Austin Community College From 2006 to 2006 Chris Suttles is skilled in: Solaris, DNS, Red Hat Linux, Unix, VMware, Linux, Perl, Data Center, Servers, Shell Scripting, CentOS, Operating Systems, Hardware, Cluster, Networking
Looking for a different
Get an email address for anyone on LinkedIn with the ContactOut Chrome extension