About the Role
We are seeking a Systems Engineering Architect to lead the design, implementation, and operational excellence of our large-scale distributed systems infrastructure. This role is critical in driving architectural decisions and ensuring reliability, scalability, and performance across our platform. You will work cross-functionally with product engineering, DevOps, SRE, and support teams to evolve the foundational systems that power our data pipeline, RDBMS solutions, container-based microservices, and cloud-native deployments.
Key Responsibilities
● Architect and design large-scale distributed systems across major cloud service providers (AWS, Azure, GCP).
● Drive the strategy and execution of data infrastructure leveraging Postgres, Elasticsearch/ELK, Snowflake, and Databricks.
● Design and scale distributed data ingestion systems using Kafka or RedPanda.
● Lead the integration of CI/CD systems and automation pipelines.
● Guide the implementation and evolution of containerized microservices on Kubernetes across Linux and Windows environments.
● Own production infrastructure scalability, resiliency, and high availability strategies.
● Implement and optimize observability solutions, including logging, monitoring, and metrics systems.
● Conduct deep performance tuning and capacity planning across application and database tiers.
● Lead incident escalation, conduct root cause analysis (RCA), and drive continuous improvement for production reliability.
● Collaborate with internal stakeholders to balance architecture excellence with pragmatic delivery timelines.
Required Qualifications
● 3+ years of experience in systems engineering, with at least 5 years in an architectural or technical leadership role.
● Proven experience designing and operating systems at cloud-scale across AWS, Azure, or GCP.
● Deep expertise in large-scale RDBMS (Postgres preferred) and Elastic Stack (ELK), with working knowledge of Snowflake and Databricks.
● Hands-on experience building and managing data streaming pipelines using Kafka or RedPanda.
● Proficiency in container orchestration using Kubernetes, and managing microservices at scale.
● Strong background in both Linux and Windows systems engineering.
● Track record of leading production-grade CI/CD and deployment automation strategies.
● Deep understanding of monitoring, metrics, and alerting systems (e.g., Prometheus, Grafana, Datadog).
● Exceptional debugging, performance tuning, and troubleshooting skills in large-scale environments.
● Demonstrated experience with RCA and handling critical production escalations under pressure.
Preferred Qualifications
● Experience in security-conscious or regulated environments (e.g., financial services, healthcare, government).
● Familiarity with Infrastructure as Code (IaC) tools like Terraform, Pulumi, or CloudFormation.
● Contributions to open-source infrastructure or systems projects.
3 years of experience required
No management responsibility