We are hiring a dedicated Kernel Test Operations Engineer to take full ownership of the operational health and stability (often referred to as "Greenness") of all kernel test versions managed by the Android Systems team. This critical engineering role ensures continuous, high-fidelity testing of Android Common Kernels by maintaining robust test infrastructure, rapidly diagnosing failures, and driving core test metrics towards zero failures.
This role requires a proactive approach, deep debugging skills, and strict adherence to service level agreements (SLOs).
Key Responsibilities
1. Test Infrastructure Maintenance & Health
• Sustain Operations: Maintain the internal pre-submit, post-submit, and related test infrastructure for all Android Common Kernels to ensure optimal functionality and availability.
• Continuous Coverage: Enforce continuous testing across Android kernel trees. Guarantee that no branches or targets remain untested for more than 2 business days.
• Test Integrity: Monitor pre-submit and post-submit test suites to prevent unintended test auto-disabling. Ensure all enabled Test Suites (TS) are constantly running.
• Technical Patching: Resolve test infrastructure issues, including writing minor test code or configuration patches (using C++, Python, Java, and Bash) to fix unreliable tests and restore test harness stability.
2. Breakage Analysis and Resolution
• Dashboard Ownership: Review and monitor all established kernel build and boot test dashboards. Immediately investigate and "drill down when red."
• Rapid Triage: Lead the bisection, isolation, and triage of test breakages. Revert the culpable commit if necessary to quickly restore the harness to a "green state."
• Escalation: Partner with EngProd and other teams on complex infrastructure issues, escalating to the Technical Lead (TL) only when deep root cause analysis or urgency demands it.
• SLO Adherence (Patching): Submit urgent/critical test configuration updates or minor test code patches assigned by the TL within 1 business day. Complete all other assigned patches within 3 business days.
3. Quality Fine-Tuning and Goal Setting
• Drive Greenness: Actively work to ensure the list of test failures is stable and trending down, targeting an ideal state of zero failures.
• Test Optimization: Fine-tune the pre-submit CTS and VTS test configurations. This includes enabling new tests to achieve better kernel coverage and disabling tests with duplicate coverage.
• Issue Tracking: Ensure all identified bugs and issues have their status updated on a daily basis.
Qualifications
• 3+ years of experience in Test Engineering, Quality Assurance, or DevOps/SRE with a focus on large-scale CI/CD pipelines and infrastructure stability.
• Demonstrated technical proficiency in scripting and configuration (e.g., Python, Bash) and familiarity with application-level languages (e.g., C++, Java) relevant to test maintenance.
• Proven expertise in test failure analysis, bisection, isolation, and debugging to identify root causes.
• Strong ability to operate under strict Service Level Objectives (SLOs) and deliver rapid technical resolutions.
• Experience maintaining test operations for large software platforms, preferably in a kernel or operating system context.
IQronix Limited is a global technology services company specializing in embedded systems, edge AI, and large-scale production support. With over twenty years of system-level expertise, IQronix partners with major telecom and electronics brands to deliver reliable, scalable, and high-performance solutions. The company provides OS integration, firmware development, BSP engineering, system bring-up, QA automation, cloud-connected diagnostics, and edge AI deployment. Headquartered in Taipei, Taiwan, and affiliated offices in US, UK and Germany. IQronix offers flexible engagement models to meet global customer needs.