Cake Job Search

Advanced filters
Off
Logo of 裕順資訊有限公司.
為什麼大家喜歡在 RichWell Co.Ltd. 上班? 1.彈性上班-早上不趕打卡,想多睡一點、避開通勤人潮都OK。2.特休多多-不用等滿一年就能休假,我們比法規更大方,放假就是要爽爽的。3.獎金福利讚 年終、績效獎金該有的都有,努力絕對不白費。4.生日小驚喜,公司記得你的每個重要時刻。5.定期聚餐/Team Building 不只是工作夥伴,更是一起成長的戰友,吃吃喝喝感情更緊密。6.技術課、內部分享會,想學什麼我們都支持,讓你持續進化不退化! About the roleWe are building a reliability-first platform. Over the next 12 months, we will stabilize our Windows-based services, strengthen observability, and progressively containerize into Kubernetes. You will be a key contributor driving self-service operations and data-driven reliability across the stack. What you’ll do• Operational automation: Build self-service runbooks for Windows services (AWX/Rundeck), implement Ansible/PowerShell DSC workflows, health checks, and safe rollbacks implementations.• Observability: Standardize metrics/logs/traces (Prometheus/Grafana, windows_exporter, OpenTelemetry; ELK/Loki). Create golden-signal dashboards and actionable alerts.• Reliability engineering: Participates in on-call, handle incidents and post-incident reviews (PIR), and lead game days to institutionalize SOPs.• Resilience: Design and implement backup disaster recovery, capacity planning, and performance tuning.• Long-term: Drive service containerization and Kubernetes adoption (Helm/Kustomize, Argo CD/Flux, ConfigMap/Secrets) with a strong focus on security and compliance.
Windows Server
Site Reliability Engineer
Prometheus/Grafana
1.6M ~ 2.2M TWD / year
4 years of experience required
No management responsibility
Logo of InAddition Consultants Ltd..
基礎設施代碼化 (IaC):維護與優化基於雲端的基礎設施(主要為 AWS 環境),利用容器化技術 (Docker/K8s) 實現彈性擴展的部署架構。可觀測性工程:設計並維護全方位的監控體系(使用 Prometheus / Grafana),不僅是設置告警,更需建立完整的日誌分析 (Log Pipeline) 與追蹤機制。事故管理與回顧:主導線上事故的排查 (Troubleshooting) 與根本原因分析 (RCA),並撰寫事後檢討報告 (Post-mortem) 以預防問題再次發生。效能與容量規劃:持續分析系統瓶頸,針對資料庫與應用服務進行架構優化與效能調校 (Performance Tuning)。自動化維運:透過程式腳本 (Python / Shell) 開發自動化工具,並優化 CI/CD 流程以提升交付效率。
1M ~ 1.2M TWD / year
3 years of experience required
No management responsibility
Logo of MoMo.
MoMo is the market leader in mobile payments in Vietnam, driven by a commitment to enhancing the lives of Vietnamese citizens through technological innovation.Within the MoMo BigData AI department, we prioritize Smart, Efficient, and Excellent execution. We are currently undergoing a major transformation to build a new hybrid data platform spanning multiple cloud vendors (GCP AWS).We are seeking an experienced Data Engineer to help us architect this platform to optimize for both budget control and technological flexibility. You will play a pivotal role in shifting our mindset from "managing data" to creating valuable Data Products that empower our internal consumers.Mô tả công việcWith MoMo's AI-first mission, we are designing and building a self-serve data platform to empower both internal teams and external partners. This platform allocates resources based on users’ needs to support:Ingesting data from diverse sources — either in batch or streaming, using both pull and push mechanismsDeveloping and deploying resilient data pipelines across the data lake, data warehouse, and streaming systemsDelivering high-quality, derived datasets to downstream tools such as BI solutions (e.g., Apache Superset,Looker Data Studio), via multiple delivery methods including APIs, datasets, and streaming dataMonitoring data quality throughout all data pipelines in the platform to ensure high-quality data, resulting in better decision-making, accurate reporting, and reliable machine learning outputsTracking and optimising resource usage for efficiencyAdditionally, we are building Data Management Systems that enable the Data Governance team and data consumers to:Manage the full data lifecycle within the big data platformExplore the MoMo data ecosystem independentlyProvide a single source of truth with high data quality to downstream consumersTrack and manage infrastructure costs across major projects, teams, and departmentsYêu cầu công việcThe MindsetPassion for Data: You dream in SQL ("SELECT COUNT(SHEEP)...") and care deeply about data accuracy.Product Thinking: You view data as a product, focusing on the usability and reliability of what you deliver to stakeholders.The Tech StackStrong Coding Skills: Proficiency in Java/Kotlin (for robust backend services) and Python (for data processing/scripting).Hybrid Cloud Infrastructure: Hands-on experience with GCP. Proficiency in Kubernetes, Docker, and IaC tools like Pulumi or Terraform.Big Data Engines: Deep understanding of computing engines like Spark, Trino, BigQuery, and Clickhouse.Orchestration: Experience building DAGs and workflows in Airflow or Temporal.Data Sources: Familiarity with diverse sources including App Events, CDC from transactional DBs (Oracle, MySQL, MSSQL), and streaming systems (Kafka, PubSub).Soft SkillsStrong problem-solving abilities with a focus on root-cause analysis.Collaborative spirit: You can explain complex infrastructure decisions to non-technical stakeholders.
No requirement for relevant working experience
Logo of 彼特思方舟.
About BTSE:彼特思方舟 is a specialized service provider dedicated to delivering a full spectrum of front-office and back-office support solutions, each of which are tailored to the unique needs of global financial technology firms. 彼特思方舟 is engaged by BTSE Group to offer several key positions, enabling the delivery of cutting-edge technology and tailored solutions that meet the evolving demands of the fintech industry in a competitive global market.BTSE Group is a leading global fintech and blockchain company that is committed to building innovative technology and infrastructure. BTSE empowers businesses and corporate clients with the advanced tools they need to excel in a rapidly evolving and competitive market. BTSE has pioneered numerous trading technologies that have been widely adopted across the industry, setting new benchmarks for innovation, performance, and security in fintech. BTSE’s diverse business lines serve both retail (B2C) customers and institutional (B2B) clients, enabling them to launch, operate, and scale fintech businesses. BTSE is seeking ambitious, motivated professionals to join our B2C and B2B teams.About the Opportunity:We are seeking an experienced and visionary Senior Manager / Associate Director to lead our high-performing Cloud Engineering team. In this role, you will drive the design, implementation, and ongoing optimization of a reliable, scalable, secure, and high-performance cloud infrastructure that supports our mission-critical applications and services. You will collaborate closely with cross-functional teams to ensure our cloud environment meets the evolving needs of the business while adhering to best practices and compliance standards.Responsibilities:Leadership StrategyLead, mentor, and inspire a team of skilled cloud engineers, fostering a culture of innovation, accountability, and continuous improvement.Define and execute the cloud infrastructure strategy in alignment with organizational goals.Cloud Architecture OperationsOversee the design, deployment, and maintenance of cloud infrastructure to ensure high availability, performance, and security.Implement automation, monitoring, and optimization strategies to improve operational efficiency.Collaboration Stakeholder ManagementPartner with RD, security, SRE, and product teams to deliver integrated, reliable solutions.Communicate technical concepts effectively to both technical and non-technical stakeholders.Governance ComplianceEnsure adherence to security, compliance, and regulatory requirements.Establish and maintain cloud governance frameworks, policies, and best practices.Requirement:Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred).8+ years of experience in cloud infrastructure engineering, with at least 3 years in a leadership role.Proven expertise in public cloud platforms, especially AWS.Strong understanding of networking, security, automation, and infrastructure-as-code (IaC) tools (e.g., Terraform, Ansible).Deep knowledge of CI/CD, DevOps, and SRE concepts and implementation.Excellent communication, leadership, and stakeholder management skills.Experience in large-scale, mission-critical environments is highly desirable.Perks BenefitsCompetitive total compensation packageVarious team building programs and company eventsComprehensive healthcare schemes for employees and dependantsAnd many more! Apply and let us tell you more!#LI-JY1
Negotiable
No requirement for relevant working experience
Logo of 彼特思方舟.
About BTSE:彼特思方舟 is a specialized service provider dedicated to delivering a full spectrum of front-office and back-office support solutions, each of which are tailored to the unique needs of global financial technology firms. 彼特思方舟 is engaged by BTSE Group to offer several key positions, enabling the delivery of cutting-edge technology and tailored solutions that meet the evolving demands of the fintech industry in a competitive global market.BTSE Group is a leading global fintech and blockchain company that is committed to building innovative technology and infrastructure. BTSE empowers businesses and corporate clients with the advanced tools they need to excel in a rapidly evolving and competitive market. BTSE has pioneered numerous trading technologies that have been widely adopted across the industry, setting new benchmarks for innovation, performance, and security in fintech. BTSE’s diverse business lines serve both retail (B2C) customers and institutional (B2B) clients, enabling them to launch, operate, and scale fintech businesses. BTSE is seeking ambitious, motivated professionals to join our B2C and B2B teams.About the Opportunity:We are looking for a seasoned and proactive Solutions Architect to lead the design, implementation, and optimization of highly robust and scalable cloud infrastructure solutions. The ideal candidate possesses deep expertise in Amazon Web Services (AWS) and a proven track record of collaborating with business stakeholders, developers, and cloud engineers to successfully execute application modernization and cloud migration projects.Responsibilities:Cloud Architecture Development: Design, develop, and implement secure, scalable, and cost-efficient cloud solutions, with a primary focus on the AWS platform.Stakeholder Collaboration: Partner with internal or external cross-functional teams to translate complex business requirements into clear, actionable technical solutions.Automation Leadership: Drive infrastructure automation initiatives using Infrastructure as Code (IaC) tools like Terraform or Ansible.DevOps Security: Establish and enforce best practices for CI/CD pipelines, comprehensive monitoring, and cloud security.Mentorship Optimization: Mentor team members on best practices in cloud architecture and continuously fine-tune systems for peak performance and cost-efficiency.Requirement:Experience: 7+ years in cloud-related roles, including at least 4+ years of hands-on experience with AWS.Programming Skills: 2+ years of experience in programming with Python.Technical Depth:Strong expertise in AWS services, architecture patterns, and best practices.Proficiency in Terraform for Infrastructure as Code.Solid understanding of system monitoring, logging, and alerting tools.In-depth knowledge of DevOps practices, CI/CD pipelines, GitOps workflows, and containerization technologies (Docker, Kubernetes, Helm, ArgoCD, Istio).General Skills:Familiarity with networking concepts and advanced security best practices.Excellent problem-solving and communication skills.A proactive, initiative-taking approach to driving projects forward.Fluent in English and Mandarin for effective partner communication.Nice to HaveAWS certifications (e.g., AWS Certified Solutions Architect – Professional).Experience with hybrid cloud environments and multi-cloud strategies.Experience integrating and working with monitoring and tracing tools (e.g., CloudWatch, Prometheus, tracing frameworks).Perks BenefitsCompetitive total compensation packageVarious team building programs and company eventsComprehensive healthcare schemes for employees and dependantsAnd many more! Apply and let us tell you more!#LI-JY1
Negotiable
No requirement for relevant working experience
Logo of 彼特思方舟.
About BTSE:彼特思方舟 is a specialized service provider dedicated to delivering a full spectrum of front-office and back-office support solutions, each of which are tailored to the unique needs of global financial technology firms. 彼特思方舟 is engaged by BTSE Group to offer several key positions, enabling the delivery of cutting-edge technology and tailored solutions that meet the evolving demands of the fintech industry in a competitive global market.BTSE Group is a leading global fintech and blockchain company that is committed to building innovative technology and infrastructure. BTSE empowers businesses and corporate clients with the advanced tools they need to excel in a rapidly evolving and competitive market. BTSE has pioneered numerous trading technologies that have been widely adopted across the industry, setting new benchmarks for innovation, performance, and security in fintech. BTSE’s diverse business lines serve both retail (B2C) customers and institutional (B2B) clients, enabling them to launch, operate, and scale fintech businesses. BTSE is seeking ambitious, motivated professionals to join our B2C and B2B teams.About the opportunity:We are looking for a Senior Database Administrator with deep PostgreSQL expertise and hands-on experience automating database deployment, monitoring, and performance tuning on Data Center and AWS. You will be part of the core infrastructure team responsible for scaling and maintaining BTSE’s mission-critical database systems that power our trading and analytics platforms. This role offers the opportunity to design, automate, and optimize high-availability PostgreSQL clusters while ensuring reliability, minimal downtime, and strong observability across production environments operating 24/7.Responsibilities:Design, deploy, and maintain PostgreSQL clusters in AWS environments using Infrastructure as Code (IaC) tools.Develop automation for provisioning, configuration management, and monitoring of PostgreSQL databases.Perform regular database maintenance including vacuuming, reindexing, and analyzing statistics to ensure peak performance.Identify, analyze, and optimize slow SQL queries and workloads to improve efficiency and throughput.Build resilient, high-availability database architectures that minimize downtime during upgrades, maintenance, and failovers.Implement robust backup, recovery, and disaster recovery strategies to guarantee data integrity and uptime.Support and monitor 24/7 production systems, participating in incident response and root cause analysis when needed.Collaborate with SRE, DevOps, and engineering teams to improve scalability, performance, and CI/CD integration for database operations.Requirements:At least 4+ years of experience managing PostgreSQL databases in production environments.Strong knowledge of SQL query optimization, indexing strategies, and PostgreSQL performance tuning.Proven ability to design and maintain highly available, fault-tolerant systems with minimal or zero downtime.Experience supporting mission-critical 24/7 systems in production, with on-call or incident response participation.Hands-on experience with AWS services such as EC2, RDS, EBS, S3, and IAM.Solid understanding of PostgreSQL replication, PITR (Point-in-Time Recovery), and HA solutions (Repmgr,Patroni, ProxySQL, or similar).Experience automating infrastructure using Terraform, Ansible, or similar tools.Proficiency in Linux systems administration and scripting (Bash, Python, or Go).Nice to Haves:Familiarity with observability and alerting stacks (Prometheus, Loki, Grafana, CloudWatch).Exposure to additional database systems such as ClickHouse or TimeScale Postgres.Experience building internal automation or self-healing tools for query analysis and database management.Perks BenefitsCompetitive total compensation packageVarious team building programs and company eventsComprehensive healthcare schemes for employees and dependantsAnd many more! Apply and let us tell you more!#LI-MC1
Negotiable
No requirement for relevant working experience
Logo of BitoGroup 幣託科技股份有限公司.
▍團隊介紹:你的未來夥伴我們是一支技術驅動的核心維運團隊,成員背景涵蓋分散式系統專家與雲端架構師。幣託,SRE 不只是維運者,更是系統穩定性的守護者與架構優化的推動者。我們與開發團隊並肩作戰,擁有系統上線的「穩定性否決權」。 我們目前的技術生態系包含:- Infrastructure: AWS(EKS, Lambda, CloudFront…)、GCP- IaC: Terraform / CDK / Helm- Observability: Prometheus, Grafana, ELK Stack, OpenSearch, OpenTelemertry- CI/CD: Gitlab CI/CD / Jenkins / ArgoCD▍團隊文化:我們如何工作- 無責備文化與安全誠實(Blame-free Security Honesty)我們深信錯誤是系統進化的動力。發生事故時,我們不找代罪羔羊,而是透過透明的事後檢討找出根因。同時,我們強調「異常操作即時通報」:我們不責備技術失誤,但要求對任何疑似安全破口或異常操作保持透明,確保團隊能第一時間共同修補安全風險。- 拒絕「雜務」(Eliminating Toil)我們討厭無意義的手動操作。團隊嚴格落實雜務(Toil)比例控制,我們鼓勵你「因懶惰而進化」,將 50% 以上的時間投入在開發自動化工具與系統設計。- 以數據驅動決策(Data-Driven Decisions)穩定性不靠感覺。我們透過精準的 SLI / SLO / SLA 來定義系統健康度,並善用 Error Budget 權衡業務開發速度與系統可靠性。- 工程化維運 (Software Engineering Approach)我們以「開發軟體」的思維來管理基礎設施。所有的變更都必須經過 GitOps 流程與 Code Review,確保維運工作的品質與可追溯性。▍關於職務:你將負責的工作內容1. 系統可靠性與安全建設:設計與維護高可用雲端架構,並推動 IAM(Identity and Access Management)最小權限原則,確保生產環境存取符合安全規範。2. 機敏資訊管理(Secret Management):管理與落實企業級機敏資訊保護(如使用 HashiCorp Vault, AWS Secrets Manager),建構安全的金鑰與憑證管理流程,避免敏感資訊外洩。3. 自動化平台開發:建立與優化 CI/CD 流水線,開發內部工具來簡化部署,並在流程中嵌入自動化安全掃描。4. 全方位監控與觀測:建置觀測性系統(Logging, Metrics, Tracing),並針對異常存取行為建立自動化預警。5. 事故回應與韌性優化:參與 On-call 輪值,主導技術事後檢討;在處理系統故障的同時,確保復原流程符合資安防護要求。6. 效能與架構調優:與開發團隊協作,進行系統優化、快取機制調整及網路延遲排查。▍我們提供的福利與環境- 薪資範圍:月薪 NTD $90,000 - $150,000(依經驗與能力討論)。- 技術成長:全額補助技術購書、線上課程(Udemy/Coursera)、及國內外技術研討會門票、AI工具補助。- 工作彈性:遠距工作、彈性上下班。- 透明溝通:落實無責備文化,我們追求的是解決問題,而非指責個人。- SRE 專屬福利:每月技術分享會與技術趨勢探討(掌握最新 CNCF 動態)、專屬技術交流頻道:推播全球最新的技術新聞與 Incident 案例分析、定期部門生日會、活動、聚餐。
Prometheus
K8S
GitLab
90K ~ 150K TWD / month
3 years of experience required
No management responsibility
Logo of Cake Recruitment Consulting.
【Responsibilities】主導並持續演進以資料與平台為核心的系統架構,支援高流量即時服務與分析型工作負載。參與下一階段系統架構的設計與落地,聚焦於可擴展性、穩定性與維運效率。與後端、平台及產品工程師密切合作,將需求轉化為可在正式環境長期運行的解決方案透過改善系統可觀測性、穩定度與可維護性,降低營運成本並提升工程團隊效率。參與跨層級的技術討論與關鍵架構決策,影響整體平台技術走向。
On-Premise
Monitoring
IaC
1M ~ 2M TWD / year
3 years of experience required
No management responsibility
Logo of 蓋亞資訊有限公司.
主要職責: 1.設計並實現高可用性、可擴展且安全的Google Cloud Platform雲端架構,確保符合客戶業務需求與技術要求 2.評估現有系統架構,提供最佳化方案與雲端遷移策略 3.建立與管理核心GCP服務,包括Compute Engine、App Engine、Cloud Functions等運算服務與VPC、Cloud Interconnect、Cloud Armor等網路服務 4.設計與實施資料庫解決方案,熟悉Cloud SQL、Cloud Spanner、Firestore、BigQuery等服務的部署與最佳化 5.實施基礎設施即代碼(IaC)策略,使用Deployment Manager、Terraform等工具自動化部署流程 6.建立全面的監控與告警機制,運用Cloud Monitoring、Cloud Logging確保系統效能與安全合規,並進行故障排除與效能優化 7.設計與實施AI/ML解決方案,包括使用Vertex AI、Gemini API、Vision AI等GCP AI服務,協助客戶將AI功能整合至其應用程式與業務流程 8.開發與部署機器學習模型,使用AutoML、BigQuery ML等工具,並建立模型監控與管理流程 9.協助客戶解決技術問題,提供諮詢與支援,作為客戶與技術團隊間的溝通橋樑 10.持續關注GCP新服務與最佳實踐,不斷改進架構設計
GCP
Negotiable
1 years of experience required
No management responsibility
Logo of TSMC 台積電.
Established in 1987 and headquartered in Taiwan, TSMC pioneered the pure-play foundry business model with an exclusive focus on manufacturing its customers’ products. As of 2024, TSMC serves more than 500 customers and manufactures over 11,000 products for high-performance computing, smartphones, the Internet of Things (IoT), automotive, and digital consumer electronics. It is the world’s largest provider of logic ICs, with an annual capacity of 16 million 12-inch equivalent wafers. TSMC operates fabs in Taiwan as well as manufacturing subsidiaries in Washington State, Japan and China, and the Company began construction on a specialty technology fab in Dresden, Germany, in 2024. In Arizona, TSMC is building three fabs, with the first starting 4nm production in 2025, the second by 2028, and the third by the end of the decade.As a platform engineer, you will focus on designing, implementing, and maintaining scalable features and services on the platform to support the productionization of applications that supportthe company’s RD/Fab/Business/IT/Security functions to improve the productivity and work quality. Responsibilities: Your responsibilities include:1. Automation and Scripting(1) Develop and maintain automation scripts for configuration, monitoring, and management using tools such as Ansible and Python.(2) Transform repeatable tasks into automation tools to streamline operations and maximize efficiency.(3) Implement Infrastructure as Code (IaC) to automate resource provisioning and CI/CD workflows. 2. Application Development (1) Develop scalable cloud-native microservice architectures for IT applications.(2) Develop state-of-the-art applications and refactor existing ones to improve performance and maintainability.(3) Apply software design principles, such as 12-factor app, to ensure sustainability and quality.(4) Write and implement tests (unit/feature/integration) to guarantee software integrity. 3. Monitoring and Troubleshooting (1) Implement monitoring solutions to proactively identify and resolve network and application issues.(2) Conduct root cause analyses and apply corrective actions to troubleshoot technicalchallenges, including Linux-related systems and logs (e.g., Go code/log analysis).(3) Lead evaluation and adoption of new IT technologies for continuous improvement. 4. (Optional) Network Design and Management (1) Operate and manage network infrastructure, including LAN, WLAN, Firewall, and Proxy.(2) Design, implement, and manage scalable network architecture aligned with business goals and industry best practices. Additional information for the job: Job Location: Hsinchu Site, Taichung Site, Tainan Site, Taipei Office (Experienced Only)On-call needs: On-call 1 week every 3 months The complete interview process includes: 1. Manager interview2. Hackerrank test3. On-site personality and English test (which could be replaced if you have script of an official English test)4. HR interview5. Second manager interview (Optional assessment)6. Technical review (Optional assessment)
Negotiable
5 years of experience required
No management responsibility

Cake Job Search

Join Cake now! Search tens of thousands of job listings to find your perfect job.