Cake Job Search

Advanced filters
Off
Logo of Cake Recruitment Consulting.
我們正在尋找能從系統角度思考問題的工程師,參與建構與運行高可用服務環境。此角色會面對大規模流量與跨節點部署情境,透過自動化與觀測能力讓服務在各種情境下維持穩定。 你將與開發團隊密切合作,設計具恢復能力的架構、降低人工操作比例,並在事件發生時快速定位問題與提出改善方案。這份工作不只是維持服務運行,而是持續提升整體系統可靠度與可維運性。【工作內容】維運與規劃雲端與邊緣節點服務(包含 AWS 環境與 Kubernetes 叢集,如 EKS)建立自動化部署流程,透過 GitLab CI、ArgoCD、Helm 與 Terraform 管理交付與基礎設施建構與維護系統觀測能力,使用 Prometheus、Grafana、Loki 或 ELK 等工具進行監測與告警設計設計與調整 CDN、WAF 及 DDoS 防護策略(如 Cloudflare、CloudFront 或 Aliyun ESA)以應對異常流量參與 On-call 輪值,進行 incident 處理、RCA 分析並完善 Runbook 與 SOP
Terraform
SRE
Prometheus
800K ~ 1.2M TWD / year
3 years of experience required
No management responsibility
Logo of Pinkoi.
在 Pinkoi,工程師的任務是讓每個來 Pinkoi 的人都能擁有難以忘懷的使用體驗,你必需擁有極大的熱情,因為我們相信一個令人讚賞的網站服務,是由背後每一位工程師努力的追求卓越,每一個小細節都是 Pinkoi 非常在乎的事情!Pinkoi 致力於打造流暢且量身打造的探索與購物體驗,讓顧客輕鬆發現感興趣的商品、品牌與內容,並加深與平台的連結。同時運用創新的 AI 技術與深厚的產業知識,幫助品牌提升銷售業績、使規模快速成長,成為設計品牌邁向規模化、國際化不可或缺的重要夥伴。工作內容與挑戰維護和迭代網站底層架構,包括正式環境和開發環境,例如:CI/CD 流程、程式語言迭代、資料庫架構調整...等。協助建立與優化內部平台工具,提升開發者體驗。設計與維護監控、告警與可觀測性(logging / metrics / tracing)。產出高品質、穩定、可維護且可讀性高的程式碼,並養成良好的自動化測試習慣。能夠順暢地與各種領域的夥伴溝通合作,例如前端工程師、後端工程師、App 工程師、資料工程師等。擁有快速辨識系統問題的能力,並且能夠確實修復問題。持續改善系統架構、效能與穩定性,並樂於撰寫技術文件、分享知識,提升產品開發體驗。持續研究並導入新技術(如 Cloud Native 工具、AI 輔助開發)與創新想法,並評估其在實際環境中的落地與價值,創造更大的影響力。我們希望你有的特質充滿熱情,你想要做出世界級一流的產品。樂於學習,對於新技術抱有好奇以及開放的態度。良好溝通,能夠清晰地和 Pinkoi 的夥伴們溝通你的想法。主動積極,能發掘系統上的不足或可以更好的地方,並提出改善作法。自我要求,具備高標準、時限觀念和責任感。應徵條件有 2 年以上後端開發經驗,具備開發與維護系統的能力。具備良好的程式基礎,並樂於持續精進(Pinkoi 主要使用 Python,但我們歡迎任何程式語言的使用者來應徵)。對資料庫、資料結構以及演算法有基礎的理解,知道它們如何影響你的程式效能。對 Linux 作業系統有基礎的理解,特別是針對 Process 跟 Thread 的部分。熟悉 container 相關技術(例如:Docker、Kubernetes)。加分條件應用程式效能校調相關經驗。具備 Observability 相關經驗(Prometheus, Grafana, OpenTelemetry)。熟悉使用如 MySQL、Elasticsearch、Redis 等常見服務,並了解在設計與開發中需針對其分散式特性(如一致性、可用性、效能瓶頸)做出相應考量。熟悉以 LLM 工具輔助開發,並整合至日常工作中以提升效率與品質。熟悉雲端平台與基礎設施維運(AWS / GCP / Kubernetes / CI/CD pipeline)。熟悉 Cloud Native, Kubernetes 生態系(Helm, ArgoCD, Operators 等)。#不用想了,趕快來當個 Pinkoi 人如果你有信心能夠勝任這份工作,歡迎提供你的個人履歷及小作業回答,即刻應徵!
800K ~ 1.6M TWD / year
No requirement for relevant working experience
Logo of InAddition Consultants Ltd..
基礎設施代碼化 (IaC):維護與優化基於雲端的基礎設施(主要為 AWS 環境),利用容器化技術 (Docker/K8s) 實現彈性擴展的部署架構。可觀測性工程:設計並維護全方位的監控體系(使用 Prometheus / Grafana),不僅是設置告警,更需建立完整的日誌分析 (Log Pipeline) 與追蹤機制。事故管理與回顧:主導線上事故的排查 (Troubleshooting) 與根本原因分析 (RCA),並撰寫事後檢討報告 (Post-mortem) 以預防問題再次發生。效能與容量規劃:持續分析系統瓶頸,針對資料庫與應用服務進行架構優化與效能調校 (Performance Tuning)。自動化維運:透過程式腳本 (Python / Shell) 開發自動化工具,並優化 CI/CD 流程以提升交付效率。
1M ~ 1.2M TWD / year
3 years of experience required
No management responsibility
Logo of 新芽網路股份有限公司.
❗️投遞履歷請一律至專屬的職缺網頁:https://25sprout.teamdoor.io/s/ML8ElGFS 目前此職缺為常態徵才,直接透過 Cake平台投遞將不會回覆唷 我們正在尋找一位 Mid-level SRE(Site Reliability Engineer),成為團隊的可靠後盾。你的任務是確保系統穩定運行、雲端環境高效管理、流程持續自動化,讓用戶體驗更順暢、工程師開發更專注。如果你熱愛新技術,喜歡動手解決問題,也樂於與不同角色協作,歡迎加入我們一起:) ▍你的工作將包括: Linux 作業系統管理與維運(RedHat / Debian / Ubuntu 等)網站/應用環境建置與維護(LAMP / LNMP)CI/CD 流程整合與最佳化(Jenkins / GitLab CI/CD)憑證、金鑰與機密管理(SSL/TLS、Vault 等)雲端平台資源管理(AWS EC2 / S3 / RDS、Azure、GCP 等)建置監控與告警系統,確保服務高可用性(Prometheus / Grafana / ELK)自動化工具與基礎架構即程式碼導入(Terraform / Ansible / CloudFormation)
60K ~ 70K TWD / month
3 years of experience required
No management responsibility
Logo of Cake Recruitment Consulting.
公司介紹: 這是一家在全球運動娛樂領域名列前三的跨國科技公司,研發中心設於台北,團隊遍及歐洲、東南亞與美加等地。產品主打即時性、高併發與高穩定的交易平台,服務超過數百萬用戶。 公司文化開放且重視技術,鼓勵工程師擁有設計自主權與技術決策力,團隊中不乏來自知名外商與頂尖科技公司的成員。以「穩定獲利+高彈性」聞名,在疫情期間仍持續發放獎金。 工作內容: 負責核心後端系統開發、架構設計與效能優化 與產品團隊協作,設計新功能與使用者體驗改善方案 維護高併發微服務系統,確保穩定性與擴展性 撰寫技術文件、參與 code review 並導入最佳實踐 指導初階工程師並優化團隊開發流程 使用技術: Java 8 / Spring Boot / Spring Cloud MySQL (Aurora, RDS), Redis, Kafka, RabbitMQ Docker, Kubernetes, Jenkins, ArgoCD Elasticsearch, Prometheus, Grafana, CloudWatch
muti-thread
Kafka
Java
1.5M ~ 2.3M TWD / year
7 years of experience required
No management responsibility
Logo of 彼特思方舟.
About BTSE:彼特思方舟 is a specialized service provider dedicated to delivering a full spectrum of front-office and back-office support solutions, each of which are tailored to the unique needs of global financial technology firms. 彼特思方舟 is engaged by BTSE Group to offer several key positions, enabling the delivery of cutting-edge technology and tailored solutions that meet the evolving demands of the fintech industry in a competitive global market.BTSE Group is a leading global fintech and blockchain company that is committed to building innovative technology and infrastructure. BTSE empowers businesses and corporate clients with the advanced tools they need to excel in a rapidly evolving and competitive market. BTSE has pioneered numerous trading technologies that have been widely adopted across the industry, setting new benchmarks for innovation, performance, and security in fintech. BTSE’s diverse business lines serve both retail (B2C) customers and institutional (B2B) clients, enabling them to launch, operate, and scale fintech businesses. BTSE is seeking ambitious, motivated professionals to join our B2C and B2B teams.About the Opportunity:We are looking for a Senior Infrastructure Cloud Engineer to design, build, and maintain robust AWS-based cloud infrastructure that powers our mission-critical systems. You will champion Infrastructure as Code, automation, and modern DevOps practices to deliver scalable, reliable, and secure cloud environments. This is a hands-on technical role where you’ll collaborate across teams to drive operational excellence and innovation in our cloud platform.Responsibilities:Architect Build: Design and implement secure, scalable, and automated AWS infrastructure solutions.CI/CD Excellence: Maintain and enhance CI/CD pipelines using GitLab CI, ArgoCD, and related tools.Observability: Build and manage monitoring and logging stacks (e.g., CloudWatch, EFK, Prometheus, Grafana) to ensure system health and performance.Automation: Automate provisioning and configuration using Terraform/Terragrunt, Ansible, and GitOps workflows.Governance Resilience: Implement cloud governance, backup, disaster recovery, and cost optimization strategies.Collaboration: Partner with application and platform teams to support cloud adoption and ensure high availability.Incident Response: Troubleshoot incidents, perform root cause analysis, and drive continuous improvement.Documentation: Maintain clear, up-to-date architecture, operations, and automation documentation.Requirement:5+ years in cloud engineering, DevOps, or SRE roles.3+ years of hands-on AWS experience.Proficiency with Infrastructure-as-Code tools (Terraform, Ansible).Strong background in containerization and Kubernetes (EKS preferred).Solid experience with CI/CD and GitOps workflows (GitLab CI, ArgoCD).Hands-on experience with observability tools (EFK, Prometheus, Grafana).Strong understanding of cloud networking, IAM, and security best practices.Perks BenefitsCompetitive total compensation packageVarious team building programs and company eventsComprehensive healthcare schemes for employees and dependantsAnd many more! Apply and let us tell you more!#LI-JY1
Negotiable
No requirement for relevant working experience
Logo of 55688集團_台灣智慧生活網股份有限公司.
我們從計程車叫車 App 出發,55688 App 已突破 720 萬會員、累積超過 100 萬次下載,並維持 4.8 星高評價。隨著服務擴展至快遞、找專家、洗衣等生活服務,我們正朝向能承載高即時流量與高可靠度需求的 Super App 邁進。 目前團隊已具備研發與第一線維運人員,正在建立 SRE(可靠度工程)能力,希望邀請對系統穩定性、工程化改善有熱情的工程師,一起把基礎打好、制度建起來。 一、職務定位 1. 負責維持系統在 7x24x365 營運模式下的穩定性、可用性與可擴展性,透過工程化方式降低事故發生率、縮短復原時間,並建立自動化、標準化的部署與維運流程,使系統能安全、快速、可預期地持續交付。同時與研發工程師密切合作,將穩定度、可維運性與交付能力內建於產品開發流程中。 2. 這是一個 SRE / DevOps 的探索與建設角色(0→1),我們不期待你一來就建立完整 SRE 體系,能與團隊逐步建立可靠度工程的基礎能力與共識。 二、Incident / on-call 分工說明 1. L1(第一線)即時應變:由維運人員負責。 2. 本職位為 L2 on-call 支援角色,專注在可靠度與穩定性。 3. 核心價值在於: * 事後改善。 * 制度建立。 * 用工程方式降低事故發生率與影響範圍。 三、你會做的事(工作內容) (一) SRE(可靠度工程|L2) 1. 與團隊一起盤點關鍵服務,逐步導入服務可靠度目標: * Service Level Agreement * Service Level Objective * Error Budget 2. 協助設計與改善系統架構: * 高可用架構(Load Balancer、Auto Scaling、Failover)。 * 健康檢查與自動復原機制。 3. 進行容量規劃與壓力評估: * Capacity Planning。 * 事前評估壅塞與資源不足風險。 4. 建立與優化可觀測性(Observability): * Metrics(CPU、Memory、QPS、Latency、Error Rate) * Logs(集中化日誌) * Tracing(分散式追蹤) 5. 設計合理告警策略: * 避免大量無效或過度頻繁告警。 * 讓告警更貼近實際風險與業務影響。 6. 參與 L2 on-call 支援: * 協助分析系統性問題與 Root Cause。 * 評估是否需要: a. 回滾版本。 b. 降級服務。 c. 進行跨系統處置。 7. 主導或協助完成 Incident Report 與 Postmortem: * 系統性整理事故過程與影響。 * 將每一次事故轉化為具體改善行動與制度。 * 追蹤改善措施的落實情況。 (二) DevOps 1. 建立與維護 CI/CD Pipeline: * 例如 Jenkins、GitLab CI、GitHub Actions。 * 確保流程穩定、可重複且易維護。 2. 將以下流程自動化,降低人工操作風險: * Build。 * Test。 * Security Scan。 * Deploy。 3. 支援多環境的一致性與部署效率: * Dev 環境。 * Staging 環境。 * Production 環境。 4. 導入 Infrastructure as Code: * 例如 Terraform。 * 提升環境管理與佈署的可重現性與可追蹤性。 5. 建立與完善發布與回復機制 6. 與 QA、RD 協作: * 透過流程與工具設計降低發版風險。 * 在速度與穩定之間取得平衡。 (三) 與研發與維運團隊協作 1. 與 RD 協作,將穩定度與可觀測性納入開發流程,例如: * 設計 Health Check 機制,讓系統狀態可被自動偵測與監控。 * 規劃服務降級與備援設計,確保在部分功能異常時,核心流程仍可運作。 * 持續消除單點故障(SPOF),提升整體架構的高可用性。 2. 提供標準化平台能力,讓各產品團隊能共用: * CI/CD Pipeline 範本。 * 監控標準模組。 * 告警標準規則。 3. 與研發與維運團隊共同建立基礎 SRE 實踐: * Incident handling 流程: a. 通報。 b. 應變。 c. 復原。 * Runbook 撰寫與持續改善: a. 讓常見情境有標準作業手冊可依循。 * 基本 SLO / Error Budget 導入與追蹤。 4. 透過文件、分享與實務協作: * 提升團隊對 SRE 思維與方法的理解。 * 建立跨團隊對穩定度的共同語言與共識。 四、我們期待你具備的條件 (一) 必備條件 1. 3–5 年以上 DevOps 或 SRE 相關實務經驗。 2. 熟悉作業系統與網路基礎: * TCP/IP。 * DNS。 * HTTP。 * Load Balancer 等相關概念。 3. 熟悉至少一種雲端平台: * 例如 GCP 或 Azure。 4. 熟悉容器與編排技術 5. 具備 CI/CD Pipeline 建置或維護經驗。 6. 熟悉或曾接觸 Observability 工具,例如: * Prometheus / Grafana。 * ELK(Elasticsearch / Logstash / Kibana)。 * Datadog。 * OpenTelemetry 等。 7. 能配合 L2 on-call 支援: * 接受輪值制度。 * 願意以工程方式持續降低 on-call 負擔與頻率。 8.具領導資淺同仁、指派工作經驗,協同完成工作內容。 (二) 加分條件 1. 有即時高流量系統經驗(即時服務、電商、金流)。 2. 具效能調校、容量規劃或壓力測試實務經驗。 3. 具雲端或平台資安實務經驗,例如: * 權限設計。 * 資安防護。 * 合規與稽核相關經驗。 這不是一個「只是在前線救火」的職位, 而是一個能與團隊一起把 SRE 能力與制度從 0 建起來的角色。 如果你喜歡把混亂變成秩序、 把事故變成制度、 把人力應變變成工程化改善, 我們會很期待和你聊聊。
50K ~ 80K TWD / month
5 years of experience required
No management responsibility
Logo of MoMo.
The MoMo Recommendation Platform is a complex system that powers personalized experiences for millions of users using a diverse range of technologies.We’re looking for a Senior Software Engineer with strong system thinking, architecture design skills, and a product mindset to help build the MLOps platform that transforms any AI/ML solutions into production-grade systems at scale.Mô tả công việcThink like a product engineer: you don’t just “code a solution” – you build a platform that empowers others to deliver intelligent sysDesign and develop a flexible platform that turns AI/ML solutions into production-ready systems: microservices, batch pipelines, or real-time APIsBuild infrastructure to support:Model training pipelinesPackaging deploymentServing rolloutMonitoring alertingCollaborate closely with Data Scientists, Business, and Product teams to deeply understand requirements and design adaptable, scalable solutionsIntegrate platform components into MoMo’s broader infrastructure: promotion engine, A/B testing, analytics, real-time scoring, etc.Yêu cầu công việcMust-Have5+ years of experience in software development, system architecture, or backend/platform engineeringProficiency in one or more of the following: Python, Bash, C++, JavaScript, Java, or GoStrong problem-solving skills and teamwork spiritExperience with:Platform Deployment: Kubernetes, Helm, Argo CD, Argo Rollouts, Docker, Google Cloud Platform (GCP) or Amazon Web Services (AWS)Serving APIs: FastAPI, gRPC, MLflow, KServe, custom logic services, REST APIsData Messaging: BigQuery, Redis, MongoDB, PostgreSQL, Oracle, MySQL, Kafka, Pub/SubOrchestration Workflow: Airflow, Argo WorkflowsCI/CD Monitoring: GitHub Actions, Prometheus, GrafanaData Sources: App event streams, relational databases, messaging systems, APIsSolid understanding of distributed systems and cloud-native architectureAbility to design systems that support diverse solution typesPlatform mindset: you build for stability, scalability, and long-term maintainabilityStrong communication and collaboration skills – able to work cross-functionally with Data Scientists, DevOps, and Product teamsNice-to-HaveExperience working with both AI/MLExperience scaling low-latency / real-time systemsFamiliarity with A/B testing, canary release, and shadow deployment strategiesProduct-oriented mindset: you build systems that others can easily adopt and extend
No requirement for relevant working experience
Logo of Cake Recruitment Consulting.
🚀 Company Overview Our client is a US-based technology company operating several large-scale global platforms. The products serve users from different cultures and regions, with millions of active members worldwide. The engineering team focuses on building high-traffic, high-availability systems and continues to upgrade the platform with AI-driven solutions. The company has a mature remote-work culture and values ownership, system reliability, and long-term technical growth. This role offers the opportunity to: Work on global-scale products Solve distributed system and high-concurrency challenges Apply AI in real production environments Participate in real architecture and technical decisions If you are looking for both technical depth and real impact, this could be a great next step in your career. 🧩 Responsibilities Full Stack Product Development Build and maintain end-to-end features across backend and frontend systems Design clean APIs and scalable data models Collaborate with product and design teams to deliver stable and user-friendly features AI Pipeline Data Systems Build and improve AI data processing and serving pipelines Evaluate and integrate new databases, including search and vector databases Optimize data retrieval, ranking, and performance Live Streaming System Reliability Troubleshoot issues across ingest, transcoding, delivery, and playback Improve system observability and platform health monitoring Develop tools to prevent repeated production incidents Reporting Monitoring Build dashboards and stability reports for uptime, latency, and errors Improve alerting systems with clear and actionable signals Use AI tools for anomaly detection and incident analysis 🛠 Tech Stack Backend: Python / Java / Node.js Frontend: React Databases: Postgres / MySQL / Redis / NoSQL Search Vector DB: OpenSearch / Elasticsearch / Qdrant Streaming: RTMP / HLS / WebRTC / FFmpeg Observability: Grafana / Kibana / Datadog / Prometheus Cloud: AWS / GCP / Azure DevOps: Docker / Kubernetes / CI/CD
React
Python
FullStack Development
1.2M ~ 1.7M TWD / year
5 years of experience required
No management responsibility
Logo of 彼特思方舟.
About BTSE:彼特思方舟 is a specialized service provider dedicated to delivering a full spectrum of front-office and back-office support solutions, each of which are tailored to the unique needs of global financial technology firms. 彼特思方舟 is engaged by BTSE Group to offer several key positions, enabling the delivery of cutting-edge technology and tailored solutions that meet the evolving demands of the fintech industry in a competitive global market.BTSE Group is a leading global fintech and blockchain company that is committed to building innovative technology and infrastructure. BTSE empowers businesses and corporate clients with the advanced tools they need to excel in a rapidly evolving and competitive market. BTSE has pioneered numerous trading technologies that have been widely adopted across the industry, setting new benchmarks for innovation, performance, and security in fintech. BTSE’s diverse business lines serve both retail (B2C) customers and institutional (B2B) clients, enabling them to launch, operate, and scale fintech businesses. BTSE is seeking ambitious, motivated professionals to join our B2C and B2B teams.About the Opportunity:We are looking for talented and motivated individuals to join a dedicated and diverse team in building up an exciting new product. You will be working with our crypto backend expert team and cooperating with resourceful PJM PDM teams. We offer a fantastic culture and great career prospects.ResponsibilitiesCollaborate on design, development, and unit testing of new featuresDesign data structures and apply design patterns for business logicEnsure production systems’ reliability: monitoring, alerting, incident responseConduct root cause analysis and post-mortem reviews for production issuesOptimize system performance, scalability, and capacity planningWork with SRE/DevOps on observability (e.g., Prometheus, Grafana, ELK)Define and monitor SLOs/SLIs and act on breachesWrite runbooks, playbooks, and disaster recovery guidesRequirementsStrong Java skills, 10+ years’ experienceProficient with SpringBoot for Java, Ktor for KotlinExperience with event-driven systems (MQ, Kafka)Familiarity with Redis cachingSolid knowledge of data structures and Java concurrencyCrypto/blockchain/Web3/trading experience is a plusSQL and relational database proficiencyPositive, adaptable, fast‑paced environment mindsetStrong communicator, independent contributor, great problem‑solverPerks BenefitsCompetitive total compensation packageVarious team building programs and company eventsComprehensive healthcare schemes for employees and dependentsAnd many more! Apply and let us tell you more!#LI-MC1
Negotiable
No requirement for relevant working experience

Cake Job Search

Join Cake now! Search tens of thousands of job listings to find your perfect job.