Cake 找工作

進階搜尋
Off
中高階
Job Summary: As a Machine Learning Engineer, your responsibilities will include designing, developing, and deploying machine learning models and algorithms to address complex challenges and improve our products and services. Additionally, you will play a key role in AI-enhanced customer service. You will collaborate closely with data scientists, software engineers, and domain experts to implement state-of-the-art machine learning solutions. As a Machine Learning Engineer, your responsibilities will include designing, developing, and deploying machine learning models and algorithms to address complex challenges and improve our products and services. Additionally, you will play a key role in AI-enhanced customer service. You will collaborate closely with data scientists, software engineers, and domain experts to implement state-of-the-art machine learning solutions. Key Responsibilities: Model Development: Design, build, and maintain scalable machine learning models and algorithms.Data Analysis: Analyze and preprocess data from various sources to prepare it for model training.Model Training Evaluation: Train, validate, and tune machine learning models to achieve optimal performance.Deployment: Deploy models into production and integrate them with existing systems.Monitoring Maintenance: Monitor model performance in production and update models as necessary to ensure they remain accurate and relevant.Collaboration: Work with cross-functional teams, including data scientists, software engineers, and product managers, to understand requirements and deliver solutions.Research: Stay current with the latest developments in machine learning and AI to continuously improve our technology stack.
TENSERFLOW
SQL
Spark
8000 ~ 1.7萬 MYR / 月
需具備 4 年以上工作經驗
不需負擔管理責任
This role requires you to work in a shift pattern or non-standard work hours as required. This may include weekend work.Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Pune, Maharashtra, India; Bengaluru, Karnataka, India.Minimum qualifications: Bachelor's degree in Computer Science, Engineering, Mathematics, a related technical field, or equivalent practical experience. 2 years of experience in a technical role such as technical support, software engineering, or solutions engineering. Experience coding in one or more general purpose languages (e.g., Python, Java, Go, C or C++) including data structures, algorithms. Experience with Artificial Intelligence (AI) concepts and Machine Learning (ML) techniques. Experience with computer networking (e.g., TCP/IP, DNS, load balancing, routing) and Linux/Unix system administration. Preferred qualifications: Professional-level certification on Google Cloud, such as the Professional Machine Learning Engineer or Professional Cloud Architect. Experience with Google Cloud's AI/ML product portfolio, including Vertex AI (Vertex AI Workbench, Pipelines, Endpoints, TensorBoard) and Generative AI tools (Gemini, Gen AI Studio). Experience in specialized ML areas like Natural Language Processing (NLP), Computer Vision, or Recommendation System. Experience with public cloud infrastructure and core services (e.g., Compute Engine, Cloud Storage, BigQuery). Knowledge of ML frameworks such as TensorFlow, Keras, or PyTorch. About the jobThe Google Cloud Platform team helps customers transform and build what's next for their business — all with technology built in the cloud. Our products are developed for security, reliability and scalability, running the full stack from infrastructure to applications to devices and hardware. Our teams are dedicated to helping our customers — developers, small and large businesses, educational institutions and government agencies — see the benefits of our technology come to life. As part of an entrepreneurial team in this rapidly growing business, you will play a key role in understanding the needs of our customers and help shape the future of businesses of all sizes use technology to connect with customers, employees and partners. As a Technical Solutions Engineer, you will own our large and important customer issues in addition to providing level two support to our other support teams. You will be a part of a global team that provides 24x7 support to help customers seamlessly make the switch to Google Cloud.In this role, you will troubleshoot technical problems for customers with a mix of debugging, networking, system administration, updating documentation, and when needed, coding/scripting. You will make our products easier to adopt and use by making improvements to the product, tools, processes and documentation. You will help drive the success of Google Cloud by understanding and advocating for our customers issues.Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.Responsibilities Troubleshoot and resolve technical issues across the Google Cloud AI/ML portfolio, focusing on customer-reported, deployment failures, model performance degradation and infrastructure-related problems. Work directly with customers on their ML deployments, including generative AI models to ensure production readiness and high availability. Utilize coding and scripting skills (primarily Python) to read, debug, and reproduce customer issues within their ML models (TensorFlow, PyTorch) or deployment environments (Kubernetes, Compute Engine). Manage customer problems through effective diagnosis, clear documentation and the development, implementation of new investigation tools to increase diagnostic speed. Develop an understanding of Google Cloud's AI/ML solutions and share this knowledge to upskill the wider global support organization. Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
Fullstack in Computer Vision and NLPSupport building, testing, and deploying machine learning models and algorithms to enhance user experience.Build the prompting and model orchestration for a production application backed by a language modelDesign and build agent affordances that unlock new capabilities for internal use and deployed productsDesign and build a novel eval that measures how many agents interact in groups to solve problemsAssist with the development of AI-powered assistant bots that automate workflows.Mô tả công việcDevelop and test AI-driven features in collaboration with the AI engineering team,Support the deployment and maintenance of machine learning models and ensure their effectiveness in production environments.Continuously monitor AI systems and suggest improvements based on user feedback and system performance.Work in an agile environment, participating in sprint planning, development, and testing cycles.Yêu cầu công việcBachelor’s degree in Computer Science, Engineering, Mathematics, Data Science, or a related field.Basic understanding of AI/ML concepts and experience with machine learning frameworks such as TensorFlow, PyTorch, or Scikit-learn.Have experience developing complex agentic systems using LLMsHave spent time prompting and/or building products with language modelsHave good communication skills and an interest in working with other researchers on difficult tasksExperience building LLM-based agents with frameworks like LangChain, LangGraph, monitoring tools such as LangSmithStrong understanding of agent system design, including orchestration, context/memory management, tool usage, and multi-step reasoningExperience building and operating agent lifecycle systems: job scheduling, execution pipelines, state management, and failure/retry handlingUnderstand stateless vs stateful agent architectures and their trade-offs (scalability, latency, memory persistence)Familiarity with data manipulation and analysis tools (e.g., Pandas, NumPy).Familiarity with cloud platforms (AWS, GCP, Azure) is a plus.Strong analytical and problem-solving skills, with a willingness to learn and adapt to new technologies and challenges.Good communication skills and the ability to work collaboratively within a teamExperience with project about Computer vision and NLP
This role requires you to work in a shift pattern or non-standard work hours as required. This may include weekend work.Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Bengaluru, Karnataka, India; Pune, Maharashtra, India.Minimum qualifications: Bachelor's degree in Computer Science, Engineering, Mathematics, a related technical field, or equivalent practical experience. 5 years of experience in a technical role such as Technical Support, Software Engineering, or Solutions Engineering. Experience coding in one or more general purpose languages (e.g., Python, Java, Go, C or C++) including data structures, algorithms, and software design. Experience with Artificial Intelligence (AI) concepts and Machine Learning (ML) techniques. Experience with computer networking (e.g., TCP/IP, DNS, Load Balancing, routing) and Linux/Unix system administration. Preferred qualifications: Professional-level certification on Google Cloud, such as the Professional Machine Learning Engineer or Professional Cloud Architect. Experience with Google Cloud's AI/ML product portfolio, including Vertex AI (Vertex AI Workbench, Pipelines, Endpoints, TensorBoard) and Generative AI tools (Gemini, Gen AI Studio). Experience in specialized ML areas like Natural Language Processing (NLP), Computer Vision, or Recommendation System. Experience with public cloud infrastructure and core services (e.g., Compute Engine, Cloud Storage,BigQuery). Knowledge of ML frameworks such as TensorFlow, Keras, or PyTorch. Ability to lead the design and implementation of AI-based solutions or debugging tools, demonstrating strong collaborating skills. About the jobThe Google Cloud Platform team helps customers transform and build what's next for their business — all with technology built in the cloud. Our products are developed for security, reliability and scalability, running the full stack from infrastructure to applications to devices and hardware. Our teams are dedicated to helping our customers — developers, small and large businesses, educational institutions and government agencies — see the benefits of our technology come to life. As part of an entrepreneurial team in this rapidly growing business, you will play a key role in understanding the needs of our customers and help shape the future of businesses of all sizes use technology to connect with customers, employees and partners. Our Technical Solutions Engineers lead and own our large and important customer issues in addition to providing level two support to our other support teams. You will be a part of a global team that provides 24x7 support to help customers seamlessly make the switch to Google Cloud.In this role, you will troubleshoot technical problems for customers with a mix of debugging, networking, system administration, updating documentation, and when needed, coding/scripting. You will make our products easier to adopt and use by making improvements to the product, tools, processes and documentation. Our Technical Solutions team is driven by customers and you will help drive the success of Google Cloud by understanding and advocating for our customers’ issues. Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.Responsibilities Troubleshoot and resolve highly technical issues across the Google Cloud AI/ML portfolio, focusing on customer-reported , deployment failures, model performance degradation and infrastructure-related problems. Work directly with customers on their ML deployments (including Generative AI models)to ensure production readiness,high availability. Utilize coding and scripting skills (primarily Python) to read,debug, and reproduce customer issues within their ML models (TensorFlow, PyTorch) or deployment environments(Kubernetes, Compute Engine). Manage customer problems through effective diagnosis,clear documentation and the development/implementation of new investigation tools to increase diagnostic speed. Develop an in-depth understanding of Google Cloud's AI/ML solutions and share this knowledge to upskill the wider global support organization. Participate in an on-call rotation, may include working non-standard hours,nights,or weekends as part of our global 24/7 support model. Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
MoMo is Vietnam’s leading financial super-app, redefining how millions manage their money through AI-driven innovation. Our Big Data AI Team doesn’t just support the product—we are the product. From hyper-personalization and eKYC to fraud detection, AI is the heartbeat of MoMo.As a Senior AI Engineer, you will lead the evolution of our Generative AI ecosystem. You won’t just be prompting LLMs; you’ll be architecting sophisticated Agentic workflows and productionizing state-of-the-art models that serve millions of users in real-time.Mô tả công việcLead Architecture: Design and deploy scalable Agentic frameworks and Generative AI solutions that integrate seamlessly into the MoMo ecosystem;Production-Grade AI: Build and maintain robust LLM-based products using SOTA techniques (RAG, fine-tuning, and prompt orchestration) and open-source libraries;Cross-Functional Leadership: Partner with Data Scientists, Backend Engineers, and Product Managers to bridge the gap between experimental models and high-availability production systems;Engineering Excellence: Write clean, high-performance production code and establish best practices for LLM Ops (CI/CD for models, versioning, and monitoring);Evaluation Optimization: Define rigorous evaluation metrics (faithfulness, relevancy, latency) and conduct A/B experiments to iterate on model performance;Scale: Optimize AI systems to handle high-concurrency traffic while maintaining low latency.Yêu cầu công việcExperience: 3+ years in professional software/AI engineering, with a deep focus on Voice Agentic and Generative AI;LLM Mastery: Hands-on experience with model families like Llama 3, Qwen 2, GPT-4, and BERT. You understand their architectures, tokenization nuances, and limitations;System Design: Proven track record of building Agentic systems or Voice-AI from scratch—taking them from preprocessing to real-world drift monitoring;Tech Stack: Mastery of Python and deep learning frameworks (PyTorch or TensorFlow). Experience with vector databases (Milvus, Pinecone, or similar) is a plus;Mindset: A product-oriented engineer who cares about the "Why" as much as the "How."
MoMo is the market leader in mobile payments in Vietnam, striving to make all transactions fast, easy, and joyful. You will join our Big Data AI team, where we position AI/Machine Learning as the core component of almost every product feature.Specifically, you will operate as a key technical leader in the Moni team—the squad behind MoMo's flagship AI Assistant. Moni currently serves hundreds of thousands of Monthly Active Users, scaling from a chatbot into a fully autonomous AI Agent. As a Senior / Technical Lead, you will drive the architectural decisions and engineering standards that power the next generation of our Agentic AI.Mô tả công việcTechnical Leadership Architecture:Define the technical vision and architecture for autonomous AI Agents.Make critical decisions on tech stacks, model selection, and system design to ensure scalability and reliability.Architect Build AI Agents: Lead the end-to-end development of complex Agentic workflows (Tool Calling, Planning, Reasoning) that integrate deep into the MoMo ecosystem.Multi-Agent Orchestration: Design and implement orchestration layers where multiple specialized agents collaborate to solve intricate user financial tasks.Advanced RAG Strategy: Engineer robust RAG pipelines (Hybrid Search, GraphRAG, Re-ranking) to handle vast knowledge bases with high precision.System Evaluation Quality Assurance: Establish "Gold Standard" evaluation frameworks for Agentic AI (reasoning capabilities, hallucination rates, safety metrics) and drive the optimization loop.Mentorship Best Practices: Mentor senior/junior engineers, conduct code reviews, and set high standards for code quality, MLOps practices, and GenAI engineering across the team.Production Excellence: Partner with DevOps/MLOps to ensure high availability and low latency for AI services serving massive concurrent traffic.Yêu cầu công việcExperience: 5+ years of professional experience in AI/ML/Software Engineering, with a strong track record in leading technical initiatives.Agentic AI Mastery: Deep hands-on experience in building AI Agents and Multi-Agent systems. Proficient in Agentic Design Patterns like Tool Calling, Planning and Reasoning, and frameworks such as LangChain, LangGraph, or Agents SDK.Advanced RAG Search: Expert knowledge of retrieval strategies, vector databases, and semantic search optimization.LLM Model Strategy: Strong capability in selecting and benchmarking Foundation Models (Open vs. Closed source) and applying fine-tuning/alignment (RLHF, DPO) strategies.System Evaluation: Experience implementing rigorous evaluation pipelines for Agentic AI (using Ragas, Langfuse, or custom metrics).Engineering Excellence: Proficient in Python, PyTorch, and modern Data/AI stacks. Experience in designing high-load distributed systems is a plus.Leadership Mindset: Ability to navigate ambiguity, drive technical consensus, and balance engineering perfection with product delivery speed.
Google welcomes people with disabilities.Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Zhubei, Zhubei City, Hsinchu County, Taiwan; New Taipei, Banqiao District, New Taipei City, Taiwan.Minimum qualifications: Bachelor's degree in Electrical Engineering, Computer Engineering, Computer Science, a related field, or equivalent practical experience. 8 years of experience with software programming languages (C/C++, Python) and application processor development. Experience with AI/ML workloads, including prefill, decode, and multimodal processing steps in LLM (Large Language Model). Preferred qualifications: Master's degree or PhD in Electrical Engineering, Computer Engineering or Computer Science, with an emphasis on computer architecture, next generation memory systems, or AI hardware accelerators. Experience with power and performance modeling and activity profiling using traces from power measurements and performance monitoring counters. Experience influencing silicon or memory roadmaps through high-fidelity performance projections of emerging technologies. Experience with ML frameworks (e.g., PyTorch, JAX, TensorFlow). Experience with SQL for data querying and analysis. About the jobBe part of a team that pushes boundaries, developing custom silicon solutions that power the future of Google's direct-to-consumer products. You'll contribute to the innovation behind products loved by millions worldwide. Your expertise will shape the next generation of hardware experiences, delivering unparalleled performance, efficiency, and integration. As a Senior System Architect within the Silicon team, you will work on GenAI use cases across hardware and software. You will be responsible for modeling and analyzing trade-offs for on-device vs. cloud AI execution of Gemini AI models. This role is critical in influencing the hardware and software roadmaps for SOC, AI accelerator, and new memory technologies.Google's mission is to organize the world's information and make it universally accessible and useful. Our team combines the best of Google AI, Software, and Hardware to create radically helpful experiences. We research, design, and develop new technologies and hardware to make computing faster, seamless, and more powerful. We aim to make people's lives better through technology.Responsibilities Model and estimate power and performance for next-generation SoC and memory technologies. Optimize hardware and software architectures for future GenAI use cases. Measure and compare on-device AI and cloud AI to provide guidance for Hybrid AI development. Support emerging technology initiatives with alignment across silicon process, IP design, Android OS, and Gemini model teams. Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
Google welcomes people with disabilities.Minimum qualifications: Bachelor's degree in Science, Technology, Engineering, Mathematics, or equivalent practical experience. 3 years of experience in Python and a related machine learning package (e.g., Keras, PyTorch, HF Transformers). Experience in applied AI, with building systems around pre-trained models (e.g., prompt engineering, fine-tuning, Retrieval-Augmented Generation (RAG), orchestrating model interactions with external tools to deliver solutions). Experience with architecting, deploying, or managing solutions on a Cloud Platform (e.g., Google Cloud Platform). Ability to communicate in Japanese and English fluently to interact with internal and external stakeholders. Preferred qualifications: Master’s degree or PhD in AI, Computer Science, or a related technical field. Experience with implementing multi-agent systems using frameworks (e.g., LangGraph, CrewAI, or Google’s ADK) and patterns like ReAct, self-reflection, and hierarchical delegation. Knowledge of Large Language Model (LLM)-native metrics (e.g., tokens/sec, cost-per-request) and techniques for optimizing state management and granular tracing. About the jobIn this role, you will be an embedded builder who bridges the gap between frontier Artificial Intelligence (AI) products and production-grade reality within customers. You will manage blockers to production including solving the integration issues, data readiness issues, and state-management tests that prevent AI from reaching enterprise-grade maturity. You will be providing deployment of AI systems and act as a feedback loop, transforming real-world field insights into Google Cloud’s future product roadmap.It's an exciting time to join Google Cloud’s Go-To-Market team, leading the AI revolution for businesses worldwide. You’ll excel by leveraging Google's brand credibility—a legacy built on inventing foundational technologies and proven at scale. We’ll provide you with the world's most advanced AI portfolio, including frontier Gemini models, and the complete Vertex AI platform, helping you to solve business problems. We’re a collaborative culture providing direct access to DeepMind's engineering and research minds, empowering you to solve customer challenges. Join us to be the catalyst for our mission, drive customer success, and define the new cloud era—the market is yours.Responsibilities Serve as a developer for Artificial Intelligence (AI) applications, transitioning from prototypes to production-grade agentic workflows (e.g., multi-agent systems, Model Context Protocol (MCP) servers) that drive Return on Investment (ROI). Architect and code the connection between Google’s AI products and customer's live infrastructure, including Application Programming Interfaces (APIs), legacy data silos, and security perimeters as part of a team. Build evaluation pipelines and observability frameworks to ensure agentic systems meet requirements for accuracy, safety and latency. Identify repeatable field patterns and friction points in Google’s AI stack, converting them into reusable modules or formal product feature requests for the Engineering teams. Co-build with customer engineering teams to instill Google-grade development best practices, ensuring project success and end-user adoption. Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
WorldQuant develops and deploys systematic financial strategies across a broad range of asset classes and global markets. We seek to produce high-quality predictive signals (alphas) through our proprietary research platform to employ financial strategies focused on market inefficiencies. Our teams work collaboratively to drive the production of alphas and financial strategies – the foundation of a balanced, global investment platform. WorldQuant is built on a culture that pairs academic sensibility with accountability for results. Employees are encouraged to think openly about problems, balancing intellectualism and practicality. Excellent ideas come from anyone, anywhere. Employees are encouraged to challenge conventional thinking and possess an attitude of continuous improvement. Our goal is to hire the best and the brightest. We value intellectual horsepower first and foremost, and people who demonstrate an outstanding talent. There is no roadmap to future success, so we need people who can help us build it.Technologists at WorldQuant research, design, code, test and deploy firmwide platforms and tooling while working collaboratively with researchers and portfolio managers. Our environment is relaxed yet intellectually driven. We seek people who think in code and are motivated by being around like-minded people. The Role We are seeking an exceptional senior-level Python engineer to join a small team working on complex data pipelines, AI/ML systems, and cutting-edge software solutions. This role will be responsible for managing technical objectives, providing technical leadership, and maintaining a hands-on approach to development. The ideal candidate will work closely across teams within WorldQuant as part of our business-facing technology organization. A successful candidate will possess deep expertise in Python development, data engineering, software architecture, and design principles. They should be able to mentor junior team members, conduct code reviews, and drive architectural decisions. Experience with AI and large language models (LLMs) is highly desirable. What You'll Bring Master's degree or higher in Computer Science, Engineering, or a related technical field from a top-tier institution. 7+ years of experience as a Python developer, with a strong focus on data engineering and AI/ML systems. Expert-level knowledge of Python and its ecosystem, including experience with data processing libraries like Pandas, NumPy, and PySpark. Proficiency in designing and implementing scalable, maintainable, and efficient data pipelines. Experience with cloud platforms (AWS, GCP, or Azure) and containerization technologies (Docker, Kubernetes). Expertise in version control systems (Git), CI/CD practices, and agile methodologies. Strong communication skills with the ability to explain complex technical concepts to both technical and non-technical stakeholders. Experience in the finance industry is a plus but not required. Experience with AI/ML frameworks such as PyTorch, or scikit-learn, LLM, agents or systems of agents is a significant plus. #LI-DN1By submitting this application, you acknowledge and consent to terms of the WorldQuant Privacy Policy. The privacy policy offers an explanation of how and why your data will be collected, how it will be used and disclosed, how it will be retained and secured, and what legal rights are associated with that data (including the rights of access, correction, and deletion). The policy also describes legal and contractual limitations on these rights. The specific rights and obligations of individuals living and working in different areas may vary by jurisdiction. Copyright © 2025 WorldQuant, LLC. All Rights Reserved.WorldQuant is an equal opportunity employer and does not discriminate in hiring on the basis of race, color, creed, religion, sex, sexual orientation or preference, age, marital status, citizenship, national origin, disability, military status, genetic predisposition or carrier status, or any other protected characteristic as established by applicable law.
Minimum qualifications: Bachelor’s degree or equivalent practical experience. 5 years of experience with software development in one or more programming languages. 3 years of experience testing, maintaining, or launching software products, and 1 year of experience with software design and architecture. 3 years of experience with ML infrastructure (e.g., model deployment, model evaluation, optimization, data processing, debugging). Experience with distributed computing, infrastructure as code, infrastructure as a service, and system design. Preferred qualifications: Master's degree or PhD in Computer Science or related technical field. 5 years of experience with data structures and algorithms. 3 years of experience developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage or hardware architecture Experience as a software engineer. Experience in any one of GCP or other cloud providers, or other data center management stack. Knowledge in three or more of the following areas: APIs and services, distributed systems, tools, testing infrastructure, and monitoring infrastructure. About the jobGoogle's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward. CMCS (Cloud ML Compute Services) team defines and drives the overall Cloud ML Compute IaaS and IaaS+ product offering and technical strategy.In this role, you will enable the customers with the best Machine Learning (ML) and High Performance Computing (HPC) platform in the world for top talent powered by TPUs, GPUs, CPUs and all ML frameworks (Tensorflow, PyTorch and JAX).Responsibilities Own the design, development, and deployment of scalable software components that enable the deployment of AI and ML infrastructure. Troubleshoot complex distributed system issues across the stack (hardware, kernel, network); build the automation, tooling, and telemetry needed to turn operational findings into permanent software fixes and improved SLOs.  Collaborate closely with Hardware, Networking, Storage, CE, Product and other partner teams to define requirements and deliver high-quality solutions. Lead code reviews, drive engineering best practices (testing, release safety), and mentor junior engineers to help grow the technical capability of the team. Contribute to the team's technical roadmap by identifying infrastructure gaps and proposing architectural improvements to support future growth. Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.

Cake 找工作

加入 Cake 社群,搜尋上萬筆職缺,快速找到適合你的工作。