About the Role
We are seeking a Data Scientist with deep expertise in both classical machine learning and Generative AI to drive digital transformation in the process industries, including energy, chemicals, manufacturing, and industrial automation. You will work on high-impact use cases such as predictive maintenance, process optimization, supply chain forecasting, and intelligent automation—leveraging massive industrial datasets, domain-specific knowledge, and cutting-edge AI/ML techniques. Drawing inspiration from the work done at AspenTech, Aveva, Siemens, C3.ai, and Palantir, you will lead the development and deployment of production-grade models that create measurable business value for mission-critical operations.
Key Responsibilities
● Design, develop, and deploy advanced machine learning and deep learning models for predictive analytics, anomaly detection, optimization, and simulation in process industries.
● Lead the application of Generative AI (LLMs, foundation models) to use cases such as natural language querying of industrial data, automated reporting, operator assist, and control recommendations.
● Collaborate with SMEs, engineers, and product teams to define high-impact use cases grounded in operational and engineering realities.
● Work with large-scale industrial time-series, sensor, and historian data (e.g., OSIsoft PI, Aspen IP.21).
● Build and evaluate digital twins for industrial assets and processes, including physics-informed ML models.
● Apply techniques such as reinforcement learning, probabilistic modeling, and graph learning where applicable.
● Stay current with the GenAI and ML research landscape, and translate breakthroughs into practical solutions.
● Contribute to data pipelines, model monitoring, retraining strategies, and deployment using MLOps tools and platforms.
● Communicate results and insights clearly to both technical and non-technical stakeholders.
Required Qualifications
● 1~3+ years of experience in applied data science, with a strong track record of delivering real-world ML/AI solutions.
● Deep understanding of supervised/unsupervised learning, time-series analysis, and generative models (e.g., transformers, diffusion models).
● Experience working with industrial datasets and process industry platforms such as Aveva, AspenTech, Honeywell, or Siemens Xcelerator.
● Proficiency with Python, ML libraries (e.g., scikit-learn, XGBoost, PyTorch, TensorFlow), and GenAI frameworks (e.g., LangChain, HuggingFace, OpenAI APIs).
● Familiarity with MLOps tooling such as MLflow, Airflow, SageMaker, or Azure ML.
● Strong analytical and problem-solving skills with the ability to work across messy and heterogeneous data sources.
● Solid grounding in statistics, optimization, and experimental design.
● Excellent communication and collaboration skills, with the ability to partner with engineers, domain experts, and executives.
Preferred Qualifications
● Background in chemical, mechanical, or process engineering
● Experience with digital twins, control systems, or simulation tools (e.g., Aspen Plus, gPROMS).
● Knowledge of graph data models, semantic layers, or ontologies used in industrial contexts.
● Understanding of domain-specific regulatory, safety, and operational requirements.
● Experience with secure and scalable AI deployment in enterprise or OT/IT hybrid environments.