Gurgaon district, Haryana, India
Contributed to the modernization of a platform product by leveraging Spark, Scala, and an AWS EC2 cluster for aggregations and transformations. This project focused on standardizing terabytes of incremental Market Ownership dataset for millions of companies.
Project/Product - Ownership Incremental Standardization
Key Achievements:
- Implemented Spark-based parallel ingestion and processing framework, significantly improving performance compared to legacy systems.
- Led the end-to-end delivery of a complex system involving multiple frameworks for ingestion, initialization, standardization, and aggregation.
- Successfully optimized performance for complex problems, such as hierarchical queries and aggregates, using techniques like caching, repartitioning, and broadcasting.
Project/Product - Modernization of Loaders using Kafka
Key Contributions:
- Executed the implementation of a real-time Kafka pipeline, handling millions of transactional CDC messages and processing them with business logic before persisting in the target database.
- Led the end-to-end pipeline delivery, designing and developing the Kafka consumer with adherence to best practices and guidelines.
- Achieved performance optimization and addressed latency issues in message consumption through Kafka optimization techniques and Docker in production.