I am a data professional passionate about transforming raw datasets into actionable insights. With strong expertise in ETL pipelines, big data tools, and cloud-native platforms, I excel at building scalable, reliable data solutions. My work combines data engineering with advanced analytics, preparing me to transition into data science roles where I can leverage modeling and machine learning for decision-making.
To build intelligent, automated data platforms that empower businesses to make real-time, data-backed decisions.
To integrate engineering precision with analytical depth, ensuring efficient, compliant, and sustainable data ecosystems.
Proficient in Python, SQL, Java, and shell scripting for data parsing, analysis, and transformation.
Experienced in Informatica PowerCenter, SSIS, Talend, Git, CI/CD pipelines, and ETL automation.
Skilled in Hadoop, PySpark, Hive, Kafka, AWS (S3, Redshift, Lambda), and GCP BigQuery.
Specialized in end-to-end ETL/ELT pipeline design, deployment, monitoring, and recovery.
Hands-on experience with Agile, Scrum, Jira boards, and version-controlled cloud-native platforms.
Strong background in data modeling, performance tuning, and statistical validation for BI reporting.
Migrated 4TB+ legacy datasets to Oracle with checksum validation, automated 200+ ETL workflows, and integrated Jenkins CI/CD pipelines, ensuring stability and reducing release cycles by 55%.
Designed and optimized 15+ ETL pipelines, implemented real-time data transformations across 6 heterogeneous systems, reducing mismatches by 95% and cutting pipeline runtime by 43%.