Ashok - Data Engineer Portfolio

Current Role

Bestbuy India Apr 2025 - Present

Software Engineer - II (Data)

Primarily focused on analyzing SAP reports and tracing data lineage from the reporting layer back through the medallion architecture—gold, silver, bronze layers—to the source RDBMS. The existing architecture is built on Teradata with ETL via Informatica. This analysis is used to reconstruct the data flow in Google Cloud using BigQuery and to build new reports in Looker Studio Pro.

Major Contributions:

Conducted in-depth lineage analysis of SAP reports across Teradata-based medallion architecture
Mapped data flow from gold layer to source systems to support migration to Google Cloud
Rebuilt data pipelines and models in BigQuery based on legacy architecture insights
Designed and developed reports in Looker Studio Pro for enhanced visualization and accessibility

SQL GCP Teradata Looker Studio Pro Medallion Architecture

Previous Experience

Carelon Global Solutions Dec 2022 - Mar 2025

Senior Software Engineer - Data Engineering

Focused on developing scalable data models and pipelines using Python and SQL across AWS and GCP environments to process high-volume healthcare claims data, while ensuring compliance with privacy and security standards.

Built ETL pipelines using Python and SQL on GCP (Cloud Functions & GKE) to monitor and process $250M in finalized claims daily (MBM Commercial model), orchestrated with Apache Airflow
Developed and maintained cloud-based workflows using AWS Lambda, Glue, EMR, Step Functions, and GCP services
Designed data models and contributed to architecture reviews with a focus on PHI/PII risk mitigation
Presented
Specialized in query optimization, code reviews, and lifecycle management to enhance data quality and security

AWS GCP SQL Airflow Snowflake

Legato Health Technologies Jan 2021 - Nov 2022

Software Engineer - Data Engineering

Contributed to enterprise-scale data pipeline development and deployment across cloud and big data platforms, with a focus on automation, data quality, and team mentorship.

Developed an HBase audit tool to detect missing loads between Hadoop and Snowflake for 1000+ tables
Built an AWS Glue job for incremental and bulk data transfers between S3 buckets, used daily for 1000+ tables
Led complex production deployments with strong quality control using Bitbucket and deployment trackers
Designed and optimized complex Snowflake SQL scripts for layered data warehousing
Mentored new associates and conducted training on Big Data, Hadoop, Bitbucket, and AEDL architecture

Python SQL Snowflake AWS Hadoop

Accenture Inc. Jun 2019 - Dec 2020

Software Engineer - Data Engineering

Led performance optimization and framework enhancement efforts for big data ingestion tools, supporting cross-team collaboration and large-scale ETL operations in a Hadoop ecosystem.

Improved Hive partition deletion tool, reducing runtime from 30 minutes to under 1 minute per table; awarded "Best of the Month" at Accenture BDF
Built a PySpark-based automation tool for converting Streamsets ETL pipelines to Spark jobs for 1800+ Hive tables
Enhanced ingestion frameworks using Shell Script, PySpark, Sqoop, Hive, and HBase to support historical and incremental data loads
Acted as the primary support for 8+ teams using the "Streamsets to Accelerator Converter" framework, ensuring reliable maintenance and issue resolution

Spark Hadoop Shell Script Python Hive Hbase UNIX

Accenture Inc. Jun 2018 - May 2019

Associate Software Engineer - Data Engineering

Worked on large-scale data ingestion and automation projects using Hadoop ecosystem tools, with a focus on streamlining data workflows and ensuring data quality.

Automated deletion of invalid partitions across 1800+ tables using shell scripting
Built 100+ Streamsets pipelines for efficient RDBMS to Hadoop data integration
Developed automation frameworks using test-driven development in Agile Environment
Gained deep expertise in Apache HBase and Hive data transformations
Created an auditing framework for 1600+ pipelines, reducing manual validation time by two days

Streamsets ETL SQL Hadoop Hive Hbase Shell Scripting UNIX

Accenture Inc. Sept 2017 - Jun 2018

Associate Software Engineer - Data Engineering

Completed comprehensive training in data engineering and business intelligence, focusing on data integration, processing, and visualization tools.

Trained in Hadoop ecosystem for big data processing
Gained proficiency in SQL and ETL ingestion techniques
Learned project management using Agile and Waterfall methodologies
Developed workflows using Informatica
Built interactive dashboards and reports in Power BI

SQL ETL Informatica Power BI Hadoop

Skills

Programming Languages

Python
SQL
Shell Script
SparkSQL

Cloud Platforms

AWS
GCP
Snowflake

Big Data Technologies

Hadoop
Spark
Hive
HBase
Kafka
Airflow
Nifi

BI & Visualization

Power BI
Tableau
Looker Studio Pro

DevOps & Tools

Terraform
Git
GitHub
Bitbucket
Bamboo
CI/CD
Jira
Confluence
Control-M

Database & Operating Systems

Oracle
MySQL
Teradata
Mainframe
Informatica
Linux/Unix

Core Competencies

Data Engineering & Architecture

Data Modeling & Warehousing
ETL/ELT Pipeline Development
Data Quality & Governance
Security & Compliance

Cloud & Pipeline Engineering

Distributed Systems
Pipeline Orchestration
CI/CD Automation
Scalable Solutions

Data Operations & Management

Agile Methodologies
Project Management
Technical Documentation
Technical Training

Personal Projects

Data Lake Architecture

Designed and implemented a modern data lake architecture using AWS services. Built an end-to-end solution for data ingestion, processing, and analytics with automated governance and security controls.

AWS S3 AWS Glue Athena Python Terraform

View Code Documentation

Real-time Analytics Platform

Built a real-time analytics platform processing millions of events per day. Implemented stream processing, real-time aggregations, and interactive dashboards for monitoring key metrics.

Kafka Spark Streaming Redis Grafana Docker

View Code Documentation

ML Feature Store

Developed a centralized feature store for machine learning projects. Implemented feature computation, storage, and serving layers with support for both batch and real-time feature serving.

Python FastAPI Redis PostgreSQL MLflow

View Code Documentation

Data Quality Framework

Created an automated data quality framework with customizable rules engine. Implemented data profiling, validation, and monitoring with alerts and detailed reporting capabilities.

Great Expectations Airflow dbt Snowflake Slack API

View Code Documentation

IoT Data Pipeline

Built a scalable IoT data pipeline handling sensor data from thousands of devices. Implemented real-time processing, anomaly detection, and predictive maintenance capabilities.

AWS IoT Kinesis Lambda TimeStreamDB Python

View Code Documentation

Education

Anna University, Panimalar Engineering College, Chennai.Aug 2013 - Aug 2017

Bachelor of Engineering in Electrical and Electronics Engineering

Completed undergraduate studies with First Class, maintaining consistent academic performance throughout.

Final Year Project

DC-DC Converter for BLDC Motor Using Ultracapacitor & Battery

Designed and implemented a hybrid power supply system to improve efficiency and performance in Brushless DC motor applications.

Beyond Code

Hiking

Hiking and trekking the Himalayas. Love exploring new trails and challenging myself.

Basketball

Occasionally play basketball and enjoy staying active through the sport.

Geopolitics

Listen to a lot of geopolitical news and analysis.

Beyond my professional life, I'm an avid hiker, basketball player, and a keen follower of global affairs. I believe in maintaining a healthy work-life balance and continuously expanding my horizons through new experiences and challenges.

Let's Connect

Email LinkedIn GitHub Resume

I'm Ashok

Current Role

Bestbuy India Apr 2025 - Present

Major Contributions:

Previous Experience

Carelon Global Solutions Dec 2022 - Mar 2025

Legato Health Technologies Jan 2021 - Nov 2022

Accenture Inc. Jun 2019 - Dec 2020

Accenture Inc. Jun 2018 - May 2019

Accenture Inc. Sept 2017 - Jun 2018

Skills

Programming Languages

Cloud Platforms

Big Data Technologies

BI & Visualization

DevOps & Tools

Database & Operating Systems

Core Competencies

Data Engineering & Architecture

Cloud & Pipeline Engineering

Data Operations & Management

Personal Projects

Data Lake Architecture

Real-time Analytics Platform

ML Feature Store

Data Quality Framework

IoT Data Pipeline

Education

Anna University, Panimalar Engineering College, Chennai.Aug 2013 - Aug 2017

Final Year Project

Beyond Code

Let's Connect