Big Data Engineer Β· Technical Lead Β· Solution Architect

I build data platforms that scale under real workloads β€” and stay close enough to the system to know when something needs to change.

Extensive experience delivering enterprise-grade analytics, ETL / ELT, and large-scale observability systems across GCP, Azure, AWS, and on-premise environments β€” from architecture design through to hands-on implementation.

Focus areas

  • Scalable data platforms
  • Real-time log ingestion
  • Observability architecture
  • Distributed ETL / ELT
  • Platform engineering & CI/CD
  • Technical leadership
7+ Years in software & data engineering
1TB/day Real-time log volume designed for production systems
Multi-cloud Proven delivery across GCP, Azure, AWS, and on-prem
Architecture Delivery, pre-sales, and cross-functional technical leadership
01

Profile

Senior Data Engineer and Technical Lead with a track record of building and evolving large-scale data platforms across cloud and on-premise environments. Designed real-time ingestion and observability systems handling up to 1TB/day, and led platform improvements that touched architecture, deployment efficiency, and codebase maintainability β€” not just feature delivery.

Comfortable moving between architecture decisions and hands-on implementation: whether that's designing a metadata-driven DAG system from the ground up, tuning Elasticsearch shard allocation under production load, or reworking CI pipelines to cut deployment overhead. I care most about systems that stay maintainable and operationally reliable as they grow β€” not just systems that work on day one.

02

Strengths

01 / Architecture

Systems designed to scale, not just to ship

From distributed ingestion to observability pipelines, I approach architecture with reliability, fault tolerance, and operational longevity as first-class constraints β€” not afterthoughts.

02 / Hands-on Engineering

Still in the code when it matters

Leading a team doesn't mean stepping back from technical problems. I stay directly involved β€” implementing critical paths, debugging production issues, and improving the systems I design.

03 / Leadership

Engineering ownership, end to end

I translate business requirements into technical decisions, keep teams aligned through delivery, and take ownership of platform quality β€” from architecture decisions down to codebase maintainability.

03

Experience Timeline

Platform engineering Β· Technical lead

Data Engineering Tech Lead @ CMC Global, Hanoi

Feb 2026 – Present

Azure logoAzureDatabricks logoDatabricksPower BI logoPower BIPython logoPython
  • Joined the project as a hands-on Senior Data Engineer, contributing directly to feature development and data workflows before stepping into the Technical Lead role.
  • Refactored and standardized helper modules across the codebase β€” improving readability, maintainability, and engineering consistency β€” while extracting reusable vendor-specific logic into shared libraries to reduce duplication.
  • Designed and implemented a metadata-driven master DAG architecture to replace isolated job orchestration, improving scalability and operational consistency across the platform.
  • Collaborated closely with stakeholders to align platform improvements with reporting requirements and delivery timelines for a global automotive client in the Korean market.
  • Redesigned Azure DevOps CI/CD pipelines around an incremental deployment strategy, cutting deployment overhead and tightening the delivery cycle across the team.
  • Built and optimized Databricks incremental refresh jobs to improve Power BI data freshness without increasing compute cost.
  • Supported team members in understanding the metadata-driven architecture and CI/CD approach, helping the team take ownership of the patterns independently.

Observability architecture Β· Infrastructure

Technical Lead & Solution Architect β€” Observability Platform @ CMC Global, Hanoi

Jul 2025 – Mar 2026

⭐Rising Star β€” Technical Lead, CMC Global
Elastic logoElasticLogstash logoLogstashElasticsearch logoElasticsearchKibana logoKibana
  • Served as Technical Lead and Solution Architect for a large-scale log ingestion and observability platform built for a semiconductor manufacturing environment.
  • Architected an on-premise, air-gapped Elastic Stack system handling up to 1TB/day log volume with near real-time processing (30–120s latency).
  • Directly configured inter-node networking and cluster communication across distributed Elasticsearch infrastructure β€” from network binding to transport layer settings.
  • Designed and implemented Index Lifecycle Management (ILM) policies, index templates, and rollover rules to govern large-scale log retention, storage tiering (hot/warm/cold), and query performance.
  • Tuned indexing strategy, shard allocation, and disk utilization under sustained high-volume workloads to maintain cluster stability and operational efficiency.
  • Monitored and optimized Logstash ingestion throughput and pipeline performance, resolving bottlenecks under production load conditions.
  • Built resilient ingestion pipelines with Persistent Queue and Dead Letter Queue to ensure zero data loss during interruptions.
  • Led security design across the stack β€” network segmentation, HTTPS enforcement, and role-based access control (RBAC).
  • Delivered dashboards, alerting, and anomaly detection capabilities in Kibana for operational teams.

Database architecture Β· Migration

Technical Lead β€” Database Migration & Redesign @ CMC Global, Hanoi

Mar 2025 – Jul 2025

MongoDB Atlas logoMongoDB AtlasAmazon Aurora logoAmazon AuroraPrisma logoPrismaTypeScript logoTypeScript
  • Led the database team in a cross-functional database migration project, transitioning from MongoDB Atlas to Amazon Aurora.
  • Designed and implemented data migration strategies ensuring data consistency, integrity, and minimal downtime.
  • Leveraged Prisma ORM with TypeScript to redesign data access layers and optimize query performance.
  • Collaborated closely with backend teams to refactor services and align with relational database design principles.
  • Identified and resolved schema mismatches between NoSQL and relational models, ensuring seamless migration.

Pre-sales Β· Solution design

Technical Lead β€” PoC & Solution Architecture @ CMC Global, Hanoi

Jul 2024 – Dec 2024

Elastic Stack logoElastic StackElasticsearch logoElasticsearchKibana logoKibana
  • Acted as Technical Lead for a critical Proof-of-Concept project to validate a large-scale log management and analytics solution for a semiconductor manufacturing system.
  • Led the end-to-end solution design and proposal process, working closely with stakeholders to define system requirements and architecture using Elastic Stack.
  • Designed and delivered a production-grade PoC system that was actively used by the client on a daily basis, serving as a foundation for the full-scale implementation.
  • Conducted technical presentations, solution demonstrations, and architecture walkthroughs to convince stakeholders and secure project buy-in.
  • Played a key role in technical proposal writing, estimation, and solution positioning, contributing directly to business acquisition.
  • Led and coordinated the engineering team, ensuring alignment between business requirements and technical implementation.
  • Built strong experience in client communication, negotiation, and cross-functional collaboration in a pre-sales / solutioning context.

Analytics engineering Β· Distributed data

Data Engineer β€” Fleet Analytics Platform @ NTT Data VDS., Hanoi

Jan 2024 – Jul 2024

Azure logoAzureDatabricks logoDatabricksElasticsearch logoElasticsearchKibana logoKibanaPython logoPython

Embedded in a Kanban team responsible for automated fleet measurement analytics, working alongside Vietnamese and European colleagues in a cross-timezone delivery setup.

  • Built data ingestion and processing pipelines on Azure β€” using Data Factory for orchestration and Databricks for large-scale analysis of terabyte-range datasets from data lakes.
  • Designed transformation workflows to handle petabytes of raw data, applying distributed processing principles to maintain throughput and reliability at scale.
  • Used Elasticsearch and Kibana for exploratory data analysis and stakeholder-facing visualizations, bridging raw data and operational insight.
  • Collaborated across a mixed Vietnamese–European team, contributing to both technical delivery and cross-cultural project alignment.

Backend architecture Β· AWS microservices

Software Engineering Team Lead β€” Microservices Platform @ NTT Data VDS., Hanoi

Jun 2023 – Dec 2023

AWS logoAWSJava logoJavaSQL logoSQL
  • Led a 5-person Vietnamese engineering team delivering a talent management application on AWS, built around a microservice architecture.
  • Designed a resilient database layer using DocumentDB with auto-scaling, and architected Spring Boot microservices for high availability under variable load.
  • Configured and integrated core AWS services β€” DocumentDB, S3, EC2 β€” directly into backend microservices, handling both infrastructure decisions and implementation.
  • Implemented AWS Video on Demand for large-scale video streaming and built reusable Spring Boot API modules shared across services.
  • Engineered a CI/CD workflow on EKS and GitLab CI that enabled parallel deployment of five microservices with consistent reliability.

Data platform Β· GCP Β· Team growth

Data Engineering Team Lead β€” GCP Analytics Platform @ NTT Data VDS., Hanoi

Oct 2021 – May 2023

GCP logoGCPBigQuery logoBigQueryAirflow logoAirflowScala logoScalaPython logoPython
  • Grew the Vietnamese sub-team from 2 to 8 engineers while coordinating closely with EU counterparts across multiple active delivery tracks.
  • Designed and implemented distributed ETL pipelines using Google Cloud Dataflow, Scala, and Scio β€” ingesting millions of streaming records from Salesforce, Siebel, and enterprise sources into GCS.
  • Architected BigQuery data models with star schema design, optimizing complex queries across dozens of dimension tables and high-volume fact tables at terabyte scale.
  • Ensured ACID compliance for large transactions and drove a significant architecture migration from wildcard tables to Slowly Changing Dimension (SCD Type 2) implementations in BigQuery.
  • Orchestrated ETL workflows and parallel BigQuery jobs using Apache Airflow (Cloud Composer), improving pipeline reliability and scheduling efficiency.
  • Mentored engineers on distributed data workflows, Airflow orchestration patterns, and BigQuery modeling β€” shortening onboarding ramp and raising the team's ability to work independently on complex pipelines.

Full-stack Β· Java Β· Early career

Software Engineer β€” Backend & Full-stack Development @ NTT Data VDS., Hanoi

May 2021 – Oct 2021

Java logoJavaTypeScript logoTypeScriptSQL logoSQL
  • Built Java reactive backend services with Quarkus and frontend features with TypeScript and Angular, working within a Scrum team on a mock testing application.
  • Implemented authentication and authorization with Keycloak, integrating security features across backend and Angular components.

Internship

Intern Software Engineer @ VTI Vietnam, Hanoi

Jul 2020 – Nov 2020

Java logoJavaSQL logoSQL
  • Optimized an email notification feature to handle 5,000 emails per second using multi-threading and Thymeleaf templates.
  • Collaborated on a Spring Boot / Vue.js web application, contributing to backend services and SQL procedures under senior engineering guidance.

Starting point

Intern Software Engineer @ SODO Asia, Hanoi

Feb 2019 – Oct 2019

Java logoJavaTypeScript logoTypeScriptMongoDB logoMongoDB
  • Developed FAQ and barcode scanning features with Angular and Spring Boot for a web and mobile platform serving 30,000 monthly users.
  • Integrated RabbitMQ for async microservice communication and accelerated search with MongoDB full-text indexing.
04

Education

2016 β†’ 2021

Engineer, Posts and Telecommunications Institute of Technology

Oct 2016 – Sept 2021

05

Highlights

  • Designed and operated data and observability systems processing up to 1TB/day in production.
  • Delivered across GCP, Azure, AWS, and air-gapped on-premise environments.
  • Led platform evolution: from architecture and system design to CI/CD optimization and codebase quality.
  • Hands-on with Databricks, Elasticsearch, BigQuery, Airflow, and distributed ETL at scale.
  • Led cross-functional engineering teams in delivery, pre-sales, and platform migration contexts.