Big Data Engineer Β· Technical Lead Β· Solution Architect
I build data platforms that scale under real workloads β and stay close enough to the system to know when something needs to change.
Extensive experience delivering enterprise-grade analytics, ETL / ELT, and large-scale observability systems across GCP, Azure, AWS, and on-premise environments β from architecture design through to hands-on implementation.
Focus areas
- Scalable data platforms
- Real-time log ingestion
- Observability architecture
- Distributed ETL / ELT
- Platform engineering & CI/CD
- Technical leadership
Profile
Senior Data Engineer and Technical Lead with a track record of building and evolving large-scale data platforms across cloud and on-premise environments. Designed real-time ingestion and observability systems handling up to 1TB/day, and led platform improvements that touched architecture, deployment efficiency, and codebase maintainability β not just feature delivery.
Comfortable moving between architecture decisions and hands-on implementation: whether that's designing a metadata-driven DAG system from the ground up, tuning Elasticsearch shard allocation under production load, or reworking CI pipelines to cut deployment overhead. I care most about systems that stay maintainable and operationally reliable as they grow β not just systems that work on day one.
Strengths
01 / Architecture
Systems designed to scale, not just to ship
From distributed ingestion to observability pipelines, I approach architecture with reliability, fault tolerance, and operational longevity as first-class constraints β not afterthoughts.
02 / Hands-on Engineering
Still in the code when it matters
Leading a team doesn't mean stepping back from technical problems. I stay directly involved β implementing critical paths, debugging production issues, and improving the systems I design.
03 / Leadership
Engineering ownership, end to end
I translate business requirements into technical decisions, keep teams aligned through delivery, and take ownership of platform quality β from architecture decisions down to codebase maintainability.
Experience Timeline
Platform engineering Β· Technical lead
Data Engineering Tech Lead @ CMC Global, Hanoi
Feb 2026 β Present
- Joined the project as a hands-on Senior Data Engineer, contributing directly to feature development and data workflows before stepping into the Technical Lead role.
- Refactored and standardized helper modules across the codebase β improving readability, maintainability, and engineering consistency β while extracting reusable vendor-specific logic into shared libraries to reduce duplication.
- Designed and implemented a metadata-driven master DAG architecture to replace isolated job orchestration, improving scalability and operational consistency across the platform.
- Collaborated closely with stakeholders to align platform improvements with reporting requirements and delivery timelines for a global automotive client in the Korean market.
- Redesigned Azure DevOps CI/CD pipelines around an incremental deployment strategy, cutting deployment overhead and tightening the delivery cycle across the team.
- Built and optimized Databricks incremental refresh jobs to improve Power BI data freshness without increasing compute cost.
- Supported team members in understanding the metadata-driven architecture and CI/CD approach, helping the team take ownership of the patterns independently.
Observability architecture Β· Infrastructure
Technical Lead & Solution Architect β Observability Platform @ CMC Global, Hanoi
Jul 2025 β Mar 2026
- Served as Technical Lead and Solution Architect for a large-scale log ingestion and observability platform built for a semiconductor manufacturing environment.
- Architected an on-premise, air-gapped Elastic Stack system handling up to 1TB/day log volume with near real-time processing (30β120s latency).
- Directly configured inter-node networking and cluster communication across distributed Elasticsearch infrastructure β from network binding to transport layer settings.
- Designed and implemented Index Lifecycle Management (ILM) policies, index templates, and rollover rules to govern large-scale log retention, storage tiering (hot/warm/cold), and query performance.
- Tuned indexing strategy, shard allocation, and disk utilization under sustained high-volume workloads to maintain cluster stability and operational efficiency.
- Monitored and optimized Logstash ingestion throughput and pipeline performance, resolving bottlenecks under production load conditions.
- Built resilient ingestion pipelines with Persistent Queue and Dead Letter Queue to ensure zero data loss during interruptions.
- Led security design across the stack β network segmentation, HTTPS enforcement, and role-based access control (RBAC).
- Delivered dashboards, alerting, and anomaly detection capabilities in Kibana for operational teams.
Database architecture Β· Migration
Technical Lead β Database Migration & Redesign @ CMC Global, Hanoi
Mar 2025 β Jul 2025
- Led the database team in a cross-functional database migration project, transitioning from MongoDB Atlas to Amazon Aurora.
- Designed and implemented data migration strategies ensuring data consistency, integrity, and minimal downtime.
- Leveraged Prisma ORM with TypeScript to redesign data access layers and optimize query performance.
- Collaborated closely with backend teams to refactor services and align with relational database design principles.
- Identified and resolved schema mismatches between NoSQL and relational models, ensuring seamless migration.
Pre-sales Β· Solution design
Technical Lead β PoC & Solution Architecture @ CMC Global, Hanoi
Jul 2024 β Dec 2024
- Acted as Technical Lead for a critical Proof-of-Concept project to validate a large-scale log management and analytics solution for a semiconductor manufacturing system.
- Led the end-to-end solution design and proposal process, working closely with stakeholders to define system requirements and architecture using Elastic Stack.
- Designed and delivered a production-grade PoC system that was actively used by the client on a daily basis, serving as a foundation for the full-scale implementation.
- Conducted technical presentations, solution demonstrations, and architecture walkthroughs to convince stakeholders and secure project buy-in.
- Played a key role in technical proposal writing, estimation, and solution positioning, contributing directly to business acquisition.
- Led and coordinated the engineering team, ensuring alignment between business requirements and technical implementation.
- Built strong experience in client communication, negotiation, and cross-functional collaboration in a pre-sales / solutioning context.
Analytics engineering Β· Distributed data
Data Engineer β Fleet Analytics Platform @ NTT Data VDS., Hanoi
Jan 2024 β Jul 2024
Embedded in a Kanban team responsible for automated fleet measurement analytics, working alongside Vietnamese and European colleagues in a cross-timezone delivery setup.
- Built data ingestion and processing pipelines on Azure β using Data Factory for orchestration and Databricks for large-scale analysis of terabyte-range datasets from data lakes.
- Designed transformation workflows to handle petabytes of raw data, applying distributed processing principles to maintain throughput and reliability at scale.
- Used Elasticsearch and Kibana for exploratory data analysis and stakeholder-facing visualizations, bridging raw data and operational insight.
- Collaborated across a mixed VietnameseβEuropean team, contributing to both technical delivery and cross-cultural project alignment.
Backend architecture Β· AWS microservices
Software Engineering Team Lead β Microservices Platform @ NTT Data VDS., Hanoi
Jun 2023 β Dec 2023
- Led a 5-person Vietnamese engineering team delivering a talent management application on AWS, built around a microservice architecture.
- Designed a resilient database layer using DocumentDB with auto-scaling, and architected Spring Boot microservices for high availability under variable load.
- Configured and integrated core AWS services β DocumentDB, S3, EC2 β directly into backend microservices, handling both infrastructure decisions and implementation.
- Implemented AWS Video on Demand for large-scale video streaming and built reusable Spring Boot API modules shared across services.
- Engineered a CI/CD workflow on EKS and GitLab CI that enabled parallel deployment of five microservices with consistent reliability.
Data platform Β· GCP Β· Team growth
Data Engineering Team Lead β GCP Analytics Platform @ NTT Data VDS., Hanoi
Oct 2021 β May 2023
- Grew the Vietnamese sub-team from 2 to 8 engineers while coordinating closely with EU counterparts across multiple active delivery tracks.
- Designed and implemented distributed ETL pipelines using Google Cloud Dataflow, Scala, and Scio β ingesting millions of streaming records from Salesforce, Siebel, and enterprise sources into GCS.
- Architected BigQuery data models with star schema design, optimizing complex queries across dozens of dimension tables and high-volume fact tables at terabyte scale.
- Ensured ACID compliance for large transactions and drove a significant architecture migration from wildcard tables to Slowly Changing Dimension (SCD Type 2) implementations in BigQuery.
- Orchestrated ETL workflows and parallel BigQuery jobs using Apache Airflow (Cloud Composer), improving pipeline reliability and scheduling efficiency.
- Mentored engineers on distributed data workflows, Airflow orchestration patterns, and BigQuery modeling β shortening onboarding ramp and raising the team's ability to work independently on complex pipelines.
Full-stack Β· Java Β· Early career
Software Engineer β Backend & Full-stack Development @ NTT Data VDS., Hanoi
May 2021 β Oct 2021
- Built Java reactive backend services with Quarkus and frontend features with TypeScript and Angular, working within a Scrum team on a mock testing application.
- Implemented authentication and authorization with Keycloak, integrating security features across backend and Angular components.
Internship
Intern Software Engineer @ VTI Vietnam, Hanoi
Jul 2020 β Nov 2020
- Optimized an email notification feature to handle 5,000 emails per second using multi-threading and Thymeleaf templates.
- Collaborated on a Spring Boot / Vue.js web application, contributing to backend services and SQL procedures under senior engineering guidance.
Starting point
Intern Software Engineer @ SODO Asia, Hanoi
Feb 2019 β Oct 2019
- Developed FAQ and barcode scanning features with Angular and Spring Boot for a web and mobile platform serving 30,000 monthly users.
- Integrated RabbitMQ for async microservice communication and accelerated search with MongoDB full-text indexing.
Education
2016 β 2021
Engineer, Posts and Telecommunications Institute of Technology
Oct 2016 β Sept 2021
Highlights
- Designed and operated data and observability systems processing up to 1TB/day in production.
- Delivered across GCP, Azure, AWS, and air-gapped on-premise environments.
- Led platform evolution: from architecture and system design to CI/CD optimization and codebase quality.
- Hands-on with Databricks, Elasticsearch, BigQuery, Airflow, and distributed ETL at scale.
- Led cross-functional engineering teams in delivery, pre-sales, and platform migration contexts.