Core Competencies Data Engineering and Pipeline Development
- Deep, hands-on experience with Google BigQuery — including dataset design, partitioning/clustering strategies, materialized views, and cost-optimization techniques.
- Proficiency in Cloud Composer (Apache Airflow) for orchestrating complex, production-grade data pipelines with proper scheduling, retry logic, and dependency management.
- Experience building and maintaining Vertex AI Pipelines for ML workflows and data transformation at scale.
- Advanced SQL skills — able to write complex, performant, and maintainable queries across large datasets including window functions, CTEs, recursive queries, and query optimization.
- Strong Python proficiency — comfortable building data transformation scripts, pipeline logic, custom Airflow operators, API integrations, and automation tooling.
Data Architecture and Scalable Design
- Proven ability to design layered data architectures using patterns such as Medallion (bronze/silver/gold), Dimensional Modeling (star schema), Data Vault, and targeted denormalization — and knows when to apply each based on the use case.
- Track record of building modular, multi-purpose datasets rather than project-specific tables — thinks in terms of canonical models and shared dimensions.
- Understands when to create new tables versus when to extend, view, or restructure existing assets to avoid unnecessary duplication and table sprawl.
- Applies best practices around naming conventions, schema organization, documentation, and lifecycle management so that the architecture remains navigable as it scales.
Tableau Dashboard Development
- Hands-on experience building production-quality Tableau dashboards — from data source configuration and extract optimization to interactive visual design.
- Ability to translate business questions into clear, intuitive visualizations that non-technical stakeholders can self-serve from.
- Familiarity with Tableau performance tuning, published data sources, and server/cloud publishing workflows.
- Understands the relationship between upstream data modeling decisions and downstream dashboard performance — designs the data layer with the visualization in mind.
Technical Stack
- Cloud Platform: Google Cloud Platform (GCP)
- Data Warehouse: BigQuery (advanced)
- Orchestration: Cloud Composer / Apache Airflow
- ML Pipelines: Vertex AI Pipelines
- Visualization: Tableau (Desktop, Server/Cloud)
- Languages: Python (advanced), SQL (advanced)
- Infrastructure: Terraform (preferred), GCS, Cloud Functions
- Version Control: Git / GitLab
Experience and Qualification
- s5+ years in a data engineering role, with meaningful GCP/BigQuery experience
- .Advanced proficiency in Python and SQL as daily working languages
- .Demonstrated experience designing and maintaining shared, reusable data models in an enterprise or multi-team environment
- .Familiarity with data architecture patterns including Medallion, star schema, and Data Vault
- .Portfolio or examples of Tableau dashboards built on well-structured data layers
- .Familiarity with CI/CD practices for data pipelines and infrastructure-as-code concepts
- .Strong communicator who can work with cross-functional teams to gather requirements and translate them into scalable data solutions