Leadtech

Junior Data Engineer - Mobile Apps

Leadtech · Remote

Full-timeLeadPythonJavaGCPDocker

🔥6 people viewed this job

About the Role

We are looking for a Junior Data Engineer to design, develop, and optimize our data infrastructure on Databricks. You will be involved in architecting pipelines using BigQuery, Google Cloud Storage, Apache Airflow, dbt, Dataflow, and Pub/Sub, ensuring high availability and performance across our ETL/ELT processes. A successful candidate has knowledge of cloud-native data solutions, experience with ETL/ELT frameworks, and a passion for building robust, cost-effective pipelines. Key ResponsibilitiesData ArchitectureSupport the development and maintenance of our data platform on GCP, including data warehousing in BigQuery/Databricks and data lake storage in Google Cloud Storage.Help organize data into clear layers and domain-focused Data Marts for analytics and reporting.Assist with Terraform-based Infrastructure as Code to provision and manage cloud resources in a consistent way.Contribute to batch and near real-time data workflows with a focus on reliability, scalability, and cost awareness.Pipeline Development & OrchestrationBuild, maintain, and improve ETL/ELT pipelines under guidance using Apache Airflow for workflow orchestration.Develop and maintain dbt transformations to create clean, version-controlled data models in BigQuery.Support data ingestion and processing using tools such as Google Dataflow, Apache Beam, or Pub/Sub where needed.Monitor scheduled jobs, troubleshoot failures, and help ensure data is delivered on time for analytics and reporting.Data Quality, Governance & SecurityHelp implement and maintain data quality checks using Great Expectations, dbt tests, or similar tools.Support documentation of datasets, metadata, lineage, and audit processes.Follow security best practices, including IAM, encryption, and secure handling of sensitive data.Assist in maintaining compliance with data privacy and governance requirements such as GDPR or CCPA.Scientists & Analytics EnablementPartner with Analytics, Product, and Data Science teams to provide reliable datasets for dashboards, reporting, and experimentation.Help maintain Data Marts that support key business domains and stakeholder needs.Support data availability and accessibility for analytics and machine learning use cases.Learn from senior team members and grow into owning larger parts of the data platform over time.RequirementsExperience1+ year of experience in data engineering or a related data role.Exposure to mobile, product, or marketing data is a plus.Technical Expertise with GCP StackBasic hands-on experience with GCP services such as BigQuery and Google Cloud Storage.Familiarity with Apache Airflow for scheduling and orchestrating data workflows.Some experience with dbt or similar transformation tools.Exposure to Pub/Sub, Dataflow, or other batch/streaming tools is a plus.Understanding of Data Mart concepts and interest in Infrastructure as Code tools such as Terraform.Programming & ContainerizationGood coding skills in Python; Java or Scala is a plus.Ability to write scripts for automation and data processing tasks.Familiarity with Docker and basic container concepts.Exposure to CI/CD and version control workflows such as GitHub Actions, GitLab CI, Jenkins, or similar.Data Quality & GovernanceUnderstanding of data quality principles and experience with dbt tests, Great Expectations, or similar tools is a plus.Basic knowledge of data governance concepts such as lineage, metadata, and access control.Awareness of privacy and compliance principles such as GDPR is a plus.General understanding of OLTP and OLAP systems. CommunicationClear communication skills and willingness to work closely with technical and non-technical stakeholders.Organized, proactive, and eager to learn.Strong problem-solving mindset and attention to detail.Preferred SkillsInterest in machine learning workflows and exposure to tools such as Vertex AI or similar ML platforms.Familiarity with monitoring and observability tools such as Prometheus, Grafana, Datadog, or New Relic.Basic awareness of security and compliance best practices.Exposure to real-time or streaming data tools such as Kafka, Spark Streaming, or similar technologies is a plus.Relevant cloud or data certifications, especially GCP certifications, are a plus. Experience in working with different LLM and automations BenefitsWhy should you join us?Growth and career developmentAt Leadtech (https://himalayas.app/companies/leadtech), we prioritize your growth. Enjoy a flexible career path with personalized internal training and an annual budget for external learning opportunities.Work-Life balanceBenefit from a flexible start and end times and the option of working full remote or from our Barcelona office. Enjoy free Friday afternoons with a 7-hour workday, plus a 35-hour workweek in July and August so you can savor summer!Comprehensive benefitsCompetitive salary, full-time permanent contract, and top-tier private health insurance (including dental and psychological services).25 days of vacation plus your birth

💬 Developer Questions

Ask the team a question — answers show up here

🎯

What does the interview process look like?

🤖

What AI/vibe coding tools does the team use daily?

👥

How big is the engineering team?

Is the team fully async or are there required meetings?

🚀

What does onboarding look like for remote hires?

🔧

Can you share more about the tech stack and architecture?

📈

What does career growth look like in this role?

📅

What does a typical day look like?

💰

Is there a salary range you can share?

📊

Is equity or stock options part of the package?

🌍

Are there timezone requirements or preferences?

🛂

Do you sponsor work visas?

🏢 Is this your listing? Claim it to answer questions

Similar Jobs

Helpful resources

Hiring for a similar role? Post your job here — it's free →