About the Role
About SilverchairSilverchair is the premier independent platform partner for scholarly and professional publishers, dedicated to expanding the reach of the world's most valuable knowledge. By connecting creators, publishers, and users, we amplify the impact of scholarship and enhance the accessibility of critical information. Our global teams develop, build, and host websites, online products, and digital libraries for prestigious publishers, including the American Medical Association, MIT Press, and Oxford University Press.
DEI StatementAt Silverchair, we celebrate and embrace diversity in all its forms. We are committed to fostering an inclusive environment from the moment you consider joining our team. We actively encourage candidates from diverse backgrounds to apply, believing that a variety of perspectives and experiences enriches our community, drives innovation, and strengthens our impact.
Equity and inclusion are at the core of our hiring practices, and we strive to build a team that reflects a broad spectrum of cultures, experiences, and viewpoints. We are particularly committed to increasing representation from groups historically underrepresented in technology careers. Your unique experiences and perspectives are not just welcomed but are integral to our collective success. Join us in our mission to create a culture that unites and brings out the best in all of us.Learn more about our commitment to diversity, equity, and inclusion at Silverchair.
We're looking for a Data Engineer to build and maintain the data pipelines that turn scholarly publishing activity into insights for our clients — some of the most recognized names in academic research. Our platform runs on Azure (Synapse, Data Factory, Confluent), and we're actively evaluating a move to a modern lakehouse architecture. You'll bring solid experience in data engineering fundamentals, contribute to the current platform from day one, and learn the systems deeply.
You'll be joining a small, senior analytics team at Silverchair, a company with long-established presence in scholarly publishing and the agility of a nimble software organization. The team operates with high autonomy, strong support from leadership, and real ownership of the platform you work on. You'll work alongside a Senior Data Engineer, a Senior Quality Engineer, and a Business Analyst.
Our Tech Stack
• Streaming ingestion: Confluent (Kafka)
• Pipelines and orchestration: Azure Data Factory
• Transformation: Spark / PySpark and SQL stored procedures
• Data warehouse: Azure Synapse Analytics (Dedicated SQL Pool)
• Future direction: Actively evaluating modern lakehouse platforms (Databricks, Microsoft Fabric)
Essential Functions
• Data Pipeline Development: Design, build, and maintain data pipelines that ensure reliable data flow from source systems through transformation layers to reporting. Integrate data quality checks and validation into the pipeline workflow. Implement error handling, logging, and retry capabilities to keep pipelines robust and recoverable.
• Data Transformation & Modeling: Develop SQL and Python-based transformations that cleanse, enrich, and structure data for analytical use. Design and implement dimensional models including fact tables and dimension tables.
• Performance & Optimization: Monitor and tune pipeline and query performance. Use execution plans and profiling tools to identify bottlenecks and improve throughput and efficiency.
• Production Support: Troubleshoot and resolve production data issues using logs, monitoring tools, and systematic debugging. Ensure pipelines run reliably and data is delivered on schedule.
• Collaboration & Documentation: Work closely with your scrum team and cross-functional partners across analytics, product, and engineering. Document pipeline designs, data lineage, and business rules. Participate in code reviews and contribute to team knowledge sharing.
Required Skills
• SQL Proficiency: Strong SQL skills including complex joins, CTEs, window functions, aggregations, views, functions, and stored procedures. Awareness of execution plans and indexing strategies for writing performant queries.
• Python Development: Ability to write clean, modular Python using functions and classes. Experience with data engineering libraries such as PySpark for data transformation and processing.
• Data Modeling & Warehousing: Experience designing dimensional models (star schema, fact/dimension tables). Understanding of data warehouse architecture concepts and layered data organization patterns.
• ETL/ELT Pipeline Development: Hands-on experience building data pipelines with orchestration tools. Familiarity with incremental/delta loading patterns, error handling, and idempotent pipeline design.
• Azure Data Platform (3-5 years): Production experience with Azure Data Factory and Azure Synapse Analytics (Dedicated SQL Pool, Serverless, Spark) is required. Exposure to Microsoft Fabric, Databricks, or dbt is a plus as we evaluate platform e