About the Role
THE ORGANIZATIONThe Charter School Growth Fund (CSGF) is a leading nonprofit venture philanthropy fund that has spent 20+ years identifying high-quality public charter schools and investing in their growth. Today the portfolio spans 200+ networks, 1,700+ schools, and more than 840,000 students.
This role sits within a new public data infrastructure project that CSGF is incubating alongside its core operations. The project is designed to evolve our internal assessment data pipeline into free, open infrastructure that researchers, funders, policymakers, and school networks can use independently. School performance data is currently fragmented across dozens of state agencies, inconsistently formatted, and practically inaccessible to anyone without significant technical resources. This platform processes standardized assessment data across 40+ charter states, calculates school performance metrics, and publishes results through a public-facing portal. A core part of our vision is building toward an open source codebase and we're looking for someone who is excited about that kind of ultimate public-facing technical work, not just internal tooling.
THE OPPORTUNITYAbout Our TeamThis role sits within a small, dedicated team building public data infrastructure for the education investing, research, and policy community. The team operates as a focused engineering and data function: processing state assessment files, maintaining the data pipeline, publishing metrics, and keeping the platform and its documentation current. Everyone on the team works close to the data and close to the code.
The team is organized around four core functions:
• Data Collection and Management
• Infrastructure and Engineering
• Data Validation and QA
• Publishing and Documentation
Our Analytics Stack
• DuckDB + dbt (data modeling)
• Python (metrics and validation)
• Quarto (portal and documentation)
About This RoleAs an Analytics Engineer you will help build and maintain the data models, validation scripts, and documentation that the platform depends on. This is a one-year contract role with the possibility of extension. This role will report to the Vice President who owns product direction and architecture. The team is highly collaborative and there are always opportunities to develop skills outside the core responsibilities of an individual role.
The project runs on an annual state release cycle, with the bulk of new assessment data arriving in the summer months. This role will need to be actively contributing to parser updates and dbt model changes within the first four to six weeks of starting. We are looking for someone who can orient in an unfamiliar codebase quickly and move from observation to independent contribution without an extended onboarding period.
KEY RESPONSIBILITIESBelow is a general outline of responsibilities. Roles and responsibilities may evolve to meet the needs of the project.
Data Modeling and Transformation
• Build and maintain dbt models that transform raw state assessment files into clean, analysis-ready datasets
• Develop new state parsers for assessment files that vary significantly in format, structure, and quality across states. This requires independent problem solving, not just following established patterns.
• Contribute to school and CMO classification reference datasets, keeping them accurate as the underlying data evolves
Data Quality, Testing, and Documentation
• Write dbt tests and maintain data documentation so that outputs are auditable by external researchers
• Perform in-depth QA on state data, identifying and resolving issues across heterogeneous source formats
• Author data dictionaries and methodology documentation with the same care as the code itself
Data Validation and Portal Publishing
• Use Python to perform rigorous QA and data validation: cross-state consistency checks, metric range validation, outlier detection, and regression testing against prior releases
• Write validation scripts that can be re-run as data updates, creating a consistent and reviewable QA record
• Publish documentation, data dictionaries, and methodology updates to the Quarto-based portal as part of regular data releases. The portal structure exists; this role maintains and extends its content.
• Write clearly for a mixed audience: technical enough for researchers building on the data, accessible enough for non-technical partners tracking what changed and why
REQUIRED QUALIFICATIONSExpected Skills and Characteristics
• Hands-on experience writing and maintaining dbt models in a production or near-production environment
• Solid SQL skills, including the ability to debug complex transformations and identify data quality issues
• Comfort working with messy, real-world data: inconsistent formats, missing values, undocumented quirks, and files that require investigation before they can be modeled
• Comfort with Python for data analysis tasks: reading files, transforming data, writing validation scripts
• Ability t