AssemblyAI is looking for a Senior Software Engineer, AI Data to build robust, scalable systems that power the AI data platform. In this role, you will drive technical execution, own significant features, and work on high-impact projects that directly influence the ability to train and evaluate AI models at scale.
What You'll Do
- Design scalable, future-proof data platforms optimized for AI research workloads.
- Build efficient self-serve data processing pipelines leveraging GCP's advanced services.
- Implement cost-effective storage and monitoring solutions for ML at scale.
- Create flexible training resource management with intelligent queuing.
- Optimize resource allocation for maximum training efficiency.
- Participate in an on-call rotation to ensure system reliability.
- Lead adoption of cutting-edge ML tools and frameworks, continuously evaluating and integrating best-in-class solutions.
- Streamline existing workflows while introducing new tooling that further reduces complexity.
- Enhance tooling and documentation to accelerate team velocity.
- Implement guardrails for cost, quality, and performance.
- Identify and eliminate technical bottlenecks in the data processing and training pipelines.
What We're Looking For
- 5+ years of professional software engineering experience.
- Strong proficiency in Python and SQL with demonstrated ability to write production-quality code.
- Solid understanding of software engineering fundamentals: data structures and algorithms, system design and architectural patterns, testing strategies (unit, integration, end-to-end), code review practices and technical collaboration.
- Experience with RESTful APIs and distributed systems concepts.
- Experience with containerization (Docker) and basic cloud infrastructure.
- Track record of delivering high-quality software in a team environment.
- Ability to thrive in a startup environment with changing priorities and rapid iteration.
Nice to Have
- Experience with GCP services (BigQuery, GCS, Cloud Run, GKE).
- Familiarity with distributed processing frameworks (Apache Beam, PySpark).
- Experience with workflow orchestration tools (Airflow, Prefect, Dagster).
- Understanding of ML/AI infrastructure and data pipelines.
- Experience with monitoring and observability tools (Datadog).
- Experience working with researchers directly.
- Background in data engineering roles.
Technical Stack
- Languages: Python, SQL
- Infrastructure: Docker, GCP
- GCP Services: BigQuery, GCS, Cloud Run, GKE
- Frameworks & Tools: Apache Beam, PySpark, Airflow, Prefect, Dagster, Datadog
Team & Environment
You will be part of the AI Data team, collaborating with researchers, platform engineers, and other stakeholders.
Benefits & Compensation
- Competitive equity grants.
- 100% employer-paid benefits.
- Germany/Ireland: €141,267 – €184,512; United Kingdom: £117,159 – £153,024 + equity.
Work Mode
This is a fully remote position open to candidates based in Germany, Ireland, or the United Kingdom.
We’re committed to creating a space where our employees can bring their full selves to work and have equal opportunity to succeed. No matter your race, gender identity or expression, sexual orientation, religion, origin, ability, age, veteran status, if joining this mission speaks to you, we encourage you to apply.




