About the Role
Design and develop core components of a platform that enables autonomous agent behaviors, integrating machine learning, decision logic, and scalable infrastructure to support complex workflows.
Responsibilities
- Lead architecture and implementation of distributed systems supporting agent-driven workflows
- Collaborate with research and product teams to translate AI models into production-grade services
- Optimize system reliability, scalability, and performance under variable loads
- Define and enforce software engineering standards across the platform
- Mentor engineers and contribute to technical strategy and roadmap planning
- Troubleshoot complex issues across multiple service boundaries
- Integrate observability and monitoring to ensure system health
- Design secure APIs and data pipelines for agent communication
- Evaluate and adopt new technologies that improve development velocity
- Ensure compliance with data privacy and security requirements
- Work closely with data scientists to operationalize machine learning models
- Build tools and frameworks to streamline agent lifecycle management
- Drive best practices in testing, CI/CD, and deployment automation
- Participate in system design reviews and code quality initiatives
- Contribute to documentation and knowledge sharing across teams
Nice to Have
- Experience with agent-based modeling or autonomous systems
- Exposure to reinforcement learning or decision-making frameworks
- Contributions to open-source projects or technical publications
- Hands-on experience with Kubernetes and service mesh technologies
- Prior work in AI-first product environments
Compensation
Competitive salary with equity and performance incentives
Work Arrangement
Hybrid with flexible remote options
Team
Part of a core platform engineering team building next-generation automation systems
About the Team
The team operates at the intersection of AI and infrastructure, creating systems that enable intelligent, autonomous behaviors. Work is collaborative, technically deep, and focused on long-term platform stability and innovation.
Tech Stack
- Primary languages include Python and Go
- Infrastructure built on Kubernetes and cloud providers
- Machine learning models deployed via scalable serving frameworks
- Monitoring through Prometheus, Grafana, and custom tooling
Available for qualified candidates