Responsibilities
- Design, Build, and Maintain Scalable ML Infrastructure: Lead the design and development of scalable machine learning infrastructure on AWS, utilizing services like AWS Sagemaker for efficient model training and deployment.
- Collaborate with Product Teams: Work closely with product teams to develop MVPs for AI-driven features, ensuring quick iterations and market testing to refine solutions effectively.
- Develop Monitoring & Alerting Frameworks: Create and enhance monitoring and alerting systems for machine learning models to ensure high performance, reliability, and minimal downtime.
- Support Marketing Team’s AI Utilization with an eye on Cross-Departmental impact: Enable various departments within the organization to leverage AI/ML models, including cutting-edge Generative AI solutions, for different use cases.
- Provide Production Support: Offer expertise in debugging and resolving issues related to machine learning models in production, participating in on-call rotations for operational troubleshooting and incident resolution.
- Scale ML Architecture: Design and scale machine learning architecture to support rapid user growth, leveraging deep knowledge of AWS and ML best practices to ensure robustness and efficiency.
- Mentor and Elevate Team Skills: Conduct code reviews, mentor team members, and elevate overall team capabilities through knowledge sharing and collaboration.
- Stay Ahead of the Curve: Stay updated with the latest advancements in machine learning technologies and AWS services, driving the adoption of cutting-edge solutions to maintain a competitive edge.
Requirements
- Bachelor's degree in Computer Science, Computer Engineering, Machine Learning, Statistics, Physics, or a relevant technical field, or equivalent practical experience.
- At least 6+ years of experience in machine learning engineering, with demonstrated success in deploying scalable ML models in a production environment.
- Proficiency with Python is required.
Nice to Have
- Deep expertise in one or more of the following areas: machine learning, recommendation systems, pattern recognition, data mining, artificial intelligence, or related technical fields.
- Proven track record of developing machine learning models from inception to business impact, demonstrating the ability to solve complex challenges with innovative solutions.
- Experience with Golang is a plus.
- Demonstrated technical leadership in guiding teams, owning end-to-end projects, and setting the technical direction to achieve project goals efficiently.
- Experience working with relational databases, data warehouses, and using SQL to explore them.
- Strong familiarity with AWS cloud services, especially in deploying and managing machine learning solutions and scaling them in a cost-effective manner.
- Knowledgeable in Kubernetes, Docker, and CI/CD pipelines for efficient deployment and management of ML models.
- Comfortable with monitoring and observability tools tailored for machine learning models (e.g., Prometheus, Grafana, AWS CloudWatch) and experienced in developing recommender systems or enhancing user experiences through personalized recommendations.
- Solid foundation in data processing and pipeline frameworks (e.g., Apache Spark, Kafka) for handling real-time data streams.