Toronto or Ajax Employment

Bell Canada Enterprises is hiring a Senior AI DevOps Architect

About the Role

Bell Canada Enterprises is looking for a Senior AI DevOps Architect to serve as the architect and strategist for our AI/ML developer experience. You will define the vision, design the frameworks, and ensure the long-term success of our AI development lifecycle through innovative DevOps practices.

What You'll Do

  • Proactively identify pain points in the developer journey and architect solutions to streamline workflows and enhance productivity.
  • Design and implement AI-optimized CI/CD pipelines that automate build, test, and deployment processes.
  • Integrate AI-powered tools to automate code reviews and identify errors, vulnerabilities, and style inconsistencies.
  • Implement AI-driven systems for continuous security monitoring, enabling proactive threat detection.
  • Evaluate and recommend new AI capabilities and tools that can enhance developer experience and operational efficiency.
  • Collaborate with the Platform team to establish organizational standards, security policies, and governance frameworks.
  • Develop and execute strategies to ensure widespread adoption of MLOps best practices across engineering teams.
  • Guide and mentor intermediate engineers and provide expert consultation on complex MLOps challenges.

What We're Looking For

  • Ability to define and articulate a long-term vision for AI/ML developer experience and architect robust, scalable solutions.
  • Deep understanding of the end-to-end machine learning lifecycle, including data management, model development, training, deployment, monitoring, and governance.
  • Proven ability to design, implement, and optimize sophisticated CI/CD pipelines specifically for AI/ML workloads.
  • Experience with major cloud platforms (AWS, GCP) and services relevant to AI/ML, including containerization (Docker, Kubernetes) and Infrastructure as Code (Terraform, Ansible).
  • Familiarity with a broad range of AI/ML frameworks, libraries, and platforms (e.g., TensorFlow, PyTorch, MLflow, Kubeflow, SageMaker, Vertex AI).
  • Expertise in integrating security best practices throughout the AI/ML lifecycle, including threat detection and vulnerability management.
  • Excellent ability to diagnose complex technical challenges, identify root causes, and develop innovative solutions.
  • Demonstrated capability to lead technical initiatives, guide junior engineers, and provide expert consultation.
  • Strong interpersonal and communication skills, with the ability to collaborate with engineering teams, platform teams, and stakeholders.
  • A proactive approach to staying abreast of the rapidly evolving landscape of AI, ML, DevOps, and cloud technologies.

Nice to Have

  • Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or a related technical field.
  • 7-10 years of experience in DevOps, Site Reliability Engineering, or Software Engineering roles.
  • At least 3-5 years of direct experience implementing and managing MLOps practices for AI/ML projects.
  • Proven track record of designing scalable CI/CD pipelines for complex applications, preferably including ML models.
  • Hands-on experience with container orchestration platforms like Kubernetes.
  • Experience with Infrastructure as Code tools (e.g., Terraform, CloudFormation).
  • Experience evaluating, selecting, and integrating new tools to improve developer workflows.
  • Experience in defining and enforcing technical standards, policies, and governance frameworks.
  • Experience mentoring engineers and leading technical discussions.
  • Strong programming skills, particularly in Python.
  • Familiarity with monitoring and logging solutions (e.g., Prometheus, Grafana, ELK Stack).
  • Knowledge of security best practices in cloud and DevOps environments.

Technical Stack

  • Cloud: AWS, GCP
  • Infrastructure & Orchestration: Docker, Kubernetes, Terraform, Ansible
  • AI/ML Frameworks & Platforms: TensorFlow, PyTorch, MLflow, Kubeflow, SageMaker, Vertex AI
  • Programming & Monitoring: Python, Prometheus, Grafana, ELK Stack

Team & Environment

This role is part of the Customer Experience team and involves close collaboration with the Platform team.

Bell Canada Enterprises is an equal opportunity employer.

Required Skills
AWSGCPDockerKubernetesTerraformAnsibleTensorFlowPyTorchMLflowKubeflowCI/CDMLOpsInfrastructure as CodeCloud ArchitectureMachine Learning
Invoicing holding you back?

Focus on work, not paperwork

Stop worrying about invoicing, taxes, and compliance. Glopay handles the business setup, you handle the client work. Get paid faster and look professional.

Auto-generated compliant invoices
Built-in expense management
Income reports for tax season
95% of earnings stay with you
Try Glopay free
No credit card needed
About company
Bell Canada Enterprises

Bell builds world-class networks, develops innovative services, and creates original multiplatform media content. The Bell Mobility team offers mobile devices, wireless services, and Internet of Things solutions to consumer and business customers.

Visit website
Job Details
Department Information Technology
Category infrastructure
Posted 14 days ago