Remote (Global) Full-time

Tether Operations Limited is hiring a Tether Operations Limited - AI Inference Engineer QVAC (100% remote Worldwide)

About the Role

Tether Operations Limited is hiring an AI Inference Engineer for QVAC. You will own the inference backbone behind our local AI stack: the C++ systems layer that makes models run fast, reliably, and predictably on real user hardware. Your work directly enables private, on-device AI experiences and helps set the technical foundation for QVAC's next generation of peer-to-peer AI products.

What You'll Do

  • Deploy machine learning models to edge devices using frameworks like llama.cpp, ggml, and ONNX.
  • Collaborate closely with researchers to assist in coding, training, and transitioning models from research to production environments.
  • Integrate AI features into existing products, enriching them with the latest advancements in machine learning.
  • Define and evolve the core abstractions that inference features depend on, so new capabilities can be added without sacrificing performance or maintainability.
  • Work on the C++ layer that powers local AI, porting and enhancing inference engines to run efficiently on edge devices.
  • Focus on the runtime: making models load faster, run leaner, and perform well across different hardware.
  • Ensure the inference layer is stable, optimized, and ready for integration with the rest of the stack.

What We're Looking For

  • Excellent programming skills in C++.
  • Strong experience with Llama.cpp and ggml inference engines.
  • Good understanding of deep learning concepts and model architectures.
  • Experience with transformers, LLMs, and Diffusion models.
  • Demonstrated ability to rapidly assimilate new technologies and techniques.
  • A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D.
  • Excellent English communication skills.

Nice to Have

  • Experience in Javascript is a bonus.

Technical Stack

  • C++
  • Javascript
  • llama.cpp
  • ggml
  • ONNX

Team & Environment

You will collaborate closely with researchers to transition models from research to production.

Work Mode

This is a 100% remote position open to candidates worldwide.

Tether Operations Limited is an equal opportunity employer.

Required Skills
C++Javascriptllama.cppggmlONNXDeep LearningTransformersLLMsDiffusion ModelsGPU ArchitectureModel DeploymentAI Inference
Relocating to Thailand?

Visa and work permit handled by experts

SVBL manages your entire visa process — from application to approval. Work permits, extensions, and compliance all covered. One partner for legal, immigration, and settling in.

Work permit processing
Visa extensions & renewals
Immigration compliance
Banking & housing guidance
Get free consultation
Free initial consultation
About company
Tether Operations Limited

Pioneers a global financial revolution with cutting-edge solutions empowering businesses to integrate reserve-backed tokens across blockchains. Product suite includes the USDT stablecoin, energy solutions for Bitcoin mining, data solutions for AI and P2P tech, digital education, and ventures at the intersection of technology and human potential.

Visit website
Job Details
Category data
Posted 14 days ago