Orderfox is hiring a Lead Data Architect to architect the data platform behind our AI-powered market intelligence agent. You will drive the vision, ship early iterations, boost performance, and help define innovation, not just support it.
What You'll Do
- Re-architect the web-to-profile pipeline, designing distributed crawling systems with polite scheduling and deduplication.
- Co-design and evolve a multimodal knowledge base combining Postgres, pgvector, Cosmos DB Graph, DuckDB, and others to serve SQL, graph, and semantic queries with sub-second latency.
- Build LLM-powered enrichment chains, developing RAG and tool-calling pipelines that convert raw documents into structured profiles and guide prompt and chain versioning.
- Integrate and steward diverse data sources (finance, trade, product catalogues), negotiate contracts, resolve entities, and define refresh cadence.
- Implement data operations and observability: CI/CD for data (GitHub Actions + dbt/Airflow), lineage (OpenLineage), and cost and quality dashboards.
- Mentor and collaborate cross-functionally, translating product questions into schemas, shaping the roadmap, and supporting peer growth.
- Lead technically while staying hands-on, working closely with the CTO and growing into a player-coach leadership role as the team scales.
What We're Looking For
- Extensive experience designing end-to-end data architectures at web scale – from raw HTML through SQL, graph, or vector databases to production-grade APIs.
- Proficient in modern, high-performance Python (async, typed), and comfortable with Spark / Dask for large transforms.
- Solid track record developing distributed web crawlers and anti-bot strategies using Firecrawl, Playwright or Crawlee.
- Deep understanding of LLM / RAG engineering with Semantic kernel / LangChain / LlamaIndex, prompt and chain versioning.
- Working knowledge of Azure lakehouse stack (Blob, Data Lake, Fabric) or similar cloud platform and open table formats (Delta/Iceberg).
- Strong communicator with the ability to clearly explain complex data topics to both executives and colleagues.
Technical Stack
- Languages & Frameworks: Python, Spark, Dask
- Databases: Postgres, pgvector, Cosmos DB Graph, DuckDB
- Crawling: Firecrawl, Playwright, Crawlee
- LLM Tooling: Semantic Kernel, LangChain, LlamaIndex
- Cloud & Data Platforms: Azure Blob, Azure Data Lake, Microsoft Fabric, Delta, Iceberg
- Data Ops: GitHub Actions, dbt, Airflow, OpenLineage
Team & Environment
You will join a ~30-person team with a flat hierarchy where initiative beats bureaucracy. You will work closely with and report directly to the CTO.
Benefits & Compensation
- Real-world impact in industrial AI, contributing to Gieni AI’s multi-agent infrastructure and Microsoft Copilot integration.
- Modern tech stack with autonomy to co-own a greenfield stack and choose the right tool for the job.
- Flat hierarchy in a ~30-person team.
- Gorgeous Zurich HQ and flexibility with lakeside office and hybrid/abroad work options.
- Personal growth with structured onboarding and clear paths to player-coach or other leadership tracks.
Work Mode
This is a hybrid role based in Zurich.
Orderfox is an equal opportunity employer.



