Remote (Global) Full-time

Clarivate is hiring a Senior Data Scientist (NLP)

About the Role

Clarivate is hiring a Senior Data Scientist (NLP) to design and implement large-scale AI-enabled solutions that modernize our content delivery systems. You will specialize in Natural Language Processing (NLP) and modern retrieval-augmented generation (RAG) architectures, focusing on text processing pipelines, indexing, vectorization, prompting, fine-tuning, and context management.

What You'll Do

  • Design scalable NLP workflows for text ingestion, cleaning, normalization, and tokenization.
  • Implement and maintain robust indexing systems and vector databases for semantic search and retrieval.
  • Develop reusable prompting strategies and lead fine-tuning initiatives for LLMs tailored to business tasks.
  • Build dynamic knowledge systems and agentic workflows using LangChain and LangGraph.
  • Integrate advanced RAG architectures like VRAG and GraphRAG to enrich information retrieval.
  • Conduct benchmark testing and model evaluations to improve accuracy, efficiency, and scalability of NLP systems.
  • Collaborate with engineering, product, and research stakeholders to deliver integrated AI-driven features.
  • Mentor junior data scientists, guide best practices, and drive innovation across AI projects.

What We're Looking For

  • Bachelor’s degree in Computer Science, Data Science, Computational Linguistics, or a related field.
  • At least 5 years of hands-on experience in data science, focused on natural language processing (NLP).
  • At least 5 years of experience using Python, with expertise in NLP libraries such as LangChain, LangGraph, or other “Lang”-based toolkits.
  • Proven experience in model development and applying machine learning techniques to real-world problems.

Nice to Have

  • Expertise in retrieval-based LLM workflows (RAG, VRAG, GraphRAG).
  • Deep understanding of embedding models, semantic search, and vector stores (e.g., FAISS, Pinecone).
  • Experience with document loaders and text splitters/document splitting strategies.
  • Familiarity with MLOps practices and production-level deployment of AI pipelines.
  • Experience with cloud platforms (e.g., AWS, Azure, or GCP).
  • Experience applying Graph Neural Networks (GNNs) to retrieval-enhanced generation.
  • Knowledge of LangSmith and vector orchestration platforms.
  • Familiarity with multilingual NLP and cross-lingual embeddings.
  • Exposure to real-time knowledge graphs and stream-based RAG systems.
  • A Master’s or PhD in a technical field (Computer Science, Data Science, etc.).

Technical Stack

  • Python
  • LangChain
  • LangGraph
  • FAISS
  • Pinecone
  • AWS
  • Azure
  • GCP

Team & Environment

This role is part of the Life Sciences & Healthcare (LS&H) segment under the Content Technology team. You will work closely with the VP of Content Technology, Solutions Architects, and internal SMEs, reporting directly to the VP of AI, Content.

Benefits & Compensation

  • Medical insurance
  • Dental insurance
  • Prescription drug coverage
  • Life insurance
  • 401k with match
  • Long term disability coverage
  • Vacation
  • Sick time
  • Volunteer time
  • Discount programs
  • Annual salary range: $117,000 - $147,000 USD

Work Mode

This is a global position open to candidates in the US.

At Clarivate, we are committed to providing equal employment opportunities for all qualified persons with respect to hiring, compensation, promotion, training, and other terms, conditions, and privileges of employment. We comply with applicable laws and regulations governing non-discrimination in all locations.

Required Skills
PythonLangChainLangGraphFAISSPineconeAWSAzureGCPNLPMachine LearningLLMsData ScienceCloud PlatformsVector Databases
Looking for a remote dev community?

200+ professionals, 37 countries, one network

Working remotely doesn't mean working alone. Iglu connects you with developers, designers, and digital experts worldwide. Collaborate, learn, and grow together.

Global professional network
Knowledge sharing & collaboration
Regular community events
Cross-project opportunities
Join the community
37 countries represented
About company
Clarivate

Clarivate provides innovative data and analytical solutions to the largest biopharmaceutical and medical technology companies in the world.

Visit website
Job Details
Category data
Posted 7 months ago