Clarivate is seeking a Senior Data Scientist (NLP) to join our Life Sciences & Healthcare team. In this role, you will leverage your expertise in natural language processing and modern retrieval-augmented generation architectures to build large-scale, AI-enabled solutions that modernize our content delivery systems.
What You'll Do
- Design scalable NLP workflows for text ingestion, cleaning, normalization, and tokenization.
- Implement robust indexing and vectorization strategies for semantic search and retrieval.
- Develop reusable prompting strategies and lead fine-tuning initiatives for LLMs.
- Build dynamic knowledge systems and agentic workflows using LangChain and LangGraph.
- Integrate advanced RAG architectures like VRAG and GraphRAG.
- Conduct benchmark testing and model evaluations to optimize system performance.
- Collaborate closely with engineering, product, and research stakeholders.
- Provide technical leadership, mentor junior data scientists, and drive innovation.
What We're Looking For
- Bachelor’s degree in Computer Science, Data Science, Computational Linguistics, or a related field.
- At least 5 years of hands-on experience in data science focused on natural language processing.
- At least 5 years of experience using Python with expertise in NLP libraries like LangChain and LangGraph.
- Proven experience in model development and applying machine learning techniques to real-world problems.
Nice to Have
- Expertise in retrieval-based LLM workflows (RAG, VRAG, GraphRAG).
- Deep understanding of embedding models, semantic search, and vector stores (e.g., FAISS, Pinecone).
- Experience with document loaders and text splitting strategies.
- Familiarity with MLOps practices and production-level deployment of AI pipelines.
- Experience with cloud platforms (AWS, Azure, or GCP).
- Experience applying Graph Neural Networks to retrieval-enhanced generation.
- Knowledge of LangSmith and vector orchestration platforms.
- Familiarity with multilingual NLP and cross-lingual embeddings.
- Exposure to real-time knowledge graphs and stream-based RAG systems.
- A Master’s or PhD in a technical field (Computer Science, Data Science, etc.).
Technical Stack
- Python, LangChain, LangGraph, FAISS, Pinecone, AWS, Azure, GCP
Team & Environment
This role sits within the Life Sciences & Healthcare (LS&H) segment under the Content Technology team and reports to the VP of AI, Content.
Benefits & Compensation
- Compensation: $117,000 - $147,000 USD per year.
- Medical, Dental, and Prescription drug coverage.
- Life insurance and long-term disability coverage.
- 401k with match.
- Vacation, sick time, and volunteer time.
- Discount programs.
Work Mode
This is a remote position open to candidates located in the US.
Clarivate is committed to providing equal employment opportunities for all qualified persons with respect to hiring, compensation, promotion, training, and other terms, conditions, and privileges of employment.





