Socure is looking for a Staff Data Scientist to lead advanced data science and R&D efforts for our foundational ID Graph platform. You will work at the intersection of graph modeling, machine learning, and product innovation to impact the core intelligence backbone for downstream identity products. This role operates at platform scale, focusing on durable, explainable, and compliant solutions.
What You'll Do
- Lead evaluation and continuous improvement of entity resolution and entity linking pipelines.
- Debug new builds, identify anomalies, and recommend modeling or system-level improvements.
- Define, implement, and maintain scalable performance and quality metrics, leveraging automation and LLMs.
- Partner with Engineering to optimize entity linking and ranking systems using Learning-to-Rank.
- Design methods to assess and classify entity confidence and quality across the graph.
- Design and implement a comprehensive data quality framework for graph-based identity data.
- Translate abstract quality concepts into measurable signals to guide modeling and product decisions.
- Identify and operationalize high-impact predictive signals from graph structure and relational patterns.
- Develop scalable approaches to link prediction, label propagation, and semi-supervised learning.
- Explore and evaluate advanced graph modeling techniques like GNNs and knowledge graphs.
- Collaborate closely with Engineering, Product Management, Compliance, and downstream product teams.
- Act as a technical leader influencing modeling standards, experimentation rigor, and best practices.
- Translate complex technical findings into clear insights for technical and non-technical stakeholders.
- Support the launch of new product capabilities built on top of the ID Graph.
What We're Looking For
- Master’s or PhD in Computer Science, Data Science, Machine Learning, Statistics, Mathematics, or a related field.
- 5+ years of experience in applied data science, machine learning, or AI, with a focus on graph-based modeling and large-scale data systems.
- Strong proficiency in Python and PySpark.
- Deep experience with Classification models, Learning-to-Rank, Anomaly Detection, and Statistical Modeling.
- Experience building and maintaining production-grade ML systems at scale.
- Hands-on experience with Databricks.
- Familiarity with graph databases and query languages such as NeptuneDB and OpenCypher.
- Experience with graph processing frameworks (e.g., GraphFrames).
- Proven ability to drive cross-functional projects, mentor peers, and influence technical and business outcomes.
- Excellent communication skills for technical and non-technical audiences.
Nice to Have
- Experience applying LLMs for evaluation, automation, or signal discovery.
- Familiarity with Knowledge Graphs and Graph Neural Networks (GNNs).
Technical Stack
- Python, PySpark, Databricks
- NeptuneDB, OpenCypher, GraphFrames
Team & Environment
You will be part of the Identity organization, collaborating with Engineering, Product Management, Compliance, and multiple product teams. Our culture is built on a high bar for team performance, moving fast, thinking critically, and acting like owners. We care deeply about solving customer problems with precision and embrace feedback to adapt resiliently.
Socure is an equal opportunity employer that values diversity in all its forms within our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.






