Responsibilities
- Grow and nurture the open-source data/infra community—launch initiatives, collaborate with data-focused groups, and organize events or challenges.
- Engage with communities like Apache Parquet, Open Tables Formats, and data engineering forums to promote best practices and Hugging Face tools.
- Promote the Hugging Face Hub as the go-to platform for data storage, versioning, and collaboration—curate and showcase datasets, benchmarks, and tools like Xet.
- Highlight use cases like efficient large dataset updates, Parquet editing, and deduplication to demonstrate the Hub’s value for data workflows.
- Create demos, benchmarks, and tools (e.g., Colab notebooks) to illustrate best practices for data storage and versioning.
- Experiment with Xet, Parquet, and other data formats to showcase their potential for ML and data engineering.
- Produce high-quality tutorials, blog posts, and videos that make complex topics accessible.
- Share insights on storage optimization, dataset versioning, and deduplication to empower developers.
- Actively participate in online communities (Discord, GitHub, forums) to highlight contributions, answer questions, and foster collaboration.
- Ensure datasets and tools released on the Hub are well-documented, with clear examples, benchmarks, and use cases.
Requirements
- Strong technical skills in Python, data libraries (e.g., pandas, pyarrow, huggingface/datasets), and storage systems (Parquet, Open Table Formats, S3).
- Hands-on builder who loves experimenting with data tools, storage optimization, and dataset versioning.
- Ability to clearly explain complex topics (e.g., deduplication, compression, Parquet editing) through writing, demos, or talks.
- Active in developer communities (GitHub, Discord, forums) and passionate about open source and knowledge sharing.
- Thrive in fast-moving environments and enjoy building in public to inspire others.
Additional Information
- Encourages applicants even if they don’t meet every requirement
- Building a diverse team whose skills, experiences, and backgrounds complement one another
- Equal opportunity employer: does not discriminate based on race, ethnicity, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or ability status
- Flexible working hours and remote options
- All remote employees have the opportunity to visit offices in NYC and Paris
- Workstation outfitting provided if needed
- Employees receive company equity as part of compensation
