Forward Deployed Research Scientist San Francisco,CA

Gathering your results ...

Job Details

Forward Deployed Research Scientist

Labelbox, Inc San Francisco, CA

Days Posted: 5 days

Experience Level:Not Specified

Employment Type:Not Specified

Pay Range: Not Specified

Role Overview Alignerr is Labelbox's human data organization - we produce the training data that frontier AI labs use to build their most capable models. Our Forward Deployed Research Team sits at the intersection of research science and client delivery, embedding research capability directly into the engagements that drive our business. This is not a traditional research scientist role. You will not spend months pursuing a single research question. You will work on multiple client engagements simultaneously, operating on timescales of days to weeks. You will sit in scoping meetings with research teams at major AI labs, reason scientifically about data strategy in real time, fine-tune open-weight models to validate our data methodology, and collaborate with our Applied Research team to turn client-grounded findings into published work. The pace is fast, the problems are applied, and the feedback loops are short. We are looking for someone who finds that energizing, not compromising. Your Impact <ul> <li>Engage directly with frontier lab research teams. You will be in the room during client scoping meetings - not as support staff, but as a technical peer. You'll engage on methodology, challenge assumptions about data requirements, and shape project specifications based on a scientific understanding of how data composition affects model outcomes. </li></ul> Develop deep scientific understanding of client engagements. For each project, you will build a working model of the client's architecture, training methodology, and target capabilities. You'll use this understanding to reason about why a particular data strategy will or won't work, identify risks early, and iterate with empirical grounding - not intuition. Run ablation studies and fine-tune open-weight models. You will fine-tune models on client data (and proxy data) to empirically measure the impact of our data on model performance. This is how we validate that what we deliver actually improves our customers' models - and how we catch problems before the client does. Consult on workflow and quality systems. You will partner with our Human Data Operations team to review annotation schemas, task designs, and quality rubrics before projects go into execution. Your job is to ensure the spec is technically sound - that the data we produce will actually serve the client's training objectives. Collaborate with Applied Research on publications and benchmarks. Our Applied Research team owns the long-horizon research agenda. Your role is to feed them signal from the field - generalizable findings, reusable methodologies, empirical results - and help drive joint projects to completion. You will contribute to benchmarks, white papers, and conference submissions that establish Labelbox's research credibility. What You Bring <ul> <li>Required </li><li>MS or PhD in Machine Learning, NLP, Computer Science, or a related quantitative field. </li><li>Hands-on experience fine-tuning large language models (open-weight models such as Llama, Mistral, Qwen, or similar). </li><li>Strong understanding of LLM training pipelines - pretraining, supervised fine-tuning, RLHF/DPO, and how data quality and composition affect each stage. </li><li>Experience designing and executing experiments with rigor - hypothesis formation, controlled comparisons, statistical analysis of results. </li><li>Ability to operate at speed. You should be comfortable going from problem definition to experimental results in days, not months. </li><li>Strong written and verbal communication. You will present findings to client research teams and contribute to published work. </li></ul> Strongly Preferred <ul> <li>Prior experience at a frontier AI lab, applied ML startup, or in a research role with direct client/stakeholder interaction. </li><li>Experience with evaluation and benchmarking of LLMs - designing metrics, building eval harnesses, interpreting results critically. </li><li>Familiarity with human data pipelines - annotation workflows, quality assurance methodology, inter-annotator agreement analysis. </li><li>Experience with reinforcement learning, reward modeling, or RLHF environments. </li><li>Published research (conferences, journals, or technical reports) in ML/NLP or adjacent fields. </li></ul> What Matters More Than Credentials <ul> <li>Applied instinct over academic purity. The measure of success here is client impact and publishable-but-practical results - not methodological novelty for its own sake. If your first instinct when handed a problem is to build a framework, this isn't the role. If your first instinct is to run an experiment and get a result, it is. </li><li>Comfort with ambiguity and incomplete information. Client engagements rarely come with clean problem statements. You'll need to extract the real question from a noisy conversation, scope an approach quickly, and iterate. </li><li>Cross-functional fluency. You will work daily with field engineers, project managers, operations teams, and an independent Applied Research team. Someone who can only operate within a pure research silo will struggle here. </li><li>Intellectual honesty. When an ablation study shows the data isn't working, you need to say so - clearly and constructively - even when it's inconvenient for the deal timeline. </li></ul> What You Should Know About This Team <ul> <li>We are small and high-leverage. The FDRT is a team of five today. Every person's work directly influences client outcomes and Labelbox's market position. </li><li>We operate at the tempo of client delivery. Two-week sprints. SLAs measured in days. If you want months of uninterrupted focus on a single problem, our Applied Research team is a better fit. </li><li>We are at the intersection of several teams. FDRT works with Field Delivery Engineers, Human Data Operations, Applied Research, and client research teams. The role requires navigating those interfaces with credibility and without ego. </li><li>We protect time for research. 25-30% of team capacity is allocated to research collaboration with Applied Research. This is not aspirational - it is a structural commitment. You will have the opportunity to publish. </li></ul>

POST A JOB

It's completely FREE to post your jobs on ZiNG! There's no catch, no credit card needed, and no limits to number of job posts.

The first step is to SIGN UP so that you can manage all your job postings under your profile.

If you already have an account, you can LOGIN to post a job or manage your other postings.

Thank you for helping us get Americans back to work!