Careers/Research Engineer

Research Engineer

San Francisco, CAFull-time

Datacurve provides the frontier coding data that powers the world's most advanced models. We absorb and standardize deeply, highly-specialized knowledge to create the world's first autonomous data engine, allowing us to teach the next generation of models (big and small) mastery across all types of knowledge work. We work with foundational labs and large, highly-specialized enterprises alike.

About the Role

As a Research Engineer at Datacurve, you will study the data we produce. You will design, build, and improve the infrastructure we use to understand our data. This type of work is intentionally pretty open-ended — we don't know what we don't know. You're tasked with understanding our data, how machines learn from it, and coming up with novel techniques to extract the maximum value possible across some of the most complex domains of knowledge.

You'll work side-by-side with researchers at the world's leading AI labs — not necessarily to fulfill their data needs — but to help them understand the performance of their models across different surfaces of the same domain. You will produce the benchmarks, artifacts, and technical narratives that define our work. You will significantly amplify the impact of the data we produce and the industries it touches — beyond improving just a small number of models in a handful of labs.

Your work will materially influence the velocity of frontier model improvement inside the world's leading AI labs and beyond — improvement capable of causing material shifts in the global economy.

What we're looking for
  • Deep technical taste: An intuitive, almost intangible understanding of what is valuable, informative, and safe for frontier AI models — as well as the ability to rigorously test and develop that understanding
  • Experimental mindset: Strong instincts for evaluation design, benchmark creation, and error analysis in highly ambiguous domain spaces
  • Scientific storytelling: Exceptional communication skills with the ability to synthesize complex technical findings into elegant, compelling narratives and artifacts; you're able to whiteboard your thinking
  • Relentless — borderline obsessive — curiosity: You're motivated to push the boundaries of what machines are capable of, in often non-obvious ways; you don't like unanswered questions
  • Deeply independent — almost iconoclastic — thinker: You hold strong opinions, but understand the limits of your knowledge; you lean toward questioning consensus rather than following it
What you'll do
  • Design, run, and continuously advance the entire lifecycle of an experiment — from end-to-end
  • Define domain-specific data quality bars and design evaluation loops to probe frontier model capabilities
  • Build data synthesis pipelines and feedback mechanisms that target subtle model failure modes
  • Collaborate with external research labs and internal SMEs to conceptualize and execute custom data strategies
  • Own your research goals to completion, often instantiating some paper, benchmark, or general public-facing technical narrative (like a blog) that is held to the same standard as top conferences

How to Apply

To apply, please email careers@datacurve.ai with your resume/GitHub/LinkedIn and why the work we do excites you. We love cool projects, deep-dive writings, and unconventional backgrounds!

Sound like you?

Introduce yourself