This research introduces 'AutoResearch,' a framework defining the spectrum of AI-powered scientific workflow automation. It analyzes the transition from task-level assistance to full research automation, identifying critical gaps in reproducibility, provenance, and accountability
The paper explores how AI systems are evolving beyond isolated assistance to manage long-horizon workflows in scientific discovery, encompassing literature grounding, hypothesis generation, experimentation, and reporting. The research highlights fragmentation in current systems regarding autonomy, domain scope, and evidence preservation. The authors define AutoResearch as the developmental spectrum of this workflow automation. They analyze how control, evidence, and accountability are distributed across these workflows, organizing the field around five key workflow conditions. Crucially, the study demonstrates that the autonomy of research systems is highly domain-conditioned, being most credible in structured settings but limited in complex, embodied, or ethically accountable contexts. Five evaluation dimensions—novelty, validity, impact, reliability, and provenance—are proposed to measure these systems.