Agentic Radiotherapy Planning: Automating Outer-Loop Tuning via TextGrad

13 Jun 2026

Table of Links

Abstract and 1. Introduction

TEXTGRAD: Optimizing AI systems by backpropagating text feedback
Results

3.1 Code optimization

3.2 Solution optimization by test-time training to improve problem solving

3.3 Prompt optimization for reasoning

3.4 Molecule optimization

3.5 Radiotherapy treatment plan optimization
Related work
Discussion, Acknowledgements, and References

A. TEXTGRAD Details

B. Optimizer Extensions

C. Code Optimization

D. Solution Optimization

E. Prompt Optimization

F. Molecule Optimization

G. Treatment Plan Optimization

3.5 Radiotherapy treatment plan optimization

Radiation therapy, also known as radiotherapy, is a cancer treatment that uses beams of intense energy, such as X-rays, to kill cancer cells. Before treatment begins, a radiotherapy team, including radiation oncologists and planners, collaborates to design an effective treatment plan. This involves determining the necessary dose of radiotherapy and pinpointing the exact locations that need treatment.

Radiotherapy treatment planning can be formulated as a two-loop optimization problem. The inner loop, known as inverse planning, includes processes such as influence map optimization and direct aperture optimization [62]. This optimization problem is typically a constrained one, solved by a numerical optimizer, aiming to minimize a weighted cost function balancing multiple conflicting objectives [63]. These objectives include delivering the prescribed dose to the planning target volume (PTV), which encompasses the tumor and an additional margin to account for uncertainties in planning or treatment delivery, while protecting critical normal tissues, known as organs at risk (OARs), from receiving unsafe doses.

The main challenge in treatment planning is translating overall clinical goals into weighted objective functions and dose constraints that yield an acceptable plan [62]. Human planners often use a trial-anderror approach, iteratively adjusting optimization hyperparameters based on the results of the optimization process until the plans meet clinical requirements [62]. These hyperparameters include the weights assigned to PTVs, organs, and other tissues in the objective function. This process can be subjective, influenced by the planner’s experience and the available time, and involves repeatedly using computationally expensive optimization algorithms over many iterations. This makes the process inefficient, timeconsuming, and costly [64].

Method. We apply TEXTGRAD to perform the outer loop optimization, i.e. hyperparameter optimization for the inner loop numerical optimizer. Instance optimization is performed with gpt-4o over the hyperparameters represented as a string: θ = “weight for PTV: [PTV WEIGHT], weight for bladder: [BLADDER WEIGHT], weight for rectum: [RECTUM WEIGHT], weight for femoral heads: [FH WEIGHT], weight for body: [BODY WEIGHT]”. When hyperparameters are provided, we obtain the treatment plan by adopting a numerical optimizer and constructing a loss as the mismatch between the current plan and the clinical objectives. Specifically, to compute the gradient, we first solve the inner optimization loop using a numerical optimizer matRad[65] to obtain the corresponding treatment plan P(θ) = matRad(θ). The loss is computed on the treatment plan P and the clinical goals g using an LLM with prompts provided in Section G.1.

Evaluation metrics. To evaluate a treatment plan, we adopt several commonly used dose metrics as a plan cannot be evaluated using a single metric. We consider the mean dose delivered to the target/organ volume, as well as Dq, which denotes the minimum dose received by q% of the target/organ volume.

Results. The gradients generated by TEXTGRAD provide meaningful guidance to improve the hyperparameters. As illustrated in Figure 3, when there is dose spillage outside the Planning Target Volume (PTV), the gradient suggests an increase in the importance weight for the PTV. This adjustment results in a more uniform and confined dose for the PTV. However, this can lead to insufficient protection of the bladder and rectum as their relative weights are reduced. Therefore, in the following step, the gradients suggest slightly increasing the weights for the bladder and rectum, resulting in better protection for these organs. We compared TEXTGRAD optimized plans with the clinical plans used to treat five prostate cancer patients. In Figure 3 (c), we assess TextGrad’s capabilities in achieving clinical goals for the PTV region. TextGrad outperforms the clinical plans across all metrics, achieving a higher mean dose, and a D95 that exactly matches the prescribed dose. In Figure 3 (d), we focus on the sparing of healthy organs. TextGradoptimized plans achieve lower mean doses for these healthy organs, suggesting better organ sparing than the human-optimized plans. We report the averages across five plans and with standard deviation included in the bracket.

Authors:

(1) Mert Yuksekgonul, Co-first author from Department of Computer Science, Stanford University ([email protected]);

(2) Federico Bianchi, Co-first author from Department of Computer Science, Stanford University ([email protected]);

(3) Joseph Boen, Co-first author from Department of Biomedical Data Science, Stanford University ([email protected]);

(4) Sheng Liu, Co-first author from Department of Biomedical Data Science, Stanford University ([email protected]);

(5) Zhi Huang, Co-first author from Department of Biomedical Data Science, Stanford University ([email protected]);

(6) Carlos Guestrin, Department of Computer Science, Stanford University and Chan Zuckerberg Biohub ([email protected]);

(7) James Zou, Department of Computer Science, Stanford University, Department of Biomedical Data Science, Stanford University, and Chan Zuckerberg Biohub ([email protected]).

This paper is available on arxiv under CC BY 4.0 license.

← Previous

Evolutionary Chemistry via LLM Agents: Multi-Objective SMILES Optimization

Up Next →

TextGrad vs. DSPy & ProTeGi: Evolution of Textual Autograd