Optimizing Scoring Models: Effective Prompting Formats

cover
25 Sept 2024

Authors:

(1) Chengrun Yang, Google DeepMind and Equal contribution;

(2) Xuezhi Wang, Google DeepMind;

(3) Yifeng Lu, Google DeepMind;

(4) Hanxiao Liu, Google DeepMind;

(5) Quoc V. Le, Google DeepMind;

(6) Denny Zhou, Google DeepMind;

(7) Xinyun Chen, Google DeepMind and Equal contribution.

Abstract and 1. Introduction

2 Opro: Llm as the Optimizer and 2.1 Desirables of Optimization by Llms

2.2 Meta-Prompt Design

3 Motivating Example: Mathematical Optimization and 3.1 Linear Regression

3.2 Traveling Salesman Problem (TSP)

4 Application: Prompt Optimization and 4.1 Problem Setup

4.2 Meta-Prompt Design

5 Prompt Optimization Experiments and 5.1 Evaluation Setup

5.2 Main Results

5.3 Ablation Studies

5.4 Overfitting Analysis in Prompt Optimization and 5.5 Comparison with Evoprompt

6 Related Work

7 Conclusion, Acknowledgments and References

A Some Failure Cases

B Prompting Formats for Scorer Llm

C Meta-Prompts and C.1 Meta-Prompt for Math Optimization

C.2 Meta-Prompt for Prompt Optimization

D Prompt Optimization Curves on the Remaining Bbh Tasks

E Prompt Optimization on Bbh Tasks – Tabulated Accuracies and Found Instructions

B PROMPTING FORMATS FOR SCORER LLM

Figure 14, 15, and 16 show examples of the Q_begin, Q_end, and A_begin prompting formats when the “QA” pattern is present. The “QA” pattern is eliminated when prompting instruction-tuned scorer models like text-bison with the Q_begin and Q_end formats (Figure 17 and 18).

Figure 14: The Q_begin prompting format on a GSM8K test exemplar with the "QA" pattern.

Figure 15: The Q_end prompting format on a GSM8K test exemplar with the "QA" pattern.

Figure 16: The A_begin prompting format on a GSM8K test exemplar.

Figure 17: The Q_begin prompting format on a GSM8K test exemplar without the "QA" pattern.

Figure 18: The Q_end prompting format on a GSM8K test exemplar without the "QA" pattern.

This paper is available on arxiv under CC0 1.0 DEED license.