Using Language Models to Simulate Human Samples: Appendix

cover
11 Jun 2024

Authors:

(1) TIMNIT GEBRU, Black in AI;

(2) JAMIE MORGENSTERN, University of Washington;

(3) BRIANA VECCHIONE, Cornell University;

(4) JENNIFER WORTMAN VAUGHAN, Microsoft Research;

(5) HANNA WALLACH, Microsoft Research;

(6) HAL DAUMÉ III, Microsoft Research; University of Maryland;

(7) KATE CRAWFORD, Microsoft Research.

1 Introduction

1.1 Objectives

2 Development Process

3 Questions and Workflow

3.1 Motivation

3.2 Composition

3.3 Collection Process

3.4 Preprocessing/cleaning/labeling

3.5 Uses

3.6 Distribution

3.7 Maintenance

4 Impact and Challenges

Acknowledgments and References

Appendix

A Appendix

In this appendix, we provide an example datasheet for Pang and Lee’s polarity dataset [22] (figure 1 to figure 4).

Fig. 1. Example datasheet for Pang and Lee’s polarity dataset [22], page 1.

Fig. 2. Example datasheet for Pang and Lee’s polarity dataset [22], page 2.

Fig. 3. Example datasheet for Pang and Lee’s polarity dataset [22], page 3.

Fig. 4. Example datasheet for Pang and Lee’s polarity dataset [22], page 4.

This paper is available on arxiv under CC 4.0 license.