Overview
With the rise of generative AI (GenAI), there has been an increased need for participation by large and diverse user bases in AI evaluation and auditing. GenAI developers are increasingly adopting crowdsourcing approaches to test and audit their AI products and services. However, it remains an open question how to design and deploy responsible and effective crowdsourcing pipelines for AI auditing and evaluation.
This one-day workshop at HCOMP 2024, aims to take a step towards bridging this gap. Our interdisciplinary team of organizers will work with workshop participants to explore several key questions, such as how to improve the output quality and workers' productivity for GenAI evaluation crowdsourcing tasks compared to discriminative AI systems, how to guide crowds in auditing problematic AI-generated content while managing their psychological impact, ensuring marginalized voices are heard, and setting up responsible and effective crowdsourcing pipelines for real-world GenAI evaluation. We hope this workshop will produce a research agenda and best practices for designing responsible crowd-based approaches to AI auditing and evaluation.