Andrew Gordon, Senior Research Consultant, Prolific – Interview Series

Andrew Gordon draws on his robust background in psychology and neuroscience to uncover insights as a researcher. With a BSc in Psychology, MSc in Neuropsychology, and Ph.D. in Cognitive Neuroscience, Andrew leverages scientific principles to understand consumer motivations, behavior, and decision-making.

Prolific was created by researchers for researchers, aiming to offer a superior method for obtaining high-quality human data and input for cutting-edge research. Today, over 35,000 researchers from academia and industry rely on Prolific AI to collect definitive human data and feedback. The platform is known for its reliable, engaged, and fairly treated participants, with a new study being launched every three minutes.

How do you leverage your background in cognitive neuroscience to help researchers who are undertaking projects involving AI?

A good starting point is defining what cognitive neuroscience actually encompasses. Essentially, cognitive neuroscience investigates the biological underpinnings of cognitive processes. It combines principles from neuroscience and psychology, and occasionally computer science, among others, which helps us understand how our brain enables various mental functions. Essentially, anyone practising cognitive neuroscience research needs to have a strong grasp of research methodologies and a good understanding of how people think and behave. These two aspects are crucial and can be combined to develop and run high-quality AI research as well. One caveat, though, is that AI research is a broad term; it can involve anything from foundational model training and data annotation all the way to understanding how people interact with AI systems. Running research projects with AI is no different from running research projects outside of AI; you still need a good understanding of methods, design studies to create the best data, sample correctly to avoid bias, and then use that data in effective analyses to answer whatever research question you’re addressing.

Prolific emphasizes ethical treatment and fair compensation for its participants. Could you share insights on the challenges and solutions in maintaining these standards?

Our compensation model is designed to ensure that participants are valued and rewarded, thereby feeling like they are playing a significant part in the research machine (because they are). We believe that treating participants fairly and providing them a fair payment rate, motivates them to more deeply engage with research and consequently provide better data.

Unfortunately, most of the online sampling platforms do not enforce these principles of ethical payment and treatment. The result is a participant pool that is incentivized not to engage with research, but to rush through it as quickly as possible to maximize their earning potential, leading to low-quality data. Maintaining the stance we take at Prolific is challenging; we’re essentially fighting against the tide. The status quo in AI research and other forms of online research has not been focused on participant treatment or well-being but rather on maximizing the amount of data that can be collected for the lowest cost.

Making the wider research community understand why we’ve taken this approach and the value they’ll see by using us, as opposed to a competing platform, presents quite the challenge. Another challenge, from a logistical point of view, involves devoting a significant amount of time to respond to concerns, queries, or complaints by our participants or researchers in a timely and fair manner. We dedicate a lot of time to this because it keeps users on both sides – participants and researchers – happy, encouraging them to keep coming back to Prolific. However, we also rely heavily on the researchers using our platform to adhere to our high standards of treatment and compensation once participants are taken to the researcher’s task or survey and thus leave the Prolific ecosystem. What happens off our platform is really in the control of the research team, so we depend not only on participants letting us know if something is wrong but also on our researchers upholding the highest possible standards. We try to provide as much guidance as we possibly can to ensure that this happens.

Considering the Prolific business model, what are your thoughts on the essential role of human feedback in AI development, especially in areas like bias detection and social reasoning improvement?

Human feedback in AI development is crucial. Without human involvement, we risk perpetuating biases, overlooking the nuances of human social interaction, and failing to address some of the negative ethical considerations associated with AI. This could hinder our progress towards creating responsible, effective, and ethical AI systems. In terms of bias detection, incorporating human feedback during the development process is crucial because we should aim to develop AI that reflects as wide a range of views and values as possible, without favoring one over another. Different demographics, backgrounds, and cultures all have unconscious biases that, while not necessarily negative, might still reflect a viewpoint that wouldn’t be widely held. Collaborative research between Prolific and the University of Michigan highlighted how the backgrounds of different annotators can significantly affect how they rate aspects such as the toxicity of speech or politeness. To address this, involving participants from diverse backgrounds, cultures, and perspectives can prevent these biases from being ingrained in AI systems under development. Additionally, human feedback allows AI researchers to detect more subtle forms of bias that might not be picked up by automated methods. This facilitates the opportunity to address biases through adjustments in the algorithms, underlying models, or data preprocessing techniques.

The situation with social reasoning is essentially the same. AI often struggles with tasks requiring social reasoning because, by nature, it is not a social being, while humans are. Detecting context when a question is asked, understanding sarcasm, or recognizing emotional cues, requires human-like social reasoning that AI cannot learn on its own. We, as humans, learn socially, so the only way to teach an AI system these types of reasoning techniques is by using actual human feedback to train the AI to interpret and respond to various social cues. At Prolific, we developed a social reasoning dataset specifically designed to teach AI models this important skill.

In essence, human feedback not only helps identify areas where AI systems excel or falter but also enables developers to make the necessary improvements and refinements to the algorithms. A practical example of this is observed in how ChatGPT operates. When you ask a question, sometimes ChatGPT presents two answers and asks you to rank which is the best. This approach is taken because the model is always learning, and the developers understand the importance of human input to determine the best answers, rather than relying solely on another model.

Prolific has been instrumental in connecting researchers with participants for AI training and research. Can you share some success stories or significant advancements in AI that were made possible through your platform?

Due to the commercial nature of a lot of our AI work, especially in non-academic spaces, most of the projects we’re involved in are under strict Non-Disclosure Agreements. This is primarily to ensure the confidentiality of techniques or methods, protecting them from being replicated. However, one project we are at liberty to discuss involves our partnership with Remesh, an AI-powered insights platform. We collaborated with OpenAI and Remesh to develop a system that utilizes representative samples of the U.S. population. In this project, thousands of individuals from a representative sample engaged in discussions on AI-related policies through Remesh’s system, enabling the development of AI policies that reflect the broad will of the public, rather than a select demographic, thanks to Prolific’s ability to provide such a diverse sample.

Looking forward, what is your vision for the future of ethical AI development, and how does Prolific plan to contribute to achieving this vision?

My hope for the future of AI, and its development, hinges on the recognition that AI will only be as good as the data it’s trained on. The importance of data quality cannot be overstated for AI systems. Training an AI system on poor-quality data inevitably results in a subpar AI system. The only way to ensure high-quality data is by ensuring the recruitment of a diverse and motivated group of participants, eager to provide the best data possible. At Prolific, our approach and guiding principles aim to foster exactly that. By creating a bespoke, thoroughly vetted, and trustworthy participant pool, we anticipate that researchers will use this resource to develop more effective, reliable, and trustworthy AI systems in the future.

What are some of the biggest challenges you face in the collection of high-quality, human-powered AI training data, and how does Prolific overcome these obstacles?

The most significant challenge, without a doubt, is data quality. Not only is bad data unhelpful—it can actually lead to detrimental outcomes, particularly when AI systems are employed in critical areas such as financial markets or military operations. This concern underscores the essential principle of “garbage in, garbage out.” If the input data is subpar, the resultant AI system will inherently be of low quality or utility. Most online samples tend to produce data of lesser quality than what’s optimal for AI development. There are numerous reasons for this, but one key factor that Prolific addresses is the general treatment of online participants. Often, these individuals are viewed as expendable, receiving low compensation, poor treatment, and little respect from researchers. By committing to the ethical treatment of participants, Prolific has cultivated a pool of motivated, engaged, thoughtful, honest, and attentive contributors. Therefore, when data is collected through Prolific, its high quality is assured, underpinning reliable and trustworthy AI models.

Another challenge we face with AI training data is ensuring diversity within the sample. While online samples have significantly broadened the scope and variety of individuals we can conduct research on compared to in-person methods, they are predominantly limited to people from Western countries. These samples often skew towards younger, computer-literate, highly educated, and more left-leaning demographics. This doesn’t fully represent the global population. To address this, Prolific has participants from over 38 countries worldwide. We also provide our researchers with tools to specify the exact demographic makeup of their sample in advance. Additionally, we offer representative sampling through census match templates such as age, gender, and ethnicity, or even by political affiliation. This ensures that studies, annotation tasks, or other projects receive a diverse range of participants and, consequently, a wide variety of insights.

Thank you for the great interview, readers who wish to learn more should visit Prolific