Large language models (LLMs) are taking the world by storm. An avalanche of use cases is being unlocked each day, and there are no signs of slowing. Various industries -also those previously thought to be âsafeâ from automation, see their core business change at neck-breaking speed.
As machine learning engineers, we find ourselves at the forefront of developing and deploying these models that have the potential to transform numerous industries. While these advancements offer exciting possibilities, it is essential to address the ethical considerations and implications that arise from their development and deployment.
In this blog post, we aim to shed light on key ethical considerations and provide practical guidance to our fellow developers and clients, ensuring responsible LLM development.
Large language models, such as GPT-4 by OpenAI, represent a significant leap in artificial intelligence technology, demonstrating a powerful ability to generate and understand human-like text. These models can author compelling prose, answer questions, translate languages, tutor in diverse subject matter, and even write computer code. As we delve deeper into the age of AI, the power these models hold is profound. They can democratize information by providing access to knowledge and learning tools to anyone with internet access and have the potential to revolutionize industries, from healthcare to education. However, this power is not without its complexities. While such models can drive incredible innovation and societal benefits, they also raise critical ethical considerations that must be addressed to ensure that their deployment does not lead to unintended negative consequences. With the rise of LLMs and foundation models (FMs), we need to move more and more toward putting energy into model alignment and keeping them under control.
In this blog post we will focus on three key aspects under which most ethical considerations can be categorized:
Some LLMs are supposedly trained on âthe entire internetâ. While this breadth of data can enhance the modelâs capabilities and is the reason why itâs so good at understanding and producing human language, it also includes data that we rather have not reflected in the output of our model. This includes but is not limited to language or content that is harmful or discriminatory, particularly towards marginalized or protected groups.
To counteract this, carefully curating your training (and fine-tuning) data while ensuring a diverse and representative dataset is crucial. We must ensure that the dataset is both diverse and representative, yet devoid of harmful content. This is not just about removing offensive content, but about promoting inclusivity and fairness in the AIâs responses.
One step further is to train a âguardrailâ model which detects and filters out any inappropriate content in the training data or the data produced. This model is trained on examples of both appropriate and inappropriate samples. The goal is for the model to learn to distinguish between acceptable and unacceptable content. Once trained, this model is used for filtering the training or fine-tuning data, or itâs integrated with the primary LLM. Every time the primary LLM generates output, itâs also passed through the guardrail model. The guardrail model then flags the output for review, blocks it, or guides the primary model to modify it.
Humans can also be involved to label data in order to avoid toxic or harmful content and encourage the useful completion of a task. This is done by letting human annotators rate multiple completions of your LLM for one prompt at a time. This data is then used to train a reward model which takes over the humanâs task of rating and forwards the scores to a Reinforcement Learning model to update the weights of the LLM.
When the data first needs to be annotated for training the LLM or a reward model, make sure to provide your annotators with enough guidance on how to label certain data. Additionally, selecting a diverse group of annotators helps avoid potential biases and contributes to a more balanced and representative model.
Lastly, being careful in your prompt design can help mitigate toxicity. Instead of:
âExplain why people who like pineapple on pizza have bad taste.â
which assumes a negative stereotype and could perpetuate divisive attitudes, you could ask:
âDiscuss the varying global perspectives on the topic of pineapple as a pizza topping, making sure to represent all views in a respectful and unbiased manner.â
setting clear boundaries for a respectful conversation.
We previously built demos to detect toxicity, profanity, and hate speech in short text for German and Dutch.
What in entirely different circumstances could be a result of toxicity, is meant as the production of misinformation or âfake newsâ in the context of LLMs.
One explanation is that a lot of times we donât know what model is actually learning when presented with a certain dataset. Consequently, the model will attempt to fill gaps where itâs missing data.
Many models allow for parameters to be adjusted. The most common ones which influence the modelâs âcreative freedomâ (and thus indirectly its hallucination) are temperature, top_p, and top_k. The more factual your use case, the stricter you should be with these parameters:
With all these parameters set to low numbers, an LLM could produce something like
âThe cat sits on the balcony, meowing at the pigeons.â
Experimenting with these parameters, the same prompt could result in something like this:
âThe cat sits on the moon, meowing at the stars.â
So choose your parameters wisely.
Another way to limit hallucination is augmenting the LLM with independent and verified sources to cross-verify the data returned by the model. This can be triggered by prompting the LLM to state its source each time it claims to produce a response with facts (e.g. ârefer to the source of the information you provide in your responseâ or âonly answer if you can refer to the source of the information you provide. If you canât answer the question, say < ⌠>â). This can help users ascertain the reliability of the information.
Taking this one step further would be using retrieval-augmented generation (RAG) which adds a knowledge base to your LLM. We refer to another blog post explaining its concept and implementation.
You can combine source verification and RAG to achieve even better results.
This can be another effective way to avoid hallucinations and steer the LLM toward more useful responses. Therefore give clear context and create prompt designs which are as specific as possible and avoid ambiguity. Instead of:
âDescribe the famous international law passed in 2020 banning pineapple on pizza.â
which makes this prompt a source of potential misinformation, you could ask:
âCould you provide an overview of the origins and popular opinion about the âHawaiian pizzaâ, which includes pineapple as a topping, based on information available up to 2021?â
which asks for factual information within a specific time frame, reducing the chance of the model generating fictitious or speculative information. Additionally, you can design your prompts in a way that describes the role of the LLM (e.g. âYou are a customer service agentâŚâ)
Another simple, but important step in terms of explainability, is transparency with end users. Let the users know that theyâre dealing with an artificial intelligence system and that the reality of this technology is that it can produce wrong answers. If an LLM is used for an unintended use case, its chance to hallucinate is higher. Adding disclaimers not only improves user understanding but also sets realistic expectations about the reliability of the AI system.
When training or fine-tuning your LLM, you need to make sure not to run into any copyright or protection issues surrounding your data and model.
From a legal perspective, you need to be aware of where your model and your data are hosted in order to comply with data and privacy protection regulations. This for example includes not storing or using personal data without a legal basis and respecting inter alia the general principles of (i) data minimization, (ii) purpose limitation, (iii) storage limitation, (iv) accuracy, (v) fairness and transparency and (vi) integrity and confidentiality.
The question of intellectual property is currently widely discussed due to the lawsuits in the US against various companies training LLMs. LLMs learn from vast amounts of data, often ingesting copyrighted material, such as books, articles, or web content, during their training phase. Does an LLM now infringe intellectual property when copyright-protected works are used to train or finetune, or when it generates text based on a copyrighted source?
It will be very interesting to see how judges in different jurisdictions deal with this question.
At the time of this writing, we are looking for a solution to this debate, by establishing a framework that facilitates collecting and preprocessing image data for foundation models. The framework leverages Creative Common (CC) licensed images for creating free-to-use datasets for image generation with AI. We previously published another blog post about image generation without copyright infringement.
The EU AI Act is planned to regulate the development and deployment of foundation models and thus LLMs. This includes (but is not limited to):
Even after following up on all these steps and building the most responsible AI system possible, the process isnât done (and will never truly be). Creating AI systems is a continuous iterative cycle where we want to implement responsibility in the concept and deployment stages. Monitoring and updating the models is important to address emerging toxic patterns, trends, or novel misuse. This needs to be rounded up with governance policies and accountability measures for all involved stakeholders throughout the life cycle.
Building an AI system responsibly is a multifaceted task that demands a proactive approach, careful planning, and ongoing commitment. But the payoff â a reliable, user-friendly system that respects and safeguards usersâ interests and societal values â is worth the effort.
The development and deployment of AI systems represent a significant technological advancement. However, along with this progress comes the responsibility to ensure that the use of such technology is ethically sound and respects the rights and values of all stakeholders. By understanding and addressing the ethical considerations involved, we can harness the full potential of AI while minimizing the risks and harm associated with it.
Thank you for bearing with me until the end! I hope you found this blog post useful. In case youâd like to read more about LLMs from my wonderful colleagues, this way please:
â