Foundation Models (FMs) have arrived and they are bound to change the landscape of AI systems for good. These large pretrained AI models, which can easily be adapted to new use cases, are revolutionising creative work and are expected to augment or take over ever more knowledge work in the coming years as use cases in different industries are being tackled.
Below, we focus on Foundation models in general, across modalities and applications. For more specific information on Large Language Models (LLMs) - Foundation models for Natural Language - please visit this page. More on Generative AI applications and use cases can be found here.
Large language models - foundation models specialized on outputting text - are advanced artificial intelligence (AI) systems that have revolutionized natural language processing and understanding. These models, such as OpenAI's GPT-3 (Generative Pre-trained Transformer 3) and its successors, are designed to understand and generate human-like text, making them powerful tools for a wide range of applications.
Foundation models for computer vision are large, high-performance models that have been pretrained on huge amounts of data. Examples include Vision Transformer (ViT) for image classification, You Only Look Once (YOLO) for object detection and Segment Anything (SAM) for segmentation. Typically you will select one of these models based on their performance on a specific task and then fine-tune it further on your use case-specific dataset.
While foundation models can perform general tasks very well, they are still outperformed by expert models on specific tasks. How to efficiently teach a foundation model to be an expert in a specific area or topic is the next big question. One technique to further adapt foundation models is fine-tuning, whereby the foundation model can be specialised with added data and knowledge or taught a specific style of generation.
Foundation models are increasingly becoming multimodal, meaning that they treat all modalities of data (text, image, video, audio,...) the same way.
AI solutions are increasingly becoming standardized and productized, especially traditional AI tasks. We expect this trend to continue and expand towards generative AI as well.
Regulation often lags behind innovation, which is also the case in the foundation model domain. New legal issues are arising with foundation models, such as privacy and copyright considerations for training data used to train these models, as well as how to ensure transparency and risk management for downstream use cases using foundation models as a base. The most recent draft of the upcoming EU AI Act recently has added provisions on foundation models and generative AI.