Survival of the Fittest: Compact Generative AI Models Are the Future for Cost-Effective AI at Scale
After a decade of rapid growth in artificial intelligence (AI) model complexity and compute, 2023 marks a shift in focus to efficiency and the broad application of generative AI (GenAI). As a result, a new crop of models with less than 15 billion parameters, referred to as nimble AI, can closely match the capabilities of ChatGPT-style giant models containing more than 100B parameters, especially when targeted for particular domains. While GenAI is already being deployed throughout industries for a wide range of business usages, the use of compact, yet highly intelligent models, is rising. In the near future, I expect there will be a small number of giant modes and a giant number of small, more nimble AI models embedded in countless applications.
While there has been great progress with larger models, bigger is certainly not better when it comes to training and environmental costs. TrendForce estimates that ChatGPT training alone for GPT-4 reportedly costs more than $100 million, while nimble model pre-training costs are orders-of-magnitude lower (for example, quoted as approximately $200,000 for MosaicML’s MPT-7B). Most of the compute costs occur during continuous inference execution, but this follows a similar challenge for larger models including expensive compute. Furthermore, giant models hosted on third-party environments raise security and privacy challenges.
0 Comments