On March 3, Bloomberg announced the launch of a large-scale language model (LLM) for the financial community - BloombergGPT.

Image source: Screenshot of Bloomberg website

Bloomberg is a global provider of business, financial information and news feeds. On March 3, the company released a research paper on the development of BloombergGPT, detailing this large-scale generative artificial intelligence (AI) model. The Big Language Model (LLM) is specially trained on various types of financial data to fully support natural language processing (NLP) tasks in the financial field.

According to Bloomberg's WeChat public account, the model will help Bloomberg improve existing financial NLP tasks, such as market sentiment analysis, named entity recognition, news classification and question answering. In addition, BloombergGPT will unlock more new opportunities to mobilize the massive amount of data on Bloomberg terminals to bring the potential of artificial intelligence to the financial sector.

According to reports, Bloomberg researchers have pioneered a hybrid training method that combines financial data with a common dataset to train models to achieve the best results on financial benchmarks while also remaining competitive enough on general LLM benchmarks.

At the same time, Bloomberg's machine learning products and research groups and AI engineering teams leveraged Bloomberg's resources in data creation, collection, and collation to build one of the largest specialized domain datasets to date.

As a financial data company, Bloomberg's data analysts have collected and maintained documents that use a lot of financial language for more than four decades. From this archive of massive English financial documents, the development team extracted and created a financial dataset containing 3630 billion tokens. This batch of data is superimposed with another public dataset containing 3450 billion word examples, becoming a large training corpus containing more than 7000 billion word examples.

Using part of the corpus, Bloomberg's research team trained a decoder-only causal language model with 500 billion parameters. The team also benchmarked the trained model. NLP tasks in the financial sector use a set of Bloomberg's own benchmarks, and various general-purpose NLP tasks use popular benchmarks in the market.

According to Bloomberg, the BloombergGPT model far outperforms open models of similar size on financial tasks, and also performs at or above average on general NLP benchmarks.

Shawn Edwards, chief technology officer at Bloomberg, added, "BloombergGPT will allow us to handle many new types of applications that not only perform better than custom models, but also work out-of-the-box and significantly reduce time-to-market." ”