Successfully applying AI to investing appears as a completely different challenge compared to the numerous industries in which it has enjoyed success. The mix between the risky nature of financial markets and the crucial role of asset managers in the economy makes AI reliability the most important factor to consider – the same trustworthiness required from security engineers when building a 100-story skyscraper in the center of Manhattan.
Just as buildings need solid foundations to resist earthquakes, AI models require learning from meaningful data to deliver what they promise: greater portfolio efficiency, risk control and the ability to continuously adapt and improve over each market cycle. Yet, it is often unclear how to achieve this. In this paper, we discuss the essentials to build AI that investors can rely on. To invest with models we can trust and control – not the other way around.
2002 was a year marked by several significant events: the Euro officially became the currency of twelve European Union members, US president Jimmy Carter won the Nobel-prize, and Brazil won the FIFA World Cup defeating Germany in a memorable 2-0 signed by an unstoppable 26-year-old Ronaldo.
Among all these - and many more - high-profile events, the release at the box office of the first Spider-Man might have gone unnoticed. Yet, this movie immortalized into our collective imagination not only the famous superhero but also the following quote: “With Great Power Comes Great Responsibility”.
In the movie, when the young Peter Parker hears these words from his Uncle Ben, he is still figuring out how to deal with his supernatural powers. Only after his uncle’s tragic death, however, Peter fully grasps the meaning and importance of this powerful advice, choosing to become the just and fair superhero on which New York citizens could always rely.
To a certain extent, this principle is becoming particularly true to frame the increasingly relevant role that AI is playing in the asset management industry.
The technological progress of the last few decades has indeed equipped asset and investment managers with new, powerful tools. Just as that spider bite enabled Peter Parker to run on walls, shoot webs and jump across buildings, new technologies such as Artificial Intelligence (AI) are allowing investors to better analyze financial data and gain a deeper understanding of the inner dynamics of financial markets. Consequently, this is leading to the construction of better diversified and more efficient portfolios that can adapt to the constant evolution of today’s markets.
However, the unprecedented power, speed and precision of AI calls indeed for greater responsibility for those building AI-driven investment solutions. In this regard, AI developers have a duty to develop models that investors can fully understand and control. As financial markets are increasingly noisy and complex, institutional investors seek investment strategies that deliver results aligned with their expectations, during good but especially during bad times.
In this paper, we will outline the key factors needed to fully unlock the potential of AI to build reliable investment solutions for institutional investors. As we are going to see, this requires combining state-of-the-art technology, the best available and most meaningful data, and a rigorous training process in which humans still play a vital role.
Start integrating AI into your investment process with the most advanced no-code solution.
Exactly like Michelin-starred chefs take great care in looking for, selecting and using only ingredients of the highest quality, using accurate and meaningful data to train Artificial Intelligence models represents the first step to achieve consistent results that investors can trust.
Indeed, AI models have an unparalleled capacity to analyze data and make accurate predictions. Yet, if the wrong input data is given, the model could infer wrong or non-existent relationships, leading to incorrect conclusions. Thus, the more high-quality data it receives, the better its estimations will be.
From this standpoint, it is clear why the expression “garbage-in, garbage-out” is more than just a famous computer science rule, but an extremely important warning on how to correctly use AI in investments. In fact, as AI is always working with data, the wrong input can hit twice, even with the most accurate model.
Indeed, after an initial period of training, AI is fed data it has not seen before, so that the knowledge acquired is always up to date, expanded and improved with new observations. Thanks to this process, AI is able to uncover non-linear and complex relationships among variables by continuously learning from vast datasets, instead of being explicitly programmed to perform a sequence of predetermined tasks like traditional quant techniques.
For those reasons, financial AI models need data with exceptionally high-quality standards. First, to facilitate the AI learning process, input data must be accurate and indisputable - that is, data everyone agrees on. Furthermore, it must be truly informative so that the model does not infer connections that are just the result of pure randomness.
Having understood the potential of AI techniques and the importance that data quality has, it is not always clear which type of data to actually use.
Indeed, investors are finding themselves with an immense quantity and variety of financial data to choose from. First, financial markets and financial statements provide a huge quantity of market data (e.g. asset prices and returns, volumes) and company fundamentals (e.g. assets, liabilities, revenues). Second, the advent of the Internet has increased the coverage and spread of financial news. Lastly, investors are looking more and more at unstructured and alternative datasets like satellite images and social media posts as a way to expand their competitive edge.
Obviously, each type of financial data holds some unique pieces of information, has its own specific strengths and weaknesses, and, therefore, is better suited for a different objective.
Yet, if the goal is to reliably build and train an AI model, then input data must not only be informative but also be relevant for the investment horizon chosen to execute a particular strategy. The benefits of staying invested and building time in the markets are well known and widely exploited, in particular by institutional investors. For this specific purpose, AI models need to take into account a medium- to long-term approach and avoid being affected by temporary short-term shocks.
To identify the most effective type of financial data in this sense, in Exhibit 3 we outline how macro-types of financial data differ in terms of predictive power (vertical axis) and predictive horizon (horizontal axis). Predictive power is related to how much essential and useful information is aggregated in the data - that is, cause-effect relationships and underlying connections among assets. Predictive horizon refers to the time it usually takes for that information to be priced into securities and turned into exploitable investment opportunities.
Financial news represents a widely available source of information for many investors. However, when training the AI, they risk adding noise and information that is only relevant for a short period of time.
Despite the potential of the vast and non-standard world of alternative data, the existing financial literature is still relatively heterogeneous regarding how much signal this data actually conveys. In this sense, alternative data only risks adding an extra layer to the analysis that blurs the real causality among asset dynamics.
Financial statements and fundamentals have long been the cornerstone of long-term value investing. Unfortunately, they present structural impediments that inhibit AI models from being responsive to current market conditions, missing investment opportunities that eventually disappear.
From this standpoint, historical market data emerges as the type of financial data that best balances the predictive power-horizon tradeoff.
Indeed, asset prices are a standardized, indisputable and widely-available information source that offers investors the opportunity to have a 360-degree understanding of data. In fact, the metrics that can be calculated from them - i.e. variances, covariances and return distributions - well describe the connections and the structure of financial markets.
Moreover, historical market data partially embeds all the above-mentioned sources of information, thus better reflecting the underlying market dynamics. In this sense, the principle that market prices reflect all available information, only reacting to new or unexpected information, represents one of the cornerstones of the modern approach to investing since the 1970 groundbreaking review of theoretical and empirical research around the Efficient Market Hypothesis by American economist Eugene Fama.
Additionally, the information contained in market data seems to be perfectly suited for a medium to long term horizon. Indeed, through a disciplined approach, that includes for instance regular rebalancing, it provides information useful to adapt to the gradual unfolding of financial markets but avoiding sharp and sudden changes in volatility.
Ultimately, market data perfectly fit the criteria outlined in the previous chapter to ensure that AI models function efficiently and correctly. On the one hand, financial markets offer data that is standardized and indisputable (once markets close, investors cannot argue against the price of securities). On the other hand, market data is constantly updated. This greatly facilitates and improves the prediction power of AI, enabling investors to better extract the signal from the noise overload of today’s world.
To develop reliable AI models for the asset management industry, understanding what type of data to use is a first, crucial step. However, similarly to how we need a recipe to turn ingredients - even of the highest quality - into a delicious meal, we cannot expect an AI model to automatically turn data into superior investment solutions without a rigorous training process.
From this perspective, Exhibit 4 helps understand that, to achieve the best results, high-quality data by itself is not enough. A state-of-the-art model is obviously another necessary condition for success, and it can only be achieved through a disciplined approach in which human intelligence plays a vital role.
Indeed, human ongoing supervision is the key to ensure that AI is trained properly, starting from data collection and throughout the model’s continuous learning as shown in Exhibit 5. As a matter of fact, any obstacles can slow down the learning process of AI, weakening the meaningful connections among variables. Moreover, AI struggles to identify what information is correct and what is not.
In this sense, the human role is fundamental because historical market data should be continuously checked for consistency and potential mistakes. In fact, a lot of expertise is required to handle some common events in financial markets - such as stock splits and M&A activities - that can fictitiously alter the data.
Furthermore, an efficient training process is crucial to reduce the risk of overfitting - that is, building overly complex models that tend to make conclusions based on random correlations, mistaking noise for signal. Indeed, overly complex and case-specific models risk mimicking too closely the original data, thus failing to correctly generalize the underlying dynamics among variables and adapt to never-seen observations.
Instead, well-built and well-trained financial AI is capable of producing investment strategies delivering results that are coherent with investors’ expectations and objectives.
A concrete example can help clarify this concept. Imagine we have trained and built an equity strategy with the aim to outperform a particular benchmark (e.g. the S&P 500). If a market crash occurs and the equity risk premium is consequently negative, we cannot expect our model to achieve positive returns during that period. However, if built properly, the model should still be able to adapt to the new scenario and outperform the benchmark, thus reducing the overall drawdown.
From this example, two important considerations emerge about what truly is the added value of AI in investments: reliability.
First, reliability means to avoid unpleasant surprises and be protected from unexpected shocks. In fact, if the machine is well trained, it can dynamically adapt to the current scenario to keep volatility under control and deliver the intended investment objective. Mitigating drawdowns means protecting the capital on which returns are calculated and substantially accelerating its appreciation process over time.
Second, a disciplined approach puts investors behind the wheel of these new technological tools. Even if a poorly trained AI model succeeds in delivering superior returns, that success will be only temporary without transparency and interpretability. Understanding how the model operates and why it makes a certain decision is truly vital to know whether the model is working correctly - and to immediately recognize and fix a problem that arises.
Eventually, the full benefits of AI in investing come to light: portfolios better diversified and more efficient, protected from unexpected risks and capable of continuously adapting to and improving over each market cycle.
In conclusion, although Artificial Intelligence is successfully being used across many industries, as the driver of investment decisions it is still underutilised and partly misunderstood. It is not a turnkey technology that can magically turn data into a professional strategy. However, if built and trained properly, it can be a powerful tool to deliver investment results that are coherent with investors’ expectations.
In this sense, the first step towards unlocking the potential AI in investing consists of using meaningful data. Indeed, we could say that “AI is just as good as its weakest data points”. If incorrect or unreliable data is given as an input, then the model risks to provide inaccurate conclusions.
Due to the huge variety and volume of financial data, it is not always easy to determine what kind of data to use to achieve investment objectives. On the one hand, input data should be informative, standardized and indisputable. To deliver reliable results for institutional investors, input data should also possess the highest tradeoff between predictive horizon and predictive power. Historical market data and the metrics that can be calculated from them (variances, covariances, return distributions) offer exactly these characteristics.
However, high-quality data by itself is not enough. A robust training process is needed, in which humans still play an important role. By combining the two, investors can really take full control of this technology, turning it from a black box to a transparent and powerful ally.