Is It Possible To Replicate Renaissance Technologies' Success?

Author

Jin Won Choi

Category

Machine Learning

Date

Nov. 17, 2020

Renaissance Technologies, or ‘RenTec’ for short, is widely regarded as the most successful asset manager of all time. According to the book ‘The Man Who Solved the Market’ by Gregory Zuckerman (henceforth called ‘MWSTM’), RenTec’s Medallion fund averaged gains of 39% per year from 1988 to 2018. Such a track record would already allow the fund to enter the investment fund hall of fame, but what’s even more remarkable is that the track record was net of some very, very hefty fees.

A typical hedge fund charges 2 and 20 - that is, 2% of assets and 20% of performance. RenTec, on the other hand, charged 5 and 44 for the Medallion fund during the period. If the fund had not charged any fees at all, it would have averaged gains of 66% per year during the same time period, which many would have thought impossible to achieve.

Given RenTec’s huge success, it comes as no surprise that the firm holds its secrets very tightly. The firm ties up their employees with very stringent non-competes, and doesn’t shy away from suing them any time it feels the agreement has been infringed upon. Zuckerman wrote in MWSTM that many people didn’t think he’d be able to write the book because they didn’t think anybody from RenTec would talk to him.

Yet, the book did get published, and while it didn’t reveal any big secrets, it did leave clues as to how RenTec’s system works. Additionally, Jim Simons, the founder of RenTec, has been open about sharing the firm’s big picture philosophies through interviews such as this one. As I spent a significant amount of time building financial machine learning models myself, I developed an understanding of these clues, and have decided to share my understanding here.

RenTec Incorporates A Wide Variety of Data

Some financial machine learning practitioners use price, and price only, to determine which securities to invest in. They don’t try to combine different types of data, such as fundamentals data, to generate their predictions.

The theoretical underpinning behind this approach is that price movements reflect the sum aggregate of all decision factors. In other words, price movements should, in theory, reflect the beliefs of value investors who analyze fundamentals, as well as quants who act on quantitative factors such as momentum. Some may therefore argue that price data is all you need.

RenTec, however, doesn’t appear to subscribe to this belief. While the firm does appear to utilize price based factors heavily, the book also states the following (emphasis mine):

“Renaissance staffers deduced that there is even more that influences investments, including forces not readily apparent or sometimes even logical. By analyzing and estimating hundreds of financial metrics, social media feeds, barometers of online traffic, and pretty much anything that can be quantified and tested, they uncovered new factors, some borderline impossible to appreciate.”

Indeed, in Jim Simons’ interview with Numberphile, he cited “huge datasets” as one of the main barriers to entry for replicating Medallion’s success. As price data is generally widely available, one can imagine that those data sets mainly consist of “alternative data”.

RenTec Models Are Not “Elaborate”

Let’s first discuss what “elaborate” means, because in the same Numberphile interview (35:40 mark), Simons describes RenTec’s system as being both “very” elaborate and “not that” elaborate.

One way to have an elaborate system is to use elaborate equations, which would include many cutting edge neural networks. ResNet, a famous neural network that classifies images, consists of dozens of hidden layers involving some 11 million parameters. When Simons says RenTec’s models aren’t “elaborate”, I believe he means that they don’t use such large equations.

Instead, RenTec’s system appears to be an amalgamation of a huge number of small, simple factors. A RenTec executive was quoted in MWSTM as saying, “Volume divided by price change three days earlier, yes we’d include that.”

However, the fact that RenTec uses such simple signals does not mean that the system as a whole is simple as well. RenTec has spent a lot of effort developing many different types of models, such as modelling transaction costs and optimizing portfolio allocations. A system that brings together many different models is necessarily elaborate, even if each of the individual models is simple.

RenTec Sees Infrastructure As A Big Competitive Advantage

In addition to huge data sets, Simons cited “programs that we’ve written to make it really easy to test hypotheses” as one of the main barriers to entry. After having worked on machine learning systems for some time, I feel that this is an underappreciated point.

People who are new to machine learning often imagine that data scientists spend most of their time looking at data and conjuring up new hypotheses to turn into new models. But the truth is that data scientists generally spend only a small fraction of their time on that core task. Instead, most data scientists spend most of their time processing and debugging data.

As I’ve explained in my white paper, processing large amounts of data becomes very time consuming. Processing even simple factors may take hours or even days, depending on how well the data pipelines are optimized. Then, because data scientists are humans, they often find that the data did not come out as initially intended, perhaps because of a bug or a misspecification, and they often have to repeat the process many times before they get it right.

An efficient data infrastructure has the potential to significantly boost data scientists’ productivity by reducing their time spent wrestling with data. Let’s say, for example, that it normally takes a week for a data scientist to develop a new factor, consisting of one day spent hypothesizing and prototyping the factor, and four days processing data. If an efficient infrastructure reduces the data processing time to just one day, the data scientist would be able to create a new factor every two days instead of every week, more than doubling their productivity.

Creating a good infrastructure is very hard, and requires balancing a myriad of requirements. The infrastructure would have to process data very quickly, but also be on the alert to detect misspecifications or errors as soon as possible. It would have to be user friendly to data scientists without restricting their flexibility to conduct any experiments they may choose. It would have to be able to handle a wide variety of data without becoming so complex so as to make it difficult to modify.

Conclusions

Nothing I’ve read about RenTec gives me the sense that there is some overarching secret behind asset manager’s success. Rather, their system works because it makes use of a million different factors, each allowing the system to outperform the financial markets by just a tiny fraction. RenTec’s competitive advantage is its infrastructure, which allows its data scientists to churn out useful factors more quickly than its competitors.

Can another firm replicate RenTec’s success? I believe the answer is yes, though it would be far from easy. With enough talent and time, I believe it’s possible to build an infrastructure for data scientists that’s on par or even better than RenTec’s. Although RenTec has a very significant head start, they are probably also constrained by legacy systems, which new firms would not have to deal with. The huge datasets proprietary to RenTec would be more difficult, if not impossible, to replicate. However, the importance of such data sets would diminish over time as new data providers filled the gap.

I believe that machine learning in asset management will follow the typical path of disruption outlined in The Innovator’s Solution. New firms will enter the market, offering similar data and infrastructure solutions as that internally used within RenTec. The quality of these firms’ offerings will be poor at first, but useful nonetheless at the affordable prices offered. But as these firms continue to grow, the quality of their offerings will get better, taking ever more market share ever higher up the market. We are working on becoming one of those firms - drop us a line if you’d like to chat.

Quantocracy Badge
Latest posts