Tag Archive: AI

Misplaced Pessimism about AI

It's hard to believe how much we've gained with AI and LLMs in the past couple of years, and now how much more it's expected to happen.

We've become so accustomed to continuous progress that at this point it's very easy to disappoint expectations with anything less than another major advancement that is deployed to the masses.

I do understand that a lot of cheesy hype has built-up, mostly because it's how business works. All sort of businesses are rushing to capitalize on the new buzzword, but this doesn't mean that we don't already have revolutionary technology that is already changing the world.

If all LLMs' development stopped today, we'd still have models like Llama 3.1 8B runnable on a laptop, and GPT-4-level models sometimes Open Source running externally with costs that are dropping rapidly (if we can predict one thing is that hardware keeps getting faster and cheaper).

What we can do today already is nothing short of amazing. There is so much untapped potential with the existing models, Open Source and not, and the only reason why this potential hasn't materialized is that we simply didn't have the time to work with it.

This is a good sign, because it means that things are evolving rapidly and there's little time to settle on an AI model, but if the rate of improvement of LLMs is truly slowing down or possibly reaching a plateau, then we, the developers/hackers/engineers, would gladly start to work towards making the best of what we've got.

Parallels with game console development

An example of this are game consoles, at least how they used to be, when every console had some new esoteric and quirky hardware developed perhaps on a hunch and under some heavy constraints by the hardware engineers and let loose to the programmers to make the best of it.

New games for a new console were usually the worst games in terms of technology, as the programmers had to struggle between learning the new hardware and getting the game out of the door.
Then, with the first releases out of the way, the developers would start to really hack the hardware, going lower level and starting to pull performance where possible.
As a result, games would start to look better and better, while the consoles would become less expensive to produce.

Count on hackers

There are many engineers/hackers out there that can and will make the best of what AI has to offer, even if progress in the field were to suddenly stop today. In fact, some of us may feel more at ease if everything slowed down a bit and we had a chance to build something on top of a relatively stable platform with not so many moving parts.

I don't think that AI (LLMs or whatever comes next) is about to slow down, but if it comes to that, we already have plenty to work with for years to come.

AI beyond the hype

AI has become the latest and biggest buzzword since the release of GPT 3.5.

This has inevitably created a hype that will eventually lead to some disappointment, given that most companies, both old and new, are rushing to ride the wave of AI to make a quick buck. All you need is a half-baked idea, a demo running as a wrapper around OpenAI's API, and the ear of an investor.

There is also considerable hype regarding the immediate effects of AI on the job market. Current AI is changing things, but humans are still needed in one way or another. It will take more time to reach the point where human work becomes obsolete and optional.

Putting the hype aside, there is something more important and more fundamental to consider beyond the commercial aspect of AI: the relatively simple structure of neural networks that can give rise to intelligence as we know it.

Humans have always fantasized about alternative life forms, perhaps somewhere outside our planet. However, life as we know it is extremely complex and requires a chain of fortuitous events, spanning millions or billions of years, to evolve from simple molecules to complex organisms.

We've also been assuming that intelligence is a property that derives from life, and finding intelligent life forms would mean we'd have to first find life, and then intelligent life on top of that. This notion is now challenged.

Artificial Neural Networks demonstrate that intelligence requires a relatively simple structure that doesn't need to be organic. Granted, today's LLMs (Large Language Models) are not self-sustaining, self-evolving, and self-replicating entities, but it's clear that those properties are likely to follow. We can already see hints of this with agent-based systems that rely on LLMs to go beyond the capacity of a single inference cycle (inputs -> model -> outputs).

Gone are all the theories about the soul, consciousness, and even quantum effects on neurons. We are now very close to creating entities that can think, learn, and evolve, and we can do so with a relatively simple structure.

In conclusion, I have no doubt that the commercial aspects of AI are being overhyped. However, it would be a waste to focus purely on that aspect, without considering the deeper significance that this holds... we have distilled intelligence, and it didn't require a magic formula or a metaphysical event to do so.

Somewhere in the universe, there may be other "engineer types" that have been forged from energy applied to structures like some form of crystals, or maybe some kind of fluid with enough mechanical properties to form and sustain a network of connections with its bits of memory and its processing units.

This would be an exciting prospect because it would mean that intelligence is less rare than we thought, given that it wouldn't require the complicated, long chain of events that life as we know it requires.

ENZO-TS 2024/Q1 update

This is a general update on the ENZO Trading System as of Q1 2024.

NOTE: We have been phasing out the trading client in favor of ByBit Copy Trading platform. The client is still supported for those that have it and for custom licenses.

Over the past year, significant developments have occurred behind the scenes. The decision to delay updates until we had a stable system worth discussing has culminated in our latest release in mid-March 2024.

This release was focused on stability, putting preservation of capital above everything else. This translates into less trading activity and less potential for profit, but at a much reduced risk level.

The system is now able to completely withdraw from all crypto markets when conditions are unfavorable, as opposed to simply shifting allocation to the least unfavorable markets. This was essential, because crypto markets offer virtually no diversification during "bear" phases, offering no hedge against losses other than halting trading altogether and holding cash (USDT in our case).

In practice, this required major work in rethinking of algorithms, modes of evaluation, and portfolio management. If one isn't careful, simply withdrawing from trading when things aren't going well can do more harm than good, as one may easily enter a cycle of racking losses and stopping trading just as the market is about to turn around.

The problems with trading

With sufficient GPU power, creating very capable artificial intelligence language models (LLMs) has become commonplace, so much so that LLMs are quickly becoming a commodity. Meanwhile, trading continues to be an unsolved problem, at least as far as public knowledge goes.

There's plenty of publicly available material on the subject of automated trading, but nothing seems to really work on its own. One can't simply download a trading model and immediately apply it to the markets for a profit. Here are a few reasons why:

  1. Trading is practically a zero-sum game: there's no incentive to share the details of a successful strategy as it could quickly become ineffective.
  2. The markets are constantly evolving: what works today may not work tomorrow.
  3. There's a high degree of randomness at play, making it difficult to distinguish between a great strategy, pure luck and anything in between.

In my experience, the vast majority of information circulating about trading is plagued by fundamental misconceptions and misunderstandings. The landscape is saturated with half-baked ideas and outright quackery. This situation is unlikely to change, as there is little financial incentive for individuals to share genuinely useful and practical information. In a field where edge is everything, those who possess valuable insights are more likely to guard them closely rather than broadcast them to the masses.

Improvements to the system

The two most significant changes for this release were:

  1. Introduction of Neural Networks for the production of new algorithms.
  2. Rethinking of the portfolio management, with extreme risk adversity in mind.

Of these, I believe portfolio management is the most critical change, though it may lack the allure of high-tech innovations.

The valuable Sharpe ratio

Assessing the performance of a trading system is more complex than it might appear. Over time, I have come to value metrics such as the Sharpe and Sortino ratios more than metrics like profit factor, win rate or profit adjusted for maximum drawdown.

The reason is that Sharpe/Sortino tend to favor stability. Stability is not only a desirable attribute for investments but also a key indicator of a system’s reliability and predictive powers. All recent changes were made with this in mind.

NOTE: Sharpe ratio is not a magical formula. What's important is to understand why it's valuable. A similar metric may work as well or better. In fact, ideally one should be able to replace Sharpe ratio with a similar metric and obtain similar results. Doing this would be a good test of the overall robustness of a system against overfitting.

About the new portfolio management

A crucial insight we gained was that effective portfolio management relies on the algorithms being as stable and predictable as possible. Paradoxically, an algorithm that brings consistent losses is preferable to one that is profitable but erratic.

This is because an algorithm that begins to underperform and continues to do so for a prolonged period of time, can quickly be disabled, knowing that more losses are likely to follow.

On the other hand, an algorithm that is known to be profitable but only because of a few trades that could happen at any time, is not suitable to be integrated in a portfolio. This is because it becomes much harder for the portfolio's algorithm to decide how to allocate capital to it, due to the random nature of this signal.

Implementing this required extensive research and rigorous testing. Part of our testing strategy now involves evaluating the system during the worst market periods. We assess performance in scenarios dominated by poorly performing "altcoins", without the safety net of more reliable assets like Bitcoin and Ethereum. The goal is to minimize losses and drawdowns as soon as a market shows signs of weakness.

This approach of extreme aversion to risk translates into overall much reduced trading activity, to the point that the system may be out of the market for long periods of time. This is much less exciting and rewarding for the short term, but it's what is necessary for a system that aims to survive and thrive in the long term.

High stability and low risk are not directly a measure of profitability, but they are essential to optimize the allocation of capital, and they open the door to trading with higher leverage, which ultimately leads to higher returns.

Neural networks and trading

The incorporation of neural networks into our trading system has been a long-awaited development. I personally decided to approach this transition cautiously, as it required extensive internal research, given the aforementioned lack of reliable information in the public domain.

After considerable research, I opted for a familiar and promising approach: a simple deep neural network optimized with neuroevolution (also known as genetic algorithms). The neuroevolution component is implemented in plain C++, while the neural network operates on LibTorch (the C++ API for PyTorch).

Neuroevolution is not what makes the news these days, but in cases like these, where one is optimizing for an outcome for which there isn't a well defined ground truth, it becomes important to explore the solution space extensively, and this is where neuroevolution shines.

Given more time, it would be interesting to give another look at backpropagation and transformers, but this will require a significant amount of research, as none of this is plug-and-play. The devil is absolutely in the details... potential is often hiding behind days of testing and tweaking with often no hint in sight.

Last but not least, I cannot overstate the importance of creating a suitable fitness function (or loss function) for the training. This is where large part of the research went, and it's what a human programmer still has to spend the time on, to essentially make it possible for a network to be trained without going off the rails.

A good fitness function should evaluate results in a realistic way, not expecting the network to do the impossible, as it will inevitably fail to produce anything useful. At the same time, one shouldn't ask too little of the network, as the threshold for usefulness is quite high. Trading is not an incremental job. A network needs to be very good at predicting before it can be useful at all. The cost of misprediction can be very high.

NOTE: A common misconception in trading, automated or not, is that one can "start small" and "just earn $50 a day" and grow from there. This couldn't be further from the truth. If you can make $50/day, you can make $5,000/day. It's not about how much you make, it's about earning anything at all, without going broke. The threshold for success is deceivingly high !

This in practice means that, for example, when it comes to a network that is supposed to predict the price of an asset, the expectation for prediction is in the order of hours rather than days, given that longer term predictions are much harder to make.

For more advanced networks, which generate trading signals themselves, the fitness function should focus on the stability of trades rather than profit. In this case however Sharpe ratio is not a good candidate, as it's very sensitive to nonsensical trades, something very common in the first stages of training.

Benefits of using neural networks

Neural networks are a powerful tool, but they aren't a blanket solution for all trading needs. In fact, some of the older algorithms are still active part of the system and are sometimes outperforming the algos based on neural networks.

The major advantage of NNs is that, for the most part, they remove the need for human intuition. This means that once the system to effectively train a neural network is in place (substantial but fixed initial cost), it can be used to create new algorithms that can adapt to new market conditions with minimal human intervention.

From this perspective, even if an NN-based algorithm performs worse than an hand-crafted one, it's still a very valuable tool, because it produces something valuable with very little extra human effort. The ratio between the outcome and the human effort is what matters, especially when it's one own's time that is being spent.

Some technical details on the neural network implementation

The code is in C++, using LibTorch (the C++ API for PyTorch). Substantial effort went to optimizing the code, both by parallelizing the training using higher dimensional tensors, and by using async transfers to the GPU, although this proved to be unreliable on my Macbook Pro with an M2 chip.

It seems that to reliably use async transfers (the non_blocking parameter in the Torch.to function), one should use CUDA-specific features, but it probably makes more sense to move training to Python + PyTorch, including the neuroevolution part, which is supposedly doable with PyTorch itself. This is something to consider for the future.

Beyond that, backpropagation and transformers would be an even more interesting proposition, but actually making it all work will require an unclear amount of work doing the research.

I'm sure that large trading firms with the necessary resources have been doing this for years, but without any information it's hard to guess what it looks like and how valuable it is.

Conclusion

Since the start of this multi-year journey, I've many times felt like it was time to sit back, relax and enjoy the profits. However, this never truly materialized. I lost track of the times that I'd have to go back and tweak things and often just scrap something completely. This includes the constant search of useful methods of evaluation, due to the fact that there is no ground truth to evaluate trades, and it's even more difficult to evaluate the overall goodness of a system from the trades it produces.

The hope for this release is to have, first of all, a system that is stable and resilient to the worst market conditions including the very dangerous state after a bull run, where momentum in asset allocation can lead to rapid setbacks. The ability to decide when to withdraw from the market, in a timely manner, is as essential as anything in a trading system.

Secondly, I'm personally excited to have a NN-base model that can do the heavy lifting in trading. This not only means that we can quickly adapt to market changes and potentially look beyond crypto, but also that I hopefully will no longer have to spend so much time working as an algo-monkey, tweaking and tuning algorithms on an endless loop.

The irony of LLM hallucinations

The advent of LLMs (Large Language Models) has been nothing short of revolutionary. Building intelligence from text (and code), is something that I didn't think would be likely. One may argue about the essence of it, but the result is undeniable, and it's only a start.

We now have a seed of alien intelligence, and it's something that is improving possibly at an exponential rate. This is the real deal, but it comes with some flaws.
One common complaint about LLMs is that of the "hallucinations" that they can produce. An hallucination in this context is generated information that is patently untrue, presented without any hesitation. It's a kind of delivery that our human brain finds uncharacteristic of an intelligent being.

This is something that I think it's probably already fixable (see my ChatAI project) with some forms of cross-referencing, and it's not yet deployed due to resources required. I don't consider this to be a major issue for the future, but it's something that got me thinking...

I think that it's ironic how quickly we point the finger at the flaws of these systems, while at the same time we're so inherently flawed to such a depth that we don't yet fully realize it. As AI will improve, this will become more evident, and at some point we'll have to do some introspection and see if we can afford to go on as we have had so far.

Humans live in a bubble of total delusion, both at the individual and at the mass level. Our delusion is not simply an existential one, which would be a noble thing, but it's lower level than that: we lie to ourselves and to others on a constant basis due to tribalism and indoctrination that we receive from the day that we're born.

School, corporations, governments, religious groups, politicians, journalists, experts, scholars, you name it. There's a constant stream of delusional, selfish, malicious or clueless people that poison the well at the higher level, constantly crippling society.
Corruption, thirst for power, idealism, anything for which "the end justifies the means" is usually a sign that something is rotten and is going to hold back progress.

Perhaps we thought that in the information age things would get better, but what we got is information overload, and most of it is biased and purposely given to us to steer us in one direction or another.
The information age clearly didn't bring the sort of enlightenment that we may have hoped for, but perhaps the AI models (especially the open sourced ones) will start to help the individual to deal with the problem of information overload that has been crippling us.

Regardless, I think that we should be more humble when we criticize the flaws of the current AI models, and take that a as jumping point to do a little more introspection and realize how much we can and should improve ourselves.

I know that AI will improve. The question is whether we will improve as well.

Computer code, probably instrumental for AGI

Commodore VIC-20 (1981)

While I wouldn't consider myself an AI expert, I've been working with machine learning for a few years and have formed some understanding of the subject.

My journey with computers began with a Commodore VIC-20, attempting to communicate with it in natural language, inspired by the movie WarGames (1983). The initial disappointment from the inability to extract functionality from this tool turned into a challenge that led me to learn programming. Since then, I've had ample time to ponder logic, intelligence, problem-solving, and the essence of being intelligent and sentient, and just how far we were from creating something that could pass the Turing test. We're well past that now, and talking about AGI (Artificial General Intelligence) is legitimate, if not necessary.

Computer code as foundation for AGIs

A key realization for me has been the integral role of hands-on experience in developing a true understanding of a subject, and sometimes of a mindset. I've found that deep comprehension of complex topics often demands more than study; it requires building (often more than once) the subject matter in code. While learning styles vary, the act of creation undeniably deepens understanding. Even for those adept at absorbing knowledge from reading, the kind of comprehension that forms a foundation for further intellect often comes from practical engagement, where the journey to reach a goal is more important than the goal itself.

I believe this process of intellectual growth through implementation and creation is essential in building an AGI, and software development provides the ideal playground. Large Language Models (LLMs) have reportedly improved by learning from computer code, which is more structured than human languages. This suggests that code should remain a key resource for advancing AI systems.

An AI system with experience in building software would make a much better companion than one that can simply recall and implement things it's read about. This is akin to the difference between a wiz kid who can ace tests and a seasoned engineer who can guide you through every step of the process based on personal experience, highlighting not just the methods, but also the rationale behind them, while anticipating many of the pitfalls.

Software development has also a recursive component to it. In a previous article ("Next level thinking ?"), I mentioned how, in my opinion, recursive thinking is a fundamental building block of our intelligence.

The power of recursion here comes by way of being able to leverage software to build more complex software, as well as building the testbed for virtually any simulation that reflects the physical world. The more accurate the simulation, the less we need to rely on testing in the physical world. Testing in the physical world is not scalable, it can require a lot of resources and it's often destructive... imagine crash testing for the safety of a car: a fairly accurate simulation can drastically cut the requirement for testing. Simulation may be hard to implement, but it can allow to run a large amount of tests for a large number of configurations, more than it would be humanly possible (see also "Simulation: from weapon systems to trading").

Don't give up on programming just yet

I also think that while the ability to deal with human language is fantastic, communicating in computer code is probably more efficient in many cases. Often I ask LLMs to give me some pseudo code rather than a long explanation or a series of bullet points. Code can be more concise, it's definitely less ambiguous, and it's more of a direct building block to further knowledge and understanding.

For this reason, I also think that programming is not necessarily dead. Logical languages, such as computer languages, in many cases may become a better global language than English. Human languages are still important at the historical level, they are a reflection of what we are, but they are never going to be as efficient and precise as a rigidly structured language that runs in a digital environment.

It should be noted that OpenAI has recently introduced "Code Interpreter", which gives the ability to execute the generated Python code. As well as "function calling", which introduces pseudo-code as a way to obtain better structured answers. This level of efficiency can't possibly be replaced by human languages that maybe be a good interface between humans, but that are not very efficient nor precise when it comes to describe systems that are more analytical in nature.