Gain access to exclusive tools that Wall Street's Elite don't want you to have. Don't miss the next issue...
Join 11,500+ Quant Scientists learning one article at a time
Join 11,500+ Quant Scientists learning one article at a time
Embeddings are used in neural networks to transform large, sparse data into manageable, dense formats.
What?
Well our goal is to build profitable algorithmic trading strategies.
They simplify complex data, making it easier to analyze.
Matt's working on a killer new course that demystifies how machine learning is really used in trading.
We thought we'd give you a little sneak peek.
In today's issue of the QS Newsletter (get the code), you'll learn how to train an autoencoder to build embeddings for stock factors.
(Today's newsletter is a little longer than usual, but we're making something hedge funds use simple!)
What You’ll Learn:
Build and train an autoencoder using PyTorch
Extract the embeddings to create clusters
Use PCA to reduce the dimensions and visualize the results
BONUS: Get the Python Code for EVERYTHING you see in this post
Disclaimer:
The information and educational material provided by Quant Science, LLC are for educational purposes only and should not be considered as financial advice or recommendations to purchase, hold, or sell any securities or other financial instruments. Before you proceed, please review our full disclaimer here.
Want exclusive access to our FULL codebase for this Quant Science tutorial plus dozens more?
Join thousands of aspiring Python quants here 👉
Since you're here, you probably want to learn how to get started developing (profitable) algorithmic trading strategies and reinvest those profits.
Here are the steps:
Find edge
Analyze risk
Backtest trading strategies
Execute trades automatically
Easy right? Well, not exactly... Avoid the 5 biggest mistakes beginners make with our free, 5-day email course:
Click here to join our free 5-Day Algorithmic Trading Course 👉
Now on to the show...
Embeddings are compact, dense representations of original high-dimensional stock data, transformed into a lower-dimensional space.
They are created using methods like autoencoders which retain the information contained in features, like volatility or technical indicators. These embeddings are used for clustering, anomaly detection, and predictive modeling.
Embeddings reduce stock features into lower-dimensional vectors that capture key patterns.
This makes them ideal for use in K-means analysis to group similar stocks based on their underlying characteristics.
We’ll use some pretty powerful libraries in this issue including PyTorch and Scikit-Learn.
Next, we’ll download stock price data to construct our mock portfolio.
We’ll use the stock price data to create a few features.
Features are patterns in the data we think drive returns. In this example, we’re using log returns, a simple moving average, and volatility.
Let’s convert the normalized feature data into PyTorch tensors and DataLoader objects.
This code converts our features data into a PyTorch tensor, wraps it in a TensorDataset for batch handling, and creates a DataLoader.
The DataLoader is used to iterate over the dataset in batches of 32 while shuffling the data to randomize the input during training.
In the encoder, data is compressed through a series of linear layers: from the original feature dimension to 64, then 32, and finally to a 10-dimensional space.
Non-linear ReLU activation functions are applied after each linear transformation to introduce non-linearity. This helps the model to capture and learn more complex data patterns effectively.
The decoder reconstructs the input data from the 10-dimensional space by gradually expanding the dimensions through linear layers from 10 to 32, then 64, and finally back to the original feature size.
The forward method of the autoencoder sequentially passes an input tensor through the encoder and decoder to produce a reconstructed version of the input.
Now we can train it.
This function manages the training of the autoencoder by iteratively adjusting its weights to minimize the loss between its predictions and the actual inputs.
The training loop iterates over the entire dataset multiple times. Each iteration processes data in batches using each batch as input and labels for autoencoder training.
Finally, we can extract the embeddings and use them to create clusters.
After extracting the embeddings, the function stacks them into a tensor, which is then clustered using K-means into five groups.
Principal Component Analysis (PCA) reduces the dimensionality of the embeddings to principal components. These components capture the directions of maximum variance in the data.
The result visualizes the two-dimensional PCA-reduced embeddings of stock data. Each point represents a stock positioned according to its values on the first two principal components. The colors represent the different clusters.
You just took the first step in using machine learning in trading like the hedge funds!
But, there's more to learn in algorithmic trading:
Backtesting your portfolio construction algorithm to make sure the strategy will work in the future
Executing the trades automatically
Monthly rebalancing
Tracking your actual Profit and Loss
Incorporating Trading Fees
Are you interested in learning algorithmic trading strategies that maximize returns responsibly, help you manage risk, and grow your investments?
We implement 3 core trading strategies including portfolio, momentum, and spread trades that have worked in our favor in the past and continue to produce results for our students.
Join 400+ of us that are learning to apply python to algorithmic trading to grow investments.
Leo was up 11.5% in just 13 trading days.
Alex was waiting 9 years for a course like this:
There's nothing worse than going at this alone--
❌ Learning Python is tough.
❌ Learning Trading is tough.
❌ Learning Math & Stats is tough.
It's no wonder why it's easy to feel lost, make bad decisions, and lose money.
Want help?
👉 Join 10,700+ future Quant Scientists on our Python for Algorithmic Trading Course Waitlist: https://learn.quantscience.io/python-algorithmic-trading-course-waitlist
Gain access to exclusive tools that Wall Street's Elite don't want you to have. Don't miss the next issue...
Join 11,500+ Quant Scientists learning one article at a time
Join 11,500+ Quant Scientists learning one article at a time