Langsung ke konten utama

Applying Corrective AI to Daily Seasonal Forex Trading

  By Sergei Belov, Ernest Chan, Nahid Jetha, and Akshay Nautiyal     ABSTRACT We applied Corrective AI (Chan, 2022) to a trading model that takes advantage of the intraday seasonality of forex returns. Breedon and Ranaldo (2012)   observed that foreign currencies depreciate vs. the US dollar during their local working hours and appreciate during the local working hours of the US dollar. We first backtested the results of Breedon and Ranaldo on recent EURUSD data from September 2021 to January 2023 and then applied Corrective AI to this trading strategy to achieve a significant increase in performance. Breedon and Ranaldo (2012) described a trading strategy that shorted EURUSD during European working hours (3 AM ET to 9 AM ET, where ET denotes the local time in New York, accounting for daylight savings) and bought EURUSD during US working hours (11 AM ET to 3 PM ET). The rationale is that large-scale institutional buying of the US dollar takes place during European working hours to pa

StockTwits Sentiment Analysis



By Colton Smith

===



Exploring alternative datasets to augment financial trading models is currently the hot trend among the quantitative community. With so much social media data out there, its place in financial models has become a popular research discussion. Surely the stock market’s performance influences the reactions from the public but if the converse is true, that social media sentiment can be used to predict movements in the stock market, then this would be a very valuable dataset for a variety of financial firms and institutions.



When I began this project as a consultant for QTS Capital Management, I did an extensive literature review of the social media sentiment providers and academic research. The main approach is to take the social media firehose, filter it down by source credibility, apply natural language processing (NLP), and create a variety of metrics that capture sentiment, volume, dispersion, etc. The best results have come from using Twitter or StockTwits as the source. A feature of StockTwits that distinguishes it from Twitter is that in late 2012 the option to label your tweet as bullish or bearish was added. If these labels accurately capture sentiment and are used frequently enough, then it would be possible to avoid using NLP. Most tweets are not labeled as seen in Figure 1 below, but the percentage is increasing.








Figure 1: Percentage of Labeled
StockTwits Tweets by Year







This blog post will compare the use of just the labeled tweets versus the use of all tweets with NLP. To begin, I did some basic data analysis to better understand the nature of the data. In Figure 2 below, the number of labeled tweets per hour is shown. As expected there are spikes around market open and close.







Figure 2: Number of Tweets Per Hour
of the Day




The overall market sentiment can be estimated by aggregating the number of bullish and bearish labeled tweets each day. Based on the previous literature, I expected a significant bullish bias. This is confirmed in Figure 3 below with the daily mean percetage of bullish tweets being 79%.








Figure 3: Percentage of Bullish
Tweets Each Day




When writing a StockTwits tweet, users can tag multiple symbols so it is possible that the sentiment label could apply to more than one symbol. Tagging more than one symbol would likely indicate less specific sentiment and predictive potential so I hoped to find that most tweets only tag a single symbol. Looking at Figure 4 below, over 90% of the tweets tag a single symbol and a very small percentage tag 5+.







Figure 4: Relative Frequency
Histogram of the Number of Symbols Mentioned Per Tweet




The time period of data used in my analysis is from 2012-11-01 to 2016-12-31. In Figure 5 below, the top symbols, industries, and sectors by total labeled tweet count are shown. By far the most tweeted about industries were biotechnology and ETFs. This makes sense because of how volatile these industries are which hopefully means that they would be the best to trade based on social media sentiment data.








Figure 5: Top Symbols, Industries,
and Sectors by Total Tweet Count





Now I needed to determine how I would create the sentiment score to best encompass the predictive potential of the data. Though there are obstacles to trading an open to close strategy including slippage, liquidity, and transaction costs, analyzing how well the sentiment score immediately before market open predicts open to close returns is a valuable sanity check to see if it would be useful in a larger factor model. The sentiment score for each day was calculated using the tweets from the previous market day’s open until the current day’s open:





S-Score =  (#Bullish-#Bearish)/(#Bullish+#Bearish)





This S-Score then needs to be normalized to detect the significance of a specific day’s sentiment with respect to the symbol’s historic sentiment trend. To do this, a rolling z-score is applied to the series. By changing the length of the lookback window the sensitivity can be adjusted. Additionally, since the data is quite sparse, days without any tweets for a symbol are given an S-Score of 0. At the market open each day, symbols with an S-Score above the positive threshold are entered long and symbols with an S-Score below the negative threshold are entered short. Equal dollar weight is applied to the long and short legs. These positions are assumed to be liquidated at the day’s market close. The first test is on the universe of equities with previous day closing prices > $5. With a relatively small long-short portfolio of ~250 stocks, its performance can be seen in Figure 6 below (click on chart to enlarge).







Figure 6: Price > $5 Universe Open
to Close Cumulative Returns







The thresholds were cherry-picked to show the potential of a 2.11 Sharpe Ratio but the results vary depending on the thresholds used. This sensitivity is likely due to the lack of tweet volume on most symbols. Also, the long and short thresholds are not equal in an attempt to maintain roughly equal number of stocks in each leg. The neutral basket contains all of the stocks in the universe that do not have an S-Score extreme enough to generate a long or short signal. Using the same thresholds as above, the test was ran on a liquidity universe which is defined as the top quartile of 50-day Average Dollar Volume stocks. As seen in Figure 7 below, the Sharpe drops to a 1.24 but is still very encouraging.










Figure 7: Liquidity Universe Open to
Close Cumulative Returns




The sensitivity of these results needs to be further inspected by performing analysis on separate train and test sets but I was very pleased with the returns that could be potentially generated from just labeled StockTwits data.





In July, I began working for Social Market Analytics, the leading social media sentiment provider. Here at SMA, we run all the StockTwits tweets through our proprietary NLP engine to determine their sentiment scores. Using sentiment data from 9:10 EST which looks at an exponentially weighted sentiment aggregation over the last 24 hours, the open to close simulation can be ran on the price > $5 universe. Each stock is separated into its respective quintile based on its S-Score in relation to the universe’s percentiles that day. A long-short portfolio is constructed in a similar fashion as previously with long positions in the top quintile stocks and short positions in the bottom quintile stocks. In Figure 8 below you can see that the results are much better than when only using sentiment labeled data.








Figure 8: SMA
Open to Close Cumulative Returns Using StockTwits Data




The predictive power is there as the long-short boasts an impressive 4.5 Sharpe ratio. Due to having more data, the results are much less sensitive to long-short portfolio construction. To avoid the high turnover of an open-to-close strategy, we have been exploring possible long-term strategies. Deutsche Bank’s Quantitative Research Team recently released a paper about strategies that solely use our SMA data which includes a longer-term strategy. Additionally, I’ve recently developed a strong weekly rebalance strategy that attempts to capture weekly sentiment momentum.




Though it is just the beginning, my dive into social media sentiment data and its application in finance over the course of my time consulting for QTS has been very insightful. It is arguable that by just using the labeled StockTwits tweets, we may be able to generate predictive signals but by including all the tweets for sentiment analysis, a much stronger signal is found. If you have questions please contact me at coltonsmith321@gmail.com.




Colton Smith is a recent graduate of the University of Washington where he majored in Industrial and Systems Engineering and minored in Applied Math. He now lives in Chicago and works for Social Market Analytics. He has a passion for data science and is excited about his developing quantitative finance career. LinkedIn: https://www.linkedin.com/in/coltonfsmith/



===


Upcoming Workshops by Dr. Ernie Chan





September 11-15City of London workshops




These intense 8-16 hours workshops cover Algorithmic Options StrategiesQuantitative Momentum Strategies, and Intraday Trading and Market Microstructure. Typical class size is under 10. They may qualify for CFA Institute continuing education credits.






November 18 and December 2:  Cryptocurrency Trading with Python





I will be moderating this online workshop for Nick Kirk, a noted cryptocurrency trader and fund manager, who taught this widely acclaimed course here and at CQF in London.




Komentar

Postingan populer dari blog ini

Applying Corrective AI to Daily Seasonal Forex Trading

  By Sergei Belov, Ernest Chan, Nahid Jetha, and Akshay Nautiyal     ABSTRACT We applied Corrective AI (Chan, 2022) to a trading model that takes advantage of the intraday seasonality of forex returns. Breedon and Ranaldo (2012)   observed that foreign currencies depreciate vs. the US dollar during their local working hours and appreciate during the local working hours of the US dollar. We first backtested the results of Breedon and Ranaldo on recent EURUSD data from September 2021 to January 2023 and then applied Corrective AI to this trading strategy to achieve a significant increase in performance. Breedon and Ranaldo (2012) described a trading strategy that shorted EURUSD during European working hours (3 AM ET to 9 AM ET, where ET denotes the local time in New York, accounting for daylight savings) and bought EURUSD during US working hours (11 AM ET to 3 PM ET). The rationale is that large-scale institutional buying of the US dollar takes place during European working hours to pa

Conditional Portfolio Optimization: Using machine learning to adapt capital allocations to market regimes

By Ernest Chan, Ph.D., Haoyu Fan, Ph.D., Sudarshan Sawal, and Quentin Viville, Ph.D. Previously on this blog, we wrote about a machine-learning-based parameter optimization technique we invented, called Conditional Parameter Optimization (CPO). It appeared to work well on optimizing the operating parameters of trading strategies, but increasingly, we found that its greatest power lies in its potential to optimize portfolio allocations . We call this Conditional Portfolio Optimization (which fortuitously shares the same acronym). Let’s recap what Conditional Parameter Optimization is. Traditionally, optimizing the parameters of any business process (such as a trading strategy) is a matter of finding out what parameters give an optimal outcome over past data. For example, setting a stop loss of 1% gave the best Sharpe ratio for a trading strategy backtested over the last 10 years. Or running the conveyor belt at 1m per minute led to the lowest defect rate in a manufacturing process. O

800+ New Crypto Features

 By Quentin Viville, Sudarshan Sawal, and Ernest Chan PredictNow.ai is excited to announce that we’re expanding our feature zoo to cover crypto features! This follows our work on US stock features, and features based on options activities, ETFs, futures, and macroeconomic indicators. To read more on our previous work, click here . These new crypto features can be used as input to our machine-learning API to help improve your trading strategy. In this blog we have outlined the new crypto features as well as demonstrated  how we have used them for short term alpha generation and crypto portfolio optimization. Our new crypto features are designed to capture market activity  from subtle movements to large overarching trends. These features will quantify the variations of the price, the return, the order flow, the volatility and the correlations that appear among them. To create these features, we first constructed the Base Features  using raw market data that includes microstructure inform