Alpha Trading - Part 4
Library

Part 4

Filtered trades with low volatility to increase profits per share but also reduced the number of trades.

Removed trades in which the volatility of the two legs were significantly different, arguing that such a trade presented unexpected and unreasonable risks.

We believe that low volatility is a problem, but filtering the volatility removes too many trades. These setbacks are part of the reality of system development. At each step, there is another problem to solve, but continuing with specific solutions to each problem will eventually result in overfitting the data.

The market doesn't hand profits to you. There are some alternative ways of identifying entry and exit points, but none of them will create larger profits from low volatility. Unless we choose to believe that the two pairs, LCC-CAL and LCC-AMR, will perform in the future as they have in the past, and better than the other airline pairs, we need to refocus on sectors with greater volatility. At the moment, banking is very active, but share prices range from exceptionally low for those banks that were hurt in the subprime crisis (Bank of America) to unusually high (Goldman Sachs and J. P. Morgan & Company) if they were perceived as safe. That might produce good profits, but the shares needed to equalize the volatility of these companies will be highly skewed and could lead to unexpected, unpleasant consequences. Volatility adjusting might not be enough to control risk.

Instead, we can turn to the housing construction sector. Although it has been at the center of the financial crisis, companies have not been bailed out or nationalized, and stock prices continue to respond to market forces, but with greater volatility.

HOME BUILDERS.

Five of the biggest home builders are Lennar (LEN), Pulte Homes (PHM), KB Homes (KBH), Toll Brothers (TOL), and Hovnanian (HOV). The first three are part of the S&P 500. Their stock performance is an excellent view into the recent mortgage crisis and economic recession. Five companies give us 10 pairs, with the cross-correlations shown in Table 3.14. The average correlation is .58, which is nearly the same as the airlines, but there were fewer airline stocks and the correlations were as low as .39. Figure 3.11 shows the five home builders' stocks together. The run-up in prices parallels the real estate market, peaking in 2006. From the chart, these five stocks seem to be performing the same way, so there is no reason that we shouldn't use all combinations of them as pairs.

TABLE 3.14 Cross-correlations for home builders, 10 years ending March 2009.

FIGURE 3.11 Prices of five home builder stocks. All five react in a similar manner to the economic changes.

Testing the Home Builders.

We follow exactly the same process to test the home builders as we did for the airlines. From five related stocks, we get 10 pairs. Each of these pairs is tested by calculating the individual stochastic values, then taking the difference in the two stochastic values (SD, the stochastic difference). A new trade is entered when the difference exceeds our entry threshold, which, from our previous experience with airlines, should be between 60 and 40 for shorting the pair, and 40 to 60 for buying the pair. Shorting the pair means entering a short sale in the first leg of the pair and buying the second leg, while buying the pair means buying the first leg and selling short the second leg. The stochastic difference is the stochastic of the first minus the stochastic of the second. No costs were considered in the process, but results show the profits per share, net of both legs, so that it should be simple to reduce those results by your expected costs. Table 3.15 shows the results of tests covering a range of calculation periods and a range of entry criteria.

TABLE 3.15 Average results of 10 home builders varying the momentum calculation periods from 3 to 14 days and the entry criteria from stochastic values of 40 to 60.

As with the airlines, tests cover the most recent 10 years. In this case, all the stocks traded for the full period. Our first concern is the information ratio (annualized return divided by annualized volatility) resulting from trading each pair. We want an average ratio greater than 0.25; otherwise, we know that the final performance will be very erratic, with large drawdowns.

Table 3.15 shows the most important statistics for the 10 pairs: LEN-PHM.

LEN-KBH.

LEN-TOL.

LEN-HOV.

PHM-KBH.

PHM-TOL.

PHM-HOV.

KBH-TOL.

KBH-HOV.

TOL-HOV.

Columns 1 and 2 show the parameters. Column 1 is the momentum calculation period, ranging from 14 down to 3. The value 3 was included to show at what point the pattern fails. We must recognize that a stochastic calculation based on three days will jump from zero to 100 nearly every other day, making the results erratic. Column 2 shows the entry threshold of 60. Part a of Table 3.15 shows the results for an entry threshold of 60, and Tables 3.15b and 3.15c show the results of entry thresholds of 50 and 40, respectively.

The statistics shown in Table 3.15 are the number of trades, the annualized rate of return (AROR), the profits per trade in dollars, and the information ratio. We expected the relationship between the parameters to change in the following ways: As the calculation period decreases from 14 to 3, there would be more trades but the profits per share would get smaller. The less time you hold a trade, the less opportunity there is for gain. Similarly, we would expect the AROR to decline, but we cannot forecast the information ratio because both returns and risk will drop, but we don't know which will drop faster.

As the entry threshold decreases from 60 to 40, we will also get more trades, but we should see a smaller return per share and more risk. When a mean-reverting trade is entered sooner, we can expect prices to move against us both in magnitude and time. Both of these will affect the ratio. If there are many more opportunities at the entry thresholds of 40, those good results might offset the fewer trades in which the prices continued to diverge, but we will also not be able to tell the extent of that in advance.

If we also change the exit threshold, then, as it moves closer to the entry level (for example, entering a short at 60 and exiting at 0, 10, or 20), the size of the profits will decrease, the number of profitable trades should increase, and the total number of trades will increase. If we exit a short sale at 20, not having to wait for zero, and prices reverse to the upside, we would get another short sale that we would have missed had we needed to wait for zero to exit.

Our goal is to have very continuous results; that is, our statistics should move smoothly in one direction as the test parameter values change. We also need profits per trade that are large enough to net a profit after costs. Finally, we want enough trades to make it worthwhile to use this strategy, although that should be reflected in the rate of return. Our confidence in the results also deteriorates if there are too few trades.

It is easier to see the results as a chart. Starting with Figure 3.12, we can see that the number of trades increases as the momentum calculation period decreases. It does this uniformly for the three entry thresholds, but the number of trades also increases as the entry threshold gets smaller. The fastest trading combination, an entry threshold of 40 with calculation periods of 5 or less, generated an average of more than 100 trades in 10 years. Although that's only 10 per year for each of the 10 pairs, it is enough to give us confidence in the method.

FIGURE 3.12 Home builders, comparison of the number of trades.

The profits per share may be the most important statistic because, above all, it tells us whether we can net a profit after costs. In these tests, shown in Figure 3.13, we see that the pattern is good, but the highest average profit per share falls below 8 cents. That may be enough for a professional trader, but we would like it to be higher. When we consider that the entry threshold of 40 generated the most trades, we see that it cleared only 4 cents per share using the faster calculation periods. Because there were more trades, we can try to be selective by using a volatility filter.

FIGURE 3.13 Home builders, average profits per share.

The last statistic, the information ratio, is also important because it gives you an idea of how much risk you will take to get these returns. Figure 3.14 shows that, for the most part, the ratio continues to increase as the calculation period declines. We can explain this in hindsight by recognizing that pairs trading, like other mean-reversion strategies, flourishes in environments of market noise. Chapter 2 pointed out that a closer look at prices-that is, looking at hourly instead of daily data-accentuated the noise. Also, using shorter calculation periods focuses on more noise and less trend. The trend is emphasized by using longer calculation periods and less frequent data (weekly instead of daily). Figure 3.14 shows that ratios were above 0.25, our objective, for all calculation periods from 6 and lower.

FIGURE 3.14 Home builders, average information ratio.

The good news is that the three figures show consistency. As the calculation period declined, results changed in a very orderly fashion. Even better, they were nearly all profitable. If you remember, an important criterion of robustness is that a large number of combinations of parameters should produce profitable returns, given a reasonable test range. Our only problem, which could be insurmountable, is that we want larger profits per share.

Selecting the Threshold Levels.

The best choice will be some combination of the number of trades, profits per trade, and information ratio. We will use the average values of all pairs, even though looking at the detail of each pair would give us more information. We think that using the averages is an attempt to avoid unnecessary overfitting. However, the shape and consistency of the statistics have convinced us that this is a sound approach, so choosing any set of parameters, or more than one set, should be safe.

If we think back to the airline tests, we also saw that the performance ranges were very similar. For airlines, there were fewer pairs, so the results might be less consistent.

Because we plan to test a low-volatility filter that may remove up to 25% of the trades, we will choose the calculation period of 4 with an entry threshold of 40, one of the fastest trading combinations. We can now look at the detail of each pair, shown in Table 3.16. All but one pair, Pulte-KB Homes, was profitable, and all were reasonably consistent. Only one pair, Lennar-KB Homes, showed a return of greater than 10 cents per share, and all had at least 100 trades. The consistency, which is good, removes the temptation of discarding one or two pairs that performed badly or selecting a few that had large profits per share. An average ratio of 0.423 is very good if only we can increase the profits per share.

TABLE 3.16 Results of home builder pairs for momentum 4 and entry threshold 40.

Low-Volatility Filter for Home Builders.

As with the airlines, we can test the low-volatility filter. If results improve, it will confirm the method that we were unable confirm with fewer airline pairs. Using the same pairs and parameters shown in Table 3.17, we applied the low-volatility filter with values ranging from 0% to 120% and got the results in Table 3.17. Of course, 120% would be impossible except that we're using only four days to project an entire year of volatility, so an unusually volatile interval will produce a very large annualized volatility.

TABLE 3.17 Home builder pairs using momentum 4, entry 40, and a low-volatility filter.

It is easier to see the results of Table 3.17 in Figures 3.15 and 3.16. The first shows how the number of trades drops and the profits per trade increase as we filter out more low-volatility trades. However, the highest per share returns average only about 11 cents, while the number of trades drops to about 20 for each pair over 10 years, two per year. If we can accept 7 cents per trade, then we could double the number of trades. That's still not much.

FIGURE 3.15 The low-volatility filter for home builders shows a steady drop in trades and a corresponding increase in per share returns as more trades are filtered out.

FIGURE 3.16 The low-volatility filter for home builders shows a parallel decline in both returns and the information ratio as more trades are removed.

Figure 3.16 shows a parallel decline in both the annualized rate of return and the information ratio as trades are removed using the low-volatility filter. We can explain that because fewer trades spread over a longer time period will always reduce the rate of return. Similarly, if the risk of the individual trades remained the same but the annualized return was lower, then the information ratio would decline. Those results show that there is less to earn, but the most important statistics are the profits per share and the number of trades, both of which are too low to be very interesting.

At this point, we've gained confidence in the method but still need to find pairs with more volatility or those that allow some form of leveraging.

A Sure Way to Avoid Overfitting.

Before moving forward, we must consider whether our selection and filtering process, no matter how well considered, has resulted in overfitting. We must continually return to this question if we are to create a successful trading method. If we have overfitted the data, then our expectations of profits will never become a reality. One way to avoid this problem is to reject the idea of selecting one or more parameter sets.

Throughout the development and testing of this pairs model, we showed the ratios resulting from tests of reasonable entry combinations and the entire range of possibilities for filtering trades. In almost all cases, the results were profitable, although those profits varied in a predictable pattern according to the parameter and filter values. The biggest danger still lies in selecting one set of parameters to trade. How do we know that it will be the best set, or even a good set, in the future?

To avoid that question, which probably does not have an answer, why not trade a number of different parameters, or even all of them? We don't need to decide on a 4-day stochastic when we can use 4, 5, 6, and 7 days. If you have a large enough investment, then trading all combinations with equal allocations could not be considered overfitting. Some combinations will be better than others, but we know that, over a long period of time, they all performed well. Your result should be the average performance of all pairs. Because this performance will vary, the average results should have less volatility and greater predictability than any one pair that we might choose. You will also increase your diversification and reduce your slippage because, for the same pair, trades will occur on different days and each trade will be smaller.

Benefiting from Pseudo-Leverage.

Supposing we are prepared to go forward with small per share returns, we now have what appears to be a viable pairs trading program. Returns are high enough to support costs, and the information ratio indicates that risk is reasonable. We now need to look deeper into one of our original a.s.sumptions: that we wanted to target a portfolio of 12% volatility. That's important because our returns and our profits per share-actually all the statistics-scale up and down based on our choice of target volatility.

In the section "Target Volatility," we calculated the necessary investment size for each pair by taking the daily net profits and losses (expressed in dollars or the currency of the stocks), finding the standard deviation, annualizing it, and then getting the investment size necessary to make that annualized standard deviation equal to 12% (our choice of target volatility). We will have to look at some actual numbers to find out whether this is really possible. In the next chapter, in which we use futures, leveraging the returns will be much easier, but for stocks it is not always possible to increase leverage.

Evaluating Leverage for the PulteToll Brothers Pair If we create a series of daily dollar returns for the PulteToll Brothers pair (PMH-TOL), then take the standard deviation of those returns, we get $26.89. Annualizing that value by multiplying by gives $426.90, then dividing by 0.12 to find the investment size that yields a 12% volatility gives an investment of $3,557.50. The question now is, Are there any trades in which the amount of purchase plus the amount of short-sale exceeds the investment of $3,557?

Unfortunately, the answer is yes. In Figure 3.17, the daily cost of shares, also called market exposure, is shown to exceed the investment size during the period from the first quarter 2004 until about August 2007, or about 3.5 years out of a little more than 9 years. The maximum cost was $6,098, 71% higher than our investment. What can be done to save our strategy?

Raise the Investment Amount. The easiest solution is to increase the investment size by 71% to $6,098. But we would still have the same profits and losses, so we would need to multiply all of our statistics by 0.58 to get the new result. That would reduce our 10-pair average profits per share from, for example, $0.11 to $0.0638 for the filtered case, and put it at a marginally profitable level.

FIGURE 3.17 PMH-TOL total cost of shares ("market exposure").

Borrow the Excess Investment. Stocks can be leveraged by as much as 50% by borrowing capital. Because the share cost exceeds the original investment by amounts varying from very small to 71%, that money can be borrowed at the current interest rate, say 5%. Because trades are not held long, funds would not be needed continuously. Estimating the cost, we have an average excess of $1,270 (the average of $3,557 and $6,098, less $3,557) for 3.5 years at 5%, or $222. That added cost represents an adjustment in total returns of 222/3557, or 6.2%. Then a per share return of $0.11 would drop to $0.103, a more manageable option.

Cap the Amount Traded. The cost of shares does not always exceed the original investment size; therefore, an alternative is to limit each trade to the amount of the investment. For example, if our investment is $3,000 and the total share value for this trade is $4,000, then the position size is reduced to 75% of the original amount. Instead of 100 shares, we trade 75 shares. The amount of this reduction will vary for each trade, depending on the ratio of actual cost to investment size. Once the trade is entered, the cost remains the same until the trade is closed out.

Figure 3.18 shows the results of capping the exposure at the investment size for the pairs LEN-TOL (Figure 3.18a) and TOL-HOV (Figure 3.18b). In both cases, capping resulted in much better returns. The capping ratio, shown as bars across the top of the chart, and read on the right scale, shows the reduction in the size of the daily position needed to bring the exposure down to the investment size. Note that the largest losing period in the original profit/loss stream occurs at the same time that the capping ratio is most active in reducing the position size. This can be explained in terms of market volatility.

FIGURE 3.18 Comparison of capped and original PL for (a) LEN-TOL and (b) TOL-HOV.

In the pairs trading strategy, we not only equalize the risk of both legs but also enter a larger position when the volatility is lower. This is intended to give each trade an equal chance to contribute to the returns. However, when the volatility is low and more positions are entered, the total exposure has a greater chance of exceeding the investment size.

Because cutting the position size during periods of low volatility makes such a drastic improvement in performance, we can conclude that those periods were not good for the strategy. We thought that trades set during periods of low volatility could be removed using the low-volatility filter, but that does not seem to have been as effective as letting the exposure control the process. Table 3.18 shows the results of all 10 home builder pairs. While the number of total trades remains the same, the c.u.mulative profits (TotPL) increased by 67%, and the information ratio jumped from 0.424 to 0.996. We could estimate the improvement in the profits per share by taking the acc.u.mulated capping ratio divided by the number of days. If the capping ratio was effective 30% of the time, and the average ratio was 0.85 (a 15% reduction), then the net impact would have been 0.30 0.15 = .045. We actually estimated the reduction in position size for LEN-TOL at about 6.5%. By itself, that would not be enough to be a major change, but combined with a 67% increase in profits, it should net a 75% increase in the profit per share.

TABLE 3.18 Comparison of capped and original results for 10 home builder pairs.

Summarizing the Capping Results.

Was that just an anomaly, or does capping actually improve performance? For home builders, the improvement is very good, but we must remember that all home builders moved in the same way at the same time. For this to be a robust solution, it would need to work on other markets. But even then, there is a similarity in the price movement of all stocks on the same exchange.

Table 3.19 shows the results of applying the same capping method to the airline pairs, based on the momentum period of 7 and entry threshold of 40. As with the home builders, all the results were improved, although not quite as dramatically. c.u.mulative profits rose from an average of $1,356 to $1,927, and the information ratio increased from an average of 0.335 to 0.572. Overall, the results confirm our previous tests, but it is not a comprehensive study.

TABLE 3.19 Comparison of capping for airline pairs with momentum 7, entry 40.

The Problem with Zero-Value Returns.

For those who are interested in the mathematics, one of the reasons that the actual cost of buying and selling the pair exceeds our investment is that the annualized standard deviation of the returns is too low. How can that be? After all, it's just a simple statistical calculation. Of course, the number is technically correct, but it includes all of the zero returns on those days when we had no position. Therefore, the less often the pair trades, the smaller the standard deviation. The actual risk on the days that a position was held would be much higher, and we could see that if we had only used the days on which returns were nonzero, that is, when we were holding a position. But then, we would still have had the same returns, but the volatility measure would give us a bigger number. Then our target volatility would have required a larger investment. A larger investment with the same returns would have resulted in a lower rate of return!

The idea of using only nonzero returns when annualizing the volatility is important for infrequent trading. If we are using only 10 days to calculate the annualized volatility and only one of those days has activity, then the volatility is going to look unreasonably small. And if you followed 10 days of being out of the market with 10 days of trading, then the volatility will be increasing every day as the zero returns drop off. That creates an unstable volatility measure. Instead, we really want to know, When we are in the market, what is our risk? To know that, we should only calculate the volatility on days with nonzero returns.

USING ETFS.

Exchange-traded funds (ETFs) can be used in pairs trading as either one of the legs or as a subst.i.tute for short sales; however, they are an index, not an individual stock. As one of the legs, they offer additional diversification, and as a subst.i.tute for short sales, they may make trading easier. Short sales can be executed as easily as entering longs, and it will not be affected if the government reinstates the uptick rule or even limits short sales during periods of financial stress. The ETFs can always be traded long or short in the same way as futures, without bias and without additional expense.

Cross-Margining and Counterparty Risk.

One very important advantage of using ETFs is cross-margining, a facility also available for trading futures and options. Cross-margining is the recognition that a pairs trade, or any spread between two related markets, has less risk than being long or short both markets. Then the amount of margin (good-faith deposit, not share value) required by the broker or dealer is much lower than the face value of the ETFs because the risk is offset. It is not clear whether there is cross-margining between a stock and an ETF; however, these can be negotiable items if both trades are placed at the same electronic communication network (ECN).

One note of caution: An ETF is guaranteed by the creditworthiness of the issuer. During a time when banks have been bailed out or, in the case of Lehman Brothers, collapsed, this level of counterparty risk may be unacceptable to some investors. They should carefully a.s.sess this aspect of risk before using ETFs. They are not guaranteed by the exchange clearing house as are futures markets. Although the New York Stock Exchange has a fiduciary responsibility to handle the transactions correctly, they do not guarantee the firms listed on the exchange. And with American depositary receipts (ADRs) listed, there is very little oversight as to whether any of their financial disclosure is correct.

Composition of an ETF.

A sector ETF is an average of a selection of stocks within that sector. An ETF is usually capitalization weighted (the number of outstanding shares times the price of a share), similar to the S&P or DJIA, or less often, equally weighted (the same number of shares). Because it is an average, the price of the ETF compared with a single stock will not be as volatile, nor will it reach the stochastic extremes that are used for single stock thresholds. Based on that, we would expect fewer trades with the same profit per share as individual stocks, or the same number of trades with smaller profits per share. But before we eliminate it because of purely theoretical reasons, we need to see how the numbers look.

There are two viable ETF candidates for the home builders, and both have reasonable liquidity: 1. XHB, SPDF S&P Homebuilders ETF, with data from February 6, 2006.

2. ITB, ISHARES DJ US Home Construction with data from May 5, 2006.

Figure 3.19 shows that these two ETFs track very closely.

FIGURE 3.19 The home builders ETFs, XHB and ITB, both track one another closely.

Using an ETF as One Leg of a Pair Considering the ETF as one leg of a pair, we can choose either XHB or ITB, creating five additional pairs. The data for these ETFs begin in February 2006 for XHB and May 2006 for ITB; therefore, we will create the statistics for the full 10 pairs from February 2006 using the same parameters, a momentum of 4, and an entry threshold of 40. The results are shown in Table 3.20 and are not as good as for the longer interval, beginning January 2000, but they will give us a benchmark to compare the use of an ETF.

TABLE 3.20 Home builder pairs from February 6, 2006.

The first case is to treat the ETF as another home builder stock, leg 2 in our test. Table 3.21 shows the results. In both cases, there are only two winning pairs and three losing ones. Surprisingly, they are not the same pairs, even though the chart of the ETFs seemed nearly identical. There are also many more trades than the pairs without ETFs, which was unexpected, and needs to be explained.

TABLE 3.21 Home builders.

The stochastic indicator will take on values from 0 to 100 regardless of volatility. It simply adjusts to the current volatility level and identifies the extremes within its current framework. We expected the ETF to have less volatility than a single stock because it is an average, and averages benefit from diversification (an upward move offsetting a downward move). To prove that, we applied the low-volatility filter at the arbitrary level of 100% and show the results for ITB in Table 3.22a and XHB in Table 3.22b. When we applied the low-volatility filter to the test beginning in January 2000, the filter level of 100% annualized volatility reduced trades from 137 to 20, showing a somewhat normal distribution. With the interval beginning in 2006, trades were reduced less, indicating a flatter distribution.

Capping the Pairs Using ETFs.

As further confirmation that capping benefits the performance, we applied the same capping principle to the five pairs after the 100% low-volatility filter was applied. Results are shown in Table 3.23. As with the earlier test, TABLE 3.22 Using an ETF as leg 2 with a low-volatility filter of 100% annualized.

TABLE 3.23 Using an ETF as leg 2 with a volatility filter and capping.

performance improved drastically, with the ratio jumping from 0.147 to 0.687 for ITB (Table 3.23a) and from 0.334 to 0.535 for XHB (Table 3.23b). Overall, ITB seems to be a better performer, but it is still necessary to filter trades in to create a profile that would be profitable after transaction costs.

Using ETFs as a Subst.i.tute for Short Sales A practical use for the ETFs is as a subst.i.tute for short sales, because there are no restrictions on trading ETFs on the short side, and new regulations are not likely to affect them. The ETF comes into play at the time of execution; the signal to buy or sell the pair is still based on the two stock legs.