Backtesting a Moving Average Crossover in Python with pandas

Author: Goodness, Created: 2019-03-27 15:11:40, Updated:

In this article we will make use of the machinery we introduced to carry out research on an actual strategy, namely the Moving Average Crossover on AAPL.

Moving Average Crossover Strategy

The Moving Average Crossover technique is an extremely well-known simplistic momentum strategy. It is often considered the “Hello World” example for quantitative trading.

The strategy as outlined here is long-only. Two separate simple moving average filters are created, with varying lookback periods, of a particular time series. Signals to purchase the asset occur when the shorter lookback moving average exceeds the longer lookback moving average. If the longer average subsequently exceeds the shorter average, the asset is sold back. The strategy works well when a time series enters a period of strong trend and then slowly reverses the trend.

For this example, I have chosen Apple, Inc. (AAPL) as the time series, with a short lookback of 100 days and a long lookback of 400 days. This is the example provided by the zipline algorithmic trading library. Thus if we wish to implement our own backtester we need to ensure that it matches the results in zipline, as a basic means of validation.

Implementation

Make sure to follow the previous tutorial here, which describes how the initial object hierarchy for the backtester is constructed, otherwise the code below will not work. For this particular implementation I have used the following libraries:

  • Python - 2.7.3
  • NumPy - 1.8.0
  • pandas - 0.12.0
  • matplotlib - 1.1.0

The implementation of ma_cross.py requires backtest.py from the previous tutorial. The first step is to import the necessary modules and objects:

# ma_cross.py

import datetime
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from pandas.io.data import DataReader
from backtest import Strategy, Portfolio

As in the previous tutorial we are going to subclass the Strategy abstract base class to produce MovingAverageCrossStrategy, which contains all of the details on how to generate the signals when the moving averages of AAPL cross over each other.

The object requires a short_window and a long_window on which to operate. The values have been set to defaults of 100 days and 400 days respectively, which are the same parameters used in the main example of zipline.

The moving averages are created by using the pandas rolling_mean function on the bars[‘Close’] closing price of the AAPL stock. Once the individual moving averages have been constructed, the signal Series is generated by setting the colum equal to 1.0 when the short moving average is greater than the long moving average, or 0.0 otherwise. From this the positions orders can be generated to represent trading signals.

# ma_cross.py

class MovingAverageCrossStrategy(Strategy):
    """    
    Requires:
    symbol - A stock symbol on which to form a strategy on.
    bars - A DataFrame of bars for the above symbol.
    short_window - Lookback period for short moving average.
    long_window - Lookback period for long moving average."""

    def __init__(self, symbol, bars, short_window=100, long_window=400):
        self.symbol = symbol
        self.bars = bars

        self.short_window = short_window
        self.long_window = long_window

    def generate_signals(self):
        """Returns the DataFrame of symbols containing the signals
        to go long, short or hold (1, -1 or 0)."""
        signals = pd.DataFrame(index=self.bars.index)
        signals['signal'] = 0.0

        # Create the set of short and long simple moving averages over the 
        # respective periods
        signals['short_mavg'] = pd.rolling_mean(bars['Close'], self.short_window, min_periods=1)
        signals['long_mavg'] = pd.rolling_mean(bars['Close'], self.long_window, min_periods=1)

        # Create a 'signal' (invested or not invested) when the short moving average crosses the long
        # moving average, but only for the period greater than the shortest moving average window
        signals['signal'][self.short_window:] = np.where(signals['short_mavg'][self.short_window:] 
            > signals['long_mavg'][self.short_window:], 1.0, 0.0)   

        # Take the difference of the signals in order to generate actual trading orders
        signals['positions'] = signals['signal'].diff()   

        return signals

The MarketOnClosePortfolio is subclassed from Portfolio, which is found in backtest.py. It is almost identical to the implementation described in the prior tutorial, with the exception that the trades are now carried out on a Close-to-Close basis, rather than an Open-to-Open basis. For details on how the Portfolio object is defined, see the previous tutorial. I’ve left the code in for completeness and to keep this tutorial self-contained:

# ma_cross.py

class MarketOnClosePortfolio(Portfolio):
    """Encapsulates the notion of a portfolio of positions based
    on a set of signals as provided by a Strategy.

    Requires:
    symbol - A stock symbol which forms the basis of the portfolio.
    bars - A DataFrame of bars for a symbol set.
    signals - A pandas DataFrame of signals (1, 0, -1) for each symbol.
    initial_capital - The amount in cash at the start of the portfolio."""

    def __init__(self, symbol, bars, signals, initial_capital=100000.0):
        self.symbol = symbol        
        self.bars = bars
        self.signals = signals
        self.initial_capital = float(initial_capital)
        self.positions = self.generate_positions()
        
    def generate_positions(self):
        positions = pd.DataFrame(index=signals.index).fillna(0.0)
        positions[self.symbol] = 100*signals['signal']   # This strategy buys 100 shares
        return positions
                    
    def backtest_portfolio(self):
        portfolio = self.positions*self.bars['Close']
        pos_diff = self.positions.diff()

        portfolio['holdings'] = (self.positions*self.bars['Close']).sum(axis=1)
        portfolio['cash'] = self.initial_capital - (pos_diff*self.bars['Close']).sum(axis=1).cumsum()

        portfolio['total'] = portfolio['cash'] + portfolio['holdings']
        portfolio['returns'] = portfolio['total'].pct_change()
        return portfolio

Now that the MovingAverageCrossStrategy and MarketOnClosePortfolio classes have been defined, a main function will be called to tie all of the functionality together. In addition the performance of the strategy will be examined via a plot of the equity curve.

The pandas DataReader object downloads OHLCV prices of AAPL stock for the period 1st Jan 1990 to 1st Jan 2002, at which point the signals DataFrame is created to generate the long-only signals. Subsequently the portfolio is generated with a 100,000 USD initial capital base and the returns are calculated on the equity curve.

The final step is to use matplotlib to plot a two-figure plot of both AAPL prices, overlaid with the moving averages and buy/sell signals, as well as the equity curve with the same buy/sell signals. The plotting code is taken (and modified) from the zipline implementation example.

# ma_cross.py

if __name__ == "__main__":
    # Obtain daily bars of AAPL from Yahoo Finance for the period
    # 1st Jan 1990 to 1st Jan 2002 - This is an example from ZipLine
    symbol = 'AAPL'
    bars = DataReader(symbol, "yahoo", datetime.datetime(1990,1,1), datetime.datetime(2002,1,1))

    # Create a Moving Average Cross Strategy instance with a short moving
    # average window of 100 days and a long window of 400 days
    mac = MovingAverageCrossStrategy(symbol, bars, short_window=100, long_window=400)
    signals = mac.generate_signals()

    # Create a portfolio of AAPL, with $100,000 initial capital
    portfolio = MarketOnClosePortfolio(symbol, bars, signals, initial_capital=100000.0)
    returns = portfolio.backtest_portfolio()

    # Plot two charts to assess trades and equity curve
    fig = plt.figure()
    fig.patch.set_facecolor('white')     # Set the outer colour to white
    ax1 = fig.add_subplot(211,  ylabel='Price in $')
    
    # Plot the AAPL closing price overlaid with the moving averages
    bars['Close'].plot(ax=ax1, color='r', lw=2.)
    signals[['short_mavg', 'long_mavg']].plot(ax=ax1, lw=2.)

    # Plot the "buy" trades against AAPL
    ax1.plot(signals.ix[signals.positions == 1.0].index, 
             signals.short_mavg[signals.positions == 1.0],
             '^', markersize=10, color='m')

    # Plot the "sell" trades against AAPL
    ax1.plot(signals.ix[signals.positions == -1.0].index, 
             signals.short_mavg[signals.positions == -1.0],
             'v', markersize=10, color='k')

    # Plot the equity curve in dollars
    ax2 = fig.add_subplot(212, ylabel='Portfolio value in $')
    returns['total'].plot(ax=ax2, lw=2.)

    # Plot the "buy" and "sell" trades against the equity curve
    ax2.plot(returns.ix[signals.positions == 1.0].index, 
             returns.total[signals.positions == 1.0],
             '^', markersize=10, color='m')
    ax2.plot(returns.ix[signals.positions == -1.0].index, 
             returns.total[signals.positions == -1.0],
             'v', markersize=10, color='k')

    # Plot the figure
    fig.show()

The graphical output of the code is as follows. I made use of the IPython %paste command to put this directly into the IPython console while in Ubuntu, so that the graphical output remained in view. The pink upticks represent purchasing the stock, while the black downticks represent selling it back: Backtesting a Moving Average Crossover in Python with pandas AAPL Moving Average Crossover Performance from 1990-01-01 to 2002-01-01

As can be seen the strategy loses money over the period, with five round-trip trades. This is not surprising given the behaviour of AAPL over the period, which was on a slight downward trend, followed by a significant upsurge beginning in 1998. The lookback period of the moving average signals is rather large and this impacted the profit of the final trade, which otherwise may have made the strategy profitable.

In subsequent articles we will create a more sophisticated means of analysing performance, as well as describing how to optimise the lookback periods of the individual moving average signals.


More