分享 Kevin Boller如何利用一些非常有用的在线资源,Yahoo Finance API
(需要解决,并且可能需要替换未来的数据源)和Jupyter notebook
,以在很大程度上自动跟踪和衡量股票投资组合相关情况 。
PME概述和个股票表现
简述背景,作者从2002年开始投资他的股票投资组合,并在几年前为其投资组合开发了一个财务模型。多年来,他会下载历史价格,并将数据加载到财务模型中-虽然在线经纪人计算已实现和未实现的回报,以及收入和股息,但在他进行自己的分析以评估时,喜欢在模型中包含历史数据。他说有一种观点/报告是从未从在线经纪人和服务中找到的,那就是类似于“公开市场”的分析。简而言之,公开市场等值(PME)是私募股权行业用来比较私募股权基金相对于行业基准表现的一组分析。具体参考https://docs.preqin.com/reports/Preqin-Special-Report-PME-July-2015.pdf。
与此相关的是,绝大多数股票投资组合经理无法选择长期表现优于大盘的股票组合,例如标准普尔500指数(约每20只积极管理的国内基金中就有1只优于指数基金)。即使一些个股表现优异,其他个股表现不佳的股票往往会超过表现较好的股票,这意味着总体而言,投资者的境况比简单地投资指数基金更糟糕。在商学院期间,作者了解了PME,在评估他当时持有的公共股票时,他在概念上也加入了类似的分析。要恰当地做到这一点,我们应该衡量每个投资组合头寸(持有期)相对于同一持有期内标准普尔500指数等值美元投资的具体投资流入时间。举个例子,如果你在2016年6月1日购买了一只股票,而且你仍然拥有它,你会想要将这只股票在那段时间的回报与2016年6月1日在标准普尔500指数(我们的基准例子)中同等美元投资的回报进行比较。在其他方面,你可能会发现,即使一只股票表现相对较好,它在同一时间段内仍可能落后于标准普尔500指数的回报。
过去,作者从雅虎财经(Yahoo Finance)下载历史价格数据,并使用EXCEL中的指数和匹配函数来计算每个头寸与标准普尔500指数的相对持有期表现。虽然这是实现这一目标的一种不错的方式,但使用Jupyter notebook
中的pandas
进行同样的操作更具伸缩性和可扩展性。无论何时下载新数据并加载到excel中,都不可避免地需要修改一些公式并验证错误。使pandas
,添加新的计算,例如累计ROI倍数(后面将介绍),几乎不需要任何时间就能实现。并且使用Plotly
的可视化效果具有很高的重现性,在预测方面也更有用。
声明:这篇文章中的任何内容都不应被视为投资建议。过去的表现不一定预示着未来的回报。这些是关于如何使用
pandas
导入不同时间间隔内的一小部分股票样本的数据,以及如何根据指数对它们的个人表现进行基准测试的一般示例。你应该把所有与投资有关的问题都告诉你的财务顾问。
除了贡献本教程之外,作者还在继续修改和构建此方法,并在本文末尾概述了进一步开发的一些注意事项。作者相信这个帖子会对面向数据科学的初学者和中级金融专业人员有帮助,特别是因为这应该延伸到许多其他类型的金融分析。这种方法在某种意义上是“类似PME的”,即它衡量的是相等持有期内的投资流入。由于公开市场投资的流动性比私募股权高得多,假设你遵循的是落后止损方法,从作者的角度来看,更重要的是专注于积极持股-通常情况下,明智的做法是剥离表现逊于基准的持股,或者你出于各种原因不再想持有的持股,作者主张着眼长远,只要他们表现优异,作者就很高兴长期持有这些份额的股票。
资源:作者现在是DataCamp的订阅者(未来的帖子将在DataCamp上发布),他说这个关于Python for Finance的社区教程非常棒。
作者已经为这篇文章创建了一个repo,包括这里的Python笔记本和这里的excel文件。
如果你想看到完整的交互式版本(因为jupyter <←>> gihub集成非常棒),你可以在这里使用nbviewer查看。https://nbviewer.jupyter.org/github/kdboller/pythonsp500/blob/b45eb79daf15d3c3032a8f398c207c9d2721ac19/Investment%20Portfolio%20Python%20Notebook_03_2018_blog%20example.ipynb
我们要完成的大纲:
导入标准普尔500指数和样本股票代码数据,使用雅虎财经API创建一个合并的投资组合‘主’文件,该文件将样本投资组合数据与历史股票代码和历史标准普尔500指数数据相结合,确定每项投资的收购日期标准普尔500指数的收盘价是多少,这使我们可以计算标准普尔500指数的等值股票position与相同美元投资的股票position计算投资组合position与标准普尔500指数在这段时间内的相对百分比和美元价值回报,计算累计投资组合回报和ROI倍数。
为了评估这个示例投资组合与市场指数相比有多好,一个更重要的项目:动态计算每个position相对于落后止损的表现,例如,如果一个position收盘比收盘高点低25%,考虑在下一个交易日卖出该position。
可视化
总回报比较-随着时间的推移,每个position相对于指数基准累计回报的百分比回报-每个position相对于基准累积投资的美元收益/(亏损)
-鉴于上述情况,总体投资回报与标准普尔500指数投资的同等权重和时间段相比如何?调整后的高收盘价百分比比较-每个position最近的收盘价相对于其自买入以来的调整后收盘价是多少?
Data Import和Dataframe操作
从导入必要的Python库开始,导入Plotly offline模块,然后读入我们的示例项目组合DataFrame。
# Import initial libraries
import pandas as pd
import numpy as np
import datetime
import matplotlib.pyplot as plt
import plotly.graph_objs as go
%matplotlib inline
# Imports in order to be able to use Plotly offline.
from plotly import __version__
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
print(__version__) # requires version >= 1.9.0
init_notebook_mode(connected=True)
# Import the Sample worksheet with acquisition dates and initial cost basis:
portfolio_df = pd.read_excel('Sample stocks acquisition dates_costs.xlsx', sheet_name='Sample')
portfolio_df.head(10)
2020410订正:sheetname 改为 sheet_name
,可能由于时间过去2年,函数改了参数。上面在.xlsx文件下载链接:https://github.com/kdboller/pythonsp500
现在您已经阅读了示例投资组合文件,您将创建一些变量来捕捉标准普尔500指数和所有投资组合的股票代码的日期范围。请注意,这是此notebook为数不多的需要每周更新的方面之一(调整日期范围以包括最近的交易周-在此,我们将此价格运行到2018年3月9日)。
# Date Ranges for SP 500 and for all tickers
# Modify these date ranges each week.
# The below will pull back stock prices from the start date until end date specified.
start_sp = datetime.datetime(2013, 1, 1)
end_sp = datetime.datetime(2018, 3, 9)
# This variable is used for YTD performance.
end_of_last_year = datetime.datetime(2017, 12, 29)
# These are separate if for some reason want different date range than SP.
stocks_start = datetime.datetime(2013, 1, 1)
stocks_end = datetime.datetime(2018, 3, 9)
正如Python Finance培训帖子中所提到的,pandas-datareader
包使我们能够读取来自Google, Yahoo! Finance and the World Bank。在这里,作者将重点介绍雅虎金融(Yahoo! Finance),虽然他已经与Quantopian
进行了非常初步的合作,也开始将quandl
作为数据源进行研究。正如DataCamp帖子中也提到的,Yahoo API端点最近发生了更改,这需要安装临时修复程序才能支持Yahoo! Finance投入到工作中去。作者已经在下面的代码中做了需要的细微调整。作者注意到一些小的数据问题,数据并不总是像预期的那样读入,或者有时会错过最后一个交易日。虽然这些问题相对较少发生,但他正在继续关注Yahoo! Finance并且认为将是未来最好和最可靠的数据来源。
# Leveraged from the helpful Datacamp Python Finance trading blog post.
from pandas_datareader import data as pdr
import fix_yahoo_finance as yf
yf.pdr_override() # <== that's all it takes :-)
sp500 = pdr.get_data_yahoo('^GSPC',
start_sp,
end_sp)
sp500.head()
订正
:
*** `fix_yahoo_finance` was renamed to `yfinance`. ***
Please install and use `yfinance` directly using `pip install yfinance -U`
More information: https://github.com/ranaroussi/yfinance
如果你带着你自己的notebook,一旦你成功地从雅虎的API读取了数据,你应该会看到类似下面的内容:
加载标准普尔500数据后,您将看到作者检查了dataframe的头部和尾部,并将数据压缩为只包括Adj Close列。Adjusted Close与Close columns之间的区别在于,Adjusted Close会反映红利(请参阅下面的未来开发区域)。当一家公司发放股息时,股价会减去每股股息的大小,因为公司正在分配公司收益的一部分。出于此分析的目的,您只需要分析此列。作者还创建了一个dataframe,其中只包括标准普尔指数在2017年最后一天(2018年初)adjusted close;这是为了运行单个股票相对于标准普尔500指数表现的YTD比较。
在下面的代码中,您将创建示例投资组合dataframe中所有自动收报机的数组。然后编写一个函数,将所有的股票代码及其相关数据读入一个新的dataframe中,这基本上与您对标准普尔500指数采取的方法相同,但适用于投资组合的所有股票代码。
# Generate a dynamic list of tickers to pull from Yahoo Finance API based on the imported file with tickers.
tickers = portfolio_df['Ticker'].unique()
tickers
# Stock comparison code
def get(tickers, startdate, enddate):
def data(ticker):
return (pdr.get_data_yahoo(ticker, start=startdate, end=enddate))
datas = map(data, tickers)
return(pd.concat(datas, keys=tickers, names=['Ticker', 'Date']))
all_data = get(tickers, stocks_start, stocks_end)
与标准普尔500dataframe一样,您将创建一个adj_close
dataframe,它只包含所有股票代码的Adj Closecolumn
。如果您查看上面作者链接到的repo中的notebook,您会发现这段代码包含的代码块比下面显示的更多。为了在这里描述这一点,作者在下面包含了通向初始merged_portfolio
dataframe的所有代码。
# Also only pulling the ticker, date and adj. close columns for our tickers.
adj_close = all_data[['Adj Close']].reset_index()
adj_close.head()
# Grabbing the ticker close from the end of last year
adj_close_start = adj_close[adj_close['Date']==end_of_last_year]
adj_close_start.head()
# Grab the latest stock close price
adj_close_latest = adj_close[adj_close['Date']==stocks_end]
adj_close_latest
adj_close_latest.set_index('Ticker', inplace=True)
adj_close_latest.head()
# Set portfolio index prior to merging with the adj close latest.
portfolio_df.set_index(['Ticker'], inplace=True)
portfolio_df.head()
# Merge the portfolio dataframe with the adj close dataframe; they are being joined by their indexes.
merged_portfolio = pd.merge(portfolio_df, adj_close_latest, left_index=True, right_index=True)
merged_portfolio.head()
# The below creates a new column which is the ticker return; takes the latest adjusted close for each position
# and divides that by the initial share cost.
merged_portfolio['ticker return'] = merged_portfolio['Adj Close'] / merged_portfolio['Unit Cost'] - 1
merged_portfolio
这取决于你对pandas的熟悉程度,这将是非常直接的到稍微令人不知所措的。下面,作者解释了这些命令行在做的事情:
- The overall approach you are taking is an example of split-apply-combine (note this downloads a PDF).
- The
all_data[['Adj Close']]
line creates a new dataframe with only the columns provided in the list; hereAdj Close
is the only item provided in the list. - Using this line of code,
adj_close[adj_close['Date']==end_of_last_year]
, you are filtering theadj_close
dataframe to only the row where the data’sDate
column equals the date which you earlier specified in theend_of_last_year
variable (2017, 12, 29). - You also set the index of the
adj_close_latest
andportfolio_df
dataframes. I did this because this is how you’ll merge the two dataframes. Themerge
function, very similar to SQL joins, is an extremely useful function which I use very often. - Within the
merge
function, you specify the left dataframe (portfolio_df
) and our right dataframe (adj_close_latest
). By specifyingleft_index
andright_index
equal True, you are stating that the two dataframes share a common index and you will join both on this. - Last, you create a new column called
'ticker return'
. This calculates the percent return for each stock position by dividing theAdj Close
by theUnit Cost
(initial purchase price for stock) and subtracting 1. This is similar to calculating a formula in excel and carrying it down, but inpandas
this is accomplished with one-line of code.
你已经获得了标准普尔500指数和个股的单个数据框,你开始开发一个‘master’ dataframe,我们将使用它来进行计算、可视化和任何进一步的分析。接下来,将继续在这个‘master’ dataframe上构建,并进一步使用pandas merge 函数。下面,将重置当前dataframe的索引,并开始将较小的dataframe与‘master’ dataframe连接起来。下面的代码块再次在Jupyter notebook中进一步展开;在这里,作者采取与前面类似的方法,将共享下面的代码,然后分解代码块下面的关键标注。
merged_portfolio.reset_index(inplace=True)
# Here we are merging the new dataframe with the sp500 adjusted closes since the sp start price based on
# each ticker's acquisition date and sp500 close date.
merged_portfolio_sp = pd.merge(merged_portfolio, sp_500_adj_close, left_on='Acquisition Date', right_on='Date')
# .set_index('Ticker')
# We will delete the additional date column which is created from this merge.
# We then rename columns to Latest Date and then reflect Ticker Adj Close and SP 500 Initial Close.
del merged_portfolio_sp['Date_y']
merged_portfolio_sp.rename(columns={'Date_x': 'Latest Date', 'Adj Close_x': 'Ticker Adj Close'
, 'Adj Close_y': 'SP 500 Initial Close'}, inplace=True)
# This new column determines what SP 500 equivalent purchase would have been at purchase date of stock.
merged_portfolio_sp['Equiv SP Shares'] = merged_portfolio_sp['Cost Basis'] / merged_portfolio_sp['SP 500 Initial Close']
merged_portfolio_sp.head()
# We are joining the developing dataframe with the sp500 closes again, this time with the latest close for SP.
merged_portfolio_sp_latest = pd.merge(merged_portfolio_sp, sp_500_adj_close, left_on='Latest Date', right_on='Date')
# Once again need to delete the new Date column added as it's redundant to Latest Date.
# Modify Adj Close from the sp dataframe to distinguish it by calling it the SP 500 Latest Close.
del merged_portfolio_sp_latest['Date']
merged_portfolio_sp_latest.rename(columns={'Adj Close': 'SP 500 Latest Close'}, inplace=True)
merged_portfolio_sp_latest.head()
- 可以在
merged_portfolio
上使用reset_index
,以便扁平化主数据框并连接较小的数据框的相关列。 - 在
merged_portfolio_sp
行中,您可以将当前主数据框(merged_portfolio)与sp_500_adj_close合并;这样做是为了使标准普尔在每个position的采购日期的收盘价-这允许您跟踪每个position持有的同一时间期(从收购日期到最近的市场收盘日期)的标准普尔业绩。 - 这里的合并与以前略有不同,因为我们在左侧数据框的
Acquisition Date
列和右侧数据框的Date
列上连接。 - 完成此合并后,您将拥有不需要的额外列-由于我们的主数据框最终将有相当数量的列用于分析,因此在此过程中删除重复和不必要的列非常重要。
- 有几种方法可以删除不必要的列并执行各种列名清理;为简单起见,作者使用
python del
,然后使用pandasrename
方法重命名几个列,通过重命名为Ticker Adj Close
来澄清股票代码的Adj Close
列;您可以使用SP 500 Initial Close
来区分标准普尔指数的初始调整收盘。 - 在计算
merged_portfolio_sp['Equiv SP Shares']
时,这样做是为了能够计算标准普尔500指数在收购每个股票position的当天收盘时的等价值:如果你花5,000美元购买一个新的股票position,你可以花5,000美元购买标准普尔500指数;继续这个例子,如果标准普尔500指数在购买时的交易价格是每股2,500美元,你就可以购买2股股票。稍后,如果标准普尔500指数的交易价格为每股3,000美元,你的股份将价值6,000美元(2股等值股票*每股3,000美元),在这段可比的时间段内,你将获得1,000美元的账面利润。 - 在代码块的其余部分,接下来执行类似的合并,这一次加入了标准普尔500指数的最新收盘价-这提供了计算标准普尔500指数相对于每个position持有期的可比回报所需的第二个部分:每个股票收购日的标准普尔500指数价格和标准普尔500指数的最新收盘价。
现在,您已经使用以下内容进一步开发了您的主数据框:
- 每个投资组合position的价格、股票和position收购日的价值,以及最新的市场收盘价。
- 每个股票的等值position收购日的等值标准普尔500指数价格、股票和价值,以及标准普尔500指数的最新收盘价。
鉴于上述情况,接下来您将执行必要的计算,以相对于标准普尔500指数的可比美元投资和持有时间,比较每个position的表现,以及这一策略/一篮子股票的整体表现。
下面是您要添加到“主”数据框中的新列的摘要。
- 在第一列
['SP Return']
中,您创建了一个列,该列计算标准普尔500指数在每个position持有期间的绝对百分比回报(注意,这是绝对回报,而不是年化回报)。在第二列(['Abs. Return Compare']
)中,您将在同一时间段内将['ticker return']
(每个位置的回报)与['SP Return']
进行比较。 - 在接下来的三栏
['Ticker Share Value'], ['SP 500 Value'] and ['Abs Value Compare']
中,我们根据我们持有的股票乘以最新调整后的收盘价来计算美元价值(市值)等值(并从股票代码中减去标准普尔回报,以计算超出/(低于)业绩)。 - 最后,
['Stock Gain / (Loss)']
和['SP 500 Gain / (Loss)']
列计算我们每个position的未实现美元收益/亏损和可比的标准普尔500收益/亏损;这使我们能够比较每个position的价值影响与简单地将这些美元投资于标准普尔500指数。
# Percent return of SP from acquisition date of position through latest trading day.
merged_portfolio_sp_latest['SP Return'] = merged_portfolio_sp_latest['SP 500 Latest Close'] / merged_portfolio_sp_latest['SP 500 Initial Close'] - 1
# This is a new column which takes the tickers return and subtracts the sp 500 equivalent range return.
merged_portfolio_sp_latest['Abs. Return Compare'] = merged_portfolio_sp_latest['ticker return'] - merged_portfolio_sp_latest['SP Return']
# This is a new column where we calculate the ticker's share value by multiplying the original quantity by the latest close.
merged_portfolio_sp_latest['Ticker Share Value'] = merged_portfolio_sp_latest['Quantity'] * merged_portfolio_sp_latest['Ticker Adj Close']
# We calculate the equivalent SP 500 Value if we take the original SP shares * the latest SP 500 share price.
merged_portfolio_sp_latest['SP 500 Value'] = merged_portfolio_sp_latest['Equiv SP Shares'] * merged_portfolio_sp_latest['SP 500 Latest Close']
# This is a new column where we take the current market value for the shares and subtract the SP 500 value.
merged_portfolio_sp_latest['Abs Value Compare'] = merged_portfolio_sp_latest['Ticker Share Value'] - merged_portfolio_sp_latest['SP 500 Value']
# This column calculates profit / loss for stock position.
merged_portfolio_sp_latest['Stock Gain / (Loss)'] = merged_portfolio_sp_latest['Ticker Share Value'] - merged_portfolio_sp_latest['Cost Basis']
# This column calculates profit / loss for SP 500.
merged_portfolio_sp_latest['SP 500 Gain / (Loss)'] = merged_portfolio_sp_latest['SP 500 Value'] - merged_portfolio_sp_latest['Cost Basis']
merged_portfolio_sp_latest.head()
现在你有了需要的东西,可以将你的投资组合的表现与同等投资于标准普尔500指数的投资组合进行比较。接下来的两个代码块部分允许您:i)比较每个position相对于标准普尔500指数的YTD表现(衡量动能和您的position的走势);ii)比较每个投资组合position的最新收盘价与其最近的收盘高点(这允许您评估position是否触发了跟踪止损,例如,收盘时比收盘高点低25%)。
下面,作者将从YTD performance代码块开始,并在下面进一步提供有关代码的详细信息。
# Merge the overall dataframe with the adj close start of year dataframe for YTD tracking of tickers.
merged_portfolio_sp_latest_YTD = pd.merge(merged_portfolio_sp_latest, adj_close_start, on='Ticker')
# , how='outer'
# Deleting date again as it's an unnecessary column. Explaining that new column is the Ticker Start of Year Close.
del merged_portfolio_sp_latest_YTD['Date']
merged_portfolio_sp_latest_YTD.rename(columns={'Adj Close': 'Ticker Start Year Close'}, inplace=True)
# Join the SP 500 start of year with current dataframe for SP 500 ytd comparisons to tickers.
merged_portfolio_sp_latest_YTD_sp = pd.merge(merged_portfolio_sp_latest_YTD, sp_500_adj_close_start
, left_on='Start of Year', right_on='Date')
# Deleting another unneeded Date column.
del merged_portfolio_sp_latest_YTD_sp['Date']
# Renaming so that it's clear this column is SP 500 start of year close.
merged_portfolio_sp_latest_YTD_sp.rename(columns={'Adj Close': 'SP Start Year Close'}, inplace=True)
# YTD return for portfolio position.
merged_portfolio_sp_latest_YTD_sp['Share YTD'] = merged_portfolio_sp_latest_YTD_sp['Ticker Adj Close'] / merged_portfolio_sp_latest_YTD_sp['Ticker Start Year Close'] - 1
# YTD return for SP to run compares.
merged_portfolio_sp_latest_YTD_sp['SP 500 YTD'] = merged_portfolio_sp_latest_YTD_sp['SP 500 Latest Close'] / merged_portfolio_sp_latest_YTD_sp['SP Start Year Close'] - 1
- 在创建
merged_portfolio_sp_latest_YTD
数据框时,您现在正在将主数据框与adj_close_start
数据框合并;作为提醒,您是通过筛选adj_close
数据框创建此数据框的,其中'Date'
列等于变量end_of_last_year
;这样做是因为YTD(年初至今)股票和指数表现是通过它来衡量的;去年的结束收盘价是下一年的结束收盘价;您这样做是因为它是衡量YTD(年初至今)股票和指数表现的方式;去年的结束收盘价就是下一年的结束收盘价;您这样做是因为它是衡量YTD(年初至今)股票和指数表现的方式;去年的结束收盘价是次年的结束日期. - 在这里,我们再次使用
del
删除不必要的列,并使用rename
方法来澄清主数据框新添加的列。 - 最后,我们提取每个Ticker(在
['Ticker Adj Close']
列中),并计算每个Ticker的YTD回报(SP 500 Latest Close
列中的每个值也有一个标准普尔500等效值)。
在下面的代码块中,您使用sort_values
方法对‘主’数据框重新排序,然后计算累计投资组合投资(持仓成本之和),以及投资组合头寸的累计价值和理论标准普尔500指数投资的累计价值。这使你能够看到你的总投资组合(在整个时期的不同时间对头寸进行的投资)与简单地投资于指数的策略进行了整体比较。稍后,您将使用['Cum Ticker ROI Mult']
来帮助您可视化每项投资对总体投资回报(ROI)的贡献或减少程度。
merged_portfolio_sp_latest_YTD_sp = merged_portfolio_sp_latest_YTD_sp.sort_values(by='Ticker', ascending=True)
# Cumulative sum of original investment
merged_portfolio_sp_latest_YTD_sp['Cum Invst'] = merged_portfolio_sp_latest_YTD_sp['Cost Basis'].cumsum()
# Cumulative sum of Ticker Share Value (latest FMV based on initial quantity purchased).
merged_portfolio_sp_latest_YTD_sp['Cum Ticker Returns'] = merged_portfolio_sp_latest_YTD_sp['Ticker Share Value'].cumsum()
# Cumulative sum of SP Share Value (latest FMV driven off of initial SP equiv purchase).
merged_portfolio_sp_latest_YTD_sp['Cum SP Returns'] = merged_portfolio_sp_latest_YTD_sp['SP 500 Value'].cumsum()
# Cumulative CoC multiple return for stock investments
merged_portfolio_sp_latest_YTD_sp['Cum Ticker ROI Mult'] = merged_portfolio_sp_latest_YTD_sp['Cum Ticker Returns'] / merged_portfolio_sp_latest_YTD_sp['Cum Invst']
merged_portfolio_sp_latest_YTD_sp.head()
您现在已经接近最后冲刺阶段,几乎准备好开始可视化您的数据,并评估您的投资组合的单个报价器和整体策略性能的优势和劣势。
与前面一样,作者已经包含了主代码块,用于确定仓位相对于最近收盘高点的交易位置;然后,并将在下面进一步解释代码。
# Need to factor in that some positions were purchased much more recently than others.
# Join adj_close dataframe with portfolio in order to have acquisition date.
portfolio_df.reset_index(inplace=True)
adj_close_acq_date = pd.merge(adj_close, portfolio_df, on='Ticker')
# delete_columns = ['Quantity', 'Unit Cost', 'Cost Basis', 'Start of Year']
del adj_close_acq_date['Quantity']
del adj_close_acq_date['Unit Cost']
del adj_close_acq_date['Cost Basis']
del adj_close_acq_date['Start of Year']
# Sort by these columns in this order in order to make it clearer where compare for each position should begin.
adj_close_acq_date.sort_values(by=['Ticker', 'Acquisition Date', 'Date'], ascending=[True, True, True], inplace=True)
# Anything less than 0 means that the stock close was prior to acquisition.
adj_close_acq_date['Date Delta'] = adj_close_acq_date['Date'] - adj_close_acq_date['Acquisition Date']
adj_close_acq_date['Date Delta'] = adj_close_acq_date[['Date Delta']].apply(pd.to_numeric)
# Modified the dataframe being evaluated to look at highest close which occurred after Acquisition Date (aka, not prior to purchase).
adj_close_acq_date_modified = adj_close_acq_date[adj_close_acq_date['Date Delta']>=0]
# This pivot table will index on the Ticker and Acquisition Date, and find the max adjusted close.
adj_close_pivot = adj_close_acq_date_modified.pivot_table(index=['Ticker', 'Acquisition Date'], values='Adj Close', aggfunc=np.max)
adj_close_pivot.reset_index(inplace=True)
# Merge the adj close pivot table with the adj_close table in order to grab the date of the Adj Close High (good to know).
adj_close_pivot_merged = pd.merge(adj_close_pivot, adj_close
, on=['Ticker', 'Adj Close'])
# Merge the Adj Close pivot table with the master dataframe to have the closing high since you have owned the stock.
merged_portfolio_sp_latest_YTD_sp_closing_high = pd.merge(merged_portfolio_sp_latest_YTD_sp, adj_close_pivot_merged
, on=['Ticker', 'Acquisition Date'])
# Renaming so that it's clear that the new columns are closing high and closing high date.
merged_portfolio_sp_latest_YTD_sp_closing_high.rename(columns={'Adj Close': 'Closing High Adj Close', 'Date': 'Closing High Adj Close Date'}, inplace=True)
merged_portfolio_sp_latest_YTD_sp_closing_high['Pct off High'] = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker Adj Close'] / merged_portfolio_sp_latest_YTD_sp_closing_high['Closing High Adj Close'] - 1
merged_portfolio_sp_latest_YTD_sp_closing_high
这是一个相当重大的提升,现在是我们期待已久的可视化的时候了。如果您在自己的笔记本中继续跟踪,您现在就有了一个非常丰富的数据框架,其中包含许多经过计算的投资组合指标,如下图所示:
可视化
# Ploty is an outstanding resource for interactive charts.
trace1 = go.Bar(
x = merged_portfolio_sp_latest_YTD_sp['Ticker'][0:10],
y = merged_portfolio_sp_latest_YTD_sp['Share YTD'][0:10],
name = 'Ticker YTD')
trace2 = go.Scatter(
x = merged_portfolio_sp_latest_YTD_sp['Ticker'][0:10],
y = merged_portfolio_sp_latest_YTD_sp['SP 500 YTD'][0:10],
name = 'SP500 YTD')
data = [trace1, trace2]
layout = go.Layout(title = 'YTD Return vs S&P 500 YTD'
, barmode = 'group'
, yaxis=dict(title='Returns', tickformat=".2%")
, xaxis=dict(title='Ticker')
, legend=dict(x=.8,y=1)
)
fig = go.Figure(data=data, layout=layout)
iplot(fig)
# Current Share Price versus Closing High Since Purchased
trace1 = go.Bar(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'][0:10],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['Pct off High'][0:10],
name = 'Pct off High')
data = [trace1]
layout = go.Layout(title = 'Adj Close % off of High'
, barmode = 'group'
, yaxis=dict(title='% Below Adj Close High', tickformat=".2%")
, xaxis=dict(title='Ticker')
, legend=dict(x=.8,y=1)
)
fig = go.Figure(data=data, layout=layout)
iplot(fig)
Total Return Comparison Charts
trace1 = go.Bar(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'][0:10],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['ticker return'][0:10],
name = 'Ticker Total Return')
trace2 = go.Scatter(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'][0:10],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['SP Return'][0:10],
name = 'SP500 Total Return')
data = [trace1, trace2]
layout = go.Layout(title = 'Total Return vs S&P 500'
, barmode = 'group'
, yaxis=dict(title='Returns', tickformat=".2%")
, xaxis=dict(title='Ticker', tickformat=".2%")
, legend=dict(x=.8,y=1)
)
fig = go.Figure(data=data, layout=layout)
iplot(fig)
Cumulative Returns Over Time
trace1 = go.Bar(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'][0:10],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['Stock Gain / (Loss)'][0:10],
name = 'Ticker Total Return ($)')
trace2 = go.Bar(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'][0:10],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['SP 500 Gain / (Loss)'][0:10],
name = 'SP 500 Total Return ($)')
trace3 = go.Scatter(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'][0:10],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['ticker return'][0:10],
name = 'Ticker Total Return %',
yaxis='y2')
data = [trace1, trace2, trace3]
layout = go.Layout(title = 'Gain / (Loss) Total Return vs S&P 500'
, barmode = 'group'
, yaxis=dict(title='Gain / (Loss) ($)')
, yaxis2=dict(title='Ticker Return', overlaying='y', side='right', tickformat=".2%")
, xaxis=dict(title='Ticker')
, legend=dict(x=.75,y=1)
)
fig = go.Figure(data=data, layout=layout)
iplot(fig)
trace1 = go.Bar(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['Cum Invst'],
# mode = 'lines+markers',
name = 'Cum Invst')
trace2 = go.Bar(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['Cum SP Returns'],
# mode = 'lines+markers',
name = 'Cum SP500 Returns')
trace3 = go.Bar(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['Cum Ticker Returns'],
# mode = 'lines+markers',
name = 'Cum Ticker Returns')
trace4 = go.Scatter(
x = merged_portfolio_sp_latest_YTD_sp_closing_high['Ticker'],
y = merged_portfolio_sp_latest_YTD_sp_closing_high['Cum Ticker ROI Mult'],
# mode = 'lines+markers',
name = 'Cum ROI Mult'
, yaxis='y2')
data = [trace1, trace2, trace3, trace4]
layout = go.Layout(title = 'Total Cumulative Investments Over Time'
, barmode = 'group'
, yaxis=dict(title='Returns')
, xaxis=dict(title='Ticker')
, legend=dict(x=.4,y=1)
, yaxis2=dict(title='Cum ROI Mult', overlaying='y', side='right')
)
fig = go.Figure(data=data, layout=layout)
iplot(fig)
References:
https://www.datacamp.com/community/tutorials/finance-python-trading
https://pypi.python.org/pypi/fix-yahoo-finance
http://www.learndatasci.com/python-finance-part-yahoo-finance-api-pandas-matplotlib/