动态规划总结2 股票买卖

在Leetcode上有一系列经典的股票买卖问题,也是典型的动态规划问题.也可以利用在每一天的状态是否买卖来解决.然而每一天的买卖关系受到了前一天操作的逻辑上的约束,看起来更像是一个买卖之间状态的转移关系.

Leetcode 121. Best Time to Buy and Sell Stock

在最基础的动态规划里,给定每一天的股票价格,只允许买卖至多一次,寻求可以获得的最大利润.
Example

Input: prices = [7,1,5,3,6,4]
Output: 5
Explanation: Buy on day 2 (price = 1) and sell on day 5 (price = 6), profit = 6-1 = 5.
Note that buying on day 2 and selling on day 1 is not allowed because you must buy before you sell.

实际上这题如果不用动态规划的思想去做,也很好理解,只要找到基于当前这一天最小的买入值,用今天的卖出值减掉,获得的利润就是当天最大的,再进行比较就可以获得所有天中最大的.

class Solution(object):
    def maxProfit(self, prices):
        """
        :type prices: List[int]
        :rtype: int
        """
        min_value = float('inf')
        res = 0
        for i in range(len(prices)):
            min_value = min(min_value, prices[i])
            res = max(prices[i]-min_value, res)
        return res

但是,如何使用动归来解决和描述呢?我们用下面的状态转移图来进行表达:

状态转移图.png

现在有两个状态A和B,显然A状态只能通过B状态卖出或者保持不买卖才能到达,同理,B状态可以保持不变,但也可以通过A状态买入才能到达.那么可以得出一个公式:

A[i] = max(A[i-1], B[i-1]+nums[i])
B[i] = max(B[i-1], A[i-1]-nums[i])
由于刚开始并没有办法卖出,所以A[0]=0
但是可以买入,因此B[0]=-nums[0]

A[i]和B[i]是第i天可能出现的所有情况的值.因为要求获得的最大利润,那必然是处于A状态下的结果.
本题中只能至多只能买卖一次,也就意味着状态应该是由B->A或者A保持不变两种情况:

B->A, B无法从A中转移过来,所以B的式子变为:

B[i] = max(B[i-1], -nums[i])

可以看出来,它所代表的意思,也是到当天为止所卖出的最大值(此处是负数).那么结合A的式子,代码就可以写成:

class Solution(object):
    def maxProfit(self, prices):
        """
        :type prices: List[int]
        :rtype: int
        """
        b = [float('-inf') for i in range(len(prices))]
        b[0] = -prices[0]
        for i in range(1, len(prices)):
            b[i] = max(-prices[i], b[i-1])
        res = 0
        for i in range(1, len(prices)):
            res = max(prices[i]+b[i-1], res)
        return res

将一维数组再优化一下,就可以得出第一种直观的答案了:

class Solution(object):
    def maxProfit(self, prices):
        """
        :type prices: List[int]
        :rtype: int
        """
        b = -prices[0]
        res = 0
        for i in range(1, len(prices)):
            b = max(b, -prices[i])
            res = max(prices[i]+b, res)
        return res

如果可以允许多次买卖的话,那么就可以直接利用上面的公式了.比如下面这题.

Leetcode 122. Best Time to Buy and Sell Stock II

Example:

Input: prices = [7,1,5,3,6,4]
Output: 7
Explanation: Buy on day 2 (price = 1) and sell on day 3 (price = 5), profit = 5-1 = 4.
Then buy on day 4 (price = 3) and sell on day 5 (price = 6), profit = 6-3 = 3.

解答如下,根据公式写的优化了一下:

class Solution(object):
    def maxProfit(self, prices):
        """
        :type prices: List[int]
        :rtype: int
        """
        # A = [0 for i in range(len(prices)]
        # B = ['_' for j in range(len(prices)]
        A = 0
        B = -prices[0]
        res = 0
        for i in range(1, len(prices)):
            prev_a, prev_b = A, B
            # A[i] = max(A[i-1], B[i-1]+nums[i])
            A = max(prev_a, prev_b+prices[i])
            # B[i] = max(B[i-1], A[i-1]-nums[i])
            B = max(prev_b, prev_a-prices[i])
            # res = max(A[i], res)
            res = max(A, res)
        return res

购买流程如下图蓝色框所示:

购买流程

利用状态转移图,可以解决股票买卖的各种变种.比如,交易需要费用:

Leetcode 714. Best Time to Buy and Sell Stock with Transaction Fee

只需要在原有的基础上扣除费用:

class Solution(object):
    def maxProfit(self, prices, fee):
        """
        :type prices: List[int]
        :type fee: int
        :rtype: int
        """
        A = 0
        B = -prices[0]
        res = 0
        for i in range(1, len(prices)):
            prev_a = A
            A = max(B+prices[i]-fee, A)
            B = max(prev_a-prices[i], B)
            res = max(res, A)
        return res

又或者如下面的题目,限定交易的次数.

Leetcode 123. Best Time to Buy and Sell Stock III

本例中,至多只能进行2次股票买卖.
Example

Input: prices = [3,3,5,0,0,3,1,4]
Output: 6
Explanation: Buy on day 4 (price = 0) and sell on day 6 (price = 3), profit = 3-0 = 3.
Then buy on day 7 (price = 1) and sell on day 8 (price = 4), profit = 4-1 = 3.

事实上,通用公式可以总结如下:
假设截止到第i天进行了k次交易,定为dp[k,i],那么(忽略边界条件):

dp[k,j] = max(dp[k-1, j-1] + prices[i]-prices[j]+dp[k-1,j-1]) 其中`0<=j<=i`

参照大佬的解法:Detail explanation of DP solution
如果尝试使用状态转移图的话,也可以解答:

状态转移图

A是最终能获得的利润,除了一次都不交易,它可以从两种状态转移而来:

# 只交易一次
A1 -> B1 -> A
# 交易两次
A1 -> B1 -> A2 -> B2 -> A

所以可以得出下面的状态转移式子(A1是初始状态,为0):

B1[i] = max(-prices[i], B1[i-1])
A2[i] = max(A2[i-1], B1[i-1]+prices[i])
B2[i] = max(B2[i-1], A2[i-1]-prices[i])
A[i] = max(max(B2[i-1], B1[i-1])+prices[i], A[i-1])

只需要置换一下顺序,就可以不用数组以及中间变量:

class Solution(object):
    def maxProfit(self, prices):
        """
        :type prices: List[int]
        :rtype: int
        """

        # b1 = [float('-inf') for i in range(len(prices))]
        # a2 = [float('-inf') for i in range(len(prices))]
        # b2 = [float('-inf') for i in range(len(prices))]
        b1 = -prices[0]
        a2 = b2 = float('-inf')
        # b1[0] = -prices[0]
        res = 0
        for i in range(1, len(prices)):
            # b1[i] = max(-prices[i], b1[i-1])
            # b2[i] = max(a2[i-1]-prices[i], b2[i-1])
            # a2[i] = max(b1[i-1]+prices[i], a2[i-1])
            # res = max(max(b1[i-1], b2[i-1]) + prices[i], res)
            res = max(max(b1, b2)+prices[i], res)
            b2 = max(b2, a2-prices[i])
            a2 = max(a2, b1+prices[i])
            b1 = max(-prices[i], b1)

        return res

Leetcode 188. Best Time to Buy and Sell Stock IV

在上一题的基础上延伸一下,如果至多可以交易k次,能获得的最大利润是多少呢?
再将状态转移图延伸一下,比如可以交易三次,那么在原有的基础上可以将公式变为:

B1[i] = max(-prices[i], B1[i-1])
A2[i] = max(A2[i-1], B1[i-1]+prices[i])
B2[i] = max(B2[i-1], A2[i-1]-prices[i])
A3[i] = max(A3[i-1], B2[i-1]+prices[i])
B3[i[ = max(B3[i-1], A3[i-1]-prices[i])
A[i] = max(max(B2[i-1], B1[i-1], B3[i-1])+prices[i], A[i-1])

所以可以总结出一个规律:

A[k][i] = max(A[k-1][i-1], B[k-1][i-1]+prices[i])
B[k][i] = max(B[k-1][i-1], A[k][i-1]-prices[i])

i这个维度可以去掉,所以只需要两个一维数组即可:

class Solution(object):
    def maxProfit(self, k, prices):
        """
        :type k: int
        :type prices: List[int]
        :rtype: int
        """
        res = 0
        if not prices or k <= 0:
            return res
        a = [float('-inf') for i in range(k)]
        b = [float('-inf') for i in range(k)]
        b[0] = -prices[0]
        b_max = b[0]
        for i in range(1, len(prices)):
            for j in range(1, k):
                b_max= max(b[j], b_max)
                b[j] = max(a[j]-prices[i], b[j])
                a[j] = max(b[j-1]+prices[i], a[j])
            b[0] = max(-prices[i], b[0])
            res = max(max(b[0],b_max)+prices[i], res)
        return res

同样地,中间增加其他状态也可以利用状态图解决.

Leetcode 309. Best Time to Buy and Sell Stock with Cooldown

当卖出股票后,至少要休息一天才能进行下一次交易.
Example

Input: prices = [1,2,3,0,2]
Output: 3
Explanation: transactions = [buy, sell, cooldown, buy, sell]

这个时候从A状态转移到B后,B无法再直接转到A,而是需要休息,因此设定休息的状态为C,状态转移图如下:

状态转移图

根据转移图又可以得到:

A[i] = max(A[i-1], C[i-1])
B[i] = max(A[i-1]-prices[i], B[i-1])
C[i] = max(B[i-1]+prices[i], C[i-1])

即除了保持自己的状态,不做任何操作外,A状态只能由C转移而来;B状态只能由A进行买入;C状态只能从B状态卖出得来.
同样地,第一天最多只能进行买入操作,因此B[0] = -prices[0],而A持有的收益最大为0, C在第一天不可能有收益,因此初始化负无穷(和前面题目A2,A3类似,可以理解为无效值用以占位).可以获得的最大收益只可能出现在A或者C上,使用数组和不是用数组的代码如下:

class Solution(object):
    def maxProfit(self, prices):
        """
        :type prices: List[int]
        :rtype: int
        """
        # A = [0 for i in range(len(prices))]
        # B = [0 for i in range(len(prices))]
        # C = [0 for i in range(len(prices))]
        A = 0
        C = float('-inf')
        B = -prices[0]
        # B[0] = -prices[0]
        res = 0
        for i in range(1, len(prices)):
            # B[i] = max(A[i-1]-prices[i], B[i-1])
            # A[i] = max(C[i-1], A[i-1])
            # C[i] = max(B[i-1]+prices[i], C[i-1])
            prev_b = B
            B = max(A-prices[i], B)
            A = max(C, A)
            C = max(prev_b+prices[i], C)
            res = max(C, A, res)
        return res

至此,就解决了所有的股票买卖问题.