2023-09-24 01 RNN 的前向传播

来源:https://hyunhp.tistory.com/448

1. RNN cell 与 RNN 直观图示

RNN ---->  Recurrent Neural Network 

You can think of the recurrent neural network as the repeated use of a single cell,the computations for a single time step. 

2. 输入的维度Dimensions of input x

2.1 Input with n_{x}number of units

➢   For a single time step of a single input example, x^{(i)<t>} is a one-dimensional input vector

➢   Using language as an example, a language with a 5000-word vocabulary could be one-hot encoded into a vector that has n_{x}=5000 units. so x^{(i)<t>} could have the shape (5000,)

➢  The notation n_{x} is used here to denote the number of units in a single time step of a single training example

2.2 Time Steps of size T_{x}

A recurrent neural network has multiple time steps, which you'll be index with t.

➢ In the lessons, you saw a single training example x^{(i)}consisting of multiple time steps T_{x}. In this notebook, T_{x}will denote the number of timesteps in the longest sequence.

2.3 Batches of size m

➢  Let's say we have mini-batches, each with 20 training examples

➢  To benefit from vectorization, you'll stack 20 columns of x^{(i)}examples

➢  For example, the tensor has the shape (5000,20,10)

➢  You'll use m to denote the number of training examples

➢  So, the shape of a mini-batch is

2.4 3D Tensor of shape (n_{x},m,T_{x})

➢  The 3-dimensional tensor x of shape (n_{x},m,T_{x}) represents the input x that is fed into the RNN

2.5 Take a 2D slice for each time step: x^{<t>}

➢  At each time step, you'll use a mini-batch of training examples (not just a single example)

➢ So, for each time step t, you'll use a 2D slice of shape (n_{x},m)

➢ This 2D slice is referred to as x^{t}. The variable name in the code is xt.

3. 隐藏状态的维度 hidden state a

the activation a^{<t>} that is passed to the RNN from one time step to another is called a "hidden state"

3.1 Dimensions of hidden state a

➢ Similar to the input tensor x, the hidden state for a single training example is a vector of length 

➢  If you include a mini-batch or m training examples, the shape of a mini-batch is (n_{a},m)

➢ When you include the time step dimension, the shape of the hidden state is (n_{a},m,T_{x})

➢ You'll loop through the time steps with index t, and work with 2 2D slice of the 3D tensor

➢ This 2D slice is referred to as a^{<t>}

  In the code, the variable names used are either a_prev or a_next, depending on the function being implemented

➢  The shape of this 2D slice is (n_{a},m)

4. 输出的维度Dimensions of prediction \hat{y}

➢ Similar to the inputs  and hidden states, \hat{y}  is a 3D tensor of shape (n_{y},m,T_{y})

            ■ n_{y} : number of units in the vector representing the prediction

            ■ m :    number of examples in a mini-batch

            ■ T_{y}:  number of time steps in the prediction

➢ For a single similar time step t, a 2D slice \hat{y} ^{<t>} has shape (n_{y},m)

➢ In the code, the varriable names are:

             ●  y_pred : \hat{y}

             ●  yt_pred : \hat{y} ^{<t>}

5. 构建RNN

➢  Here is how you can implement an RNN:

Steps:

            ● Implement the calculations needed for one time step of the RNN.

            ● Implement a loop over T_{x} time steps in order to process all the inputs, one at a time

➢  关于 RNN Cell

You can think of the recurrent neural network as the repeated use of a single cell. First, you'll implement the computations for a single time step. 

➢  RNN cell  versus RNN_cell_forward:

● Note that an RNN cell outputs the hidden state a^{<t>}

      ■ RNN cell is shown in the figure as the inner box with solid lines

● The function that you'll implement, rnn_cell_forward, also calculates the prediction \hat{y} ^{<t>}

     ■ RNN_cell_forward is shown in the figure as the outer  ox with dashed lines

➢ The following figure describes the operations for a single time step of an RNN cell:

代码如下:

# UNQ_C1 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_cell_forward

def rnn_cell_forward(xt, a_prev, parameters): 

      """

     【代码注释】

       Implements a single forward step of the RNN-cell as described in Figure (2)

       Arguments:

       xt -- your input data at timestep "t", numpy array of shape (n_x, m).

       a_prev -- Hidden state at timestep "t-1", numpy array of shape (n_a, m)

       parameters -- python dictionary containing:

                        Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

                        Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

                        Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

                        ba --  Bias, numpy array of shape (n_a, 1)

                        by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

        Returns:

        a_next -- next hidden state, of shape (n_a, m)

        yt_pred -- prediction at timestep "t", numpy array of shape (n_y, m)

        cache -- tuple of values needed for the backward pass, contains (a_next, a_prev, xt, parameters)

        """

    # Retrieve parameters from "parameters"   

    Wax = parameters["Wax"]

    Waa = parameters["Waa"]

    Wya = parameters["Wya"]

    ba = parameters["ba"]

    by = parameters["by"]


    ### START CODE HERE ### (≈2 lines)   

    # compute next activation state using the formula given above   

    a_next = np.tanh(np.dot(Wax, xt) + np.dot(Waa, a_prev) + ba)

    # compute output of the current cell using the formula given above   

    yt_pred = softmax(np.dot(Wya, a_next) + by)

    ### END CODE HERE ###   

    # store values you need for backward propagation in cache   

    cache = (a_next, a_prev, xt, parameters)

    return a_next, yt_pred, cache

执行上述代码

def rnn_cell_forward_tests(rnn_cell_forward):

        np.random.seed(1)

        xt_tmp = np.random.randn(3, 10)

        a_prev_tmp = np.random.randn(5, 10)

        parameters_tmp = {}

        parameters_tmp['Waa'] = np.random.randn(5, 5)

        parameters_tmp['Wax'] = np.random.randn(5, 3)

        parameters_tmp['Wya'] = np.random.randn(2, 5)

        parameters_tmp['ba'] = np.random.randn(5, 1)

        parameters_tmp['by'] = np.random.randn(2, 1)

        a_next_tmp, yt_pred_tmp, cache_tmp = rnn_cell_forward(xt_tmp, a_prev_tmp, parameters_tmp)

        print("a_next[4] = \n", a_next_tmp[4])

        print("a_next.shape = \n", a_next_tmp.shape)

        print("yt_pred[1] =\n", yt_pred_tmp[1])

        print("yt_pred.shape = \n", yt_pred_tmp.shape)

# UNIT TESTS

rnn_cell_forward_tests(rnn_cell_forward)

6. RNN前向传播的过程 RNN Forward Pass

➢ A recurrent neural network (RNN) is repetition of the RNN cell that you've just built.

      ● If your input sequence of data is 10 time steps long, then you will re-use the RNN cell 10 times

➢ Each cell takes two inputs at each time step:

      ● a^{<t-1>}: The hidden state from the previous cell

      ● x^{<t>} :  The current time step's input data

➢ It has two outputs at each time step:

      ●  A hidden state (a^{<t>})

      ●  A prediction (y^{<t>})

➢ The weights biases (W_{aa},b_{a},W_{ax},b_{x}) are resued each time step

      ●  They are maintained between calls to rnn_cell_forward in the 'parameters' dictionary

?  上面代码里面没有提b_{x}

# UNQ_C2 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_forward

def rnn_forward(x, a0, parameters):

    """ Implement the forward propagation of the recurrent neural network described in Figure (3).        Arguments:

    x -- Input data for every time-step, of shape (n_x, m, T_x).

    a0 -- Initial hidden state, of shape (n_a, m)

    parameters -- python dictionary containing:

    Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

    Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

    Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

    ba -- Bias numpy array of shape (n_a, 1)

    by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

    Returns:

    a -- Hidden states for every time-step, numpy array of shape (n_a, m, T_x)

    y_pred -- Predictions for every time-step, numpy array of shape (n_y, m, T_x)

    caches -- tuple of values needed for the backward pass, contains (list of caches, x)

"""

# Initialize "caches" which will contain the list of all caches

caches = []

# Retrieve dimensions from shapes of x and parameters["Wya"]

n_x, m, T_x = x.shape

n_y,n_a = parameters["Wya"].shape

### START CODE HERE ###

# initialize "a" and "y_pred" with zeros (≈2 lines)

a = np.zeros((n_a, m, T_x))

y_pred = np.zeros((n_y, m, T_x))

# Initialize a_next (≈1 line)

a_next = a0

# loop over all time-steps

for t in range(T_x):

    # Update next hidden state, compute the prediction, get the cache (≈1 line)

    a_next, yt_pred, cache = rnn_cell_forward(x[:,:,t] ,a_next, parameters)

    # Save the value of the new "next" hidden state in a (≈1 line)

    a[:,:,t] = a_next

    # Save the value of the prediction in y (≈1 line)

    y_pred[:,:,t] = yt_pred

    # Append "cache" to "caches" (≈1 line)

    caches.append(cache)

### END CODE HERE

### # store values needed for backward propagation in cache

caches = (caches, x)

return a, y_pred, caches

执行 上述代码

def rnn_forward_test(rnn_forward) :
    np.random.seed(1)

    x_tmp = np.random.randn(3, 10, 4)

    a0_tmp = np.random.randn(5, 10)

    parameters_tmp = {}

    parameters_tmp['Waa'] = np.random.randn(5, 5)

    parameters_tmp['Wax'] = np.random.randn(5, 3)

    parameters_tmp['Wya'] = np.random.randn(2, 5)

    parameters_tmp['ba'] = np.random.randn(5, 1)

    parameters_tmp['by'] = np.random.randn(2, 1)

    a_tmp, y_pred_tmp, caches_tmp = rnn_forward(x_tmp, a0_tmp, parameters_tmp)

    print("a[4][1] = \n", a_tmp[4][1])

    print("a.shape = \n", a_tmp.shape)

    print("y_pred[1][3] =\n", y_pred_tmp[1][3])

    print("y_pred.shape = \n", y_pred_tmp.shape)

    print("caches[1][1][3] =\n", caches_tmp[1][1][3])

    print("len(caches) = \n", len(caches_tmp))

#UNIT TEST   

rnn_forward_test(rnn_forward)

7. 小结

You've successfully built the forward propagation of a recurrent network from scratch.

➢ Situations when this RNN will peform better:

● This will work well enough for some applications, but it suffers from vanishing gradients.

● The RNN works best when each output \hat{y}^{<t>}  can be estimated using "local" context.

● "Local" context refers  to information that is close to the prediction's time step t.

●  More formally, local context refers to inputs x^{<t_j> }and predictions \hat{y}^{<t>}  where is t_jclose to t

➢ What you should remember:

● The recurrent neural network, or RNN , is essentially the repeated use of a single cell.

● A basic RNN reads inputs one at a time, and remembers information through the hidden layer activations(hidden states) that are passed from one step to the next

      ■ The timestep dimension determines how many times to re-use the RNN cell

● Each cell takes into two inputs at each time step:

      ■  The hidden state from the previous cell

      ■   The current time step's input data

● Each cell has two outputs at each time step:

      ■   A hidden state

      ■   A prediction

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,752评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,100评论 3 387
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 159,244评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,099评论 1 286
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,210评论 6 385
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,307评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,346评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,133评论 0 269
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,546评论 1 306
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,849评论 2 328
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,019评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,702评论 4 337
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,331评论 3 319
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,030评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,260评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,871评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,898评论 2 351

推荐阅读更多精彩内容