【机器学习】-Week4.2 Model Representation II

Model Representation II

To re-iterate, the following is an example of a neural network:

In this section we'll do a vectorized implementation of the above functions. We're going to define a new variable z_k^(j)​ that encompasses the parameters inside our g function. In our previous example if we replaced by the variable z for all the parameters we would get:

In other words, for layer j=2 and node k, the variable z will be:

The vector representation of x and z^(j) is:

Setting x = a^(1), we can rewrite the equation as:

We are multiplying our matrix Θ(j−1) with dimensions sj​×(n+1) (where s_j is the number of our activation nodes) by our vector a^(j-1) with height (n+1). This gives us our vector z^(j) with height s_j. Now we can get a vector of our activation nodes for layer j as follows:


Where our function g can be applied element-wise to our vector z^(j).

We can then add a bias unit (equal to 1) to layer j after we have computed a^(j). This will be element a_0^(j)​ and will be equal to 1. To compute our final hypothesis, let's first compute another z vector:

We get this final z vector by multiplying the next theta matrix after Θ^(j−1) with the values of all the activation nodes we just got. This last theta matrix Θ^(j) will have only one row which is multiplied by one column a^(j) so that our result is a single number. We then get our final result with:

Notice that in this last step, between layer j and layer j+1, we are doing exactly the same thing as we did in logistic regression. Adding all these intermediate layers in neural networks allows us to more elegantly produce interesting and more complex non-linear hypotheses.


来源:coursera 斯坦福 吴恩达 机器学习

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容