Neural Computation of Decisions in Optimization Problems

Neural Computation of Decisions in Optimization Problems

J. J. Hopfield, D. W. Tank
Biological Cybernetics, 1985

Introduction

From the perspective of neuroscience, the authors first review their high-level goal, i.e. understanding the structure and connection of neurons.

One of the central goals of research in neuroscience is to understand how the biophysical properties of neurons and neural organization combine to provide such impressive computing power and speed.

Then they propose their opinion on the key point of neuron organization.

Parallel analog computation in a network of neurons is thus a natural way to organize a nervous system to solve optimization problems.

Specifically, in this paper, they showed the power of the proposed analog computational networks by testing on TSP.

We quantitatively demonstrate the computational power and speed of collective analog networks of neurons in solving optimization problems rapidly.

Background on Hopfield Network

The figure below illustrates an unit neuron in a Hopfield network.

According to Kirchhoff's Current Law, we have that
C_i \frac{du_i}{dt} = \sum^N_{j = 1} T_{ij} v_j - u_i / R_i + I_i
v_j = g_j (u_j)
where T_{ij} = 1 / R_{ij}, I_i is the current bias for neural i, g_j is the activation function for neuron j, and R_i is a parallel combination of R_{i0} and R_{ij}:
1 / R_i = 1 / R_{i0} + \sum^N_{j = 1} 1 / R_{ij} .
For simplicity, we can further set C_i, R_i and g_i independent of i, define T_{ij} / C and I_i / C as T_{ij} and I_i, which leads to the following equation:
du_i / dt = -u_i / \tau + \sum^N_{j = 1} T_{ij} v_j + I_i ,
where \tau = RC, and v_j = g (u_j).
Given an initial value for u_i, this equation of motion provides a full description of the time evolution of the state of the circuit.

There are two key properties of the Hopfield network as shown in earlier work:

  • The equations of motion for a network with symmetric connections, i.e. T_{ij} = T_{ji}, always lead to a convergence to stable states;
  • When the width of the amplifier gain curve is narrow - the high-gain limit - the stable states of a network comprised of N neurons are the local minima of the quantity
    E = -1/2 \sum^N_{i = 1} \sum^N_{j = 1} T_{ij} V_i V_j - \sum^N_{i = 1} V_i I_i .
    And in the high-gain limit, the minima only occur at corners of the feasible space, which is a N-dimensional hypercube defined by V_i = 0 or 1.

Consequently,

Networks of neurons with this basic organization can be used to compute solutions to specific optimization poblems by first choosing connectivities T_{ij} and input bias currents I_i which appropriately represent the function to be minimized.

Network Formulation for TSP

For a TSP with N cities, the representation scheme in this work has a complexity of O(N^2), as the route is represented as a permutation matrix, in which the location of each city is represented as the output states of N neurons in an one-hot encoding fashion. Each neuron is labelled by two indices, in which the first index X represents the city, and the seond index i represents its order in the route.

To solve TSP with the Hopfield network, the energy function should include two parts:

  • The first part favors solutions in the form of a permutation matrix, which guanrantees the feasibility of the solution;
  • The second part favors solutions with shorter distance.

The first part of the energy function is defined as
E_1 = A / 2 \sum_X \sum_i \sum_{j \neq i} V_{Xi} V_{Xj} + B / 2 \sum_i \sum_X \sum_{Y \neq X} V_{Xi} V_{Yi} + C / 2 (\sum_X \sum_i V_{Xi} - n)^2 .
The first term and second term mean that there is only one non-zero element in each row and each column of the matrix respectively, and the third term means that there is exactly n non-zero elements in the matrix. Only a permutation matrix can reach the minimum of this energy function with E_1 = 0.

(Two questions: 1. Why not use the row sum and column sum to represent the first and second term? 2. How can this function guanrantee that there won't be several distinct loops in the matrix?)

The second part of the energy function is defined as
E_2 = D / 2 \sum_X \sum_{Y \ neq X} \sum_i d_{XY} V_{Xi} (V_{Y, i + 1} + V_{Y, i - 1}) .
This function represents the total length of the tour. For notational convenience, subscripts are defined as modulo n to represent the tour loop.

The final energy function is E = E_1 + E_2.

If A, B and C are sufficiently large, all the really low energy states of a network described by this function will have the form of a valid tour. The total energy of that state will be the length of the tour, and the states with the shortest path will be the lowest energy states.

Next, we map the energy function of TSP to the standard form of the energy function in the Hopfield network:
T_{Xi, Yj} = -A \delta_{XY} (1 - \delta_{ij}) - B \delta_{ij} (1 - \delta_{XY}) - C - D d_{XY} (\delta_{j, i + 1} + \delta_{j, i - 1}) ,
I_{Xi} = + C n ,
where \delta_{ij} = 1 \ if \ i = j \ and \ 0 \ otherwise .

Simulation Results

  • The activation function is set as g(u) = \frac{1}{2} (1 + \tanh (u / u_0));
  • The instance is generated by sampling N points uniformly in an unit square;
  • The choice of the random initial point leads to different convergence solution;
  • The proposed method achieves close-to-best solutions for 10 cities, good solutions for 30 cities. No larger instance size is considered. The method performs much better than random search, but is not compared with other human-designed heuristics.

Conclusion

I suppose the key idea of this paper is that we can relate some kinds of optimization problems with the dynamics of a specific type of dynamic system, such as the Hopfield network. But I'm not convinced that this is a scalable approach to solving combinatorial optimization problems.

In BP networks, our goal is encoded in the objective function, and we update the network weights to optimize this function. In Hopfield networks, our goal is encoded in the connection weights of the network, and the optimal solution is acquired by the system dynamic.

It deserves to investigate what is the current status of the Hopfiled network and how can we combine it with the deep learning paradigm.

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 渐变的面目拼图要我怎么拼? 我是疲乏了还是投降了? 不是不允许自己坠落, 我没有滴水不进的保护膜。 就是害怕变得面...
    闷热当乘凉阅读 4,323评论 0 13
  • 夜莺2517阅读 127,753评论 1 9
  • 版本:ios 1.2.1 亮点: 1.app角标可以实时更新天气温度或选择空气质量,建议处女座就不要选了,不然老想...
    我就是沉沉阅读 6,940评论 1 6
  • 我是一名过去式的高三狗,很可悲,在这三年里我没有恋爱,看着同龄的小伙伴们一对儿一对儿的,我的心不好受。怎么说呢,高...
    小娘纸阅读 3,414评论 4 7
  • 那一年,我选择了独立远行,火车带着我在前进的轨道上爬行了超过23个小时; 那一年,我走过泥泞的柏油路,在那个远离故...
    木芽阅读 1,658评论 4 5