Graph的题:
- 值得思考的点和概念:树、有向图、无向图、相连性、有圈无圈
- 树是各节点之间只有一条路可走的无圈无向图
- 很多时候degree能透露不少信息
- 相连无圈图:节点数=边数+1。节点数常可以作为循环的终止条件。
- 算法:
- BFS, DFS
- Union Find (for connectivity) (weighted size, path compression)
- Strong connected: Kosaraju's Algorithm (two pass, very intuitive!)
- Eulerian graph: Hierholzer’s Algorithm (put the degree 0 node in the stack, very similar to Kosaraju)
- Shortest path: Dijkstra's Algorithm, Bellman-Ford Algorithm, Floyd–Warshall algorithm,
- Topological sort (for sequence and relative order)
Algorithm Crush
Dijkstra's Algorithm
- Purpose: find shortest distance from source to every node, not for negative edges
- Adjacency matrix for dense graph:
- initialize each node -> O(V)
- for each minimum value in unvisited node -> O(V)
- we check every neighbor of the node -> O(VE), since in a dense graph, we can imagine each vertex has V-1 neighbors, so complexity becomes O(V^2)
- Adjacency list for sparse graph:
- initialize a min heap with each node -> O(V)
- for each minimum value in unvisited node -> O(V)
- we check every neighbor of the node -> say there are N edges(neighbors) for each node, so that NlogV would be the time to find and update its neighbors(we don't need O(V) time to find a neighbor in the min heap because heap gives us logarithm time)
- total time complexity: O(V+VNlogV), since VN=E, it is O(V+ElogV) which is O(ElogV)
Bellman-Ford Algorithm
- Purpose: find shortest distance from source to every node, with negative edges; also detect negative cycles
- The algorithm performs V(# vertexes) iterations of Dijkstra's Algorithm from the same source node. But at this time, the order of the vertexes you visit doesn't matter anymore.
- The Dijkstra's algorithm can't be applied to a graph with negative edges because in that algorithm, once we visit the node, we mark that as being found the shortest path for that node(Greedy!). There is a reason for that: we visit the vertexes in ascending order, so that if we visit a node A, that means all other unvisited node has longer path than the source to A, so that it is impossible the shortest path to A goes through any other unvisited node. However, if there is a negative edge involved, it is possible that the path from the source to A is shorter than the path from the source to B, yet the path from the source to B then to A is a shorter path for A, because the edge B->A is negative.
- In Bellman-Ford Algorithm, we have to go back after one iteration to check if we fix the negative edge problem. After one iteration,
133. Clone Graph
此题用BFS遍历,用unordered_map来建立新旧node的对应关系。
310. Minimum Height Trees
For an undirected graph with tree characteristics, we can choose any node as the root. The result graph is then a rooted tree. Among all possible rooted trees, those with minimum height are called minimum height trees (MHTs). Given such a graph, write a function to find all the MHTs and return a list of their root labels.
拿到这道题,有两种大方向的思路——用某种规律直接算出MHT的根是什么,或者用dfs进行遍历每种情况,再比较。后者比较麻烦,而且得到的结果要进行比较拿出最大值。那MHT有什么规律吗?
我先想到了最简单的一种情况,就是类似单向列表的情况:1->2->3->4->5->6,很明显MST是以3或者4为根的数,因为3和4就在中间。那么思考两个问题:
- 是否一个树的“中间”节点就是MST的根?
- 这个“中间”在更复杂的情况下该如何定义并找到?
这里可以简单地证明。MHT的定义告诉我们,它要使子树中,深度最深的那个子树的深度最小。对MHT,我们可以定义一个距离,两外叶(external leave)节点之间的最长距离为D,这两个节点之间一定经过了根节点。如果这个子树的外叶到根的距离小于D/2,那么它就不是最深子树,因为还有一个子树,它的外叶到根的距离大于D/2。如果这个所要求的子树的外叶到根的距离大于D/2,有两种情况,一种情况是,D/2不是整数,最深和次深子树不可能相等,总会差一个,这时候已经达到极致平衡了;另一种情况是,我们还可以减少这个子树的深度,使次深子树深度增加,但还是比这个子树的深度小。所以我们得出结论,这样的子树,它的外叶到根的距离,是D/2,或者由于这个距离的总节点数是偶数,会比次深树多一个节点的距离。这也就是说,我们要求的根,就处在D这条路的中间。
问题就变成了,怎么求D这条路。如果我们把指针放在各外叶节点,向根节点走,毫无疑问的是,最长路径D的两个外叶节点会最晚碰头,因为他们经过的距离最长嘛。而当它们碰头的时候,就是走过了一样的距离,也就是中间节点了。
写code的时候,我考虑到下面几点:
- edge case:n == 1
- 每次指针前进,需要将这个节点从相邻那个节点的邻居中移除,可以用一个unordered_set来取代vector表示
- 需要有一个数据结构来记录当前/下一轮degree为1的节点,可以用vector
- 上面说的这点的这个vector里装的都是degree为1的节点,他们的邻居都是唯一的,所以每遍历一次其实就可以将它们从图中抹去,直到图中剩下的节点数为1或者2,作为终止条件
class Solution {
public:
vector<int> findMinHeightTrees(int n, vector<pair<int, int>>& edges)
{
if (n == 1) return {0};
vector<unordered_set<int>> make_edge(n);
for (auto &ele: edges)
{
make_edge[ele.first].insert(ele.second);
make_edge[ele.second].insert(ele.first);
}
vector<int> one_degree;
for (int i = 0; i < make_edge.size(); ++i)
{
if (make_edge[i].size() == 1) one_degree.push_back(i);
}
// n is the number of vertexes in the graph
while (n > 2)
{
vector<int> temp;
n -= one_degree.size();
for (auto num: one_degree)
{
int only_neighbor = *make_edge[num].begin();
make_edge[only_neighbor].erase(num);
if (make_edge[only_neighbor].size() == 1) temp.push_back(only_neighbor);
}
one_degree = temp;
}
return one_degree;
}
};
323. Number of Connected Components in an Undirected Graph
Given n nodes labeled from 0 to n - 1 and a list of undirected edges (each edge is a pair of nodes), write a function to find the number of connected components in an undirected graph.
与261. Graph Valid Tree很类似啊。
思路一:用bfs/dfs遍历,每次新的循环记一个component。
思路二:用union find。我们知道,对于一个相连无圈图,节点数=边数+1。而这道题不能通过节点数-边数来得出结论的原因是,各节点之间可能有圈连起来,这样一来,明明一条边就可以连起来的节点就连了不止一次。用union find的code如下:
class Solution {
private:
int root(int i, vector<int> parent)
{
while (i != parent[i]) i = parent[i] = parent[parent[i]];
return i;
}
public:
int countComponents(int n, vector<pair<int, int>>& edges)
{
vector<int> parent(n, 0);
iota(parent.begin(), parent.end(), 0);
for (auto &v: edges)
{
int i = root(v.first, parent), j = root(v.second, parent);
if (parent[i] != parent[j])
{
parent[i] = j;
n--;
}
}
return n;
}
};
这里用了path compression,没有用weighted union find。对于union find和时间复杂度(O(mlog*N))和bfs的(O(M+N))的比较,可以这么理解(from stackoverrflow):
The union-find algorithm is best suited for situations where the equivalence relationship is changing, i.e., there are "Union" operations which need to be performed on your set of partitions. Given a fixed undirected graph, you don't have the equivalence relationships changing at all - the edges are all fixed. OTOH, if you have a graph with new edges being added, DFS won't cut it. While DFS is asymptotically faster than union-find, in practice, the likely deciding factor would be the actual problem that you are trying to solve.
tl;dr - Static graph? DFS! Dynamic graph? Union-find!
399. Evaluate Division
Given a / b = 2.0, b / c = 3.0.
queries are: a / c = ?, b / a = ?, a / e = ?, a / a = ?, x / x = ? .
return [6.0, 0.5, -1.0, 1.0, -1.0 ].
An application of Floyd-Warshall algorithm. 这个算法讨论的是所有点到所有点的最短距离,需要的数据结构有
- 一个matrix来记录各点之间的最小距离
- 一个matrix来记录下一点是什么(这道题不用,如果让你找出路径的话就要)
这个算法的insight,也就是update条件是:两点之间的最短距离,要么就是现在的距离,要么就是通过一个其它点的距离。其中,两点之间的最短距离是可以通过其它节点的。
应用到这道题上,因为题目告诉我们给的input没有矛盾项,所以不存在最短路径的说法,每条路径都是唯一的。所以我们检查的时候,看对应的值如果已经是正数了的话,说明之前update过了,就不需要update了。
因为这道题的input是string,我们需要a map of map来记录信息。但是这样既费时又费空间。我preprocess了一下这个信息,将string与int相对应,之后速度也就快了。
class Solution {
public:
vector<double> calcEquation(vector<pair<string, string>> equations, vector<double>& values, vector<pair<string, string>> queries)
{
unordered_map<string, int> string_to_int;
int count = 0;
for (auto &item: equations)
{
if (string_to_int.find(item.first) == string_to_int.end())
{
string_to_int[item.first] = count++;
}
if (string_to_int.find(item.second) == string_to_int.end())
{
string_to_int[item.second] = count++;
}
}
int size = string_to_int.size();
vector<vector<double>> value(size, vector<double>(size, -1.0));
for (int i = 0; i < equations.size(); ++i)
{
int first = string_to_int[equations[i].first];
int second = string_to_int[equations[i].second];
value[first][second] = values[i];
if (values[i]) value[second][first] = 1 / values[i];
value[first][first] = values[i] ? 1 : -1;
value[second][second] = 1;
}
for (int k = 0; k < size; ++k)
{
for (int i = 0; i < size; ++i)
{
for (int j = 0; j < size; ++j)
{
if (value[i][j] < 0 && value[i][k] >= 0 && value[k][j] > 0)
{
value[i][j] = value[i][k] * value[k][j];
}
}
}
}
vector<double> result;
for (auto &q: queries)
{
if (string_to_int.find(q.first) != string_to_int.end() &&
string_to_int.find(q.second) != string_to_int.end())
{
int first = string_to_int[q.first];
int second = string_to_int[q.second];
result.push_back(value[first][second]);
}
else
{
result.push_back(-1);
}
}
return result;
}
};