- Planning procedures are often search procedures。
-
State-space planning explores the most obvious search space:
– Nodes labelled by states of the world
– Actions define successor states
– Plans are paths from the initial node to a goal node -
Search space can be explored in many ways:
– forward, backward
– using a variety of strategies (breadth-first, depth-first, A*, . . . )
– using a variety of heuristics
– The STRIPS representation enables an efficient exploration and domain- independent heuristics - the difference between Planning procedures and search procedures is the search space they consider
注解:
A planning problem can be reduced to a search problem.
But the plan expressed as a search may have a monstrously large search space. In an uninformed search technique the computer does not "know" that buy(x) results in have(x) even though this is trivially obvious to a human being. Thus, even the search space of single-action plans is huge.
Domain-independent heuristics
need to know:
- An admissible heuristic is optimistic: it gives a lower bound on the true cost of a solution to the problem.
- An admissible heuristic can be obtained by relaxing a problem P into a simpler problem P′: the cost of any optimal solution to P′ is a lower bound on the cost of the optimal solution to P.
Therefore,
we relax STRIPS problem descriptions to obtain generic planning heuristics:
• delete relaxation heuristics
• abstraction heuristics
• landmark heuristics
Delete relaxation
Let P be a planning problem and let P+ be the relaxed problem obtained by ignoring the negative effects (delete list) of every action.
P + is called the delete-relaxation of P
P+ is like P except that: eff−(a) = {} for all a
A solution for P + is called a relaxed plan.
Delete-relaxed planning: once a fact becomes true, it remains true forever.
An action does not need to be applied more than once in a relaxed plan.
Delete relaxation heuristics:The cost h+ of an optimal solution to P+ is a lower bound on the cost of an optimal solution of for P , hence an admissible heuristic
But, finding an optimal solution for P +(PLANMIN) is NP-hard —— Need to further relax the problem to get efficient admissible heuristics.
hmax heuristic [Bonet & Geffner, 1999]Finding an arbitrary solution for P+ (PLANSAT) is polynomial —— Relaxed plans are used to derive inadmissible heuristics.
hFF heuristic [Hoffmann & Nebel, AIJ 2001]
NP,NPC,NPH Introduction
Relax the problem by ignoring the negative effects eff−
Further relax the problem by ignoring interactions between subgoals
Heuristics h(s, g) estimate the minimum cost from s to g.
admissible hmax heuristic: cost to reach a set is the max of costs. looks at the critical path: h(s, g) = maxp∈g h(s, {p})
non-admissible hsum heuristic: cost to reach a set is the sum of costs.assumes subgoal independence: h(s, g) =sum( h(s,{p}) | p in g)
Delete-relaxation heuristics so far:
• h+ too hard to compute(Optimal Relaxed Plan)
• hmax not very informative
• hsum can greatly over-estimate h∗
hFF heuristic
• inadmissible
• compromise between hmax and hsum
• makes sense when actions have unit costs
• 1.relaxed reachability + 2.relaxed plan extraction
Step:
- Builds the graph of reachable propositions and actions until a fix point or the goal is reached. Returns failure if the goal isn’t reachable.(Relaxed reachability algorithm)
过程描述:
relaxed reachability:对初始state 采取一个符合前提条件的action,将新的add effect 和原effect 集建立一个新的effect集,忽略掉delete effect,然后goal测试
(只要effect 集里面存在所有的goal effect,则true )不通过就循环该过程,直到test通过 返回所有effect集和对应的action 或者no actions 返回 false。
2.Works back from the last level to the first, selecting an action to achieve each goal, and inserting its precondition as new goals to be achieved.返回采用的action的数量即为hff(An action does not need to be applied more than once in a relaxed plan)