[Theory] Static Program Analysis 读书笔记(三):function, pointer & abstract interpretation

9. Control Flow Analysis

If we introduce functions as values (and thereby higher-order functions), or objects with methods, then control flow and dataflow suddenly become intertwined.

The task of control flow analysis is to conservatively approximate the interprocedural control flow, also called the call graph, for such programs.

  • closures
  • closure analysis

The constraints for closure analysis are an instance of a general class that can be solved in cubic time.

A language with functions as values must use the kind of control flow analysis to obtain a reasonably precise CFG. For common object-oriented languages, such as Java or C#, it is also useful, but the added structure provided by the class hierarchy and the type system permits some simpler alternatives.

  • Class Hierarchy Analysis (CHA)
  • Rapid Type Analysis (RTA)
  • Variable Type Analysis (VTA)

10. Pointer Analysis

  • Interprocedural Points-To Analysis
  • Null Pointer Analysis
  • Flow-Sensitive Points-To Analysis
  • Escape Analysis

The final extension of the TIP language introduces pointers and dynamic memory allocation.

  • allocation-site abstraction
  • points-to analysis
  • address taken
  • Andersen’s algorithm
  • Steensgaard’s Algorithm

In languages with both function values and pointers, functions may be stored in the heap, which makes it difficult to perform control flow analysis before points-to analysis. But it is also difficult to perform interprocedural points-to analysis without the information from a control flow analysis.

  • points-to graphs
  • escape analysis

11. Abstract Interpretation

If an analysis is sound, the properties it infers for a given program hold in all actual executions of the program.

The theory of abstract interpretation provides a solid mathematical foundation for what it means for an analysis to be sound, by relating the analysis specification to the formal semantics of the programming language.

Another use of abstract interpretation is for understanding whether an analysis design, or a part of an analysis design, is as precise as possible relative to a choice of analysis lattice and where imprecision may arise.

What matters is that the semantics captures the meaning of programs in ordinary executions, without any approximations. The semantics specifies how a concrete program execution works, whereas our analyses can be thought of as abstract interpreters.

  • collecting semantics
  • continuous

With these definitions and observations, we can define the semantics of a given program as the least solution to the generated constraints. (A solution to a constraint system is, as usual, a valuation of the constraint variables that satisfies all the constraints – in other words, a fixed-point of cf .)

As readers familiar with theory of programming language semantics know, the fixed-point theorem also holds for infinite-height lattices provided that f is continuous. This tells us that a unique least solution always exists – even though the solution, of course, generally cannot be computed using the naive fixed-point algorithm.

To clarify the connection between concrete information and abstract information for the sign analysis example, let us consider three different abstraction functions that tell us how each element from the semantic lattices is most precisely described by an element in the analysis lattices. The functions map sets of concrete values, sets of concrete states, and n-tuples of sets of concrete states to their abstract counterparts.

  • concretization functions

Furthermore, abstraction functions and concretization functions that arise naturally when developing program analyses are closely connected.

  • extensive
  • reductive

The pair of monotone functions, α and γ, is called a Galois connection if it satisfies these two properties.

The intuition of the first property is that abstraction may lose precision but must be safe. One way to interpret the second property is that the abstraction function should always give the most precise possible abstract description for any element in the semantic lattice.

We have argued that the Galois connection property is natural for any reasonable pair of an abstraction function and a concretization function, including those that appear in our sign analysis example. The following exercise tells us that it always suffices to specify either α or γ, then the other is uniquely determined if requiring that they together form a Galois connection.

Once the analysis designer has specified the collecting semantics and the analysis lattice and constraint rules, then the relation between the semantic domain and the analysis domain may be specified using an abstraction function α (resp. a concretization function γ), and then the associated concretization function γ (resp. abstraction function α) is uniquely determined – provided that one exists such that the two functions form a Galois connection.

This raises an interesting question: Under what conditions does α (resp. γ) have a corresponding γ (resp. α) such that α and γ form a Galois connection?

The following exercise demonstrates that the Galois connection property can be used as a “sanity check” when designing analysis lattices.

Soundness means that the analysis result over-approximates the abstraction of the semantics of the program.

We often use the term soundness of analyses without mentioning specific programs. An analysis is sound if it is sound for every program.

soundness theorem
Let L_1 and L_2 be lattices where L_2 has finite height, assume \alpha :L_1\rightarrow L_2 and \gamma :L_2\rightarrow L_1 form a Galois connection, cf:L_1 \rightarrow L_1 is continuous, and af:L_2 \rightarrow L_2 is monotone. If af is a sound abstraction of cf , then

\alpha (fix(cf)) \sqsubseteq fix(af)

To summarize, a general recipe for specifying and proving soundness of an analysis consists of the following steps:

  • Specify the analysis, i.e. the analysis lattice and the constraint generation rules, and check that all the analysis constraint functions are monotone.
  • Specify the collecting semantics, and check that the semantic constraint functions are continuous.
  • Establish the connection between the semantic lattice and the analysis lattice, either by an abstraction function or by a concretization function. Then check, for example using the property from Exercise 11.20, that the function pairs form Galois connections.
  • Show that each constituent of the analysis constraints is a sound abstraction of the corresponding constituent of the semantic constraints, for all programs.

The requirements that the analysis constraint functions are monotone, the semantic constraint functions are continuous, and the abstraction and concretization functions form Galois connections are rarely restrictive in practice but can be considered as “sanity checks” that the design is meaningful.

As we have seen in Exercises 11.21 and 11.23, it is possible to design sound analyses that do not have all these nice properties, but the price is usually less precision or more complicated soundness proofs. Another restriction of the soundness theorem above is that it requires L_2 to have finite height, however, the theorem and proof can easily be extended to analyses with infinite-height lattices and widenings.

  • soundness testing
  • optimal abstraction

As usual in logics, the dual of soundness is completeness.
In Section 11.3 we defined soundness of an analysis for a program P as the property

\alpha(\{\[P\]\})\sqsubseteq \[ \[P\] \]

Consequently, it is natural to define that an analysis is complete for P if

\[ \[ P\] \] \sqsubseteq \alpha(\{\[P\]\})

An analysis is complete if it complete for all programs.

If an analysis is both sound and complete for P we have

\alpha(\{\[P\]\}) = \[ \[P\] \]
  • complete abstraction

For abstractions that are sound, completeness implies optimality (but not vice versa).

  • soundness and completeness theorem

Abstractions that are incomplete may be complete in some situations; for example, abstract addition in sign analysis is not complete in general, but it is complete in situations where, for example, both arguments are positive values. For this reason, even though few analyses are sound and complete for all programs, many analyses are sound and complete for some programs or program fragments.

  • reachable states collecting semantics
  • trace semantics

Reference

Static Program Analysis

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 218,284评论 6 506
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 93,115评论 3 395
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 164,614评论 0 354
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,671评论 1 293
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,699评论 6 392
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,562评论 1 305
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,309评论 3 418
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,223评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,668评论 1 314
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,859评论 3 336
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,981评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,705评论 5 347
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,310评论 3 330
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,904评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 33,023评论 1 270
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 48,146评论 3 370
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,933评论 2 355