二进制代码注入PIN

2016-09-04

CSysSec注：本文来自Diting0x的个人博客，讲述如何利用PIN进行二进制代码注入，值得推荐。
转载本文请务必注明，文章出处：《二进制代码注入PIN》与作者信息：Diting0x

什么是注入（Instrumentation)

PIN初探

PIN框架

Pintools Example

什么是注入（Instrumentation)

每一个写过代码的人都调试过程序，最简单的无非就是手动在源代码中插入printf语句，当然大部分还是会选择一些调试工具如GDB。注入技术也类似，不过注入的对象是可执行二进制文件。简单来说，在你的程序中插入额外的代码来分析程序的运行时信息就称为注入技术。但就广义来说，在源代码中注入代码也可称之为注入，只是为了区分，注入技术一般特指对象为可执行二进制文件。

进一步说明，注入一般又分为静态二进制注入（Static Binary Instrumentation) 与动态二进制注入（Dynamic Binary Instrumentation). 看定义可知，SBI技术工作在程序运行前，DBI则工作在运行过程中。相比SBI，DBI技术有以下优势：

不需要重新编译、重新链接
在运行时发现代码
能处理动态产生的代码
能附加到正在运行时的进程中

目前，研究较多的都属DBI技术，前文提到，DBI的工作方式有点类似编译器，只不过分析例程（analysis routine)是可编程化的。DBI技术广泛应用在程序分析中，如逆向工程（reverse engineering), 程序调试（program debug)，恶意代码分析（malware analysis)等。

Pin初探

对DBI技术有个大致了解之后，进入本文主题, Pin是Intel 与University of Colorado合作研究出的一款用来动态分析二进制程序的注入工具，发表在系统顶会PLDI’2005,有兴趣的可以读读这篇文章，其它DBI技术还有Valgrind, DYNAMORIO, QEMU.

开门见山，首先来看一段代码：

    counter++;
             sub  $0xff, %edx
    counter++;
             cmp  %esi, %edx
    counter++;
             jle <L1>
    counter++;
             mov  $0x1, %edi
    counter++;
             add  $0x10, %eax
```             
这段代码很简单，只是在每条指令前加了一个计数器
那么如果使用PIN, 该如何实现？ 看下面代码：
``` C
#include <iostream>
#include "pin.h"
UINT64 icount = 0;
void docount() { icount++; }
    
void Instruction(INS ins, void *v) 
{
    INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END);
}
void Fini(INT32 code, void *v) 
{ std::cerr << "Count " << icount << endl; }
int main(int argc, char * argv[])
{
    PIN_Init(argc, argv);
    INS_AddInstrumentFunction(Instruction, 0);
    PIN_AddFiniFunction(Fini, 0);
    PIN_StartProgram();
    return 0;
}

现在对这段实现代码一一解释：
void docount() {icount+;} 用户可自定义docount() 函数，PIN称之为分析例程（Analysis routine),先对这个概念保持一点印象。再看void Instruction（）函数，PIN 称之为注入例程（Instrumentation routine), 关注其中两个参数，IPOINT_BEFORE 表示在每条指令之前执行，（AFUNPTR)docount便是要注入的分析例程。main() 函数调用INS_AddInstrumentFunction（Instruction,0),便会触发注入例程以及分析例程，这些参数都可以根据自己需求修改。
只需这么几行简单的代码，PIN便能帮你分析正在运行的二进制文件执行过的指令条数了。这么简单？ PIN只能做这些吗？当然不是！下文再介绍一些PIN的框架以及使用过程。

##PIN框架
可以把PIN理解为实时编译器（just in time complier).只不过输入进PIN的不是二进制代码，而是可执行的二进制文件，PIN能拦截可执行文件的第一条指令，然后从这条指令开始产生新的代码序列，之后将控制权交给产生的代码序列，此刻用户就有机会注入自己的代码，这个过程就是Instrumentation.
如果你只想使用PIN来分析自己的二进制代码，内部实现原理 (可阅读PLDI’2005)大可不必关心，关注的应该是PIN为用户提供的上层接口，使用这个接口来编写自己的Pintools. 版权原因，本文也不会深入去介绍PIN是如何实现的。

下面就来聊聊这个Pintools. Pintools实际上就是用户要实现的Instrumentation过程，可以把Pintools想象成能修改内部PIN的代码产生过程的插件（Plugins),PIN官网会提供一些sample教你怎么写Pintools,大部分还得靠用户自己去写。
总的来说，instrumentation包括两个部分：

第一，决定在哪注入代码，注入什么代码的一种机制
第二，在注入点要执行的代码

第一个部分被称之为 instrumentation routine,第二个部分被称之为 analysis routine. Pintools会向PIN注册一个注入回调函数，如上文提到的INS_AddInstrumentFunction（），该函数会代表instrumentation routine观察要产生的代码，分析代码的静态属性，来决定是否以及在哪里调用analysis routine. Analysis routine再收集被分析程序的数据。

注意，前文中的例子对每条指令分析，带来的开销太大，因此，PIN提供不同的粒度对程序进行分析，供用户根据不同需求选择：

Instruction
Basic block: 包含一些指令序列，终止于控制流改变指令;单入口单出口。
Trace:包含一些Basic block序列，终止于无条件控制流改变指令；单入口多出口。

##Pintools example
最后，以一个Pintools 的完整例子结束本文

 
#include <iostream>
#include <fstream>
#include "pin.H"
ofstream OutFile;
// The running count of instructions is kept here
// make it static to help the compiler optimize docount
static UINT64 icount = 0;
// This function is called before every instruction is      executed
VOID docount() { icount++; }
// Pin calls this function every time a new instruction     is encountered
VOID Instruction(INS ins, VOID *v)
{
    // Insert a call to docount before every            instruction, no arguments are passed
  INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END);
}
KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool",
"o", "inscount.out", "specify output file name");
// This function is called when the application exits
VOID Fini(INT32 code, VOID *v)
{
    // Write to a file since cout and cerr maybe closed     by the application
  OutFile.setf(ios::showbase);
  OutFile << "Count " << icount << endl;
  OutFile.close();
}
/* =====================================================     ================ */
/* Print Help Message                                                        */
/* =====================================================     ================ */
INT32 Usage()
{
   cerr << "This tool counts the number of dynamic          instructions executed" << endl;
   cerr << endl << KNOB_BASE::StringKnobSummary() <<      endl;
    return -1;
}
/* =====================================================     ================ */
/* Main                                                                      */
/* =====================================================     ================ */
/*   argc, argv are the entire command line: pin -t      <toolname> -- ...    */
/* =====================================================     ================ */
int main(int argc, char * argv[])
{
   // Initialize pin
   if (PIN_Init(argc, argv)) return Usage();
   OutFile.open(KnobOutputFile.Value().c_str());
  // Register Instruction to be called to instrument      instructions
    INS_AddInstrumentFunction(Instruction, 0);
   // Register Fini to be called when the application      exits
    PIN_AddFiniFunction(Fini, 0);
    // Start the program, never returns
    PIN_StartProgram();
    return 0;
}

参考

PLDI’2005

PIN User Manual

PIN Intel Tutorials

转载本文请务必注明，文章出处：《二进制代码注入PIN》与作者信息：Diting0x