Security Note: 翻譯小品:惡意程式分析系列-如何使用Yara Rule檢測惡意程式

翻譯小品:惡意程式分析系列-如何使用Yara Rule檢測惡意程式
Malware Analysis: How to use Yara rules to detect malware

原文來源:peerlyst
原文作者: Chiheb Chebbi

當進行惡意程式分析作業時，分析師需要收集可以識別惡意程式的所有的資訊，而其中一項就是Yara Rule。在本文中，我們將探討Yara Rule以及使用它來檢測惡意程式。
When performing malware analysis, the analyst needs to collect every piece of information that can be used to identify malicious software. One of the techniques is Yara rules. In this article, we are going to explore Yara rules and how to use them in order to detect malware.

本篇文章大概會討論到的範圍如下:
The article outline is the following:
 什麼是惡意程式分析 What is malware analysis
 靜態惡意程式分析技術 Static malware analysis techniques
 什麼是Yara以及如何安裝它 What is Yara and how to install it
 偵測惡意程式與Yara規則Detect malware with Yara
 Yara規則結構 Yara Rule Yara rule structure
 如何撰寫你的第一個Yara Rule How to write your first Yara rule
 Yara Python

在閱讀完本文之後，你也可以參考其他的Yara Rule相關資源，並且下載
After reading this article you can download this small document that includes other helpful resources:
https://static.peerlyst.com/image/upload/v1571381124/post-attachments/Yara_Rules_Resources_unj0lm

惡意程式分析 Malware Analysis
惡意程式是一種複雜且帶有惡意行為的軟體，其行為範圍從最簡單的異動電腦上的一些設定到一些複雜且危險的設定等。
Malware is a complex and malicious piece of software.Its behavior range from basic actions like simple modifications of computer systems to advanced behaviors patterns.

按照定義，惡意程式是一種具備惡意行為的軟體，主要用途是破壞電腦系統，如資料竊取等間諜活動，正常的使用者在感染惡意程式後，所使用的電腦系統環境可能進一步被控制，因此要清楚地了解惡意程式分析，必須根據其行為進行分類。甚至有的時候我們沒有辦法清楚的區分，因為他可能使用了不同的參數或功能面向，但通常就惡意程式來說，可分為許多的類別，以下就其中一些已知的類別進行簡單的敘述:
By definition, a malware is a malicious piece of software with the aim of damaging computer systems like data andidentity stealing ,espionage,legitimate users infection and gaining full or limited control to its developer.To have a clear understanding of malware analysis, a malware categorization based on its behavior is a must. Even sometimes we cannot classify a malware because it uses many different functionalities but in general, malware can be divided into many categories some of them are described below:

特洛伊木馬:被定義為一種看似合法應用程式的惡意軟體
Trojan: is a malware that appears as a legitimate application

病毒:這種類型的惡意程式會複製自己並且感染電腦
Virus: this type of malware copy itself and infect computer machines

殭屍網路:受感染惡意程程式的眾多設備被控制的網路，通常會由中繼站C2進行控制
Botnets are networks of compromised machines which are generally controlled by a command and control (C2C) channel

勒索軟體:這種惡意程式會加密電腦上所有資料，並且通常受害者會被要求使用加密貨幣支付贖金以解鎖檔案
Ransomware this malware encrypts all the data on a computer and ask the victim usually using the cryptocurrency Bitcoin to get the decryption key

間諜軟體:用來追蹤受感染使用者的活動(包含搜尋紀錄、已安裝的應用程式等)
Spyware as it is obvious from the name it is a malware that tracks all the user activities including Search history,installed applications

Rookit:使攻擊者可以獲得未經授權的存取，而通常這些存取行為可能會是系統管理員權限
Rootkit enables the attacker to gain an unauthorized access generally administrative to a system.Basically, it is unnoticeable and makes its removal as hard as possible

惡意程式分析，主要是針對取得惡意程式樣本(如病毒、蠕蟲、特洛伊木馬、rookit或後門程式)進行分析其功能，以及相關的來源跟淺載的影響。作為惡意程式分析師，我們主要的職責是收集有關惡意程式的所有資訊，並對被感染的設備狀況進行了解及調查。而執行惡意程式分析作業，我們通常需要遵循某些方法論以及步驟，通常執行惡意程式分析有三個階段:
Malware analysis is the art of determining the functionality, origin and potential impact of a given malware sample, such as a virus, worm, trojan horse, rootkit, or backdoor. As a malware analyst, our main role is to collect all the information about malicious software and have a good understanding of what happened to the infected machines. Like any process, to perform a malware analysis we typically need to follow a certain methodology and a number of steps. To perform Malware Analysis we can go thru three phases:

靜態惡意程式分析Static Malware Analysis

動態惡意程式分析 Dynamic Malware Analysis

記憶體惡意程式分析 Memory Malware Analysis

如果讀者有興趣更清楚且詳細的了解這些技術，筆者強烈建議可以去讀另外一篇文章“How to bypass Machine Learning Malware Detectors with Generative adversarial Networks”
To have a clear understanding of these techniques in detail I highly recommend that you read the first sections of my Article: “How to bypass Machine Learning Malware Detectors with Generative adversarial Networks”

在那篇文章中討論了一些議題如下:
Where I discussed many aspects including:
惡意程式的基礎 Malware fundamentals
惡意程式的散播 Malware Distribution
典型的防毒軟體偵測規避技術 Classical AV evasion techniques
惡意程式分析技術Malware analysis techniques
使用機器學習進行惡意程式檢測Machine learning malware detection
機器學習創造威脅模型Machine Learning Threat Model
使用生成式對抗網路GAN繞過機器學習惡意程式檢測 Bypassing Machine Learning malware detectors using Generative Adversarial Networks (GANs)

靜態惡意程式分析
靜態惡意程式分析通常是指在不執行惡意程式樣本的情況下對該樣本進行檢測及收集資訊，這項作業能夠提供惡意程式樣本二進位檔案下的所有資訊，靜態惡意程式分析的第一步，除了確認惡意程式的雜籌(HASH)值外，還必須要了解惡意程式檔案的大小以及檔案類型屬性，以便對於惡意程式有初步且清楚的了解，因為惡意程式的雜籌(HASH)值資訊如MD 5或SHA1可以當作樣本文件的唯一是別，而後面為了對於樣本有更多的了解，找到樣本中的一些字串值，通常會使用一些工具透過二進位檔案讀取或者類似IDA逆向分析的方式去觀察惡意程式樣本，這些工作都是探索惡意程式如何運行的重要一步，而惡意程式的作者經常為了使惡意程式分析師的工作艱辛，常會將一些程序包裹以及加密來規避檢測，因此在進行靜態分析作業時，有必要使用PEiD之類的工作對其進行檢測。
Static Malware analysis
Static malware analysis refers to the examination of the malware sample without executing it. It consists of providing all the information about the malicious binary. The first steps in static analysis are knowing the malware size and file type to have a clear vision about the targeted machines, in addition to determining the hashing values, because cryptographic hashes like MD5 or SHA1 can serve as a unique identifier for the sample file. To dive deeper, finding strings, dissecting the binary and reverse engineering the code of malware using a disassembler like IDA could be a great step to explore how the malware works by studying the program instructions. Malware authors often are trying to make the work of malware analysts harder so they are always using packers and cryptors to evade detection. That is why, during static analysis, it is necessary to detect them using tools like PEiD.

當然如果讀者有興趣也可以參閱筆者的其他文章:
For more information, read my articles:

惡意程式分析的初始-使用IDA Pro Getting started with IDA Pro

如何使用Radwarw2執行靜態惡意程式分析 How to Perform Static Malware Analysis with Radare2

如何構建Linux自動化惡意程式分析環境How to build a Linux Automated Malware Analysis Lab

在本文中，我們將探討如何使用Yara Rule，在執行惡意程式分析時，有需多的技術或工具可以對惡意程式樣本進行識別，例如雜籌Hash，而在維基百科上面對於Yara的定義說明如下:
In this article, we are going to explore how to use YARA Rules. When performing static malware analysis there are many techniques to classify malware and identify it such as hashes. Another technique is using YARA rules. According to Wikipedia:

YARA為主要用於惡意程式研究及檢測工具的一種名稱，它提供一種基於規則的方法，可以基於檔案或二進位模式建立下對於惡意程式族群進行敘述，而這些敘述本質上就稱為Yara Rule。
YARA is the name of a tool primarily used in malware research and detection. It provides a rule-based approach to create descriptions of malware families based on textual or binary patterns. A description is essentially a Yara rule name, where these rules
以下說明如何在Ubuntu環境安裝Yara，你可以參考以下的步驟進行簡單的安裝使用:
Install Yara:
The first step, of course, is installing YARA. If you are using Ubuntu for example, you can simply use

安裝完成後的畫面入下:

It is already installed on my machine

或者你也可以透過下載檔案的方式從Github下載進行安裝

Or you can download the tar file and install it from Github https://github.com/VirusTotal/yara/releases

這邊要特別注意的是在你安裝之前應該要先確認你的系統是否已經安裝automake libtool make 及 gcc等函示庫

Yara needs the following libraries automake libtool make and gcc so ensure that you already installed them

並且在安裝完成後確認沒有問題

Let's check if everything went well

我們透過以下的指令建立一個dummy的規則

Create a dummy rule

建立完成後我們可以下一些指令確認一下規則有沒有被建立

If you get "dummy my_first_rule" then everything is Okay!

如想要知道YARA一些官方資料也可以上YARA的官方網站取得

使用Yara Rule來檢測惡意程式

Detect Malware with Yara rules

就上述說明我們已經可以理解，可以使用Yara Rule來檢測惡意程式，以下我們就實際案例來證明，為了達成測試，我們將使用名為“ theZoo”的惡意程式資料庫，針對該項目的定義如下:

We already learned that we use Yara rules to detect malware. Let's discover how to do that in a real-world example. For testing purposes, I am going to use malware from a dataset called "theZoo": https://thezoo.morirt.com. The project owners define the repository as follows:

theZoo是一個用於惡意程式分析且可對外公開的資料庫，由於我們發現要對所有版本惡意程式進行分析有些難度，所以我們決定用一種較安全的方式取得惡意程式，theZoo是由Yuval tisf Nativ所建立，目前是由Shahak Shalev負責維護。

theZoo is a project created to make the possibility of malware analysis open and available to the public. Since we have found out that almost all versions of malware are very hard to come by in a way which will allow analysis, we have decided to gather all of them for you in an accessible and safe way. theZoo was born by Yuval tisf Nativ and is now maintained by Shahak Shalev.

使用免責聲明: Disclaimer

請記住，在資料庫裡面的惡意程式都是非常危險的，為了安全他們都被加密，在使用時除非你知道你自己在做的事，否則請不要隨意的執行他們!

Please remember that these are live and dangerous malware! They come encrypted and locked for a reason! Do NOT run them unless you are absolutely sure of what you are doing!

實體隔離是一種對於資訊系統的安全方是，他主要是基於原生的系統環境，將其分成較小的幾個獨立部分，將主要系統隔開以確保受損的子系統不會影響到整個實體系統，使用沙箱Sandbox分析惡意程式就是一種好的方式。目前訪間有許多沙箱Sandbox，例如Cucokoo Sandbox和LIMON，這是Cisco系統資訊安全調查員Monnappa KA作為研究項目所開發出來的開放原始碼沙箱Sandbox，他是一個使用Python，可以在Linux系統上自動收集、分析和報告惡意程式的軟體，他允許使用開房原碼工具執行靜態、動態和記憶體分析，並且能從執行前後進行調查這些惡意程式，而為了是別惡意程式，我們將以公開使用的規則為例，最好的資源之一就是Github上的rules。

Isolation is a security approach provided by many computer systems. It is based on splitting the system into smaller independent pieces to make sure that a compromised sub-system cannot affect the entire entity. Using a sandbox to analyse malware is a wise decision to run untrusted binaries. There are many sandboxes in the wild, such as Cuckoo Sandbox and LIMON, which is an open source sandbox developed by cisco systems Information Security Investigator Monnappa K A as a research project. It is a Python script that automatically collects, analyzes, and reports on Linux malware. It allows one to inspect the Linux malware before execution, during execution, and after execution (post-mortem analysis) by performing static, dynamic and memory analysis using open source tools.To identify malware we are going to use publically available rules as a demonstration. One of the greatest resources is https://github.com/Yara-Rules/rules

讓我們從上面進行複製Clone them

這些項目可以滿足一組IT安全研究人員的需要，即擁有一個單一的儲存庫，在該儲存庫中會進行編一，分類以及保持最新的Yara簽章，並盡可能以最新的資料進行存儲，並開始成為收集Yara Rule的一個開源論壇，而我們Yara Rule規則群集已受到GNU-GPLV2許可，並可以對於任何的使用者和組織進行開放使用，只要您同意便可以進行使用。

This project covers the need of a group of IT Security Researchers to have a single repository where different Yara signatures are compiled, classified and kept as up to date as possible, and began as an open source community for collecting Yara rules. Our Yara ruleset is under the GNU-GPLv2 license and open to any user or organization, as long as you use it under this license.

在這裡提醒你必須要使用Yara 3版本或更高的版本執行，才能套用這些rule

去偵測惡意程式，且必須要遵循相對應的格式。

Yara version 3 or higher is required to run the rules.

To detect malware, generally, you need to follow this format

以下就NJ-RAT惡意程式樣本偵測作案例示範

For example to detect NJ-RAT

參考以下的指令並且執行

Run the following command

Yara偵測這隻惡意程式的結果如下:

Yara detect the malicious file

Yara Rules structure

以下則針對Yara Rule規則結構進行說明:

Now let's explore the structure of a Yara rule. Yara rules usually contain:

Yara Rule的結構通常包含:

基本資訊(作者、開發的時間等) Metadata: Information about the rule (Author, development date and so on)

識別資訊

字串識別:在這地方你必須要加上將來Yara搜尋惡意程式所需要的字串Strings identification: You need to add the strings that YARA needs to look for in order to detect malware.

條件:這是檢測以識別字串和標示的邏輯規格則。Condition: this is a logical rule to detect the identified strings and indicators.

以下案例師一個有關skeleton樣本的Yara Rule

For example, this is a skeleton of a simple Yara rule:

但有一點，你不能把以下所列的一些資訊作為標示識別:

You can't use these terms as identifiers:

all, and, any, ascii, at, condition, contains,entrypoint, false, filesize, fullword, for, global, in ,import, include, int8, int16, int32, int8be, int16be,int32be, matches, meta, nocase, not, or, of,private, rule, strings, them, true, uint8, uint16,uint32, uint8be, uint16be, uint32be, wide

以下的Yara Rule適用於njRAT惡意程式偵測的:

This is the Yara rule for the njRAT detection

所以接下來探討如何建立你的第一個Yara Rule

How to create your first YARA rule

假設今天我們要建一個Ardamax Keylogger的規則，首先我們要下指令

Let's suppose that we are going to create a rule that detects Ardamax Keylogger. First we need to extract the strings using strings command

選擇一些字串加入倒規則裡面，以這次示範的例子來說，我選擇了以下的字串

Select some strings for demonstration purposes. In my case I am going to select:

invalid bit length repeat

??1type_info@@UAE@XZ

.?AVtype_info@@

然後打開文字編輯器去建立你的Yara Rule

Open a text editor and create your rule (FirstRule.yar)

Wide用於加入搜尋兩個字串，wide was added to search for strings encoded with two bytes per character

並且在這裡面Yara Rule通常會使用關閉大小寫區分的辨識No case was used to turn off the case-sensitive capability of Yara

寫好了規則之後就儲存並且執行Save the rule and run:

你將可以看到Yara對惡意程式檔案偵測觸發到你寫的規則情況如下:

As you can see Yara detected the malicious file based on our rules:

順帶一提，Yara本身是支援正規表達式Regular Expression，你可以在撰寫的過程中使用，可參考下表:

Yara supports regular expressions thus you can use one of the following expressions

Yara Python

主要是由Yara Python函示庫而來，可以為您的Python API增加Yara的功能。如果使用這個函示庫，您可以從Python程序中使用Yara，他涵蓋了Yara的所有功能，從編譯、儲存、載入規則到掃描檔案等，以及字串還有排程。

It is possible to add Yara capabilities to your python API thanks to a library called "Yara-Python".

With this library you can use YARA from your Python programs. It covers all YARA's features, from compiling, saving and loading rules to scanning files, strings and processes.

參考以下指令進行安裝:

To install it:

參考以下執行的過程:

This is an example that shows how to include Yara-python in your python application:

規避偵測技術Evasion techniques

黑帽駭客是非常聰明的，他們每天都在尋找如何逃避防毒軟體偵測方法，但事實上防毒並不是完全可以達成保護的方式，無論攻擊的手法多複雜，廠商都無法檢測到APT攻擊，攻擊者正使用多種手段和策略來繞過防毒軟體保護，以下是用於欺騙防毒軟體的方法:

Black hat Hackers are highly intelligent people. That is why they are looking every day for methods to escape antiviruses and avoid detection.Antiviruses are not totally protection solutions. All the AV vendors are failing to detect advanced persistent attacks no matter how sophisticated their solutions are. Attackers are using many means and tactics to bypass Antivirus protection. Below are some methods used to fool the antiviruses:

模糊處理是一種用於惡意程式二進位檔案結構都難以閱讀的技術，在惡意程式開發領域，隱藏字串就是件相當重要的事情，字串是相當重要的關鍵，通常這裡面可能有URL、系統註冊碼等，因此在很多情況下會使用加密來完成這些作業。

Obfuscation is a technique used to make the textual structure of a malware binary hard to read as much as possible. In malware development world is vital to hide what we call the strings. Strings are significant words usually are URLs, registry keys etc.. To do this, cryptographic standards are used in many cases to achieve this task

Binding用於運行惡意程式時用合法的應用程式運作作為掩飾。

Binding is the operation of binding the malware into another legitimate application

加密以及打包適用於將惡意程式加密使防毒軟體不容易看到的工具和技術，有時候會使用壓縮的方式，使逆向分析作業更加艱困。

Crypters and packers are tools and techniques used to encrypt a malware and keep the antivirus away from peeking inside. Packers some time called executable compression methods are used to make reverse engineering more difficult.

總結摘要

到目前為止，在對於某些類型的惡意程式片段進行簡要敘述之後，我們探討不同的惡意程式分析的方法，之後我們使用Yara Rule，透過撰寫Yara Rule在其結構中敘述如何使用他們檢測惡意程式以及如何創建自己的第一個Yara Rule，並且我們也發現到可以透過Yara Python跟Python界接，以及有關規避房無堧體檢測的一些概要技術。

參考資料:

1、 Malware Analysis: How to use Yara rules to detect malware

https://www.peerlyst.com/posts/malware-analysis-how-to-use-yara-rules-to-detect-malware-chiheb-chebbi

2、 YARA RULES Resources

https://static.peerlyst.com/image/upload/v1571381124/post-attachments/Yara_Rules_Resources_unj0lm

3、 How to bypass Machine Learning Malware Detectors with Generative adversarial Networks

https://www.peerlyst.com/posts/how-to-bypass-machine-learning-malware-detectors-with-generative-adversarial-networks-chiheb-chebbi

4、 Getting started with IDA Pro https://www.peerlyst.com/posts/getting-started-with-ida-pro-chiheb-chebbi?trk=search_page_search_result

5、 How to Perform Static Malware Analysis with Radare2

https://www.peerlyst.com/posts/how-to-perform-static-malware-analysis-with-radare2-chiheb-chebbi?trk=search_page_search_result

6、 How to build a Linux Automated Malware Analysis Lab

https://www.peerlyst.com/posts/how-to-build-a-linux-automated-malware-analysis-lab-chiheb-chebbi?trk=search_page_search_result

7、 Wikipedia Yara

https://www.peerlyst.com/tags/wikipedia

8、 Github Yara

https://github.com/VirusTotal/yara/releases

9、 YARA DOC

https://yara.readthedocs.io/en/stable/gettingstarted.html

10、theZoo

https ://thezoo.morirt.com

11、Github Yara Rules

https://github.com/Yara-Rules/rules

12、Tutorial: Creating Yara Rules for Malware Detection

https://www.real0day.com/hacking-tutorials/yara

13、Tutorial: Creating Yara Signatures for Malware Detection

https://0x00sec.org/t/tutorial-creating-yara-signatures-for-malware-detection/5453

14、Github VirusTotal yara-python

https://github.com/VirusTotal/yara-python

15、How to install YARA and write basic YARA rules to identify malware

https://seanthegeek.net/257/install-yara-write-yara-rules/

16、Writing YARA rules

https://yara.readthedocs.io/en/v3.4.0/writingrules.html

Security Note

2020年6月7日星期日

翻譯小品:惡意程式分析系列-如何使用Yara Rule檢測惡意程式

沒有留言:

張貼留言

2020年6月7日 星期日

翻譯小品:惡意程式分析系列-如何使用Yara Rule檢測惡意程式

沒有留言:

張貼留言

2020年6月7日星期日