DIY YARA vs. YARA with Belkasoft X
YARA is an open-source toolset commonly used in malware research, incident response, and digital forensics. DFIR and cyber security specialists employ it as an industry-standard solution for streamlining the process of detecting and categorizing malware.
The implementation of YARA generally involves two stages:
- Creating YARA rules that describe specific textual and binary patterns found in malicious files or software
- Passing these rules as arguments to a specialized tool that searches files or directories by the criteria defined in the rule files.
There are many open-source tools and components that allow you to perform YARA scans. In this article, we collected the instructions for implementing the most popular ones to help you understand how to configure and use them. However, if you prefer to skip the implementation part and go straight to rule execution, we also explain how to run YARA rules with minimal prerequisites in Belkasoft X, a DFIR tool with advanced incident response capabilities.
We will explore the following topics:
- Configuring the open-source YARA tool on various operating systems
- YARA rule structure and specifics
- Running YARA rules in the open-source YARA tool
- Automation and validation of YARA rules
- Out-of-the-box YARA scanning in Belkasoft X
Installing YARA
You can set up and run YARA on such platforms as Windows, macOS, and Linux, either through its command-line interface or by using Python scripts with the YARA-Python extension. This section includes details on installing the open-source YARA tool on each system.
For additional details, refer to the official YARA documentation website.
Windows
Option 1: YARA tool compiled binaries
- Navigate to the VirusTotal GitHub repository
- Depending on your system configuration, download either yara-x.x.x-xxxx-win32.zip or yara-x.x.x-xxxx-win64.zip
- Extract the yara.exe and yarac.exe files anywhere on your disk
Option 2: Chocolatey
One more command-line installer that can be used for installing YARA is Chocolatey. You can find detailed instructions on how to deploy it in Chocolatey guidelines for YARA.
Option 3: Windows Scoop
Alternatively, you can install the YARA tool by the command-line installer for Windows Scoop using the PowerShell terminal (version 5.1 or later).
- Begin with the following command:
Set-ExecutionPolicy RemoteSigned -Scope CurrentUser # Optional: Needed to run a remote script the first time irm get.scoop.sh | iex |
- For YARA installation, run:
PS C:\Users\jiosu> scoop install YARA
Installing 'YARA' (4.2.3-2029) [64bit] from main bucket yara-4.2.3-2029-win64.zip (2.0 MB) [==========================================================================] 100% Checking hash of yara-4.2.3-2029-win64.zip ... ok. Extracting yara-4.2.3-2029-win64.zip ... done. Linking ~\scoop\apps\YARA\current => ~\scoop\apps\YARA\4.2.3-2029 Creating shim for 'yara'. Creating shim for 'yarac'. 'YARA' (4.2.3-2029) was installed successfully! |
- Integrate PowerShell with YARA:
PS C:\Users\jiosu> yara yara: wrong number of arguments Usage: yara [OPTION]... [NAMESPACE:]RULES_FILE... FILE | DIR | PID Try `--help` for more options |
Linux
- Download the source tarball and get ready to compile it:
tar -zxf yara-4.2.0.tar.gz cd yara-4.2.0 ./bootstrap.sh |
- Verify you have automake, libtool, make, gcc and pkg-config installed in your system. Ubuntu and Debian users can run:
sudo apt-get install automake libtool make gcc pkg-config |
- If you plan to modify YARA source code you may also need to install flex and bison for generating lexers and parsers:
sudo apt-get install flex bison |
- Compile and install YARA in the standard way:
./bootstrap.sh ./configure make sudo make install |
- Run the test cases to verify everything works as expected:
make check |
Some YARA features depend on the OpenSSL library. Those features are enabled only if you have the OpenSSL library installed in your system. If you do not, YARA will work fine, but you will not be able to use the disabled features. The configure script will automatically detect if OpenSSL is installed or not. If you want to enforce the OpenSSL-dependent features, you must pass --with-crypto to the configure script. Ubuntu and Debian users can use sudo apt-get install libssl-dev to install the OpenSSL library.
macOS
Launch the Terminal and install the Homebrew package manager. Then type brew install yara.
YARA-Python installation
If you plan to use YARA from your Python scripts, you need to install the yara-python extension. Please refer to the instructions on how to install it.
Modules and integrations
YARA has the capability of integrating modules, which serve as an enhancement to its functionality. Modules are add-on packages that allow YARA to perform specific tasks. For example, to scan specific file types like Portable Executable (PE), Executable and Linkable Format (ELF), and searching Windows Management Instrumentation (WMI) on Windows machines. Additionally, YARA modules can be integrated with other analysis programs, like Cuckoo, and can be called by programs like OSQuery. This integration of modules provides YARA with additional features and abilities, making it a versatile and comprehensive tool for malware analysis.
YARA rule structure
A YARA rule is a structured piece of code that describes specific strings and conditions for their search. It consists of the following components:
- Metadata includes information about the rule, such as its author, description of its purpose, creation date, and so on. It is used to identify the rule and does not affect the search.
- Strings are the core of the rule; they can include regular expressions, text, or hexadecimal strings that the rule must detect during the scan. You can apply modifiers for the string presentation and encoding to fine-tune the search and use wildcards to enlarge it.
- Conditions define the search criteria, indicating if the rule must detect the files with all the provided strings, some of them, or their specific combinations.
- Imports is an optional component. It extends the search according to the functionality of additional modules, such as PE, ELF, Cuckoo, and others.
These different types of data and search modifiers provide a wide range of options for precisely detecting suspicious files. For the specification on writing YARA rules, refer to YARA docs.
YARA strings
Determining which strings to include in a YARA rule requires a combination of technical knowledge, analysis of the malware or software you are targeting, and understanding the behavior and characteristics of the specific threat you are investigating. Here are some recommendations that can help you identify the strings to include in your YARA rule:
- Examine the malicious file or software sample you are investigating to understand its structure, behavior, and any known indicators of compromise (IOCs). You can use specialized tools like strings.exe to extract the file strings and look for unique patterns specific to the malware or software you are analyzing. These can include hardcoded URLs, filenames, registry keys, function names, or other identifiable text strings.
- Keep in mind that malware authors may obfuscate or encrypt strings to evade detection. Therefore, it is important to consider different variations or transformations of strings that may be used in the malware. To handle obfuscated strings, you can use wildcard characters and regular expressions or apply XOR decryption routines. You can also use tools like FLARE obfuscated string solver (FLOSS) that automatically deobfuscate strings from malware binaries.
- Consult public threat intelligence sources, security blogs, or malware analysis reports to identify known IOCs associated with the malware or software you're investigating. These sources often provide valuable information about strings commonly used by threat actors.
Types of YARA rules
You can use YARA for various purposes, including regular threat detection, threat hunting, threat intelligence, and method detection. These functions make YARA a powerful tool for securing computer systems and networks from malware attacks. With its ability to craft rules based on specific needs and integrate modules, YARA is a versatile tool for conducting thorough and effective malware analysis, such as Threat Detection, Threat Hunting, Threat Intelligence, and Method Detection.
Rule Type | Strings Used | Example Detection |
Regular |
Program section of the PDB path |
\Release\dloader.pdb |
Error messages |
Target: Failed to load SAM functions. |
|
Persistence keywords |
GoogleUpdateTaskMachineSystem |
|
Specific output strings |
The file uploaded failed! |
|
Mutex values |
LOADPREF_MUTEX |
|
Unique exports |
FloodFix |
|
File references |
C:\\ddd\\a1.txt |
|
Command combinations |
-P /tmp && chmod +x /tmp/ |
|
Threat Intel Tracking |
Username section of the PDB path |
C:\Users\WMJI\Desktop\ |
Author of malicious Office Docs |
<cp:lastModifiedBy>Joohns/cp:lastModifiedBy> |
|
Email addresses |
ahmedOmed@outlook.com |
|
C2 server addresses |
link.angellroofing.com |
|
Special keywords |
Backsnarf AB25 |
|
Attacker vices |
[base64 encoded "$god = "variable] |
|
Developer fingerprint |
Coded by z668 |
|
Method Detection |
Special form of invocation |
/RunProgram=\"hidcon:[a-zA-Z](1,16).cmd/ |
Special form of obfuscation |
c\" & \"r\" & \"\" & \p\" & \"t |
|
Special form of evasion |
certutil-urlcache-split-f http |
|
Suspicious form of encoding |
4D5A90000300000004000000FFFF000088000000 |
|
Suspicious size |
[big LNK files) uint16(0)==0x004c and filesize > 200KB |
|
Suspicious combination |
[Copyright is Microsoft Windows and SFX RAR] |
|
Persistence method |
A\\Users\\Public\\[a-zA-Z]{1,16).exe/ |
|
Exploit code keywords |
[+] Shellcode |
|
Usual suspicious exports |
ReflectiveLoader |
Source: nextron-systems.com
Running YARA in the open-source YARA tool
When using the open-source implementation, you can execute YARA rules from the command line. In this example, we use the Windows binary file implementation.
The command-line syntax requires specifying the YARA executable, options, rule file, and target directory. Additional command-line arguments, such as recursive directory scanning (-r), counting matches (-c), and negating matches (-n), can be viewed through the help menu (-h). For example, you can use the "-r" option for a recursive directory scan, the "-c" option to count matches, and the "-n" option to negate matches.
- recursive directory scanning (-r)
yara64.exe -r path/to/rules/folder path/to/target/directory |
- counting matches (-c)
yara64.exe -c path/to/rules/folder path/to/target/file |
- negating matches (-n)
yara64.exe -n path/to/rules/folder path/to/target/file |
- help menu (-h)
yara64.exe -h YARA 4.2.3, the pattern matching swiss army knife. Usage: yara [OPTION]... [NAMESPACE:]RULES_FILE... FILE | DIR | PID Mandatory arguments to long options are mandatory for short options too. |
|
--atom-quality-table=FILE -C, --compiled-rules -c, --count -d, --define=VAR=VALUE --fail-on-warnings -f, --fast-scan -h, --help -i, --identifier=IDENTIFIER --max-process-memory-chunk=NUMBER (default=1073741824) -l, --max-rules=NUMBER --max-strings-per-rule=NUMBER -x, --module-data=MODULE=FILE -n, --negate -N, --no-follow-symlinks -w, --no-warnings -m, --print-meta -D, --print-module-data -M, --module-names -e, --print-namespace -S, --print-stats -s, --print-strings -L, --print-string-length -X, --print-xor-key -g, --print-tags -r, --recursive --scan-list -z, --skip-larger=NUMBER directory -k, --stack-size=SLOTS -t, --tag=TAG -p, --threads=NUMBER -a, --timeout=SECONDS -v, --version |
path to a file with the atom quality table load compiled rules print only number of matches define external variable fail on warnings fast matching mode show this help and exit print only rules named IDENTIFIER set maximum chunk size while reading process memory abort scanning after matching a NUMBER of rules set maximum number of strings per rule (default=10000) pass FILE's content as extra data to MODULE print only not satisfied rules (negate) do not follow symlinks when scanning disable warnings print metadata print module data show module names print rules' namespace print rules' statistics print matching strings print length of matched strings print xor key and plaintext of matched strings print tags recursively search directories scan files listed in FILE, one per line skip files larger than the given size when scanning a set maximum stack size (default=16384) print only rules tagged as TAG use the specified NUMBER of threads to scan a directory abort scanning after the given number of SECONDS show version information |
Automating YARA rule creation with YarGen
Incident response and malware analysis often require creating rules for a significant number of malware samples. The complexity and quantity of malware can make this a time-consuming task. To streamline the process, Florian Roth created a Python tool called YarGen that automates the creation of YARA rules. YarGen can be downloaded from the corresponding GitHub repository and run on any system that can execute Python scripts. To use YarGen, run the script against a directory of malware samples. For example:
python yarGen.py path/to/malware/samples |
For Windows, install Scoop. Then, install git and download YarGen with the following command:
git clone https://github.com/Neo23x0/yarGen |
Before installing YarGen, make sure that Python is installed on your machine. Once you have Python installed, you can proceed with installing the necessary dependencies for YarGen. To do this, run the following command:
pip install –r requirements.txt |
This command will install all the required dependencies for YarGen to function correctly.
Run python yarGen.py --update to download the built-in databases automatically. The system saves them into the ./dbs subfolder.
We will consider a hypothetical scenario where we have already identified a malware sample on our system, specifically a Java Server Pages (JSP) file. The JSP file is located in a directory named samples. We will use this sample to demonstrate how to run YarGen against a directory of malware samples.
It is important to note that in real-world scenarios, the quantity of malware samples can be much larger, and this demonstration is for educational purposes only. However, the process of running YarGen against a directory of malware samples remains the same. YarGen will then scan the directory and generate YARA rules based on the malware samples it finds. These rules can then be used for detecting and classifying similar malware samples in the future.
python yarGen.py -m C:\Users\Desktop\JSP\SAMPLE\ |
The execution of the command will produce a YARA rule as output. However, it is crucial to keep in mind that this rule may require some refinement and adjustments. The tool does not always generate perfect matches for strings and other data. As a result, it is the responsibility of the analyst to thoroughly review the output rule and make any necessary modifications. This includes removing any unnecessary strings or conditions that may increase the rate of false positives. Proper cleanup and post-processing of the rule is essential for achieving optimal results.
Upon completion of the procedure, a yargen_rules.yar file will be generated. This file can be opened for further review using the following command: notepad yargen_rules.yar
/* YARA Rule Set Author: yarGen Rule Generator Date: 2023-02-02 Identifier: Reference: https://github.com/Neo23x0/yarGen */ /* Rule Set ----------------------------------------------------------------- */ rule webshell_sample { meta: description = " - file webshell-sample.jsp" author = "yarGen Rule Generator" reference = "https://github.com/Neo23x0/yarGen" date = "2023-02-02" = "58fffca06fd551c6dd09eebee9f3958db65fcd3faf12787b825643f9d50bb695" strings: $s1 = " private void execute(HttpSession session, String cmd) throws IOException {" fullword ascii $s2 = "<FORM NAME=\"shell\" action=\"\" method=\"POST\" onsubmit=\"exeCommand('execute');return false;\">" fullword ascii $s3 = " while(processThreadSession.getProgIn()==null && processThreadSession.isAlive()){" fullword ascii $s4 = " session.setAttribute(\"progErrorByteArrayOutputStream\", processThreadSession.getProgError());" fullword ascii $s5 = " private void setupProcess(HttpSession session) {" fullword ascii $s6 = " execute(session, cmd);" fullword ascii $s7 = " System.out.println(\"Process end!!!!!!!\");" fullword ascii $s8 = " session.setAttribute(\"progInBufferedWriter\", processThreadSession.getProgIn());" fullword ascii $s9 = " session.setAttribute(\"progOutputByteArrayOutputStream\", processThreadSession.getProgOutput());" fullword ascii $s10 = " Thread processThreadSessionOld = (Thread) session.getAttribute(\"process\");" fullword ascii $s11 = " proc = runtime.exec(\"cmd\");// for Windows System use runtime.exec(\"cmd\");" fullword ascii $s12 = " req.setRequestHeader('User-Agent','XMLHTTP/1.0');" fullword ascii $s13 = " <input type=\"button\" value=\"Reset\" name=\"controlcButton\" onclick=\"exeCommand('controlc');return false;\"/>" fullword ascii $s14 = " function exeCommand(myFunction){" fullword ascii $s15 = " session.setAttribute(\"process\", processThreadSession);" fullword ascii $s16 = " setupProcess(session);" fullword ascii $s17 = " if (processThreadSessionOld != null) {" fullword ascii $s18 = "MUKHA MO!!!" fullword ascii $s19 = " processThreadSessionOld.interrupt();" fullword ascii $s20 = " private String getOutput(HttpSession session) {" fullword ascii condition: uint16(0) == 0x253c and filesize < 40KB and 8 of them } |
The generated YARA rule contains multiple ASCII strings obtained from the malware sample. The rule also incorporates a file size condition that verifies if the file size of the target matches the defined criteria. After making some edits and adjustments, you can use this rule to detect the presence of malware in other systems, disk images, or other file formats.
Validating YARA rules with YaraDBG
One more tool that can come in handy when creating and running YARA rules is YaraDBG. It is a web-based debugger that enables thorough root-cause analysis (RCA) by providing insights into why specific YARA rules matched or did not match particular files. It aids in the management of large rule sets and offers a dedicated YARA parser, regular expression (regex) engine, and evaluation engine.
To evaluate a YARA rule, drop it in the left pane, and then in the right pane, drop the file against which you want to evaluate the rule. Note that the implementation of YaraDbg ensures that data files are kept securely on your machine, and only the YARA rules are sent to the backend for parsing.
The left panel will display your YARA rules.
The right pane will display the Hexadecimal Representation of the binary file:
You can review and analyze the matches at the bottom of the right pane.
Out-of-the-box YARA scans in Belkasoft X
Analyzing the causes of a cyber threat may be a time-pressing task. If you want to cut the elaborate installation process of the open-source YARA tool, you can use specialized products that support running YARA rules out of the box. In this section, we will show how to execute them in Belkasoft X, a comprehensive digital forensics and cyber incident response tool.
If you choose Belkasoft X to scan the files for malware, all you need to get started are the files you want to scan and the YARA rules you want to run for these files. You can select the YARA rules to use in the "Select advanced analysis" window when adding the files to analyze as a data source to your case.
Alternatively, you can scan an existing data source. From the main menu, select File System. Then, right-click the item and select Scan with YARA.
After Belkasoft X analyzes the files, go to the Artifacts → Overview tab to view the items detected according to the selected YARA rules.
In addition to viewing, you can sort, filter, bookmark the detected items, and export them into various report formats.
For more details on YARA rules and how to use them with Belkasoft X, refer to the article "Walkthrough: YARA Rules in Belkasoft X".
Conclusion
YARA serves as a convenient and effective tool for malware detection and categorization. It is based on YARA rules that describe the patterns occurring in malicious files. You can execute YARA rules using open-source tools; however, this approach may require some time and effort to set up and configure all the necessary components.
To avoid a steep learning curve and unneeded efforts, you can opt for a tool that supports YARA out of the box. One such tool is Belkasoft X, which allows you to perform this task with minimum prerequisites. Thus, investigators can focus on analyzing the results and gaining valuable insights from the YARA rule-based scans, accelerating their investigation process and enhancing overall efficiency.
Further reading:
- YARA: https://virustotal.github.io/yara/
- Guide to writing sound YARA rules: https://www.nextron-systems.com/2015/02/16/write-simple-sound-yara-rules/
- Compiled strings found in legitimate programs: https://github.com/Neo23x0/yarGen
- Repository of unique and useful YARA rules: https://github.com/InQuest/awesome-yara
- Google releases 165 YARA rules to detect Cobalt Strike attacks: https://www-bleepingcomputer-com.cdn.ampproject.org/c/s/www.bleepingcomputer.com/news/security/google-releases-165-yara-rules-to-detect-cobalt-strike-attacks/amp/