Tool for extracting possible IoC information from files

Writer: Hakkari Onni, Project Engineer, JAMK University of Applied Sciences

This blog post presents a tool called ioc_strings that can be used to gather relevant technical information from file strings. Tool extracts possible IoC (Indicator of Compromize) information from files, such as urls, domains, emails, hashes etc. These IoC types are compatible with Cortex-Analyzers, therefore it is possible to feed these gathered possible IoCs to Cortex-Analyzers and receive informative evaluation to see if these possible IoCs are actual IoCs.

Benefit in finding only relevant information

In image below, left side is Linux strings command output with 112k output strings. Right side is output for ioc_strings with 27 lines that only contains relevant possible IoC information for further analyzing. Without ioc_strings it would be a huge job to identify all the strings output strings manually.



The tool uses Linux strings command to gather all strings from a file, and then it loops through every one of them. It also splits single strings at whitespaces to improve the amount of gathered possible IoCs. For example, string ip =, which does not identify as IoC type, would yield 3 strings: ip, = and From these 3 strings identifies as IP address. Python libraries iocextract and validators are utilized for identifying IoC types.

ioc_strings can be also used as Python library to identify IoC type. Example code and output:

import ioc_strings

ioc1 = ioc_strings.IOC("")

ioc2 = ioc_strings.IOC("testing")


{'': ['ip']}
{'testing': []}

Example usage

Example files scanned are from

Input path can be either file or directory. If input path is a directory, all filepaths are searched recursively and extracted one by one. Example case directory structure:

├── Brain.A
│   ├── Brain.A.img
│   ├── Brain.A.txt
│   ├── nobrains
│   │   ├── BRAIN
│   │   ├── DEBRAIN.C
│   │   ├── DEBRAIN.EXE
│   │   ├── DREAD.ASM
│   │   ├── DREAD.INC
│   │   ├── DWRITE.ASM
│   │   ├── DWRITE.INC
│   │   ├── README
│   │   ├── VACCINE.COM
│   │   ├── VACCINE.PAS
│   │   └── VACCINE.TXT
│   └──
├── Brain.A.md5
├── Brain.A.pass
├── Brain.A.sha


iocstrings theZoo/malwares/Binaries/Brain.A/



IoC types can also be included in the output with -t option. Command:

iocstrings theZoo/malwares/Binaries/Brain.A/ -t

output (JSONL format):

{"c56f135fdaff397ad207f61b4f2042fe": ["hash"]}
{"03f1e073761af071d373f025359da84ec39ada19": ["hash"]}
{"": ["domain"]}
{"": ["domain"]}
{"": ["email"]}
{"": ["domain"]}
{"": ["email"]}

Output can be also filtered by IoC type. Command:

iocstrings theZoo/malwares/Binaries/Brain.A/ -t --filter email


{"": ["email"]}
{"": ["email"]}


The ioc_strings tool is alternative and perhaps more convenient choice for Linux strings command, when analyzing malware files or memory dumps. If you are interested in testing the tool, see the GitHub repository: