PDF pipeline

The pipeline polls for new files at a Gitlab repo, analyses the documents and writes the results to another branch of the repo. * Watch the VIDEO

Tools run in the pipeline

PDFiD / PeePDF / JSunpack-n / shellcode analysis

Pipeline workflow:

Poll for new files in git repository
job-show-files: List samples to be analysed
job-pdfid: Analyse the samples, classify to clean/malicious/more analysis needed, and save logs to repo. job-peepdf-virustotal-check: Analyse samples, query hashes from VirusTotal database
job-jsunpack-n: Analyse samples, extract JavaScript and convert shellcode to binary if found. Save shellcode to repo's shellcode-folder.
job-sctest: Use peepdf's sctest to analyse the shellcode binaries converted by jsunpack-n. Push results to repo.

Watch the pdf-pipeline VIDEO

The repositories

branch: master

Contains the scripts for the pipeline
The results will be written to results/

branch: sample-source

Place the samples here

How to set up the pipeline

USING THE SCRIPT

You can set up the pipeline to the pilot environment with sudo ./setup-pipeline.sh

sudo ./setup-pipeline.sh

[+] Cloning the pipelines.git

Available pipelines
1) document-pipeline
2) pdf-pipeline
3) pdf-pipeline Private registry version
4) Quit
Your choice:

Or you can give the desired pipeline as an argument: sudo ./setup-pipeline.sh pdf-pipeline

The setup will copy the pilot environment's SSH keys to Gitlab using a private access token, and modify the pipeline's credentials to use resources with SSH.

MANUALLY

Setup concourse (tutorial)
Setup a git repository with branch:master, with the files included in the "results" folder.
Setup branch:sample-source with folder "pdf" for the samples.
Edit the credentials.yml with the details of your git and your ssh key.
Login to concourse:

fly -t CONCOURSE_TARGET_NAME login -c http://127.0.0.1:8080 -u CONCOURSE_USERNAME -p CONCOURSE_PASSWORD

Set up the pipeline:

fly -t CONCOURSE_TARGET_NAME sp -c pipeline.yml -p pdfjobs -l credentials.yml

Unpause the pipeline:

fly -t CONCOURSE_TARGET_NAME unpause-pipeline -p pdfjobs

Upload your samples to sample-source -branch