PDF pipeline

The pipeline polls for new files at a Gitlab repo, analyses the documents and writes the results to another branch of the repo. * Watch the VIDEO

Tools run in the pipeline

PDFiD / PeePDF / JSunpack-n / shellcode analysis

Pipeline workflow:

  1. Poll for new files in git repository

  2. job-show-files: List samples to be analysed

  3. job-pdfid: Analyse the samples, classify to clean/malicious/more analysis needed, and save logs to repo. job-peepdf-virustotal-check: Analyse samples, query hashes from VirusTotal database

  4. job-jsunpack-n: Analyse samples, extract JavaScript and convert shellcode to binary if found. Save shellcode to repo's shellcode-folder.

  5. job-sctest: Use peepdf's sctest to analyse the shellcode binaries converted by jsunpack-n. Push results to repo.

Watch the pdf-pipeline VIDEO

The repositories

branch: master

  • Contains the scripts for the pipeline

  • The results will be written to results/

branch: sample-source

  • Place the samples here

How to set up the pipeline

USING THE SCRIPT

You can set up the pipeline to the pilot environment with sudo ./setup-pipeline.sh

sudo ./setup-pipeline.sh

[+] Cloning the pipelines.git

Available pipelines
1) document-pipeline
2) pdf-pipeline
3) pdf-pipeline Private registry version
4) Quit
Your choice: 

Or you can give the desired pipeline as an argument: sudo ./setup-pipeline.sh pdf-pipeline

The setup will copy the pilot environment's SSH keys to Gitlab using a private access token, and modify the pipeline's credentials to use resources with SSH.

MANUALLY

  1. Setup concourse (tutorial)

  2. Setup a git repository with branch:master, with the files included in the "results" folder.

  3. Setup branch:sample-source with folder "pdf" for the samples.

  4. Edit the credentials.yml with the details of your git and your ssh key.

  5. Login to concourse:

fly -t CONCOURSE_TARGET_NAME login -c http://127.0.0.1:8080 -u CONCOURSE_USERNAME -p CONCOURSE_PASSWORD

  1. Set up the pipeline:

fly -t CONCOURSE_TARGET_NAME sp -c pipeline.yml -p pdfjobs -l credentials.yml

  1. Unpause the pipeline:

fly -t CONCOURSE_TARGET_NAME unpause-pipeline -p pdfjobs

  1. Upload your samples to sample-source -branch