[WIP] Zardoz is a small WFA which tries to learn from the server and client the rules of what to block.
You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Uriel Fanelli 424a46b5d0 Update 'LICENSE' 2 years ago
.vscode Initial commit 3 years ago
assets Initial commit 3 years ago
vendor Initial commit 3 years ago
.gitignore Initial commit 3 years ago
LICENSE Update 'LICENSE' 2 years ago
README.md Initial commit 3 years ago
alloc.go Initial commit 3 years ago
bindata.go Initial commit 3 years ago
blacklist.txt Initial commit 3 years ago
build.sh Initial commit 3 years ago
classifier.go Initial commit 3 years ago
file.go Initial commit 3 years ago
go.mod Initial commit 3 years ago
go.sum Initial commit 3 years ago
handler.go Initial commit 3 years ago
log.go Initial commit 3 years ago
main.go Initial commit 3 years ago
matrix.go Initial commit 3 years ago
run.sh Initial commit 3 years ago
whitelist.txt Initial commit 3 years ago
zgc.go Initial commit 3 years ago


Zardoz: a lightweight WAF , based on Pseudo-Bayes machine learning.

Zardoz is a small WAF, aiming to take off HTTP calls which are well-known to end in some HTTP error. It behaves like a reverse proxy, running as a frontend. It intercepts the calls, forwards them when needed and learns how the server reacts from the Status Code.

After a while, the bayes classifier is able to understand what is a "good" HTTP call and a bad one, based on the header contents.

It is designed to don't consume much memory neither CPU, so that you don't need powerful servers to keep it running, neither it can introduce high latency on the web server.


This is just an experiment I'm doing with Pseudo-Bayes classifiers. It works pretty well with my blog. Run in production at your own risk.



  • golang >= 1.12.9


git clone https://git.keinpfusch.net/LowEel/zardoz 
cd zardoz
go build 


Zardoz has no configuration file, it entirely depends from environment string.

In Dockerfile, this maps like:

ENV DUMPFILE /somewhere/bayes.txt

Using a bash script, this means something like:

export PROXYPORT=":17000" 
export TRIGGER="0.6"
export SENIORITY="1025"
export DEBUG="true"
export DUMPFILE="/somewhere/bayes.txt"

Understanding Configuration:

REVERSEURL is the server zardoz will be a reverse proxy for. This maps to IP and port of the server you want to protect.

PROXYPORT is the IP and PORT where zardoz will listen. If you want zardoz to listen on all ports, just write like ":17000", meaning, it will listen on all interfaces at port 17000

TRIGGER: this is one of the trickiest part. We can describe the behavior of zardoz in quadrants, like:


The value of trigger can be from 0 to 1, like "0.5" or "0.6". The difference between BLOCK without learning and block with learning is execution time. On the point of view of user experience, it will change nothing (user will be blocked) but in case of "block+learn" the machine will try to learn the lesson.

Basically, if the GOOD and BAD are very far, "likelyhood" is very high, so that block and pass are taken strictly.

If the likelyhood is lesser than TRIGGER, then we aren't sure the prediction is good, so zardoz executes the PASS or BLOCK, but it waits for the response , and learns from it. To summerize, the concept is about "likelyhood", which makes the difference between an action and the same action + LEARN.

Personally I've got good results putting the trigger at 0.6, meaning this is not disturbing so much users, and in the same time it has filtered tons of malicious scan.

SENIORITY: since Zardoz will learn what is good for your web server, it takes time to gain seniority. To start Zardoz as empty and leave it to decide will generate some terrible behavior, because of false positives and false negatives. Plus, at the beginning Zardoz is supposed to ALWAYS learn.

The parameter "SENIORITY" is then the amount of requests it will set in "PASS+LEARN" before the filtering starts. During this time, it will learn from real traffic. It will block no traffic unless "seniority" is reach. If you set it to 1025, it will learn from 1025 requests and then it will start to actually filter the requests. The number depends by many factors: if you have a lot of page served and a lot of contents, I suggest to increase the number.


This is where you want the dumpfile to be saved. Useful with Docker volumes.


The amount of collected tokens which are considered enough to do a good job. This depends by your service. This is useful to limit memory usage if your server has a very complex content, by example.


If DEBUG is set to "false" or not set, minute Zardoz will dump the sparse matrix describing to the whole bayesian learning, into a file named bayes.json. This contains the weighted matrix of calls and classes. If Zardoz is not behaving like you expected, you may give a look to this file. The format is a classic sparse matrix. WARNING: this file may contain cookies or other sensitive headers.

DEBUG : if set to "true", Zardoz will create a folder "logs" and log what happens, together with the dump of sparse matrix. If set to "false" or not set, sparse matrix will be available on disk for post-mortem.


Credits for the Bayesian Implementation to Jake Brukhman : https://github.com/jbrukh/bayesian


  • Loading Bayesian data from file.
  • Better Logging
  • Configurable block message.
  • Usage Statistics/Metrics sent to influxDB/prometheus/whatever