What is netGA?

netGA is a Genetic Algorithm for generating Network Intrusion detection rules based upon training data. Currently, the training data is provided by training data sets from the MIT Lincoln Labratory. The goal is to take audit data and use a Genetic Algorithm to generate rulesets that can identify attacks. This project is based upon the techniques describe by Ren Hui Gong et al. in their paper titled "A Software Implementation of a Genetic Algorithm Based Approach to Network Intrusion Detection". Jim Hoagland's Statistical Packet Anomaly Detection Engine (SPADE) project also inspired me to do this project.



I completed my Masters of Science in Computer Science from Sac State. Check out my Master's Project Report.

There two parts to netGA. First part is the executable that generates the rules based upon the Genetic Algorithm using the DARPA audit data as training input. The net_ga executable runs this. Second part is the netGA plugin to nProbe. The plug-in loads the rules from a configuration file. It outputs matches to stdout (intended for testing, but not ready for production).

I got the plug-in for nProbe working to test the rules against the tcpdump capture data. The tcpdump capture data is contained in the DARPA training data and its file name is sample_data01.tcpdump.gz. Luca Deri assisted me with doing this integration for which I am very greatful. You can test the plug-in using tcpreplay and the dummy0 interface in GNU/Linux. See my MS Project report for details regarding this.

Source Code ChangeLogs.

netga genetic algorithms changelog
nProbe GA plugin changelog

Download the code and try it out. The Gentic Algorithm code doesn't use autoconf, and has one big Makefile, so it's a little rough! To try the nProbe plugin, clone the code and use the usual compile options for building nProbe. The plugin is enabled by default. netGA code is GPLv2.


You need GLib to build the netGA executable code. If you use GNU/Linux, you can install this using libglib2.0-dev (at least on Ubuntu and Debian). You need to have a version that supports g_slice_alloc. I believe this is relatively recent. Then, just type make.


Who am I?

I am in the Master's of Science program in Computer Science at Sac State. This is my Master's Project.


Advisor: Dr. Gordon
Secondary: Dr. Ghansah


Ruleset Format

A "-1" in a rule is a wildcard value for that field. The format below below is to give an idea of the format. I haven't generated a standard way for outputing the rules. Each element "gene" uses a 32 bit integer. In the case of the duration and the Source IP and the Dest IP, their genes are broken into four sub genes of 1 byte each.
DurationServiceSource PortDest PortSource IPDest IPAttack Type