Extend the framework

How to add a new dataset type

If you need to use a dataset that of a type that is currently not supported, you can add your own type to the framework. First you will need to add a new entry to the DB. This can be done while the framework is running or by adding the necessary sql statements in ./database/bicep.sql Each entry should look like this:

INSERT INTO dataset_type (name, description, function_prefix) VALUES ('Network Analysis Data', 'Network traffic data in form of PCAPs. The labels are in CSV file format', 'network_traffic_data');

The function_prefix determines how your python file has to be named and what the prefix for the functions you will need to implemented is.

BICEP is designed to automatically look for the appropriate dataset types listed in the DB. Your code to handle the new type needs to be located in ./backend/core/app/models/dataset_types_implementation/<your_dataset_type_name>.py Then you will need to implement 2 functions which NEED TO ADHERE to the following naming convention:

<your_dataset_type_name>_get_benign_and_malicious_counts_of_labels_file <your_dataset_type_name>_get_positives_and_negatives_from_dataset This ensures that the system can find these. The documentation for these methods can be found in Dataset Types Implementation Afterwards, feel free to create a pull request to contribute to the framework.

How to add a new ensembling technique

Currently, only the simple majority vote algorithm is supported to construct an ensemble. If you need a more sophisticated technique, you will need to implement it. To do so, you will need to add a new entry to the DB first. This can be done while the framework is running or by adding the necessary sql statements in ./database/bicep.sql Each entry should look like this:

INSERT INTO ensemble_technique (name, description, function_name) VALUES ('Majority Vote', 'A simply Majority vote approach where all IDS in the ensemble have the same weight', 'majority_vote');`

Where name is the display name, and function_name your python file and function name. BICEP is designed to automatically look for the appropriate ensembling techniques listed in the DB. Your code to handle the new type needs to be located in ./backend/core/app/models/ensemble_techniques_implementation/<your_ensemble_technique_name>.py Then you will need to implement 2 functions which NEED TO ADHERE to the following naming convention:

<your_dataset_type_name>_get_benign_and_malicious_counts_of_labels_file <your_dataset_type_name>_get_positives_and_negatives_from_dataset This ensures that the system can find these. The documentation for these methods can be found in Ensembling Technique Implementation Afterwards, feel free to create a pull request to contribute to the framework.