Dataset of Publication "Malware Communication in Smart Factories: A Network Traffic Data Set"

DOI

Machine learning-based intrusion detection requires suitable and realisticdata sets for training and testing. However, data sets that originate fromreal networks are rare. Network data is considered privacy sensitive and the purposeful introduction of malicious traffic is usually not possible. In thispaper we introduce a labeled data set captured at a smart factory locatedin Vienna, Austria during normal operation and during penetration tests with differentattack types. The data set contains 173 GB of PCAP files, which represent 16 days (395 hours) of factory operation. It includes MQTT, OPC UA, and Modbus/TCP traffic. The captured malicious traffic was originatedby a professional penetration tester who performed two types of attacks: (a)aggressive attacks that are easier to detect and (b) stealthy attacks that areharder to detect. Our data set includes the raw PCAP files and extractedflow data. Labels for packets and flows indicate whether packets (or flows)originated from a specific attack or from benign communication. We describethe methodology for creating the data set, conduct an analysis of the dataand provide detailed information about the recorded traffic itself. The dataset is freely available to support reproducible research and the comparabilityof results in the area of intrusion detection in industrial networks.

File description:

a_day1, a_day2, s_day1, s_day2, tf_a and tf_s: Main data set, where files starting with "tf"  are training files  containing only benign, operational data and all other files are attack files containing both, operational data and attack data.

images.zip: Contains descriptive images about the data.

extractions.zip: Contains extracted packets, flows in both labeled and unlabeled form.

a_day_tuesday_dos.zip: additional day of attack traffic containing benign and attack data, including a DoS attack. This day is not labeled.

Identifier
DOI https://doi.org/10.48436/vs6hv-1vs74
Related Identifier IsVersionOf https://doi.org/10.48436/b0k3b-77a55
Metadata Access https://researchdata.tuwien.ac.at/oai2d?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:researchdata.tuwien.ac.at:vs6hv-1vs74
Provenance
Creator Brenner, Bernhard; Fabini, Joachim ORCID logo; Offermanns, Magnus; Semper, Sabrina; Zseby, Tanja (ORCID: 0000-0002-5391-467X)
Publisher TU Wien
Publication Year 2024
Rights Creative Commons Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/legalcode
OpenAccess true
Contact tudata(at)tuwien.ac.at
Representation
Language English
Resource Type Dataset
Version 1.0.0
Discipline Other