Goodware Dataset used in "Disarming Visualization-Based Approaches in Malware Detection Systems"

This is one of the datasets used in the experiments of the paper: Fascí, L. S., Fisichella, M., Lax, G., & Qian, C. (2023). Disarming visualization-based approaches in malware detection systems. Computers & Security, 126, 103062. https://www.sciencedirect.com/science/article/pii/S0167404822004540It contains 2.000 goodware files. We collected .exe files both by web scraping on different platforms (DriverPack Solution, Filehippo, Major Geeks, Portable Freeware, Softonic) and by using executables of two virtual machines right after installing the 32-bit version of Windows 8 and 10, respectively. To ensure that the collected .exe files are not malware, we scanned each file with VirusTotal software.The password to open the archive is Ben1gN@D$!?consists of both malware and goodware. The malware samples are from the MalImg dataset (Nataraj, Karthikeyan, Jacob, Manjunath, Nataraj, Karthikeyan, Jacob, Manjunath, 2011). As for goodware, no datasets and no direct reliable sources of safe software were found. Therefore, we collected .exe files both by web scraping on different platforms (DriverPack Solution, Filehippo, Major Geeks, Portable Freeware, Softonic) and by using executables of two virtual machines right after installing the 32-bit version of Windows 8 and 10, respectively. To ensure that the collected .exe files are not malware, we scanned each file with VirusTotal software, as done by Pinhero et al. (2021). In total, we collected about 2000 samples, which are available at Repository (2022).

THIS DATASET IS ARCHIVED AT DANS/EASY, BUT NOT ACCESSIBLE HERE. TO VIEW A LIST OF FILES AND ACCESS THE FILES IN THIS DATASET CLICK ON THE DOI-LINK ABOVE

Identifier
DOI https://doi.org/10.17632/cv3v9szdn7.1
PID https://nbn-resolving.org/urn:nbn:nl:ui:13-gv-axls
Metadata Access https://easy.dans.knaw.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:easy.dans.knaw.nl:easy-dataset:339672
Provenance
Creator Lax, G
Publisher Data Archiving and Networked Services (DANS)
Contributor Gianluca Lax
Publication Year 2024
Rights info:eu-repo/semantics/openAccess; License: http://creativecommons.org/licenses/by/4.0; http://creativecommons.org/licenses/by/4.0
OpenAccess true
Representation
Resource Type Dataset
Discipline Other