Currently, a dataset that contains sounds specific to manufacturing and/or assembly environment is not available. Within the Master's thesis "Sound is Context: Acoustic Work Step Classification using Deep Learning", a suitable dataset was created and open-sourced to public and research. The dataset contains a selection of typical production activities, which can be classified based only on their typical sounds. The recordings were performed in the Pilot Factory of the Vienna University of Technology.
The chosen production activities/sounds are:
Grabing screws from a box
Sanding
Filing
Hammer
Cordless screwdriver
Press drill
Bench grinder
The recording devices were the built-in microphone of a iPhone 13 Mini Smartphone and the Apple AUX-Headset microphone on a ASUS ZenBook 14 UM433I laptop. As smartphone recording applications, the standard Voice Memos App from Apple and the Voice Recorder & Memos Pro App from Linfei Ltd. were used, and on the laptop the standard Windows Voice Recorder was used. The different recording positions were 80cm away from the noise source, close (meaning 0-30cm), at the chest pocket, on a shelve above, reverse (meaning microphone was facing away or had an obstacle in between), or mixed (which means a variety of the before mentioned positions). To produce the sounds, the tempo, the intensity, the movement pattern, the rhythm, the hand, etc. were changed regularly, furthermore, due to the current work operation in the pilot factory, a wide variety of background noises were also recorded, which additionally increases the variance.
The recordings are available raw, without any pre-processing. Please refer to the thesis for information about pre-processing and the classification model.