A Primer: Predictive Maintenance Using Sound

4 min readMar 9, 2020

Hidden in plain sight, motors are integral part of human civilization. Motors can be found in your laptop, your car, your house, your school, and your watch. Without motors, your beloved products cannot be made. According to the U.S. Department of Energy, industrial motor use accounts for 25 percent of all electricity usage nationwide.

Similar to motors, sound is everywhere and similar to motors, some sounds cannot be heard. With sensors, sounds can be analyzed to detect failures, thus preventing major breakdown before they become a major issue.

Traditional methods of predictive maintenance are categorized into three sections: 1) Recognizing abnormal sounds using conventional hardware tools like a ultrasonic microphone before machines start to fail. 2) Detection of discharge or leakage detection using infrared cameras 3) Using sensors that can recognize pressure, temperature or vibration changes. Recent advances in Artificial Intelligence (AI) has changed the landscape in how predictive maintenance would be optimized.

In this blog, we will design a deep learning architecture such as Convolution Recurrent Neural Network (CRNN) to deal with high frequency data, which would be processed from Spectrograms. A CRNN model learns from both the spatial and recurrent structure of our data simultaneously.

Collect Data using Sensors

With thousands and thousands of high-frequency sound clips, machine learning can be used to predict when a motor would breakdown. A CRNN model will then be created in conjunction with signal process techniques to enable us extract value from high frequency data. We first collect data by using sensors and we would classify sounds into four classes: optimal pressure, slightly reduced pressure, severely reduced pressure, close to total failure).

Process Data using Spectrograms

We deal with raw data by splitting the raw data into fixed and equal labeled sequences. We adopt Spectrograms to preprocess these signal pieces and feed these pieces into the CRNN model. Spectrograms are time-frequency portraits of signals. They are basically a plot of the frequency intensity of the signal as time progresses. In other words, we are building an AI empowered time-series model that can map the data behaviors in a high-frequency domain. In order to standardize our sequence, we need to reduce the magnitude of our series; a simple differentiation, plus a proper clip to limit extreme variations.

Below is a sample graph of a raw and a standardized sequence composed of 6 different pressure signals:

To compute spectrogram in Python, we use the library called Librosa. A transformation is operated for every pressure series at our disposal, so we ended with a sequence of 6 spectrograms for every sample instead of a sequence of raw signals.

The AI CRNN Model

CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network) can both be used together and they are not mutually exclusive, as both can perform classification of image and text inputs, creating an opportunity to combine two network types for increased effectiveness. Where CNN cannot process visually complex with added temporal characteristics, RNN would come in and resolve processing issues.

Below is the architecture of a CRNN:

The combination of CNN and RNN are sometimes called CRNN. Inputs are first processed by CNN layers whose outputs are then fed to RNN layers. Optical Character Recognition and/or audio classification often use this type of hybrid model.

In this exercise, we feed CRNN with spectrograms we had previously generated to detect the working status of a motor. Each observation of the motor is now composed of stacked spectrograms (6 in total, one for each pressure signal). The script to do this is shown in Keras:

def get_model(data):
    
    inp = Input(shape=(data.shape[1], data.shape[2], data.shape[3]))
    
    x = Conv2D(filters=64, kernel_size=(2, 2), padding='same')(inp)
    x = BatchNormalization(axis=1)(x)
    x = Activation('relu')(x)
    x = MaxPooling2D(pool_size=(2, 1))(x)
    x = Dropout(0.2)(x)
    
    x = Permute((2, 3, 1))(x)
    x = Reshape((data.shape[2], -1))(x)    x = Bidirectional(GRU(64, activation='relu', 
                          return_sequences=False))(x)
    x = Dense(32, activation='relu')(x)
    x = Dropout(0.2)(x)
    out = Dense(y_train.shape[1], activation='softmax')(x)
    
    model = Model(inputs=inp, outputs=out)
    model.compile(loss='categorical_crossentropy', 
                  optimizer='adam', metrics=['accuracy'])
    
    return model

At the first stage, the network extracts convolution features from the spectrogram plots, which is the structure, frequency x time x n_features. To pass through the recurrent part, we need to reshape our data in the format x n_features: time. The new n_features is the result of the flattering operation compute on the convolutional n_features and frequency. Consequently, our model achieved around 86% in accuracy.

Summary

In this blog, we learned about traditional methods of predictive maintenance and slightly touched on the AI applications of this field. As AI advances, the reliance on hardware sensors will be less in the future.

Want to learn more about predictive maintenance, be sure to send a message on Growthbotics!

A Primer: Predictive Maintenance Using Sound

Written by Growthbotics

No responses yet