4. Methodology4.1 ArchitectureArtificial Neural Networks are organized in layers with neurons in each layer. The connection patterns and information flow can vary (6). More specifically, there is an input layer, a hidden layer and an output layer. There can be multiple hidden layers in a neural network.4.2 Algorithms4.2.1 Decision Tree (DT) RegressionDecision Tree is a non-parametric method that can be applied for both regression and classification tasks. A hierarchical tree structure is implemented in order to construct a model for the prediction of a target variable (4). An ensemble of boosted DT learners can be used as base-level predictors (7).4.2.2 BackpropagationThis supervised learning algorithm is based on the back propagation of the error. Backpropagation is a generalization of the delta rule to multi-layered feedforward networks, made possible by the chain rule to iteratively compute gradients for each layer (6). This is one of the reasons why ANNs are widely regarded as the most successful machine learning technique, a flexible mathematical structure in order to identify relationships between input and output data. Especially when implemented in fields that does not have categorical and definitive data, as is the case with natural disasters.4.2.3 K-means ClusteringK-means is a fast, robust and a simple unsupervised learning algorithm well-suited to solve the clustering problem. Therefore, in this case it was used to cluster locations with similar chances of a disaster striking. Since the K-means problem is solved using Lloyd’s algorithm, it is most directly applied in the Euclidean plane, even though it can be applied to higher-dimensional spaces. 4.2.4 Random ForestThe Random Forests are an ensemble learning method for classification, regression and other tasks, where the model takes an average over a set of decision trees. In earthquake prediction this technique is applied to the continuous acoustic time series data recorded from the fault (3). Every DT predicts the time remaining before the next failure using a sequence of decisions based on the features. Despite the fact that the data was from laboratory earthquakes, accuracy of the model is quantified using R2 or the coefficient of determination. Furthermore, the ML analysis provided brand new insight to slip physics. Explicitly, the acoustic emissions that occurred long before failure was previously assumed to be noise and hence neglected.4.2.5 Support Vector Machines and Support Vector RegressionThe Random Forests are an ensemble learning method for classification, regression and other tasks, w4.3 Data Mining TechniquesAs the world is continuously immersed in the latest social media sites or the hottest apps, thus creating an abundance of data, which is hard to sift through even with more computing in the ML arsenal than ever before. Everything from Neural Networks, Support Vector Machines and Decision Trees have been extensively used as data mining models. The popular microblogging service Twitter has become a valuable source for real-time detection and understanding the needs of the affected people (2). This is due to the fact that information can be easily retrieved from news accounts with APIs, feeds or even by web scraping. Furthermore, each user can be viewed as a sensor and both Kalman and Particle Filtering can be applied. These are common techniques in location based computing and can predict typhoon trajectories to estimates of earthquake epicenters (9).4.4 Deep Hybrid ModelTechnology can be blinded by the laws of nature, where phenomenons such as isolated thunderstorms and jetstreams are extremely volatile and powerful. The hybrid model has 3 main components, with the first one as a set of individual predictors trained using historical data. The machine learning algorithms mentioned in the previous section can be applied to the data in order to build the predictors. Secondly, the inferences that emerged with the individual predictors can be refined by constraining the output. Lastly, the deep belief network can produce a solution based on the variables (7).4.5 Convolutional Neural NetworkConvolutional Neural Networks are very similar to typical neural networks except that it makes the explicit assumption that the inputs are images, which then makes it possible to encode certain properties into the architecture. This recent advancement in the field of AI, is why highly scalable CNN for earthquake detection using a single waveform is possible. The algorithm is also orders of magnitude faster than existing methods.Earthquake detection is a supervised classification problem and the risk profiles of countries were created with the K-means clustering algorithm. The dataset can then be split into a test set and a training set. However, deep classifiers such as this have many trainable parameters, which require plentiful examples of each class in order to avoid overfitting and generalize correctly in real-world scenarios. The network then creates a probabilistic map of the earthquake location by computing a probability distribution over the classes (13).5. ResultsIn this section, the outcomes of the methodology that was outlined in this research is discussed. For a prediction algorithm to be useful in an operational environment, it needs to consistently provide skilled predictions. However, using social media data to predict natural disasters can provide unreliable results due to the fact that multiple earthquakes are taking place within the same time frame and millions of users tweeting about it. Although, the experiments conducted by (9) results in 96% probability of detecting an earthquake of seismic scale 3 or higher.For natural disasters, architectures that can automatically learn from exceptional representations of raw data should be considered. Therefore, the Deep Belief Network (DBN) consisting of stacked Restricted Boltzmann Machines are ideal. The deep belief network reduces the error by another 1-2%. Following the unyielding superiority, the accuracy was compared with the baseline model, with the hybrid incomparably outperforming the baseline (7). When evaluating the results for the Convolutional Neural Network, the probability distribution prediction is made with 99% confidence. Even though, this method requires a colossal amount of historical earthquake data and is virtually the only limitation. Moreover, something that can curtail the computation is to choose a small set of short representative waveforms as templates, which can then be correlated with the full length continuous time series as suggested by (13).