Arjun with a catastrophical problem of deriving meaning

      Arjun Sridharkumar     [email protected]       Harshad Kulkarni                                            [email protected]                    Anish Philip       [email protected]                                                                                      Kirtana Nair          [email protected]              Hari Chavan             [email protected]ment of Information TechnologyFr. C. Rodrigues Institute of TechnologyVashi, Navi MumbaiAbstract – In today’s information era a lot of information is relayed through emails for faster,better and secure communication. This results in a copious amount of emails which might hinder the understanding of information. This information needs to be relayed to the user in a precise manner . A neural network can be used to retrieve said pertinent information. This neural network will perform summarisation which will aid in shortening the data to relevant content. However merely summarising it wont encompass the writer’s intent. In order to do so the expert neural network’s context based understanding will improve precision of results as it will be trained based on past email correspondences between two entities.Keywords- Summarization, Neural Networks, Sentence Similarity, Semantic Similarity, Classification, Clustering, Word2Vec.IntroductionInformation has become the new currency. Hence it is pertinent that we derive proper meaning from information. Text summarization primarily deals with deriving meaning from large amounts of data. Abstractive and Extractive summarization are the two main methods for achieving summarization. The former aims on generating an abstract that represents the whole text while the latter focuses on selecting sentences that encapsulate the essence of the input text. Our CTSM employs extractive summarization technique. Sentence similarity is being used as a measure for selection of sentences from the text.The CTSM consists of a single hidden neural network as shown in Fig (). It calculates the sentence similarity between each sentence from the input text and all the sentences in the last 20 mails. Sentences having the highest sentence similarity are included in the summarised output..Motivation    The information age has presented us with a catastrophical problem of deriving meaning from large amounts of information.The amount of information produced is increasing exponentially and can soon result in an information explosion . This explosion of information does not yield any insights instead  limits the storage and processing capacity. Summarization provides an unique solution to tackle this problem. Through Summarization we are reducing the amount of unnecessary information by decreasing the amount of text without loss of meaning. Mails have become the primary mode of communication. Sometimes mails tend to be very long and take up a huge amount of time when read  . Our CTSM helps in reducing the content without changing the meaning so as to save the end users time. Also storage of these long mails is an expensive affair and can be avoided with our CTSM. Further, Spam filtering techniques can be improved if they are applied to a  summarized version of the email. Our proposed summarizer helps in improving system performance by reducing time and space consumption.TECHNOLOGIESA. Working1) A training dataset of news headlines and articles is used to train the neural network. 2) In order to relate to the context different methods are used .Each of these method are implemented using separate neural networks.Tensorflow 5     Tensorflow is a symbolic math library used for machine learning applications. The CTSM will use this framework to train  the neural network because of the user support, available documentation and its flexibility.  the incremental training of neural network is feasible using partial subgraph computation,of tensorflow. This features gives us na estimation of performance and implementation of different neural networks.These notable features make Tensorflow a suitable choice as  a deep learning framework in the CTSM. B. Sentence SimilaritySentence Similarity is a measure  that refers to the alikeness of two sentences. It can be calculated using individual word similarity based on distance metrics such as Euclidean, Cosine distance, etc. Eg: Assume following  two sentences:1) The water started receding towards the bank.2) The bank remains open on sundays. A distance based metric may indicate some similarity between these two sentences.The first sentence refers to the riverbank while the second one refers to the bank that handles your money. To understand the difference between the usage of the word “bank” semantic similarity would be needed which is achieved using Word2Vec. Word2Vec6 Word2Vec is a  shallow two layer neural network that is  pre-trained and used for determining linguistic and semantic relationships. It produces word embeddings. Word embeddings are mapping of words to real number vectors. Eg: King + Woman = Queen.A king whose gender is of a woman is a queen. Word2Vec can be used to obtain relations as shown in the above example.CTSM The CTSM aims to implement contextual similarity by taking into consideration the last  20 mails from the same sender. The System design  The neural network will be trained on the news headlines dataset by (). This dataset con tains the articles along with the headlines. The headlines act as the summarized version of the articles. Training on such a dataset will not only provide us with performance insights but also the neural network will be able to do general summarization.3)A single layer neural network can be depicted with each neuron representing the method used. The final output node uses sentence similarity between the outputs of the neurons from previous layer.System DesignArchitectureFigure 2 Architecture of the CTSM (Steady optimized airflow system)The following are the notations for the above diagram:1)   2)   3) Figure 2 shows the architecture of the CTSM. The CTSM has two major phases concerned with -Parallel Neural Network LearningSequential Simulation of Test Dataset1) Parallel Neural Network LearningThe following are the modules which will be implemented a) Convolutional Neural NetworkThis module will take the training dataset of boundary parameters as input to a deep learning framework. The deep learning framework proposed is TensorFlow. The module will train a neural network using the feed forward learning approach. Gradient flow has been shown in the Figure 2 using orange arrows. Gradient is the error that is backpropagated through the layers of the neural network. This will keep updating the neural network until there is no error. The output of this module is a trained model.2) Sequential simulation of test datasetThe following are the modules comprising the sequential phase.a) Trained modelThis module is a trained neural network obtained as an output of the training module. The trained model is a dynamic system that takes new data other than training data (but with sameparameters) as input. This module will take an external test dataset as input. The test dataset consists of car boundary parameters like weight, temperature, speed and pressure requirements of a new car to be designed. This module will take an external test dataset as input. It will use the learned model to smoothen the design of the car such as to minimize the value of drag and lift. The output of this module will be the new optimized car design parameters of the test dataset cars. This will be sent as an input to the fluid simulation module.b) Output simulation moduleThe optimized car design parameters of the test dataset cars will be the input to this module. This module consists of a fluid simulation software. This software accepts the boundary parameters of the car as input. It processes these parameters and produces a simulation of steady optimized airflow over the car as the final output.Objectives and GoalsThe objective of this system is to train a neural network to minimize horizontal drag and vertical lift and optimize the airfoil angle values for steady aerodynamic flow in car model designs. The system aims to:Modify car designs for smooth frontal airflow over it.Reduce drag and lift in cars.Reduce the time required for the working of a neural network.Visually illustrate the improved airflow over cars.RELATED WORKThere have been several approaches to solving the car aerodynamic problem by analysing steady state flow using neural networks.Steady State Flow with Neural NetsNeural networks for BEM analysis of steady viscous flowsParallelization of Steady State Flow with Neural  Networks i) Steady State Flow with Neural Nets 8 The project is named “Steady-state-flow-neural-nets” and is executed by Oliver Hennigh .The projects basic premise is to utilize the steady state fluid flow over boundary conditions of a car. The boundary conditions are defined as the design parameters of a car. Steady-state flow refers to the condition where the fluid properties at any single point in the system do not change over time. These fluid properties include temperature, pressure, and velocity. The project uses a deep learning framework called Caffe. The use of this framework optimises the self learning capabilities of the algorithm. Initially the algorithm is given 700MB sample car data set which unstructured data, on this data it learns and establishes basic relationships between the car design and drag/lift. After several iterations of learning based on the data set provided, this relationship becomes more generalized. So basically when a new model is given as an input the algorithm output is determined by comparing it with what it has learnt.    ii) Neural networks for BEM analysis of steady viscous    flows 9 This paper by Nam Mai-Duy and Thanh Tran-Cong presents a new neural network-boundary integral approach for analysis of steady viscous fluid flows. Indirect radial basis function networks (IRBFNs) which perform better than element-based methods for function interpolation, are introduced into the BEM scheme. IRBFNs are networks that recursively reduce the error function. This is to represent the variations of velocity and traction along the boundary.  iii) Parallelization of Steady State Flow with Neural    Networks 10 The goal of this system is to train a neural network to minimize horizontal drag and vertical lift and optimize the airfoil angle values for steady aerodynamic flow in car model designs. The system proposed by Oliver Hennigh makes use of Caffe to implement steady state flow. Caffe comes with certain drawbacks. In Caffe, each node of neural network is a layer. In TensorFlow, each node is a tensor operation (e.g. matrix add/multiply, convolution, etc.). A layer can be defined as  a composition of those operations. What this means is that the building brick in TensorFlow is smaller than the building brick in caffe. That’s why caffe is not considered flexible because for new layer types, you have to define the full forward, backward, and gradient update. The system proposed by M. M. Grigoriev and A. V. Fafurin uses sequential computation instead of neural networks which does not make the efficiency of the system to be strictly increasing. The system proposed by Nam Mai-Duy and Thanh Tran-Cong uses neural network implementation but will be much slower than the CTSM as the CTSM will run on a parallel architecture.ConclusionThe CTSM consists of a neural network which will help information retention by summarizing emails. Summarization will be done by retrieving meaning from the contents of the email. Meaning of the content will be derived based on the context. The output will be a small paragraph that summarizes the email contents.forwardingresponseThe CTSM consists of a neural network which will be trained to minimize drag and lift for steady aerodynamic flow in cars. The output will be a 2-dimensional fluid simulation demonstrating steady air flow with the optimized drag and lift values. The CTSM will also use parallel architecture to increase the computational speed thereby saving time. References1U.S. Dept. Of Transportation, “Federal Motor Vehicle Safety Standards and Regulations”. National Highway Traffic Safety Administration. Office of Vehicle Safety Compliance, Washington, DC. 2016.2Pouster, J. “Percentage of households that have a car”. Pew Research Center, Washington, DC. Dated 23-10-2017.3Poese, Smith, Garrett, Gerwen & Gosselin. “Annual Global Road Crash Statistics”, Association for Safe International Road Travel (ASIRT).4 OpenCFD homepage – Mai-Duy, N. and Tran-Cong, T. (2003), Neural networks for BEM analysis of steady viscous flows. Int. J. Numer. Meth. Fluids, 41: 743–763. doi:10.1002/fld.46910 Grigoriev, M. M. and Fafurin, A. V. (1997), A boundary element method for steady viscous fluid flow using penalty function formulation. Int. J. Numer. Meth. Fluids, 25: 907–929. doi:10.1002/(SICI)1097-0363(19971030)25:8<907::AID-FLD592>3.0.CO;2-T