I. Introduction

An artificial neuron network (ANN) is a computational model that is dependent on the intellectual capacity, establishment and progress of biological neural networks. As a neural network changes based on inputs and outputs, the information that circulates or travels through the network affects the legislature or structure of the ANN. ANNs are contemplated as nonlinear statistical data modelling tools, that may lead to neurosis relationships between inputs and outputs patterns. ANN is also known as a neural network.

A statistical model called as neural processes which has the following eminence:

1. The model contains sets of adaptive weights, i.e. numerical parameters that are tuned by a learning algorithm, and

2. Its capability is to approximate non-linear functions with their inputs.

An instance of neural network is, the human brains which can be describe as a biological neural network—an interconnection snare of neurons transmits highly structured mould of electrical signals. Dendrites receive input signals and, based on individual inputs, blaze an output signal via an axon. How the human brain actually works is an elaborate and complex mystery.

In Fig. 1 shows the human nerve cell or neuron (a) and its artificial equivalent (b) are sketched. Fig. 2 depicts the representation of neural set of ANN. The neuron receives a set of input signals through a number of dendrites. At the edge of each dendrite the input signal is weighted with a factor ‘w’, which can be positive or negative. All the input signals are weighted with a factor ‘w’, which can be positive or negative.

All the signals from the dendrites are added in the cell body in order to furnish to a weighted sum of inputs of the neuron. If the weight is positive the corresponding input will have an excitatory influence on the weighted sum. If the weight is negative weight, then the input reduces the weighted sum and is inhibitory. In the cell body, the weighted sum of inputs is compared to a threshold value. If the weighted sum is above threshold, then the neuron sends a signal through its output to all connected neurons.

II. Strategies for learning Artificial Neural Networks

• Supervised Learning : It’s an essential, a strategy that involves a teacher that is smarter than the network itself. Example for supervised learning is shown in fig. (3). For example, consider the facial recognition system. The teacher demonstrates the network with a bunch of faces, where the teacher already knows the name associated with each face. The network makes its guesses, and then the teacher provides the network with the answers. The network can then compare its answers to the known “correct” answers and make adjustments according to its errors.

• Unsupervised Learning: this is required when there isn’t an example data set with known answers. Imagine the searching for a hidden pattern in a data set. An application of this is known as clustering, i.e. partitioning a set of elements into groups according to some unknown pattern.

• Reinforcement Learning: This strategy is based on observation. For example, a little mouse running through a maze. If the mouse it turns left, it gets a piece of cheese; if it turns right, it receives a little shock. Presumably, the mouse will learn to turn left over time. The decision is made by its neural network with an outcome and observes its environment. If the observation is negative, the network can adjusts its weights in order to make a different decision the next time. Reinforcement learning is popular in robotics. For example, At time’t’, the robot performs a task and observes the results. One can view at reinforcement learning in the context of the simulated steering vehicles.

III. Neural Networks In Data Mining

Neural networks are non-linear statistical data modelling tools. They can be utilized to find various patterns in data. With the help of neural networks as a tool, data warehousing firms are gathering information from datasets. This process is known as data mining. The major difference between the data warehouses and ordinary databases is that there is huge and actual data manipulation and cross-fertilization of the data, which helps the users to make more informed decisions.

Neural networks mainly consist of three components: the architecture or model; the learning algorithm; and the activation functions. Neural networks are programmed or “trained” to “. . . store, recognize, and associatively retrieve patterns or database entries; to solve combinatorial optimization problems; to filter noise from measurement data; to control ill-defined problems; in summary, to estimate sampled functions when format of the functions are not known.” It is precisely these two abilities (pattern recognition and function estimation) which make Artificial Neural Networks (ANN) so prevalent a utility in data mining.

3.1. Feed forward Neural Network (FFNN): This is one of the simplest neural networks. FFNN consists of three layers: an input layer, hidden layer and output layer. FFNN is shown in Fig. (4). from the figure, in each layer there are one or more processing elements. The use of processing elements is to simulate the neurons in the brain and this is the reason why they are often referred to as neurons or nodes.

3.2. Recurrent Neural Network: Recurrent Neural Network (RNN) which is represented in fig.(5) contains at least one feed-back connection, so the activations can flow round in a loop, which enables the networks to do temporal processing and learn sequences, e.g., perform sequence recognition/reproduction or temporal association/prediction. Recurrent neural network architectures can have many different forms. One common form is, it consists of a standard Multi-Layer Perceptron (MLP) with added loops. These can exploit the powerful non-linear mapping capabilities of the MLP, and also have some form of memory. Others have more uniform structures, potentially with every neuron connected to all the others, and may also have stochastic activation functions.

For simple architectures and deterministic activation functions, learning can be achieved using similar gradient descent procedures to those leading to the back-propagation algorithm for feed-forward networks