Skip to main content

Introduction to Deep Learning with ANN & Perceptron

 

Introduction


  • To understand what deep learning is, we first need to understand the relationship deep learning has with machine learning, neural networks, and artificial intelligence. The best way to think of this relationship is to visualize them as concentric circles:

  • Neural networks are inspired by the structure of the cerebral cortex. At the basic level is the perceptron, the mathematical representation of a biological neuron. Like in the cerebral cortex, there can be several layers of interconnected perceptrons.

  • Machine learning is considered a branch or approach of Artificial intelligence, whereas deep learning is a specialized type of machine learning.


Why Deep Learning is Important?


  • Computers have long had techniques for recognizing features inside of images. The results weren’t always great. Computer vision has been a main beneficiary of deep learning. Computer vision using deep learning now rivals humans on many image recognition tasks.

  • Facebook has had great success with identifying faces in photographs by using deep learning. It’s not just a marginal improvement, but a game changer: “Asked whether two unfamiliar photos of faces show the same person, a human being will get it right 97.53 percent of the time. New software developed by researchers at Facebook can score 97.25 percent on the same challenge, regardless of variations in lighting or whether the person in the picture is directly facing the camera.”

  • Speech recognition is a another area that’s felt deep learning’s impact. Spoken languages are so vast and ambiguous. Baidu – one of the leading search engines of China – has developed a voice recognition system that is faster and more accurate than humans at producing text on a mobile phone. In both English and Mandarin.

  • What is particularly fascinating, is that generalizing the two languages didn’t require much additional design effort: “Historically, people viewed Chinese and English as two vastly different languages, and so there was a need to design very different features,” Andrew Ng says, chief scientist at Baidu. “The learning algorithms are now so general that you can just learn.”

  • Google is now using deep learning to manage the energy at the company’s data centers. They’ve cut their energy needs for cooling by 40%. That translates to about a 15% improvement in power usage efficiency for the company and hundreds of millions of dollars in savings.


ANN: Artificial Neural Networks

  • ANNs are inspired by biological neurons found in cerebral cortex of our brain.

    The cerebral cortex (plural cortices), also known as the cerebral mantle, is the outer layer of neural tissue of the cerebrum of the brain in humans and other mammals. - WikiPedia



    read more about cerebral cortex at this link

    NCERT reference

  • ANNs are core of deep learning. Hence one of the most important topic to understand.

  • ANNs are versatile, scalable and powerfull. Thus it can tackle highly complex ML tasks like classifying images, identying object, speech recognition etc.


Biological Neuron

Image source

  • Biological Neuron produce short electrical impulses known as action potentials which travels through axons to the synapses which releases chemical signals i.e neurotransmitters.

  • When a connected neuron recieves a sufficient amount of these neurotransmitters within a few milliseconds, it fires ( or does not fires, think of a NOT gate here) its own action potential or elctrical impulse.

  • These simple units form a strong network known as Biological Neural Network (BNN) to perform very complex computation task.


The first artificial neuron

  • It was in year 1943, Artificial neuron was introduced by-

    • Neurophysiologist Warren McCulloh and
    • Mathematician Walter Pitts
  • They have published their work in McCulloch, W.S., Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5, 115–133 (1943). https://doi.org/10.1007/BF02478259. read full paper at this link

  • They have shown that these simple neurons can perform small logical operation like OR, NOT, AND gate etc.

  • Following figure represents these ANs which can perform (a) Buffer, (b) OR, (c) AND and (d) A-B operation

    image source

    image.png

  • These neuron only fires when they get two active inputs.


The Perceptron

  • Its the simplest ANN architecture. It was invented by Frank Rosenblatt in 1957 and published as Rosenblatt, Frank (1958), The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain, Cornell Aeronautical Laboratory, Psychological Review, v65, No. 6, pp. 386–408. doi:10.1037/h0042519

  • It has different architecture then the first neuron that we have seen above. Its known as threshold logic unit(TLU) or linear threshold unit (LTU).

  • Here inputs are not just binary.

  • Lets see the architecture shown below -

    ArtificialNeuronModel english


  • Common activation functions used for Perceptrons are (with threshold at 0)-

    • $$step(z)\ or\ heaviside(z) = \left{
      01z<0z0


  • Single TLUs are simple linear binary classifier hence not suitable for non linear operation. This is proved in the implimentation notebook with coding of simple logic gates.
  • Rosenblatt proved that if the data is linearly separable then only this algorithm will converge which is known as Perceptron learning theoram

  • Some serious weaknesses of Perceptrons was revealed In 1969 by Marvin Minsky and Seymour Papert. Not able to solve some simple logic operations like XOR, EXOR etc.

  • But above mentioned problem were solved by implimenting multiplayer perceptron


Derivation:-


Let's assume that you are doing a binary classification with class +1 and -1

Let there be decision function ϕ(z)

which takes linear combination of certain inputs "x"

corresponding weights "w" and net input z=w1x1+w2x2+...+wnxn


So in vector form we have

so z=wTx


Now, if

for a sample x

ϕ(z)={+11 if zθ if z<θ

Lets simplify the above equation -

ϕ(z)={+11 if zθ0 if zθ<0

Suppose w_0 = - \theta and x_0 = 1

Then,

z=w0x0+w1x1+w2x2+...+wnxn

and

ϕ(z)={+11 if z0 if z<0


here w0x0 is usually known as bias unit






Comments

Popular posts from this blog

Concept of Percepton Activation Function

  What is Perceptron: A perceptron is a neural network unit (an artificial neuron) that does certain computations to detect features or  business intelligence  in the input data. And this perceptron tutorial will give you an in-depth knowledge of Perceptron and its activation functions. By the end of this tutorial, you’ll be able to: Explain artificial neurons with a comparison to biological neurons Implement logic gates with Perceptron Describe the meaning of Perceptron Discuss Sigmoid units and Sigmoid activation function in Neural Network Describe ReLU and Softmax Activation Functions Explain Hyperbolic Tangent Activation Function Let’s begin with understanding what is artificial neuron. Biological Neuron A human brain has billions of neurons. Neurons are interconnected nerve cells in the human brain that are involved in processing and transmitting chemical and electrical signals. Dendrites are branches that receive information from other neurons. Cell nucleus or Soma ...

Cost, Activation, Loss Function|| Neural Network|| Deep Learning. What are these?

  What is the difference between cost function and activation function? A cost function is a measure of error between what value your model predicts and what the value actually is. For example, say we wish to predict the value yi for data point xi. represent the prediction or output of some arbitrary model for the point xi with parameters θ. One of the many cost functions could be this function is known as the L2 loss. Training the hypothetical model we stated above would be the process of finding the θ that minimizes this sum. -An activation function transforms the shape/representation of the data g o ing into it. A simple example could be max(0, xi), a function which outputs 0 if the input xi is negative or xi if the input xi is positive. This function is known as the “ReLU” or “Rectified Linear Unit” activation function. The choice of which function(s) are best for a specific problem using a particular neural architecture is still under a lot of discussions. However, these repre...

Deep Learning: Which Loss and Activation Functions should I use?

Deep Learning: Which Loss and Activation Functions should I use? The purpose of this post is to provide guidance on which combination of final-layer activation function and loss function should be used in a neural network depending on the business goal. This post assumes that the reader has knowledge of activation functions.  What are you trying to solve? Like all machine learning problems, the business goal determines how you should evaluate it’s success. Are you trying to predict a numerical value? Examples: Predicting the appropriate price of a product, or predicting the number of sales each day If so, s e e the section  Regression: Predicting a numerical value Are you trying to predict a categorical outcome? Examples: Predicting objects seen in an image, or predicting the topic of a conversation If so, you next need to think about how many classes there are and how many labels you wish to find. If your data is binary, it is or isn’t a class (e.g. fraud, diagnosis, likely t...