SRNs and some other learning models

Discussion of
Stoianov & Zorzi’s
Numerosity Model
Psych 209 – 2017
Feb 16, 2017
Stoianov and Zorzi -- Questions
• What is the learning algorithm used here? Can you explain
and/or do you have questions about how training works in this
network?
• Consider the learning algorithm -- do you think it is biologically
plausible? Computationally interesting?
• The authors claim numerosity detectors emerge in the second
layer of the network. How do they support this claim? What
comments or questions do you have about this?
• The authors claim the model accounts for aspects of human
numerosity judgments. How do they support this claim?
• Consider the training set the authors used. Do you think their
choices of the nature of the training and testing stimuli
influence their results? Do their choices influence your
judgment of whether their model explains how numerosity
sensitivity might arise in humans and non-human animals?
S&Z’s network
• Greedy layer-wise
training using this rule:
• Large training set with
different numbers of
blobs and blob size
varying for different items
in the set
Why a Deep Network?
• We need multiple layers to capture
conjunctions of features effectively
• And to separate what should be pulled
apart by treating what should be merged
as the same.
• How can we learn a deep network?
• Supervised learning – CNN’s
• Unsupervised learning – DBN’s
The deep belief network vision
(Hinton)
• Consider some sense data D
• We imagine our goal is to
understand what generated it
• We use a generative model
• Search for the most probable
‘cause’ C of the data
Cause
– The one in p(D|C)p(C) is greatest
• How do we find C?
• Minimize contrastive divergence
or KL divergence between
generated and observed states.
• The KL divergence of Q from P is
given by the equation below.
• For us, P(i) indexes the actual
probabilities of states of the world,
Q(i) indexes our estimates of the
probabilities of these states.
Data
Stacking RBM’s
• ‘Greedy’ layerwise
learning of RBM’s
– First learn H0 based
on input.
– Then learn H1 based
on H0
– Etc…
– Then ‘fine tune’ says
Hinton – but maybe
the fine tuning is
unnecessary?
Digit Recognition
Movie
http://www.cs.toronto.edu/~hinton/adi/index.htm
Stoianov and Zorzi Model,
training data, and unit analysis
Basis Functions and Numerosity
Detectors
Results of simulation of
numerosity judgment task