úterý 29. listopadu 2011

Mental rotation task - history and my experiment

The other part of my research is cognitive neuroscience. I focus on mathematicaly gifted people and their performance in various cognitive tasks.

First task which I have chosen is mental rotation. Why? Because it was very good described and I thought that it is a good starting point for my research.

Description of the task

Mental rotation task is the task where we have to represent the image and generate our mental representation of it. The subject see two forms which are rotated by some degree and he has to decide whether are these forms same or the mirror images.

Previous research

Shepard and Metzler (1971) introduced the concept of mental rotation and confirmed that there is an increasing linear relation between reaction times and angular disparity between the two stimuli. These finding confirmed the Shepard's hypothesis that the menatal rotation is represented in analog format (as we see the image) and is rotated by unitary process (holistic theory). The second analog theory is a "piecemeal" theory presented by Kosslyn (1981) which proposes that the image is first divided into parts and afterwards is each part rotated sequentialy. Besides these analog theories there exists also propositional theory which supposes that the image is representated in the abstract propositional format. Both piecemeal theory and propositional theory predict that there will be an increase in the reaction time while the stimuli difficulty is increasing.

Pic 1: Mental rotation task (Shepard and Metzler)

Five stages of mental rotation process were described:
- stimulus coding
- generation of the mental image
- mental rotation and matching
- decision whether the stimuli and mental image are matching or not

Afterwards huge amount of research was done comparing reaction times and error rates between different groups of people (women vs. men, nationalities, artists vs.nonartist, whether mental rotation abilities predict choice of field of surgery etc.).

Experiments with EEG (Gill et al.,1998) confirmed componential aspects of the mental rotation and localized corresponding regions of the brain. Rotation of an internal image may be mediated by the left temporal region.

Neuroimaging studies have showen that the mental rotation is mediated primarily by parietal lobes. O'Boyle et al. (2005) demonstrated in their fMRI study that mathematically gifted male adolescents engage different brain structures than those avarage gifted when performing 3D mental rotation.

Pic 2: Brain activity during the mental rotation task

My stimuli

A set of special stimuli with different complexity was created. Each stimulus consists of basic units (squares or cubes). We presented 228 pairs of 2D stimuli and 140 pairs of 3D stimuli (rotated by multiple of 60°) with an increasing complexity. 3D stimuli were composed so that rotation around different axis was needed.

Pic 3: My 2D and 3D stimuli used for MR task

After reaching the highest complexity (2D stimuli) we added stimuli with modifications and observed whether there is an increase in error rates. (it means that the minor sign was mirrored while the major sign was normal - it could confuse subjects and they could say that stimuli are same).

Pic 4: Example of mental rotation task - pair of 2D stimuli without modifications(A) and with modification in the minor sign(B)

Why I've decided to create this new set of stimuli? Because I needed some very well descriable stimuli where I can do simple modifications. I also wanted to have the same objects in 2D and 3D.

Experiment

1,During the whole experiment the EEG activity was measured.
2,Mental rotation task
- First was measured the baseline activity during closed and opened eyes and blind stimuli were presented
- Example stimuli for 2D
- Mental rotation task with 2D stimuli with an increasing complexity
- Example stimuli for 3D
- Mental rotation task with 3D stimuli
3, Questionnaire which asked subjects what strategies they had used, what is their mathematical education and abilities, sex, health, which food additives, nootropics or psychopharmaca they intake
4, Intelligence test - Raven (WAIS IV would be better, but this was used because it is easier and it is the entering test to Mensa)
5, Analysis of reaction times and error rates
6, Processing and analysis of an EEG signal

úterý 22. listopadu 2011

Episodic-like memory and how birds store their food

Today we visited department of Neurophysiology of Academy of science and there we listened to two presentations and one was about signal transfer and memory storage and processing.

When speaking about the memory in human, we can distinguish 2 distinct categories:
Implicit memory (perceptual, procedural, priming)
Explicit memory (autobiographical or episodical)

All the thing about memory is very interesting, beginning with the H.M.patient who was living without hippocampus and from the operation on he wasn't able to transfer things from short-term to long-term explicit memory eventhough he was able to learn some skills (implicit memory). But these things as well as the signal transfer in neurons and the changes in synapses or neurotransmitters aren't what I'd like to speak about right now.


One thing which was particulary interesting for me was episodic-like memory in animals, especialy in birds. Surely we can't ask them what and how they remember. But there are few experiments that can tell us more about their memory and thoughts. They are able to store their food in some specific places before winter (hunderds of places) and then during the winter they will recall where is their food located and search for it - they can remember where. Further they can store not only seeds (peanuts) but also waxworms and when they can't go for the food for a few days, they will then go directly to the places where are seeds, because they know that worms aren't fresh by that time - so they can remember what and how. There is another very interesting thing about birds. They steel the food to each other sometimes. And when the bird has been already robbed then its behavior is different than the behavior of the one who haven't met the stealer yet. So they can remember the episodes...This research was done by Dickinson 1998 on Western scrub jays. It is called episodic-like because there is no proof of autonoetic consciousness.

Following research tried to find such a behavior in mammals. Nowdays there are many tests with rats who try to remember where in the maze is located the food or where the adversible stimuli will be. They can remember this based on their egocentric frame or on allocentric frame based on some signs around the "maze".

One problem about these experiments is that it is not spontaneous, but it is based on repetition. Speaking about repetition we should forgot on habitutation.

Further info:
http://www.dandydesigns.org/id21.html
http://en.wikipedia.org/wiki/Hoarding_%28animal_behavior%29
Episodic-like memory

čtvrtek 17. listopadu 2011

Video in Matlab

This will be very short post. Just to write how to do simple video in Matlab. Are you able to produce figures? Then it is no more work.

Into the variable F we will save frames (figures). When all figures are loaded, then we can play the movie by function movie. It can be also exported to avi format.
And just one little note about very useful function cat - it will join two variables together. It can be vectors, matrices or stuctures like structure F and F1 containing frames. Very important is the number specified in cat function, which says along which dimension the variables will be joined.

Some useful Matlab functions for plotting

Because there are some conferences where I need to present my results, I had to learn how to create nice graphs and plots in Matlab. I haven't learnt all the possibilities yet, but I decided to share what I've explored.

So first we'll play with boxplots.

sobota 12. listopadu 2011

How to initialize the centers of models in GMM

Today I have been dealing with a big problem which is the initialization of the centers of new clusters in GMM. Other topic is how to find the optimal number of clusters.

Many approaches to these problems have been proposed, but any of them is the best one. As the others also I have been thinking for a while what could be the best way to find the right number of clusters. I have developed some algorithm based on the likelihood function, which I'd like to deal with you. I know it isn't optimal, but it could be a direction if not the way.

On this address you can see the video with one of my first trials.
http://www.youtube.com/watch?v=ZBKbTAmVT2k

The more difficult example:
http://www.youtube.com/watch?v=UuHHpJjyPIE


What can be seen on this video are few problems which we have when programming some automatic algorithm for GMM. It tends to overfit the data - which means that more clusters are found that would we like to. Another problem is that when the data are not gaussian and we have only few of them then it is obvious that the algorithm wouldn't work as well as we would liked to.
So there are few tasks for every researcher - find out some tradeoff between the log-likelihood which is decreasing when the models are added and the number of models - we prefer less models.

The algorithm is:
1.initialize first cluster in the mean of data
2.run EM algorithm (GMM) to find optimal mean of the model/s and covariance matrices
3.for each datapoint i look for the highest likelihood among models and save it to the vector lm(i)
4.make the histogram of lm (there is the problem how to find the best partitioning of the histogram)
5.find local maxima in histogram
6.exclude the maxima which are relevant to the already placed models
7.from the resting maxima find the one with the biggest number of datapoints
8.from these datapoints find the one which has the densiest neigbrhood and set this point as the center for the new cluster
9.run 2-8 until the dependency between new cluster and some other cluster is higher than some threshold

Further reading:
Initialization of GMM models:
Very helpful was the article from Lee&Lee&Lee: The Estimating optimal number of GMM based on incremental k-means for speaker identifications

Other articles about the optimal number of clusters and initialization centers of models:
Selecting the optimal number of components for GMM
An algorithm for estimating number of components of GMM based on penalized distance
Number of components and initialization in Gaussian Mixture Model

sobota 5. listopadu 2011

Gaussian mixture models

Last days I had big troubles with implementing Gaussian mixture models into Matlab. Of course, I know, that it is already done many times, but I need my own script to work on it and be able to understand each step.

As I have mentioned in the last post, I was terribly surprised (and it wasn't the good surprise) when I realised that the Neural modeling fields algorithm by Perlovsky is only some widening of Gaussian mixture model algorithm. So I decided to start with the Gaussian mixture models and then continue by dynamic fuzzy logic, models adding and parameters changing.

What is it all about. You have some data and there could be some patterns(clusters) which you'd like to find. There are several techniques and among them the most favorite is k-means clustering. Gaussian mixture model supose that the data are the result of mixture of some patterns (models, clusters)

and that these models can be described by the gaussian probability distribution function:


There is the same problem as in the k-means clustering, that we don't know how many clusters we should find. So we try different numbers and we must be very cautious not to overfit the data (this could result into the situation where is one model for each data point).

Lets say that we know how many clusters should we await (a priori given number of components). Now we can intialize the centers of gaussians randomly or we can use k-means clustering algorithm to ease the work.

But now it is the hard part. We have to determine the parameters of a mixture. the most widely spread technique is seemingly Expectation-maximization algorithm(EM). EM is of particular appeal for finite normal mixtures where closed-form expressions are possible such as in the following iterative algorithm by Dempster et al. (1977) All the work is about maximizing the log-likelihood function, the likelihood estimation for this problem (similiraty between models and data).
Log-likelihood can be expressed as:

Where theta is a set of models' parameters(centers and covariance matrix), n is a number of datapoints and f is the likelihood between each datapoint and the models.

Using covariance matrix we will compute for each datapoint the probability that it is corresponding to the given model (the likelihood for each datapoint) l.
Further f will mean the probability that the signal X was generated by the model M (so it should sum between the models up to 1). I will use some explanations and equations from the Perlovsky's work:
r says how many signals are associated with the given model r(h)=sum(f(h))/N.

Now we have to dynamicly change the parameters - maximize them having the computed likelihood function.


The same dynamic update only with different notation:

Thus on the basis of the current estimate for the parameters, the conditional probability a given observation x(t) being generated from state s is determined for each t = 1, …, m ; m being the sample size. The parameters are then updated such that the new component weights correspond to the average conditional probability and each component mean and covariance is the component specific weighted average of the mean and covariance of the entire sample.

Now we can use these new parameters and iteratively compute likelihood functions for each datapoint and model and compute new parameters...

Main problems which occured during implementation:

1, I know I should know how to multiply the matrices, but when you are in the n-dimensional space, have many different matrices, it can be quite tricky...
2, Covariance matrices - for each model there should be one (because you want to change parameters differently for each model), you can have scalar variance, isotropical model (having numbers only on the diagonal of the matrix) or full variance matrix - I recommend to start with isotropical and then continue to full variance matrix, and....covariance is standard deviation of the data
3, it isn't mentioned everywhere but you have to normalize everything what should be normalized

Ok, ok, but where is the Perlovsky's idea? I will surely describe it more precisely in some further post. But the main idea is to start from some highly fuzzy gaussian model, make it more and more crisp, then add some parameters, and fit the data as well as it is possible. So I'll discuss in the next posts some problems with adding models, changing parameters etc.

Further reading:
Nice explanations about Neural modeling fields, EM algorithm, Mixture models and gaussian probability function can be found on the Wikipedia.

About gaussian mixture models for example:
http://www.ll.mit.edu/mission/communications/ist/publications/0802_Reynolds_Biometrics-GMM.pdf

Very nice master thesis:
http://phd.arngren.com/DL/imm5217.pdf

Perlovsky's work:
book: Perlovsky, L.I. (2001). Neural Networks and Intellect: using model-based concepts. Oxford University Press, New York, NY (3PrdP printing).
some useful articles:
http://www.leonid-perlovsky.com/11%20-%20FDL%20NMNC.pdf
http://www.leonid-perlovsky.com/papers-online.html