The LSVM (SVM with latent variable) is mostly used for human figure detection, it is very efficiency because it puts the human figure’s structure into consideration: a human figure has hands, head and legs. The LSVM models the human figure structure with 6 parts, and the position of these 6 parts are latent value.

The basic logic is sliding a window on the image, for every position we get a small image patch, by scoring this image patch we can predict whether this image patch contains a human figure or not.

Defining the score function

Anyway, the first thing to do: defining a score function:

Structural SVM is a variation of SVM, hereafter to be refered as SSVM

Special prediction function of SSVM

Firstly let’s recall the normal SVM’s prediction function:

ω is the weight vector,x is the input,b is the bias, is sign function, is the prediction result.

On of SSVM’s specialties is its prediction function:

y is the possible prediction result,Υ is y’s searching space,and Φ is some function of x and y.Φ will be a joint feature vector describes the relationship between x and y

Then for some given , different prediction will be made according to different x.

Sepp Hochreiter was graduated from Technische Universität München, LSTM was invented when he was in TU and now he is the head of Institute of Bioinformatics, Johannes Kepler University Linz.

Today he comes by Munich and gives a lecture in Fakultät Informatik.

At first, Hochreiter praised how hot is Deep Learning (to be referred as DL) around the world these days, especially LSTM, which is now used in the new version of google translate published a few days ago. The improvements DL made in the fields of vision and NLP are very impressive.

Then he starts to tell the magic of DL, taking face recognition as an example, the so-called CNN (Convolution Neuro Networks):


Centering and Scaling



Simple Neuron


The above diagram shows a neuron in NN, it simulates a real neuron:

it has inputs:

it has weights for each inputs: : weight vector

it has bias

it has a threshold for the “activation function”


For the ith sample in the training set, we have the following loss function:

is the score classifying to class j,and is the score classifying correctly(classify to class ), is the th row of .


Problem 1:

Considering the geometrical meaning of the weight vector , it is easy to find out that is not unique, can change in a small area and result in the same .

Problem 2:

It the values in is scaled, the loss computed will also be scaled by the same ratio. Considering a loss of 15, if we scale all the weights in by 2, the loss will be scaled to 30. But this kind of scaling is meaningless, it doesn’t really represent the loss.

Traditional Thread

C++ 11 supports multi-threading, for which we previously used pthread or boost. Standard C++ 11’s threading is based on boost::thread, now it’s cross platform and no dependency needed.

#include <thread>
#include <iostream>
int testThread()
	std::cout << "I am a seperate thread...";
void main()
	std::thread newThread(testThread);//now it's running
        //do something else .... //
	newThread.join();//before the ending, newThread should join

The code is simple, the thread function will run right after the std::thread is declared.