about me
game theory
... more
Subscribe Weblog

In his latest entry Steffen (Mit dem Kopf voran) posted the following quote from Enrico Fermi:
"I remember my friend Johnny von Neumann used to say, 'with four parameters I can fit an elephant and with five I can make him wiggle his trunk.'" A meeting with Enrico Fermi, Nature 427, 297; 2004.
I know that there exists a fool-proof method for sculpting an elephant (get a huge block of marble and chip away everything that doesn't look like an elephant) but fitting an elephant with four parameters is an impossible task. Actually, the question, "How many paramters does it take to fit an elephant?" has already been answered by Wel in 1975: He started with an idealized drawing (A) defined by 36 points and showed that one could be satisfied with the fit of a 30 term model:elephant
The point of the elephant-story* is what every student is told in his first econometrics lecture: The residual sum can't increase by adding more explanatory variables (and therefore more parameteres) to a linear regression model. As a rule: The more explanatory variables, the better the fit. The problem that arises when one keeps adding regressors is that the precision of the estimators will be very poor, i.e. overfitted models (models with too much paramters) have estimated (and actual) sampling variances that are needlessly large (i.e. with new data your model will perform poorly).

For a linear model containing predictors x1,x2, ...,xk with estimators of coefficients b0, b1,...,bk, if the regressor xk+1 is added to the model producing new estimators of coefficients b*0, b*1,...,b*k+1, then for i = 0,...,k

Var(b*i) ≥ Var(bi)

Ergo: If a model is made more complex, the variance of model predictions will increase (since the parameters are estimated less precisely).

On the other hand if important regressors are excluded the predicted dependend variables ("y hat") will systematically deviate from the observed variables (y) and the residuals (which contain the omitted variable) will be unnecessarily large. If the assumed model is not correct due to an important predictor being excluded then the bias could be reduced by including that predictor.

The figure shown below shows the trade-off between (squared) bias (solid line) and variance versus the number of estimable parameters in the model. All model selection methods implicitly employ some notion of this trade-off:biasvariancetradeoff
Keep in mind that the best approximating model need not occur exactly where the two curves intersect. Conclusion: Whether a predictor is worth adding to a model (for a fixed vector of regressors, x) depends on whether the reduction in squared bias is greater than the increase in variance.

Model too simple ⇔ high bias/low variance

Model too complex ⇔ low bias/high variance


Note for Advanced Readers: Mean Squared Error (MSE):

Assume z = h(x,θ)+v, where z is the dependend variable, h(.) is some function, x is is the vector of regressors, v is noise and θ is a vector of model parameters. The MSE of the model at a fixed x can be decomposed as:

*If one shows up with a model with lots of parameters opponents will argue that one has used enough parameters to fit an elephant.

Graphics taken from Model Selection and Multi-Model Inference,
Kenneth Burnham & David Anderson
maskodok meinte am 15. Jan, 01:30:
Interesting topic for a blog. I have been searching the Internet for fun and came upon your website. Fabulous post. Thanks a ton for sharing your knowledge! It is great to see that some people still put in an effort into managing their websites. I'll be sure to check back again real soon.
Mobil Sedan COrolla,Idrpoker.com agen Texas poker Online Indonesia Terpercaya, Mobil Sedan COrolla, Cipto Junaedy 
ARL (guest) meinte am 23. Feb, 05:11:
A great experience Jasa SEO 
asda (guest) antwortete am 27. Feb, 11:58:
This machine or device can be used to treat pains associated with muscle spasms, headaches, back pain, and other chronic or acute pains. Its use not only gives the user instant relief from pain but its regular use can help manage extreme flare-ups of arthritis pain. The machines are used to reduce swelling and muscle stiffness in the localized area.

A sonogram is a useful tool in obstetrics and gynecology. It has no known ill effects on the human body, so it is the most commonly used imaging modality for obstetric ultrasound. It is used to evaluate the uterus and ovaries because X-rays could negatively affect fertility. It is also used to compliment mammography. Mammograms are formed by the reaction of X-rays to the different densities of breast tissue. Ultrasound confirms if a mass seen on mammography is cystic or solid and if there is any blood flow in the mass. 
kimcil (guest) antwortete am 7. Apr, 12:24:
I am preparing a research paper and collecting information on this topic. Your post is one of the better that I have read. Thank you for putting this information into one post.
alfamart official partner merchandise fifa piala dunia brazil 2014
Unit Link Terbaik di Indonesia Commonwealth Life Investra Link 
Ahmad (guest) antwortete am 15. Apr, 14:57:
Almost everyone who drives a car is required by law to carry auto insurance. Month to month (or perhaps every six months, depending on how you choose to pay your premium) this can seem like an inconvenience on which you don’t see any return. steamautospa 
Ahmad (guest) antwortete am 15. Apr, 14:57:
Car shopping is one thing people seem to struggle with. The reason is due to the fact that many people aren’t sure what they should check out in a car. You all need to be a good negotiator and business savvy. chrusciel-transport 
Ahmad (guest) antwortete am 15. Apr, 14:58:
Although there are many internet marketing tactics that you can use, a lot of what you will actually be able to do depends on your technology. Make sure you take into consideration the type of item you are trying to market. bbs-businessschool 
octavinsu meinte am 1. Mar, 06:45:
Dewa Poker
I comparable what you guys are positive too. Such smart work and reporting! Keep up the superb works guys I've incorporated you guys to my blogroll. I think it'll improve the value of my website :). agen bola terpercaya,agen bola online,agen bola sbobet,agen ibcbet Online 
octavinsu meinte am 1. Mar, 06:45:
Hi my loved one! I wish to say that this post is amazing, nice written and include almost all important infos. I would like to look more posts like this . jasa seo,jasa seo murah,jasa seo profesional,jasa seo murah bergaransi,jasa seo terpercaya,Jasa SEO Indonesia,Pakar SEO 
Agen Bola Terpercaya (guest) meinte am 1. Mar, 09:16:
Dewa Poker
I was just searching for this information for a while. After 6 hours of continuous Googleing, at last I got it in your site. I wonder what is the lack of Google strategy that do not rank this kind of informative web sites in top of the list. Normally the top web sites are full of garbage. 
Women Hairstyles 2014 (guest) meinte am 24. Mar, 18:28:
I was very encouraged to find this site. I want to thank you for this special read. I definitely enjoyed every bit of it and I ve marked you to check out new stuff you write. 
iGamble247 (guest) meinte am 14. Apr, 22:10:
i would like to bookmark the page so i can come here again to read you, as you have done a wonderful job. live casino online