about me
art
biz
Chess
corrections
economics
EconoSchool
Finance
friends
fun
game theory
games
geo
mathstat
misc
NatScience
... more
Profil
Logout
Subscribe Weblog

 

mathstat

Recently, I asked myself the question: "Why is the expected value of e^sX (X ~ N(0,1)) equal to e^(0.5*s^2)?". You know that compounding a sum of money (A) at a continously compounded rate X for a period involves multiplying it by e^X. In case X is non-random, the expected value is A*e^X. In case X is random, calculating the expected value becomes quite difficult. The solution for a standard normally distributed X can be derived as follows:

1. Write down the equation (μ = E[e^sX] where X ~ N(0,1)):
gau012. Substitute y = x - s and you get:gau2More general, if X ~ N(μ*, σ2), then X = μ* + σZ with Z ~N(0,1). It follows thatgau03Of course, practitioners often know the parameters they are interested in (e.g. s = 0.3), so they can take a shortcut and run a monte carlo analysis:

s <- 0.3
n <- 10000
x <- rnorm(n)
answer <- mean(exp(s*x))

Or a bit more sophisticated and less arbitrary:

s <- 0.3
n <- 10000
i <- (1:n)/(n+1)
x <- qnorm(i)
answer <- mean(exp(s*x))

NB: The expected value can be very misleading.
Source: Stochastische Grundlagen der Finanzmathematik, Klaus Pötzelberger

Ernest Chan (blog) writes in his book Quantitative Trading - How to Build Your Own Algorithmic Trading Business:
Here is a little puzzle that may stymie many a professional trader. Suppose a certain stock exhibits a true (geometric) random walk, by which I mean there is a 50-50 chance that the stock is going up 1 percent or down 1 percent every [day]. If you buy this stock, are you most likely--in the long run and ignoring financing costs--to make money, lose money, or be flat?
Most traders will blur out the answer "Flat!," and that is wrong. The correct answer is that you will lose money, at the rate of 0.005 percent (or 0.5 basis points) every [day]! This is because for a geometric random walk, the average compounded rate of return is not the return μ, but is g = μ - σ^2/2.
For this very reason, geometric Brownian Motion is often written as
bmwhere "μ - σ^2/2" is the expected return and "μ" is the return of the expected prices, i.e. ln(E[St]/E[St-1]).

If you lose 50 percent of your portfolio, you have to make 100 percent to get back to even... that's what everybody knows. But it's also interesting to see how mild volatiltiy hurts over time. Here are ten (random - no cherry picking) realizations of a geometric Brownian Motion with a daily volatility of 1% (i.e. a yearly volatility of 16% when having 252 trading days) over the period of 100 years:
bm02
bm01
bm03
bm04
bm05
bm06
bm07
bm08
bm09
bm10
R Development Core Team (2008). R: A language and environment for statistical computing.

PS: Hey, this looks promising:
veryprom

A while ago, we adressed the following question: A portfolio manager knows that his strategy can, on average, outperform the benchmark index by 3% annually. His portfolio has an annual volatility (standard deviation) of 25% against the index's 15%. Assuming that the correlation between the returns of the portfolio and the returns of the index is 0.9, how many years would it take to outperform the index with 90% probability?

The correct answer is a whopping 300 years! (apply the Itô-Döblin formula)

Today somebody asked me if I could run a couple of simulations to get a better understanding of the result. What I did was plot 20 simulations of log(Portfolio/Index) for varying correlations. For ρ = 0.9 2 out of 20 (10%) are--as expected--below zero:
skillluckparadox

By strolling through the library I figured out that the number of pages of my master's thesis is five standard deviations lower than the average number of pages my friends have written to graduate from the Vienna University of Economics and Business Administration:

Author Pages Title
Michael Sigmund 139 Anwendungsgebiete der Spieltheorie in den Sozialwissenschaften
Robert Ferstl 127 Werkzeuge zur Analyse räumlicher Daten - eine Softwareimplementation in EViews und MATLAB
Karin Doppelbauer 122 Analiz rʹinka mjasa kur v Rossii - Eine Analyse des russischen Geflügelfleischmarktes
Markus Pock 107 Untersuchungen zu Wachstumseffekten der WWU mittels Zeitreihenanalyse
Christian Kraxner 105 Using credit derivatives for managing corporate bond portfolios
Christian Balbier 104 Föderale Strukturen in den neuen Mitgliedsstaaten der Europäischen Union
Stefan Woytech 103 Harmonisierung von internem und externem Rechnungswesen auf Basis der IAS/IFRS
Anton Burger 92 Reasons for the U.S. Growth Experience in the Nineties: Non Keynesian Effects, the Capital Market and Technology
Michael Stastny 31 Economic Growth and Output Variability: An Empirical Analysis (pdf)

The outlier status of my thesis is confirmed by conventional tests for outliers:
outliers
How cool is that?

The difference between an event being almost sure and sure is the same as the subtle difference between something happening with probability 1 and happening always.

If an event is sure, then it will always happen. No other event (even events with probability 0) can possibly occur. If an event is almost sure, then there are other events that could happen, but they happen almost never, that is with probability 0.

Cool Example: Throwing a dart
dbot
For example, imagine throwing a dart at a square, and imagine that this square is the only thing in the universe. There is physically nowhere else for the dart to land. Then, the event that "the dart hits the square" is a sure event. No other alternative is imaginable.

Next, consider the event that "the dart hits the diagonal of the square exactly". The probability that the dart lands on any subregion of the square is equal to the area of that subregion. But, since the area of the diagonal of the square is zero, the probability that the dart lands exactly on the diagonal is zero. So, the dart will almost surely not land on the diagonal, or indeed any other given line or point. Notice that even though there is zero probability that it will happen, it is still possible.

Source: Wikipedia

There is an old conundrum in queueing theory that goes like this:
  • A passenger arrives at a bus-stop at some arbitrary point in time
  • Buses arrive according to a Poisson process
  • The mean interval between the buses is 10 min.
What is the mean waiting time until the next bus?
busline
Answer: 10 min. This is an example of length-biased sampling. The explanation of the paradox lies therein that the passengers' probability to arrive during a long interarrival interval is greater than during a short interval. ] Here is a neat non-technical explanation (taken from this book). [

Given the interarrival interval, within that interval the arrival instant of the passanger is uniformly distributed and the expected waiting time is one half of the total duration of the interval. The point is that in the selection by the random instant the long intervals are more frequently represented than the short ones (with a weight proportional to the length of the interval).

Consider a long period of time t. The waiting time to the next bus arrival W(τ) as a function of the arrival instant τ of the passenger is represented by:
waittwhere the Xi are the interarrival intervals. The mean waiting time, W_bar, is the average value of this sawtooth curve:waitt01
Note that long interarrival intervals contribute much more than short ones to the average waiting time. As t grows, t/n -> X_bar, hence,
waitt02For exponential distribution (as the Xi are distributed),
waitt03Therefore,
waitt04Altogether,
waitt05
Q.E.D.

Sources:
Advanced Course in Operating Systems (University of Haifa), Lecture 1 & 2

New Economist writes: In a new post on the Statistical Modeling, Causal Inference, and Social Science blog, Aleks Jakulin at Columbia University points us to a great online tool, ZunZun. It lets you use 2 and 3 dimensional 'Function Finders' to 'help determine the best curve fit for your data'."

On William Greene's site I found a neat data set (Data Tables :: Table F6.1) for estimating a Cobb-Douglas production function. ZunZun comes up with the following suggestion:

Y = β1( L0.5K0.5) + β2(cos(L)K1.5)

Surface Plot:
cobb_surface
Contour Plot:
cobb_contour
The R2 reaches an unrealistic 0.968 (0.94 for the Cobb-Douglas specification). Textbook data... Here is a scatterplot of the logarithmized data (created with R):
scatter3d

viewp01
viewp02

The Viewpoints 2000 Group
Mathematics Magazine, Vol. 74, No. 4. (Oct., 2001), p. 320.

related items:
Neat Proofs, Mahalanobis

Taken from the bionet.info-theory FAQ: If someone says that information = uncertainty = entropy, then they are confused, or something was not stated that should have been. Those equalities lead to a contradiction, since entropy of a system increases as the system becomes more disordered. So information corresponds to disorder according to this confusion.

If you always take information to be a decrease in uncertainty at the receiver and you will get straightened out:
entro01
where H is the Shannon uncertainty:
entro02
and pi is the probability of the ith symbol. If you don't understand this, read the short Information Theory Primer.

Imagine that we are in communication and that we have agreed on an alphabet. Before I send you a bunch of characters, you are uncertain (Hbefore) as to what I'm about to send. After you receive a character, your uncertainty goes down (to Hafter). Hafter is never zero because of noise in the communication system. Your decrease in uncertainty is the information (I) that you gain.

Since Hbefore and Hafter are state functions, this makes I a function of state. It allows you to lose information (it's called forgetting).

Many of the statements in the early literature assumed a noiseless channel, so the uncertainty after receipt is zero (Hafter=0). This leads to the SPECIAL CASE where I = Hbefore. But Hbefore is NOT "the uncertainty", it is the uncertainty of the receiver BEFORE RECEIVING THE MESSAGE.

e8MIT news office: An international team of 18 mathematicians has mapped one of the largest and most complicated structures in mathematics. If written out on paper, the calculation describing this structure, known as E8, would cover an area the size of Manhattan.

The work is important because it could lead to new discoveries in mathematics, physics and other fields. In addition, the innovative large-scale computing that was key to the work likely spells the future for how longstanding math problems will be solved in the 21st century. Click here (or here ) to continue.

related items:
American Institute of Mathematics: E8
Lecture Slides: The Character Table for E8, or How We Wrote Down a 453,060 x 453,060 Matrix and Found Happiness, David Vogan, MIT