mathstat
Recently, I asked myself the question: "Why is the expected value of e^sX (X ~ N(0,1)) equal to e^(0.5*s^2)?". You know that compounding a sum of money (A) at a continously compounded rate X for a period involves multiplying it by e^X. In case X is non-random, the expected value is A*e^X. In case X is random, calculating the expected value becomes quite difficult. The solution for a standard normally distributed X can be derived as follows:
1. Write down the equation (μ = E[e^sX] where X ~ N(0,1)):
2. Substitute y = x - s and you get:More general, if X ~ N(μ*, σ^{2}), then X = μ* + σZ with Z ~N(0,1). It follows thatOf course, practitioners often know the parameters they are interested in (e.g. s = 0.3), so they can take a shortcut and run a monte carlo analysis:
s <- 0.3
n <- 10000
x <- rnorm(n)
answer <- mean(exp(s*x))
Or a bit more sophisticated and less arbitrary:
s <- 0.3
n <- 10000
i <- (1:n)/(n+1)
x <- qnorm(i)
answer <- mean(exp(s*x))
NB: The expected value can be very misleading.
Source: Stochastische Grundlagen der Finanzmathematik, Klaus Pötzelberger
1. Write down the equation (μ = E[e^sX] where X ~ N(0,1)):
2. Substitute y = x - s and you get:More general, if X ~ N(μ*, σ^{2}), then X = μ* + σZ with Z ~N(0,1). It follows thatOf course, practitioners often know the parameters they are interested in (e.g. s = 0.3), so they can take a shortcut and run a monte carlo analysis:
s <- 0.3
n <- 10000
x <- rnorm(n)
answer <- mean(exp(s*x))
Or a bit more sophisticated and less arbitrary:
s <- 0.3
n <- 10000
i <- (1:n)/(n+1)
x <- qnorm(i)
answer <- mean(exp(s*x))
NB: The expected value can be very misleading.
Source: Stochastische Grundlagen der Finanzmathematik, Klaus Pötzelberger
Mahalanobis - am 2008-12-27 22:58 - Rubrik: mathstat
Ernest Chan (blog) writes in his book Quantitative Trading - How to Build Your Own Algorithmic Trading Business:
where "μ - σ^2/2" is the expected return and "μ" is the return of the expected prices, i.e. ln(E[S_{t}]/E[S_{t-1}]).
If you lose 50 percent of your portfolio, you have to make 100 percent to get back to even... that's what everybody knows. But it's also interesting to see how mild volatiltiy hurts over time. Here are ten (random - no cherry picking) realizations of a geometric Brownian Motion with a daily volatility of 1% (i.e. a yearly volatility of 16% when having 252 trading days) over the period of 100 years:
R Development Core Team (2008). R: A language and environment for statistical computing.
PS: Hey, this looks promising:
Here is a little puzzle that may stymie many a professional trader. Suppose a certain stock exhibits a true (geometric) random walk, by which I mean there is a 50-50 chance that the stock is going up 1 percent or down 1 percent every [day]. If you buy this stock, are you most likely--in the long run and ignoring financing costs--to make money, lose money, or be flat?For this very reason, geometric Brownian Motion is often written as
Most traders will blur out the answer "Flat!," and that is wrong. The correct answer is that you will lose money, at the rate of 0.005 percent (or 0.5 basis points) every [day]! This is because for a geometric random walk, the average compounded rate of return is not the return μ, but is g = μ - σ^2/2.
where "μ - σ^2/2" is the expected return and "μ" is the return of the expected prices, i.e. ln(E[S_{t}]/E[S_{t-1}]).
If you lose 50 percent of your portfolio, you have to make 100 percent to get back to even... that's what everybody knows. But it's also interesting to see how mild volatiltiy hurts over time. Here are ten (random - no cherry picking) realizations of a geometric Brownian Motion with a daily volatility of 1% (i.e. a yearly volatility of 16% when having 252 trading days) over the period of 100 years:
R Development Core Team (2008). R: A language and environment for statistical computing.
PS: Hey, this looks promising:
Mahalanobis - am 2008-12-22 09:07 - Rubrik: mathstat
A while ago, we adressed the following question: A portfolio manager knows that his strategy can, on average, outperform the benchmark index by 3% annually. His portfolio has an annual volatility (standard deviation) of 25% against the index's 15%. Assuming that the correlation between the returns of the portfolio and the returns of the index is 0.9, how many years would it take to outperform the index with 90% probability?
The correct answer is a whopping 300 years! (apply the Itô-Döblin formula)
Today somebody asked me if I could run a couple of simulations to get a better understanding of the result. What I did was plot 20 simulations of log(Portfolio/Index) for varying correlations. For ρ = 0.9 2 out of 20 (10%) are--as expected--below zero:
The correct answer is a whopping 300 years! (apply the Itô-Döblin formula)
Today somebody asked me if I could run a couple of simulations to get a better understanding of the result. What I did was plot 20 simulations of log(Portfolio/Index) for varying correlations. For ρ = 0.9 2 out of 20 (10%) are--as expected--below zero:
Mahalanobis - am 2007-05-15 21:50 - Rubrik: mathstat
By strolling through the library I figured out that the number of pages of my master's thesis is five standard deviations lower than the average number of pages my friends have written to graduate from the Vienna University of Economics and Business Administration:
The outlier status of my thesis is confirmed by conventional tests for outliers:
How cool is that?
Author | Pages | Title |
Michael Sigmund | 139 | Anwendungsgebiete der Spieltheorie in den Sozialwissenschaften |
Robert Ferstl | 127 | Werkzeuge zur Analyse räumlicher Daten - eine Softwareimplementation in EViews und MATLAB |
Karin Doppelbauer | 122 | Analiz rʹinka mjasa kur v Rossii - Eine Analyse des russischen Geflügelfleischmarktes |
Markus Pock | 107 | Untersuchungen zu Wachstumseffekten der WWU mittels Zeitreihenanalyse |
Christian Kraxner | 105 | Using credit derivatives for managing corporate bond portfolios |
Christian Balbier | 104 | Föderale Strukturen in den neuen Mitgliedsstaaten der Europäischen Union |
Stefan Woytech | 103 | Harmonisierung von internem und externem Rechnungswesen auf Basis der IAS/IFRS |
Anton Burger | 92 | Reasons for the U.S. Growth Experience in the Nineties: Non Keynesian Effects, the Capital Market and Technology |
Michael Stastny | 31 | Economic Growth and Output Variability: An Empirical Analysis (pdf) |
The outlier status of my thesis is confirmed by conventional tests for outliers:
How cool is that?
Mahalanobis - am 2007-04-12 18:11 - Rubrik: mathstat
The difference between an event being almost sure and sure is the same as the subtle difference between something happening with probability 1 and happening always.
If an event is sure, then it will always happen. No other event (even events with probability 0) can possibly occur. If an event is almost sure, then there are other events that could happen, but they happen almost never, that is with probability 0.
Cool Example: Throwing a dart
For example, imagine throwing a dart at a square, and imagine that this square is the only thing in the universe. There is physically nowhere else for the dart to land. Then, the event that "the dart hits the square" is a sure event. No other alternative is imaginable.
Next, consider the event that "the dart hits the diagonal of the square exactly". The probability that the dart lands on any subregion of the square is equal to the area of that subregion. But, since the area of the diagonal of the square is zero, the probability that the dart lands exactly on the diagonal is zero. So, the dart will almost surely not land on the diagonal, or indeed any other given line or point. Notice that even though there is zero probability that it will happen, it is still possible.
Source: Wikipedia
If an event is sure, then it will always happen. No other event (even events with probability 0) can possibly occur. If an event is almost sure, then there are other events that could happen, but they happen almost never, that is with probability 0.
Cool Example: Throwing a dart
For example, imagine throwing a dart at a square, and imagine that this square is the only thing in the universe. There is physically nowhere else for the dart to land. Then, the event that "the dart hits the square" is a sure event. No other alternative is imaginable.
Next, consider the event that "the dart hits the diagonal of the square exactly". The probability that the dart lands on any subregion of the square is equal to the area of that subregion. But, since the area of the diagonal of the square is zero, the probability that the dart lands exactly on the diagonal is zero. So, the dart will almost surely not land on the diagonal, or indeed any other given line or point. Notice that even though there is zero probability that it will happen, it is still possible.
Source: Wikipedia
Mahalanobis - am 2007-04-02 22:37 - Rubrik: mathstat
There is an old conundrum in queueing theory that goes like this:
Answer: 10 min. This is an example of length-biased sampling. The explanation of the paradox lies therein that the passengers' probability to arrive during a long interarrival interval is greater than during a short interval. ] Here is a neat non-technical explanation (taken from this book). [
Given the interarrival interval, within that interval the arrival instant of the passanger is uniformly distributed and the expected waiting time is one half of the total duration of the interval. The point is that in the selection by the random instant the long intervals are more frequently represented than the short ones (with a weight proportional to the length of the interval).
Consider a long period of time t. The waiting time to the next bus arrival W(τ) as a function of the arrival instant τ of the passenger is represented by:
where the X_{i} are the interarrival intervals. The mean waiting time, W_bar, is the average value of this sawtooth curve:
Note that long interarrival intervals contribute much more than short ones to the average waiting time. As t grows, t/n -> X_bar, hence,
For exponential distribution (as the X_{i} are distributed),
Therefore,
Altogether,
Q.E.D.
Sources:
Advanced Course in Operating Systems (University of Haifa), Lecture 1 & 2
- A passenger arrives at a bus-stop at some arbitrary point in time
- Buses arrive according to a Poisson process
- The mean interval between the buses is 10 min.
Answer: 10 min. This is an example of length-biased sampling. The explanation of the paradox lies therein that the passengers' probability to arrive during a long interarrival interval is greater than during a short interval. ] Here is a neat non-technical explanation (taken from this book). [
Given the interarrival interval, within that interval the arrival instant of the passanger is uniformly distributed and the expected waiting time is one half of the total duration of the interval. The point is that in the selection by the random instant the long intervals are more frequently represented than the short ones (with a weight proportional to the length of the interval).
Consider a long period of time t. The waiting time to the next bus arrival W(τ) as a function of the arrival instant τ of the passenger is represented by:
where the X_{i} are the interarrival intervals. The mean waiting time, W_bar, is the average value of this sawtooth curve:
Note that long interarrival intervals contribute much more than short ones to the average waiting time. As t grows, t/n -> X_bar, hence,
For exponential distribution (as the X_{i} are distributed),
Therefore,
Altogether,
Q.E.D.
Sources:
Advanced Course in Operating Systems (University of Haifa), Lecture 1 & 2
Mahalanobis - am 2007-03-28 03:42 - Rubrik: mathstat
New Economist writes: In a new post on the Statistical Modeling, Causal Inference, and Social Science blog, Aleks Jakulin at Columbia University points us to a great online tool, ZunZun. It lets you use 2 and 3 dimensional 'Function Finders' to 'help determine the best curve fit for your data'."
On William Greene's site I found a neat data set (Data Tables :: Table F6.1) for estimating a Cobb-Douglas production function. ZunZun comes up with the following suggestion:
Contour Plot:
The R^{2} reaches an unrealistic 0.968 (0.94 for the Cobb-Douglas specification). Textbook data... Here is a scatterplot of the logarithmized data (created with R):
On William Greene's site I found a neat data set (Data Tables :: Table F6.1) for estimating a Cobb-Douglas production function. ZunZun comes up with the following suggestion:
Y = β_{1}( L^{0.5}K^{0.5}) + β_{2}(cos(L)K^{1.5})
Surface Plot:Contour Plot:
The R^{2} reaches an unrealistic 0.968 (0.94 for the Cobb-Douglas specification). Textbook data... Here is a scatterplot of the logarithmized data (created with R):
Mahalanobis - am 2007-03-27 04:50 - Rubrik: mathstat
The Viewpoints 2000 Group
Mathematics Magazine, Vol. 74, No. 4. (Oct., 2001), p. 320.
related items:
Neat Proofs, Mahalanobis
Mahalanobis - am 2007-03-24 03:16 - Rubrik: mathstat
Taken from the bionet.info-theory FAQ: If someone says that information = uncertainty = entropy, then they are confused, or something was not stated that should have been. Those equalities lead to a contradiction, since entropy of a system increases as the system becomes more disordered. So information corresponds to disorder according to this confusion.
If you always take information to be a decrease in uncertainty at the receiver and you will get straightened out:
where H is the Shannon uncertainty:
and p_{i} is the probability of the ith symbol. If you don't understand this, read the short Information Theory Primer.
Imagine that we are in communication and that we have agreed on an alphabet. Before I send you a bunch of characters, you are uncertain (H_{before}) as to what I'm about to send. After you receive a character, your uncertainty goes down (to H_{after}). H_{after} is never zero because of noise in the communication system. Your decrease in uncertainty is the information (I) that you gain.
Since H_{before} and H_{after} are state functions, this makes I a function of state. It allows you to lose information (it's called forgetting).
Many of the statements in the early literature assumed a noiseless channel, so the uncertainty after receipt is zero (H_{after}=0). This leads to the SPECIAL CASE where I = H_{before}. But H_{before} is NOT "the uncertainty", it is the uncertainty of the receiver BEFORE RECEIVING THE MESSAGE.
If you always take information to be a decrease in uncertainty at the receiver and you will get straightened out:
where H is the Shannon uncertainty:
and p_{i} is the probability of the ith symbol. If you don't understand this, read the short Information Theory Primer.
Imagine that we are in communication and that we have agreed on an alphabet. Before I send you a bunch of characters, you are uncertain (H_{before}) as to what I'm about to send. After you receive a character, your uncertainty goes down (to H_{after}). H_{after} is never zero because of noise in the communication system. Your decrease in uncertainty is the information (I) that you gain.
Since H_{before} and H_{after} are state functions, this makes I a function of state. It allows you to lose information (it's called forgetting).
Many of the statements in the early literature assumed a noiseless channel, so the uncertainty after receipt is zero (H_{after}=0). This leads to the SPECIAL CASE where I = H_{before}. But H_{before} is NOT "the uncertainty", it is the uncertainty of the receiver BEFORE RECEIVING THE MESSAGE.
Mahalanobis - am 2007-03-22 02:14 - Rubrik: mathstat
MIT news office: An international team of 18 mathematicians has mapped one of the largest and most complicated structures in mathematics. If written out on paper, the calculation describing this structure, known as E_{8}, would cover an area the size of Manhattan.
The work is important because it could lead to new discoveries in mathematics, physics and other fields. In addition, the innovative large-scale computing that was key to the work likely spells the future for how longstanding math problems will be solved in the 21st century. Click here (or here ) to continue.
related items:
American Institute of Mathematics: E_{8}
Lecture Slides: The Character Table for E8, or How We Wrote Down a 453,060 x 453,060 Matrix and Found Happiness, David Vogan, MIT
The work is important because it could lead to new discoveries in mathematics, physics and other fields. In addition, the innovative large-scale computing that was key to the work likely spells the future for how longstanding math problems will be solved in the 21st century. Click here (or here ) to continue.
related items:
American Institute of Mathematics: E_{8}
Lecture Slides: The Character Table for E8, or How We Wrote Down a 453,060 x 453,060 Matrix and Found Happiness, David Vogan, MIT
Mahalanobis - am 2007-03-20 02:35 - Rubrik: mathstat