about me
game theory
... more
Subscribe Weblog



"Multivariate Adaptive Regression Splines (MARS) is an implementation of techniques popularized by Friedman (1991) for solving regression-type problems.

MARS is a nonparametric regression procedure that makes no assumption about the underlying functional relationship between the dependent and [explanatory] variables. Instead, MARS constructs this relation from a set of coefficients and basis functions that are entirely "driven" from the regression data. In a sense, the method is based on the "divide and conquer" strategy, which partitions the input space into regions, each with its own regression equation. This makes MARS particularly suitable for problems with higher input dimensions (i.e., with more than 2 variables), where the curse of dimensionality [see also blessing of dimensionality] would likely create problems for other techniques.

The MARSplines technique has become particularly popular in the area of data mining because it does not assume or impose any particular type or class of relationship (e.g., linear, logistic, etc.) between the predictor variables and the dependent (outcome) variable of interest. Instead, useful models (i.e., models that yield accurate predictions) can be derived even in situations where the relationship between the predictors and the dependent variables is non-monotone and difficult to approximate with parametric models." [Continue]
Note to self: Read Hat tip to Diethelm Würtz

Taylor Effect: It is by now well established in the financial econometrics literature that high frequency time series of financial returns are often uncorrelated but not independent because there are non-linear transformations which are positively correlated. In 1986 Taylor observed that the empirical sample autocorrelations of absolute returns, |r|, are usually larger than those of squared returns, |r|^2. A similar phenomena is observed by Ding et al. (1993) who examined daily returns of the S&P 500 index and conclude that, for this particular series, the autocorrelations of absolute returns raised to the power of θ are maximized when θ is around 1, that is, the largest autocorrelations are found in the absolute returns. Granger and Ding (1995) denote this empirical property of financial returns as Taylor Effect. Therefore, if rt, t = 1,...T, is the series of returns and ρθ(k) denotes the sample autocorrelation of order k of |rt|θ, θ > 0, the Taylor effect can be defined as follows:

ρ1(k) > ρθ(k) for any θ ≠ 1.

However, Granger and Ding (1994, 1996) analyze several series of daily exchange rates and individual stock prices, and conclude that the maximum autocorrelation is not always obtained when θ = 1 but for smaller values of θ. Nevertheless, they point out that the autocorrelations of absolute returns are always larger than the autocorrelations of squares. [1] This can also be observed when looking at USDCHF High Frequency FX rates (1996-04-01 00:00:00 to 2001-03-30 23:30:00; 62,496 observations):
teffectPlot, k = 1,...,10:
Scaling Law: Some financial time series show a selfsimilar behavior under temporal aggregation. The 'empirical scaling law' relates the average of the unconditional volatility, measured as the absolute value of the return, r(ti), over a time interval to the size of the time interval:
where the drift exponent 1/E is an estimated constant that Müller et al. (1990) find to be similar across different currencies and ΔT is a time constant that depends on the currency [2]. The Wiener process, a continuous Gaussian random walk, exhibits a scaling law with a drift exponent of 0.5 (slope of green line). The estimated drift component for the USDCHF series is 0.52, which is actually not statistically different from 0.5:
For more information see Fractals and Intrinsic Time - A Challenge to Econometricians.

Here is the official Rmetrics site.

[1]: see Stochastic Volatility Models and the Taylor Effect, Alberto Mora-Galán and Ana Pérez and Esther Ruiz
[2] see The Impact of News on Foreign Exchange Rates: Evidence from High Frequency Data, Dirk Eddelbuettel and Thomas H. McCurdy

by Rick Mabry
Mathematics Magazine, Vol. 72, No. 1. (Feb., 1999), p. 63.

Finance theory suggests that an asset with a higher perceived risk would pay a higher return on average [Caution]. For example, let rt denote the ex post rate of return on some asset minus the return on a safe alternative asset. Suppose that rt is decomposed into a component anticipated by investors at date t-1 (denoted μt) and a component that was unanticipated (denoted ut):
garchblogThen the theory suggests that the mean return (μt) would be related to the variance of the return. The GARCH(1,1)-in-mean, or GARCH(1,1)-M, regression model is characterized by
for ε i.i.d. with zero mean and unit variance. The effect that higher perceived variability of ut has on the level of rt is captured by the parameter γ (see Hamilton's TSA Bible).

A realization (n = 500) of a Garch(1,1)-M process with κ = 0.005, γ = 2, ω = 0.0001, α = 0.1, β = 0.8, and ε ~ N(0,1) looks as follows:
Unfortunately, I got a GARCH-in-mean effect (γ) of -34 instead of +2 after I estimated the process given above (n=500) with R (R uses the Ox package with "garchOxFit" command to estimate GARCH models. See here for "garchOxFit" installation instructions for Windows OS.) Estimating the GARCH(1,1)-M coefficient for n=500, 1000, 2000, 3000, ..., 10000 yields the following result:
This would mean that you shouldn't even think of estimating such a model if you don't have at least 4000 observations. So far I haven't seen any applied work where people used more than 1000 observations. How cool is that? Suggestions welcomed.

timesonline: Cyclists who wear helmets are more likely to be knocked off their bicycles than those who do not, according to research.

Motorists give helmeted cyclists less leeway than bare-headed riders because they assume that they are more proficient. They give a wider berth to those they think do not look like “proper” cyclists, including women, than to kitted-out “lycra-clad warriors”.

Ian Walker, a traffic psychologist, was hit by a bus and a truck while recording 2,500 overtaking manoeuvres. On both occasions he was wearing a helmet.

During his research he measured the exact distance of passing traffic using a computer and sensor fitted to his bicycle.Half the time Dr Walker, of the University of Bath, was bare-headed. For the other half he wore a helmet and has the bruises to prove it.

He even wore a wig on some of his trips to see if drivers gave him more room if they thought he was a woman. They did.

He was unsure whether the protection of a helmet justified the higher risk of having a collision. “We know helmets are useful in low-speed falls, and so are definitely good for children.”

On average, drivers steered an extra 3.3 in away from those without helmets to those wearing the safety hats. Motorists were twice as likely to pass “very close” to the cyclist if he was wearing a helmet.

For an excellent discussion, see here. Here is a nice picture of a bike.

Greg Mankiw reports that in a new paper entitled What Has Mattered to Economics Since 1970, Kim, Morse, and Zingales identified the most cited articles published in economics journals since 1970. The winner is Hal White's A Heteroskedasticity-Consistent Covariance-Matrix Estimator and a Direct Test for Heteroskedasticity, with 4318 citations in the Social Science Citation Index. Here is a sloppy reminder1:

Assume the data are generated by
hc01OLS estimation is unbiased and consistent, but it's inefficient. The correct covariance matrix for the OLS coefficient vector ishc02
There are now n + k unknown parameters; n unknown variances; and k elements in the β vector. Without some additional assumptions (If we knew Ω, we could use GLS), estimation from n sample points is clearly impossible. The influential paper by White showed that it is in fact possible to obtain an estimator of the covariance matrix of OLS estimates that is asymptotically valid when there is heteroskedasticity of unknown form. The key to obtaining an heteroskedasticity-consistent covariance matrix estimator is to recognise that we do not have to estimate Ω consistently (which would be impossible). The asymptotic covariance matrix of a vector of estimates, under heteroskedasticity, can be written as
hc03The only tricky thing is to estimate the second factor. White showed that this second factor can be estimated consistently by hc04where "Ω hat" may be any of several different inconsistent estimators of Ω. Unlike Ω, the second factor has only 0.5(k2+k) distinct elements, whatever the sample size. That is why it is possible to estimate it consistently. A typical element of plim(X'ΩX/n) is hc05 The White estimator replaces the unknown hc06 i.e. with the squared OLS residuals. This provides a consistent estimator of the variance matrix for the OLS coefficient vector and is particularly useful because it does not require any specific assumations about the form of the heteroskedasticity.

1 For more information see Davidson and MacKinnon or Hamilton

Glek Pelkabo writes (via email): About a year ago you had a post entitled "Little Correlation Between IQ and Happiness". I performed my own study using data from the Wisconsin Longitudinal Study and got a different result. I have posted something about this on my blog.

My take: By just looking at a sunflower plot of the data one wouldn't reject the hypothesis of zero correlation due to the symmetry (~ inverted triangle):

iqhappyIn this sample there's something happening around the mean IQ (blue line, lowess smoother) but I am not sure if it's worth writing a paper based on this observation...

NB: In your regression tslope^2 = F-statistic

stolen from Jem N. Corcoran

related items (red-hot):
Efficient tests of stock return predictability, John Y. Campbell and Motohiro Yogo, Journal of Financial Economics, Volume 81, Issue 1, July 2006, Pages 27-60

Aleks writes: "The excellent Infosthetics blog, which usually focuses at the intersection between information and art, today linked to this Timeline of Trends and Events ranging from 18th century to now and includes projections of the future for a number of variables: political power, economic development, wars, ecology and environment. I have always been impressed by such integrative attempts, for example the synchronoptical summaries of world history, or Karl Hartig's charts.
Although you might find the display cluttered, the wealth of information is worth it. Of course, it would be nice to have an interactive system for zooming into such time series, aligning them, reordering them, superimposing different time series, and so on. It is just a question of time. Moreover, with the abundance of data, our ability to model many kinds of it, the future of history may lie in statistical-graphical summarization."

via Statistical Modeling, Causal Inference, and Social Science

Sometimes a picture says it all. ;-D