**Statistics:**

I'm glad this thread has brought up the topic of Statistics. As I stated in response to fanfanclubclub above, I actually have a degree in Statistics (probably the only one on this board that does).

The use of Statistics by people can usually be divided into a few categories:

1) Statistics!, ugh I hate statistics and anything having to do with math. I don't trust them. Most people used Statistics to LIE!!!!

2) I took a class in Statistics (or I use Statistics at my job). I know all about Statistics. I'm really good at using Statistics. (This category is often referred to as "knowing just enough to be dangerous").

3) I've studied alot of Statistical and Mathematical theory, and know when statistics can (and should) be used, but I also understand the limitations of statistics (and mathematics), especially when they are not used correctly.

4) I have a freak mind for mathematics, obtained a PhD in Math and Statistics, and can do crazy probability equations in my head (I've only met a few of these guys, so they are few and far between).

Most people on this board fall squarely between numbers 1 and 2.

What Statistics really boils down to is the study of Probability, Variance and Bias. At the core of Statistics is typically the "probability distribution" of the data. Is it normal? Poisson? Is it discrete or continuous? If it doesn't have a defined distribution, should you use Non-Parametric statistics?

Now after you answer those questions, what type of Statistical Methods are you going to utilize to draw inference from?

It gets rather complicated from there on out.

Now I'm not sure what type of analysis fanfanclubclub was doing. He claims he uses "linear regression". There are several different types of regression methods, the simplest being linear. I actually like Multivariate Regression, and have used it to build very accurate models. Of course, it depends on what you are trying to model (what is your response variable?).

What most people don't do when they are attempting to perform regression is look at the residual plots of the variables. This is really one of the most important things to do when building a model. This will tell you if you need to transform your variables. I'm not sure if fanfanclubclub is doing that. On top of that, is his data "normal". I'm not so sure that your typical sports statistics (like shooting efficiency) is normal.

If the data is not normal, Non-parametric methods should probably be used. But I haven't studied Non-parametrics too much.

Does his model exhibit multicollinearity? I'm sure it does, given that variables within sports are typically tied together. You have to account for that.

Also, Time is one of the most difficult things to model. Obviously Time is a variable that should be included in any model dealing with players who are assumed to grow and improve over the various seasons. Time Series models can become extremely difficult.

Bottom line, Statistics can be great and extremely useful, just know what you are doing before you embark on your analysis.