can we stop posting a new thread every time hollinger stats gets updated?
can we stop posting a new thread every time hollinger stats gets updated?
First I was curious about why its tested "5000 times"
Odds are just that - odds. Testing them doesn't change the odds.
A fair coin flip is 50%. If I flip it 5000 times, the outcome of those 5000 flips doesn't change what the odds were.
well that explains why it would need to be tested - but now I question why one would need to make 'random' adjustments.it makes a random adjustment up or down to allow for the possibility that a team will play better or worse
At this point aren't we just admitting our initial 'odds' (ie. power ranking) aren't much better than a pure guess, so lets add a bunch of stuff to it 5000 times, average the results, and now call it a fair prediction?
I get that the equation itself is unbias, but adding 'random' variables can just as easily create a bias that otherwise didn't exist.
The statistics which the simulations are based on factor in this year's performance by the team. Meaning, if dwade sat out 10 games this year, we don't have a sample of Miami if they played at full health for 35+ games.
I guess I'm making this more complicated. Basically, if Miami was healthy the whole year, they would be better, so our percentage would go down because the simulation would have us playing a healthy Miami 1000 times instead of one who sits bosh and wade all the time.
I didn't mean to say that it simulates injuries
The "odds" that he's using are based on those 5000 simulations. I.e. if he runs his simulation 5000 times and the raptors win the championship 500 times, the "odds" of them winning are 500/5000 or 10%.
The algorithm that simulates the season is bases the strength of teams off his power rankings. So taking that number of his power rankings into account, they simulate the season more efficiently- it's how he measures performance, etc.
I agree though, that there are certain things missing from his algorithm and that the raptors chance of winning the championship should be (and in reality is) less than the heat's but the "odds" he's using are based on his simulations.
Sure, the odds of getting heads is 50% when you flip a coin, but if you flip a coin 5000 times, you might get heads 4500 times. I think he's just using odds because it's better than saying "percentage of times my algorithm predicted these events would happen"
A key that opens many locks is a master key, but a lock that gets open by many keys is just a shitty lock
Well, there's something called LLN (Law of Large Numbers) that shows that even if it's theoretically possible to get heads 4500 times, it does not happen. You could get 8 heads out of 10 flips but never ever would you get 4000 heads out of 5000 flips.
http://en.wikipedia.org/wiki/Law_of_large_numbers
thats not what the theory of large numbers is saying.
Its saying the larger the # of samples the more likely the result will trend towards the expected odds. As such the chances of getting 4500 heads out of 5000 is very slim BUT if one does get (possible, but not very probable), as the sample size increases (to 10k or 50k or 100k) the expected results will normalize closer to 50%, until infinity when it will be 50%.
A simple but maybe poor explanation is: the odds matter, not the result of the test. If the test doesn't give us the expected odds, its because the test just isn't big enough yet.
Which is ofcourse is a bit of a paradox.
I get that, but that doesn't make sense under probabilities. The odds are X, the expected result should also be X, if its not X there is a problem with our odds to start with OR large # theory (see above )The "odds" that he's using are based on those 5000 simulations. I.e. if he runs his simulation 5000 times and the raptors win the championship 500 times, the "odds" of them winning are 500/5000 or 10%
Ofcourse you throw additional variables into that (ie. a team plays better or worse in the future) that will change the expected odds, but if we don't know what those variables are or will be, then we are just guessing at them and their impact. And ofcourse random is random, we shouldn't plan for/expect random. If we know what they are or can reasonably expect them to be, why not just put them in the initial equation?
Last edited by Craiger; Thu Jan 16th, 2014 at 07:18 PM.
It's like I'm taking advanced stats all over again... I'd rather not
Disagree. The expected result sure, but not necessarily the actual result.
For arguments sake, definitions of Odds:
and Definition of Probability:1.A certain number of points given beforehand to a weaker side in a contest to equalize the chances of all participants.
2.The ratio of the probability of an event's occurring to the probability of its not occurring.
3.The likelihood of the occurrence of one thing rather than the occurrence of another thing, as in a contest.
Probabilities don't say something WILL happen. They say something SHOULD, or COULD happen.1.The quality or condition of being probable; likelihood.
2.A probable situation, condition, or event.
3.The likelihood that a given event will occur.
"I have self-doubt. I have insecurity. I have fear of failure. We all have self-doubt. You don't deny it, but you also don't capitulate to it. You embrace it. You rise above it." -Kobe Bryant
What you are missing here, is he doesn't start with odds for each team to make the playoffs, win games, etc. He starts with a simple net rating projection. Then looks at each of the 1230 games determining the odds of one team beating the other for each game based on those net ratings. Then he runs a random number generator to determine if the better team wins. Most often it does. But there will be a great many games where the underdog wins. There appear to also be other random factors thrown in, but on the whole they should average out so long as they are applied equally to all the teams at some point.
Anyway, he applies that strategy to each of the 1230 games, getting a win-loss record for each team. He does the same for the playoff matchups, and runs the lottery, but that's all an extension of the same thing. He then does all of that again. It yields different results due to the random decision of which team wins each game (not truly random, probabilistic). He does this 5000 times (he being the computer program I guess). Then at the end you have the best case, worst case, and average record for each team, the number of times they made the playoffs, the number of times they won the championship, the lotto, etc. That's where his odds come from.
He has no "odds" to start with - only the odds a team with a particular net rating will beat a team with another net rating in one game. He has to run the simulation to be able to project that to the odds you are talking about, the odds a team makes the playoffs, etc.
There are currently 1 users browsing this thread. (0 members and 1 guests)