OttoinGrotto
2023-24 Iggy Award Most 3 Pointers Made
- Joined
- Aug 26, 2011
- Messages
- 63,060
- Like
- 185,165
Ah, Monte Carlo.Yes, with one caveat: the low frequency of encountering zones injects some risk into predicting future performance against a zone using the entire body of data. The easy way to counter this: model using only the data against zones. I imagine you'll find enough data to model, and if not: run a Monte Carlo with - I think - some Gibbs sampling (I could be wrong here, it's been a while since I built a model myself). This should yield a 'cloud' of data that minimizes outliers and increases sample size to better 'feed' the predicted range. The Gibbs sampling lets you test each variable / input for fit with the hypothesis you're testing - in essence, asking 'Is this data point valid for us in my against-the-zone D performance prediction?'
Alternatively, you could smooth the general dataset out using a zone-based normalizing algorithm.
I'm so glad I'm not doing that stuff anymore.