Announcement

Collapse
No announcement yet.

Are independent possessions a fundamental flaw in data-ball?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Are independent possessions a fundamental flaw in data-ball?

    I was thinking of this the other day. The vast, vast majority of data analysis and advanced metrics depend on the assumption that each possession is a unique and independent event.

    I would argue that this is not the case. Basketball, and all sports, when you get down to it, is about a group of people competing against another. These people make endless series of split-second decisions while playing. Each decision made is influenced by a multitude of tiny factors. One of the more important factors are previous decisions and the consequences of those decisions.

    To bring this back to basketball: I think that a possession is defined by more than the number of points that it produces.

    If a pick and pop is run one play that results in a missed mid range jumper, but the next play, the same pick-setter rolls to the basket, is left open and scores, what effect did that pick and pop have on producing the 2 points in the later possession?



    I'm going to veer off course a little here, and discuss tennis. As an individual sport with many distinct 'units' (rallies, games, sets), there should be a huge amount of data mining going on. But this isn't the case, and I don't think it's simply because no one's thought of trying.

    I think it's because, not only does each rally affect subsequent rallies, but effective strategies are largely dependent on the opponent on the other side of the court. Given this, the strategies that are most effective in a given time period, likely depend on the 'meta'-sport. If 7 of the top 10 players have weaker forehands, shots to that side will be analytically superior. But does this define the sport? Is it a law of tennis that it is necessarily advantageous to hit to the forehand side? Or do these things change over time independently of the rules?
    Last edited by stooley; Fri Mar 7, 2014, 12:44 PM.
    "Bruno?
    Heh, if he is in the D-league still in a few years I will be surprised.
    He's terrible."

    -Superjudge, 7/23

    Hope you're wrong.

  • #2
    To add to the tennis analogy, I'd say the fact that 3 point attempts and shots at the rim generally produce more points is similar to the widely prevalent fact in tennis that the backhand is a weaker side. Yet good players will still hit plenty of shots to the forehand side.

    Can this analogy be stretched to the mid range jumper as a necessary part of basketball, which enables those rim shots and 3 point shots?
    "Bruno?
    Heh, if he is in the D-league still in a few years I will be surprised.
    He's terrible."

    -Superjudge, 7/23

    Hope you're wrong.

    Comment


    • #3
      Most advanced metrics fail to capture the dozens and dozens of small nuances that occur during any given sport, especially in the game of basketball where the action is so fluid and there's so much interaction between the 15-16 players that get in a game on average.

      It's kind of why I want to bang my head against a wall when I see people using analytics as their be-all, end-all argument for proving a certain point. They're definitely useful in certain contexts, but for the most part, they should be used as part of a bigger formula incorporating a lot of other data as well.

      Comment


      • #4
        Following this line of thinking to its logical end, there's only one valid stat:

        The final score.

        Which is true

        Short of that though, you have to decide on some arbitrary limitations on things to begin parsing that final score up.
        "Stop eating your sushi."
        "I do actually have a pair of Uggs."
        "I've had three cups of green tea tonight. I'm wired. I'm absolutely wired."
        - Jack Armstrong

        Comment


        • #5
          JimiCliff wrote: View Post
          Following this line of thinking to its logical end, there's only one valid stat:

          The final score.

          Which is true

          Short of that though, you have to decide on some arbitrary limitations on things to begin parsing that final score up.
          Even final score leaves room for interpretation, based on the strength of the opponent, whether a back-to-back is involved, injuries, etc.

          Comment


          • #6
            Okay, but the same level of complexity is true in chess (even moreso). However, you can still evaluate positions and determine the "best move" in the position. In making that analysis, you can take into account an enormous amount of data. Sometimes, the nature of your opponent and other external factors will influence the position but it doesn't mean you can't objectively evaluate a position and a move.

            I'm reminded also of the "it depends what the judge ate for breakfast" joke among lawyers. You can only control what you can control. Using the data to make the best decisions based on all the knowledge and data you have. You can't discount that preparation and decision-making based on the fact that some guy had a fight with his wife three weeks ago....

            Comment


            • #7
              slaw wrote: View Post
              Okay, but the same level of complexity is true in chess (even moreso). However, you can still evaluate positions and determine the "best move" in the position. In making that analysis, you can take into account an enormous amount of data. Sometimes, the nature of your opponent and other external factors will influence the position but it doesn't mean you can't objectively evaluate a position and a move.

              I'm reminded also of the "it depends what the judge ate for breakfast" joke among lawyers. You can only control what you can control. Using the data to make the best decisions based on all the knowledge and data you have. You can't discount that preparation and decision-making based on the fact that some guy had a fight with his wife three weeks ago....
              But I'm saying that the data may actually be misleading. Chess is actually far, far less complex than basketball. I know that sounds crazy, but there are far fewer possible moves, and objectively good positions are calculated assuming an optimal move in response.

              Data-driven analysis in basketball often challenges long held beliefs in the basketball community, the same cannot be said for chess.

              These long held beliefs were based on reality and were relevant for actions that could be controlled. So are these findings misleading? How closely should they be followed?

              For example, certainly, efficiency has been shown to be a critical element of basketball, which was a nice revelation. But Houston's D league team is shooting exclusively 3s and shots in the paint, to an extreme. Is this a reasonable strategy given the information provided on efficiency? Or is that disregarding other, equally as important aspects of basketball?
              Last edited by stooley; Fri Mar 7, 2014, 02:12 PM.
              "Bruno?
              Heh, if he is in the D-league still in a few years I will be surprised.
              He's terrible."

              -Superjudge, 7/23

              Hope you're wrong.

              Comment


              • #8
                slaw wrote: View Post
                Okay, but the same level of complexity is true in chess (even moreso). However, you can still evaluate positions and determine the "best move" in the position. In making that analysis, you can take into account an enormous amount of data. Sometimes, the nature of your opponent and other external factors will influence the position but it doesn't mean you can't objectively evaluate a position and a move.

                I'm reminded also of the "it depends what the judge ate for breakfast" joke among lawyers. You can only control what you can control. Using the data to make the best decisions based on all the knowledge and data you have. You can't discount that preparation and decision-making based on the fact that some guy had a fight with his wife three weeks ago....
                But the other team is looking at the same data and preparing against your "best decisions". It's easy to fall back on your "best decisions", statistically speaking, but that doesn't always mean that the best statistical decisions are the most advantageous within any given game and/or possession.

                The way I see it, is that you can utilize statistics as part of your game-planning, but you need to allow the players the freedom to execute within the flow of the game.

                For example, if the stats say that one of the team's most efficient shots is Player-A rolling behind a screen set by Player-B, to execute a catch-and-shoot play off a pass received from Player-B, then it should be exploited. However, if Player-B is off the court in foul trouble, is that play still as efficient? Or if the other team knows the efficiency of that play and plans their game to combat it (ie: cheating on switches, hedging on picks, etc..), do you continue to run it as your #1 play? Stats alone would say yes, but within the flow of that particular game (or quarter, or due to a certain lineup matchup), your head & gut (ie: experience) might tell you otherwise. A good coach and seasoned players will be able to adjust in-game, even if it runs counter to what the stats say in a vacuum.
                Last edited by CalgaryRapsFan; Fri Mar 7, 2014, 02:38 PM.

                Comment


                • #9
                  To add: especially in basketball, we're looking at very small differences in percentages. 1.15 PPS vs. 1.22 PPS.

                  If that 1.15 PPS play leads to 0.10 PPS more on a future possession, then it is the better play.
                  "Bruno?
                  Heh, if he is in the D-league still in a few years I will be surprised.
                  He's terrible."

                  -Superjudge, 7/23

                  Hope you're wrong.

                  Comment


                  • #10
                    stooley wrote: View Post
                    To add: especially in basketball, we're looking at very small differences in percentages. 1.15 PPS vs. 1.22 PPS.

                    If that 1.15 PPS play leads to 0.10 PPS more on a future possession, then it is the better play.
                    Well, kind of. But not when comparing a drive or a 3 pointer to a mid-range shot (which is the primary efficiency discussion amongst stats guys offensively lately). Mid-range games in general you're looking at 0.8 PPS. The other shots are more like 1.1 PPS. You need a lot of trickle down efficiency later to make up for the difference in efficiency there. Especially when dealing with players like DeMar who take 10+ mid range (8-23 feet) jumpers a game. That's more than 10% of your possessions used where you (in theory) sacrifice 0.3 PPS - 3 points overall. You then need a .033 PPS increase (or a 3.3% increase in FG%) for every other possession in the game to break even. Just off DD's midrange jumpers.

                    Now, I'm not personally of the opinion that a team should only shoot 3's or layups, or that that is even possible with the whole, you know, other team in the way, but let's not underestimate the inefficiency of a midrange shot. And this is far too simplistic an approach, but the reality remains that IF you view a midrange shot (for example) now as a delayed reward shot, then the delayed reward does need to be substantial to make up for the present loss.
                    twitter.com/dhackett1565

                    Comment


                    • #11
                      Take a look at the Rio Grande Vipers (Houston's D League affiliate) shot chart:



                      Apparently it's working for them....
                      "Bruno?
                      Heh, if he is in the D-league still in a few years I will be surprised.
                      He's terrible."

                      -Superjudge, 7/23

                      Hope you're wrong.

                      Comment


                      • #12
                        Let's take the example you used to start off.

                        Question: does an inefficient mid-range shot missing have additional value not captured in the miss, because it makes a later-in-the-game 3-point shot more likely to go in, because the earlier mid-range shot caused the defence to shift.

                        Answer: I don't think so. It would only make a difference if the defence shifts after seeing a mid-range shot miss. Why would they? Why does a miss cause them to change up their defence to defend against a strategy that is a proven failure? They might shift if they see that mid-range shot going in reliably. But then the fact that shot is going in will be reflected in the stats. So against a well-coached NBA team, I don't think missed shots create value.


                        The Vipers diagram is great ... and the color coding is incredibly misleading. I saw it before and didn't notice. This time I really looked at it and saw the difference.

                        The top of the key, where they shot 14/33 or 42% is in green. Around the basket, where they shot 825/1485 or 55% is in yellow. So visually, shooting 55% looks like it is worse than shooting 42%.

                        What's stunning about the diagram, is that:
                        - they took ~1500 or 47% of their shots at the rim, shooting 55%
                        - they took ~1500 shots or 47% of their shots from 3 point range, shooting ~52% (35% from 3 = 52% from 2 all else being equal)
                        - they took ~200 shots or 6% of their shots from the rest of the court combined, shooting ~35%

                        I wish the color coding reflected this: showing the success of their shots vs. all shots taken, not vs. all shots other teams took from the same distance.

                        Comment


                        • #13
                          Stooley, looks like someone's working on answering your question:

                          http://espn.go.com/video/clip?id=esp...tartTime=02:14

                          And with respect to your thoughts on possessions, I've been thinking the exact same thing about FGs for a long time. It's insane and stupid that all shots are treated equally. The idea of a field goal should just be destroyed.
                          "Stop eating your sushi."
                          "I do actually have a pair of Uggs."
                          "I've had three cups of green tea tonight. I'm wired. I'm absolutely wired."
                          - Jack Armstrong

                          Comment


                          • #14
                            I do actually try to keep my post short... honest I do!!!!

                            With SportVu these types of questions can be easily answered (as long as someone with access to the data takes the time to look at it, which in reality makes it not that easy at all).

                            All you have to do is track the players on every possession (which sportvu does). Then you just see if any patterns emerge. Just because a player makes or misses a basket DOESN'T necessarily mean a team will change how it defends that possession on the very next play. I'd guess that there are teams that make small adjustments from play to play, and teams that (relatively speaking) don't. I'd also guess that there are players that make small adjustments from play to play and those who are less likely to do so. I'd also guess that there are players who don't make adjustments but are effective defenders, and players who do make small adjustments virtually every play and are also effective. I'd also guess there are players who make adjustments when they shouldn't, and players who don't, but should. But I don't have any evidence for that opinion...

                            If we remember back to the infamous Zach Lowe article last year on the Raps in house analytics. They had a program which ran the "optimum" position of all defensive players at any given point based on the positions of the offensive players and the ball. The theory that they'd be working under is that you shouldn't adjust if the guy makes the basket good for him, guard him the same way and let him take the same shot all night if he wants.

                            The thing to remember though is that analytics is predictive modeling. The models will prescribe a certain set of actions as being optimum. However, with more data, more teams, and more players you can get more specific. Playing an "optimum" defense against league numbers certainly won't be optimum against each and every one. Even more specific playing the "optimum" defense for a team won't necessarily be the most optimum in reality, as those numbers would be skewed towards minutes/usage heavy players (predominately starters), there might be a different "optimum" defense for each possible 5 player unit. You can even do the same with the players on defense. The optimum defense with JV playing center could very well be different when Hayes is in (based on their different wingspans and speeds, I don't know if the raptors model was player specific or even position specific). To go full circle, eventually, you could even break the numbers down to the point where patterns do emerge that would suggest you should do something different based on what happened the previous play, and produce a theory that says how you should adjust depending on the offensive player the defensive player and what happened on the last play (or last play, where this particular set/play was run).

                            The effectiveness of the emerging analytics is predicated on a number of things, first the time/resources to mine the VAST amounts of data, the ability to discern patterns out of it, make predictions about how the game should be played, and last but most importantly, be able communicate that to coaches/players and the coaches/players ability to conform to the model. Then the model can be tested, and if the results are different from the predicted results (and there should probably be sizable variance, if not outright disproved). Eventually, you should have success, however, as more teams adjust to the what analytics says is the optimum, that optimum will change and shift, and new optimums will emerge.

                            I'm sure this post seems largely nonsensical, but to boil it down. The best analytics can ever do (but will probably never achieve) is to prove mathematically, how a game should have been played, or what the "best" decision would have been given certain constraints. It will never be able to prove what teams should do in the future. Analytics will never be able to "solve" basketball. Just like the act of observation changes an experiment, the act of playing optimum basketball according to a model, will change that model's optimum.

                            That said, there's a wealth of data just waiting to be mined, and looking at possessions in isolation isn't flawed, it just that it's been the only way we've been able to break data down to examine it, in order to find patterns and make hypothesis. It's not a flaw, it's an opportunity! A team that takes a holistic approach to their analysis (again largely impossible before sport vu), may find huge may come up with some revolutionary stuff! Stuff that is in this vein of thought that would be interesting to know...

                            Do players run back on defense faster, if they've touched the basketball during the offensive possession? Do they get progressively slower the more possessions occur without them touching it?

                            How often and varied are the adjustments that players make from possession to possession. Are players that make adjustments more effective than players who don't make any adjustments.

                            I'm kinda in love with sportvu and I can't WAIT for it to be ten years in the future where a bunch of amateurs have the money to access this type of data, because on top of all the great theory that will come out of it, there will just be really cool anecdotal stuff. Like who is the fastest jumper in the league. When you isolate for height and wingspan, who dribbles closest to the floor? Wouldn't it be cool to know that!?!!?!?

                            So anyway, to try to tie this randomness all back together... YES looking at plays in isolation, and making decisions about how you are going to play in the future based on isolated play analysis may not lead to the most optimal outcome (although, at this point I'm not convinced you find sufficient causal evidence, that teams DO and/or SHOULD make changes in their scheme from play to play). Looking at plays holistically, determining to what extent previous plays impacted future plays and decision making, is something that is only starting to be possible. If I were you I would send your concerns to kirk goldsberry/zach lowe anyone else in sports who is interested in analytics because they may just find the answer for you! Well, start to answer it for you anyway.

                            Another ridiculously long post brought to by ezz_bee.
                            "They're going to have to rename the whole conference after us: Toronto Raptors 2014-2015 Northern Conference Champions" ~ ezzbee Dec. 2014

                            "I guess I got a little carried away there" ~ ezzbee Apr. 2015

                            "We only have one rule on this team. What is that rule? E.L.E. That's right's, E.L.E, and what does E.L.E. stand for? EVERYBODY LOVE EVERYBODY. Right there up on the wall, because this isn't just a basketball team, this is a lifestyle. ~ Jackie Moon

                            Comment


                            • #15
                              Very thoughtful post, ezz_bee.

                              I think we're going to find an interesting amount of the data we get focuses on the mental side of basketball (your 'does a player run back faster when they've touched the ball on the offensive possession').

                              In straight analytic terms, it should make no difference. In human psychology terms, it may make a big difference. (And then we start asking if the human side can be coached out of you.)

                              Returning to the example that started the thread, I should give a more nuanced answer:
                              - there is no sports reason why a missed mid-range shot should make a later shot more successful
                              - there may be a human psychology reason behind taking certain inefficient shots to 'set up' later efficient shots

                              It will be interesting to analyze the data to find out.


                              P.S. There is also the human tendency to find patterns and assign meaning where none exists. It may take a lot of data to give statistically meaningful results, when we want to believe otherwise.

                              In their last 3 games, the Raptors shot 14/24, 10/21 and 8/22 from 3-point range. That's 58%, 48% and 36% respectively. Is there a meaning behind those results? Or is it just the pure variability of doing an activity that works on average 40% of the time?

                              Put otherwise: if Amir Johnson sinks a 3 (or misses a 3), should you guard him tighter, or should you just let him shoot those all day? What would you advise an opposing coach?

                              Comment

                              Working...
                              X