Saturday, April 15, 2017
2016-2017 Regular Season NBA Tracking Data
A few years ago the NBA installed special SportVU cameras in each arena that allow teams to track the movements of every player on the court. The new data points that these cameras provide offer insights into whether or not a player is getting enough touches, what types of shots a player tends to take, and how much energy is being used during the course of a game. I was able to pull some of the player tracking data for the 2016-2017 season and created a quick Tableau scatter plot to show how some of these data points may be correlated. I also highlighted the four main contenders for this season's MVP award (in my opinion, there is only one contender). The majority of the data points listed in this dashboard should have very little to no influence on whether that player should or should not win the MVP award this season but it does give a reference as to where they stand among their peers.
Tuesday, April 11, 2017
My Experience Hiking Through the Grand Canyon to Havasu Falls
Havasu Falls, taken with my GoPro |
This post falls in the "and Other Things" category...
A couple years ago while on a mindless YouTube binge I came across a video of a group of friends on a road trip out west. Many of the places that they visited I had seen before but there was one place in particular that stood out that I was not familiar with. I messaged the creator of the video to see where they were and he told me it was a place called Havasu Falls, located in the Grand Canyon. I couldn't believe a place like that could exist in the middle of a desert. Immediately, I began looking into details for a possible trip. Luckily, my friends and I had a weekend trip to Las Vegas planned for the following year to celebrate the end of our 20's. That was enough for me to plan a two day detour to visit the most remote town in the lower 48.
Somewhere near Kingman, AZ |
The morning of our hike, we woke up at around 5am and got the car packed. We left right away so that we could begin our hike early in the morning before the heat set in. The drive along Indian Road 18 is not particularly difficult. Although there is no street view on Google Maps there is no need to worry. The road is in good shape with a lot of straightaways. Use caution though. We began our drive at dawn where there was plenty of wildlife activity along the roads. Not only were we battling the sun at times but also some sudden braking due to deer, goats, and free range cows with no fencing separating their pasture from the street. We didn't experience any trouble getting there or back but could imagine it would have been a significant amount of time before we got any help if we had had any car troubles.
We arrived at Hualapai hilltop a little after 6am and began our hike by 630am. The hike in is not terribly difficult in terms of endurance. We stopped a couple of times along the way to stretch, sit for a couple minutes, and get the rocks out of our shoes. Never at any time though did we feel like we wouldn't be able to make it. The terrain however, is very rough. I was legitimately concerned with taking a wrong step and breaking an ankle. The majority of the hike is over very fine dirt that is riddled with rocks. Some of the rocks you can see very easily, others you notice when you step and sink a little into the dirt. Speaking later with other campers we learned that someone actually did break their ankle that day on the hike. This would be a nightmare as you would either have to tough it out or wait for a caravan of horses to pick you up which could potentially take hours. I recommend getting a decent pair of hiking shoes. My friend and I both wore standard Nike's thinking we would be OK and we regretted it. Nevertheless, we kept a strong and steady pace and made it in just under 3 hours. I've posted the data from my fitbit for the day to give you a better idea of what you're looking at.
The closest thing to a full marathon that my Fitbit will ever see. |
I broke the fitbit data down into four different sections. The first is the long hike into Supai. The second is our hike from the town of Supai to Havasu Falls with a short stop at Lower Navajo Falls to take a dip in the water. The third is our hike from Havasu Falls to Mooney Falls, the climb down, the climb back up, and then our hike back to Havasu Falls. The fourth is our journey from Havasu Falls back to Havasupai Lodge.
If I knew then what I know now there are a few things that I would have done differently. First of all, I would go for more than one day. We tried to squeeze this trip in real quick before getting back to Vegas with the rest of our crew. I wish we could have stayed two nights and left the third morning. I really wanted to make it back to Beaver Falls but we couldn't. We started hiking back there but started getting lost and decided to turn back rather than going any further into the unknown. If we had another day we could have taken our time rather than rushing from site to site trying to see it all in one day. Especially after that long hike in. When we made our final hike back to the lodge at the end of the day I was completely spent. You will notice from my fitbit steps that I was asleep by 8pm.
The second thing I would do differently is I would actually camp out, at least for a night. We decided to stay in the lodge since we figured we would be exhausted from a full day on our feet and the fact that we had a looming Vegas trip in store. Camping would have meant carrying more supplies in during the hike as well which we did not want to do. Unbeknownst to us however was the fact that this lodge didn't operate like your typical hotel. We arrived in town around 930am and the lobby of the lodge didn't open until 1pm. This forced us to have to carry our bags with us throughout the entire day. Luckily, we packed very light.
The last thing I would have done differently is drink more water. Every blog or article you read about this hike warns you to drink plenty of fluids. I thought I did but I was wrong. I was good the entire day until the final hike back from Havasu Falls to the lodge. It was hot, we had been on our feet all day, and this time we were walking up hill. The dirt is particularly hard to walk in. Every step you take you sink in a little bit as if you are walking on a sandy beach and it really does a number on your legs. Drink fluids on the drive in, continue on the hike, refuel once you're in town, and grab some backups to have throughout the day. There is a water source between Havasu Falls and Mooney Falls to refill water bottles if you need to. Keep hydrating even if you think you don't need to.
All in all I'm very glad I got this trip under my belt. It's a place that not many people will ever see due to the difficult journey to get there. I'd love to do it again if I can before I get too old. Below I've posted a few more helpful pieces of information that I wish I would have known going in as well as a video I put together from our trip.
A few more tips...
- The diner is actually pretty decent. Once we checked in around 930am we went over and grabbed a hot meal. It's the only resemblance of a restaurant in the tiny town. I had an egg and cheese bagel and it was exactly what I needed to refuel after the long hike.
- Bring some water shoes. I bought a pair on Amazon for $20 and ended up just leaving them at the lodge. It was worth it for the day.
- If you plan on taking the helicopter ride back to your car on the hilltop be sure to ask around the night before to see when and where you need to be. We were leaving on a Friday morning and we lined up for the helicopter around 6am. There were already 5 people ahead of us. Usually the helicopter doesn't start making trips until 10am but, due to an Indian celebration in the village that day, it began making rounds at 8am. You have to stand and wait near the trash dumpsters and there were flies all over the place. It was a pretty miserable wait but it was worth the $95 to not have to make the 3 hour hike uphill. Not to mention a pretty incredible view once you were in the air.
- This website did not exist when we booked but it looks like everything you will need to book can be found here.
Location:
Supai, AZ 86435, USA
Wednesday, April 5, 2017
Final NCAA Tournament Results
Well, as usual this year's NCAA Tournament came and went in the blink of an eye. There were plenty of close games, viewership was at an all time high, but for some reason the excitement just didn't seem to be there. I think we can probably blame this on a lack of first round upsets and buzzer beaters. South Carolina in the Final Four was probably the biggest surprise of the tournament, but, if you've paid attention to the style that they play, it shouldn't really be that big of a surprise at all that they made a run. And to top it all off the championship game was a dud.
As far as my machine learning experiment goes, I don't believe it was a complete failure. There were a couple of major upsets in the second round that I knew would be an issue and turned out to be pretty detrimental to the final result. I've already found ways that I can improve for next year and look forward to working on them this off season so that I can test throughout the 2017-2018 season. Below are the results showing how many teams were predicted correctly at the end of each round.
Final Results...65% accuracy
Round of 32: 30/32
Round of 16: 9/16
Round of 8: 2/8
Final Four: 0/4
Finals: 0/2
As B Rabbit said, it's back to the lab again.
Monday, March 20, 2017
Sweet 16 Update
So, let's take a look at how well my bracket performed after the opening weekend...
First Round (30-2)
Thursday: 16-0
Friday: 14-2
Obviously I was feeling pretty good after the opening round. I expected to do well but this still shocked me a little bit. There were plenty of close games but no real crazy upsets this year. Even the "upsets" that we did have were not truly upsets. Doing well in the first round was very important for having a chance in my pool but as it turns out, it didn't matter.
Second Round (9-7)
Saturday: 5-3
Sunday: 4-4
This is what I was worried about. Having St. Mary's, Cincinnati, and Michigan in the Final Four was a huge stretch and things obviously didn't go as planned. Thankfully, Michigan was able to hang on to do a little damage control. St. Mary's and Wichita State both had chances to win which would have helped a bit but all in all this round destroyed me. I would assume that most of America's 'East' region is a dumpster fire like mine with popular pick SMU(6) losing in the first round and Villanova(1) and Duke(2) losing in the second round.
All things considered, I do have the winning team advancing in 39 out of 48 games to this point (81%) which is better than expected. However, when the smoke cleared after last night's Cincinnati loss, only 1 of my Final Four teams was left standing.
Thursday, March 16, 2017
My 2017 NCAA Tournament Bracket
So, here it is...
Now, before I lose all credibility....Yes, I am a University of Cincinnati alum and season ticket holder. That being said, I am just as shocked as you are that the Bearcats are in my Final Four. There's no way I would have picked them to go that far on my own but, as I said in the previous post, I'm going 100% with the results of my model and these are the results of my model.
Overall, I don't hate it. Crazier things have happened.
What stands out…
- As of this post my first round picks match the Vegas money lines with the exception of two games (Xavier +120, Northwestern +115)
- Only 4 first round upsets
- 6 second round upsets
- 4 third round upsets
- No true Cinderella
- Only one #1 seed in the Final Four (Villanova)
- Three #1 seeds in the Elite Eight (sorry, UNC)
- #2 seeds had it rough with just one team (Duke) advancing to the Sweet Sixteen
- Saint Mary’s beats Gonzaga in the Elite Eight after losing to them three times already this season
- Michigan stays hot
- The #1 overall seed in the tournament wins the championship
- Gonzaga continues to be a post-season pretender
- SMU and Cincinnati in the Elite Eight would be big for the AAC
Now it's time to watch some basketball and hope my Bearcats prove me right.
Sunday, March 12, 2017
Using Machine Learning to Predict the NCAA Tournament
Every year around the middle of March college basketball owns our country’s attention for the greatest event in all of sports, March Madness. Let’s be honest, the main reason March Madness garners so much attention, beyond just die hard college basketball fans, is the office bracket pool. People who otherwise wouldn’t have even known it was basketball season fill out their brackets based on team color, mascot, or favorite cities they’ve visited. Others take a more analytical approach. This year, in an effort to gain an advantage in my office pool, I decided to combine my interest in college basketball with what I do in my profession and use machine learning to fill out my bracket.
I hate when I hear somebody talking about how they predicted some crazy upset but fail to mention that they filled out seven different brackets. You didn’t predict anything, bud. My strategy has always been 1 bracket a year and I usually fill it out in about 10 minutes so that I don’t over think anything. This season though, I’m going 100% with math and machine learning. No picking the hot teams, sentimental picks, or local favorites. Simply plugging in the numbers and seeing what comes out. This strategy has been used by statisticians for a while now, most notably by Ken Pomeroy whose website KenPom.com has been ranking teams and predicting games as far back as 2002. For me, this is completely for learning purposes. If it was a foolproof strategy I'd be on my way to Vegas. After all, the odds of predicting a perfect bracket are somewhere in the ‘1 in 9.2 quintillion’ range. Famous investor Warren Buffett, who previously offered $1 million to anyone that could predict a perfect bracket, has even upped his offer to $1 million a year for life to anyone that could predict the Sweet Sixteen (Berkshire Hathaway employees only). If I can predict somewhere in the range of 70-75% I believe it will be a success. In testing throughout the season my model had ~72% accuracy.
The two technologies that I used for this project were R programming and Microsoft Azure Machine Learning. For those that don’t know, R programming is an open source programming language that focuses on statistics. Azure Machine Learning is a cloud based solution that uses algorithms to learn from the data and try to make predictions based off of what it learns. In this experiment I used R to pull the data from the web, uploaded that data to Azure ML, and ran my machine learning experiment to predict the outcome of NCAA Tournament games. Going into this project I had virtually no hands on experience with either of these tools. This was simply a way for me to learn some new technologies and have a little fun.
Now, let me tell you how I did it...
With any machine learning project that you work on, the data collection and cleansing stage will take up the bulk of your time. I wanted to make sure that I could automate the process of data collection so that I could run my R code each morning and have the latest game logs for every Division I game of the season in one unified file. For this experiment I am pulling all of the data from www.sports-reference.com. I ultimately wanted to predict whether a team would win or lose a game and this site provides all of the data that I will need to make that prediction. The data is also in HTML table format and easily extracted using the ‘XML’ library in R. This is the only source of data that I used for this experiment. I could have blended in other data sources but for this first run I chose to keep it simple.
Once I was able to pull all of the data and get it into a format that I could work with in Azure ML, my next task was to choose which data points to use in my machine learning experiment. This task seems pretty straightforward but there is a lot more that goes into it than you may think. As I said earlier, I wanted to be able to predict whether a team would win or lose a game. In Azure ML this is known as a two-class classification since our outcome is binary (W or L). When looking at the dataset I needed to look for data points that would help me most accurately predict whether a team would win or lose. This is where some basketball knowledge helps.
If I handed you a box score at the end of a game without telling you which team won the game, you could probably determine the outcome for yourself by looking at a few key statistics. Perhaps “Which team made more shots?” or “Which team got more rebounds?” etc. The problem with this is that these are raw statistics that can vary greatly from game to game. A team may score 75 points in two separate games but it may take them 68 shots to reach that mark in one game and 52 shots in the next. For this reason it is better to use statistics that have been normalized per possession or per opportunity rather than your standard box score statistic. This way we can determine how efficient a team was in a particular facet of the game.
For now, I’m not going to share the data points that I used. Maybe in the future..
After selecting my data points I was finally ready to run my experiment in Azure ML. As I said earlier, I used a Two-Class Classifier to predict whether a team would win or lose. There are many algorithms to choose from when working with two-class classification. I tested a few but ultimately chose to use the Two-Class Neural Network. Artificial neural networks are designed to work much like the neural network in the human brain. The data points are passed in as input, they progress through a hidden layer of interconnected nodes where the data is learned, weighted and passed through as output. Once the dataset is trained it is scored and able to be evaluated. This score shows how well our experiment predicted the outcome of past games based on the training data that we fed it. It will also give us a confidence score between 0 and 1 to show how accurate an outcome may be.
This has been a trial and error process over the past couple of months. It took me a while to define the data points that gave me the most accurate predictions. The good thing is that with only a couple minor changes I’ll be able to use this code again next season to refine the process and hopefully predict games more accurately. As for the final product, I’ll be posting that on Thursday after the first game tips. Maybe it will help me win my office pool, maybe I’ll lose to Sue in accounting. Either way, I’ve learned a lot that will help me in my job and I already have a head start for next year’s tournament.
Subscribe to:
Posts (Atom)