Continuing from my previous post, I now focus on detailed match statistics, rather than the available aggregate data. By scraping very detailed data from each match of the 2018/2019 Norwegian hockey season, my goal is to present aggregate data that are not available at the source webpage. The data material is scraped from Hockey live.
The code I started by simply downloading the main HTML file manually from the web browser.
I wanted to visualize the personal statistics for the hockey players of Stavanger Oilers, for the 2018/2019 season.
The data material is scraped from both Elite Prospects and Hockey live (regular season and playoffs), using the R-package rvest, as described in this blog post.
The code Scraping the data from Elite Prospects was straightforward, as it is stored as an HTML table. When you want to scrape a table with rvest, you only need to specify an index integer.
This post regards my MS_VAR Github repository, which contains code used in the following paper:
Osmundsen, Kjartan Kloster, Tore Selland Kleppe, and Atle Oglend. “MCMC for Markov-switching models—Gibbs sampling vs. marginalized likelihood.” Communications in Statistics-Simulation and Computation (2019): 1-22.
A Markov-switching vector autoregressive (MS-VAR) model is an autoregressive mixture model governed by a (hidden) finite state Markov chain. In the mentioned paper, the MS-VAR model is expressed as: