Difference in Difference in R: A Complete & Easy Tutorial

Conducting a difference in difference in R allows researchers to gain insight into the impact of a policy, or other outside factors, by taking into consideration two things: how a group mean changes before and after the policy or other outside factors was implemented. To that end, this is considered the treatment group. compare this change… Continue reading Difference in Difference in R: A Complete & Easy Tutorial

Is Data Science a Fad?

As a University professor intimately involved in the field, I am often asked “is data science a fad?” Let me be abundantly clear when answering whether or not data science is a fad. Data science is not a fad. In fact, I would make the argument that it is one of the hottest and most… Continue reading Is Data Science a Fad?

How To Clear The Environment in R: Keep RStudio from Running Slow

Knowing how to clear the environment in R is one of the easiest ways to overcome an RStudio installation that is running entirely too slow. To that end, RStudio can slow to a crawl due to a multitude of reasons ranging from a large amount of information being stored in memory, settings being set in… Continue reading How To Clear The Environment in R: Keep RStudio from Running Slow

A Beginner’s Guide to NFL Analytics: Getting Started with nflfastR and RStudio

Thanks to the work of a handful of people (@mrcaseb, @benbbaldwin, @_TanHo, @LeeSharpeNFL, and @thomas_mock … to name a few), getting started with advanced analytics using NFL data is now easier than ever. Without getting too far into the weeds of the history behind all this, the above-mentioned people are responsible for the creation of… Continue reading A Beginner’s Guide to NFL Analytics: Getting Started with nflfastR and RStudio

Computing Player Performance Percentiles Using Scraped Data

There is no doubt about it: we are currently in the golden age of big data when it comes to the NFL, MLB, and many other leagues. In this case, the nflfastR project (which is the “child” of Ron Yurko’s nflscrapR) allows for fast and easy access to deeply detailed and rich statistics dating back… Continue reading Computing Player Performance Percentiles Using Scraped Data

A Better Way To Work With Zillow ZTRAX Data: A Guide To Wrangling the Data in R

For researchers and/or academics that have any interest in working with housing data, Zillow’s ZTRAX database is a must. The ZTRAX database, short for Zillow Transaction and Assessment Dataset, is unquestionably the largest real estate database that has ever been made available – free of charge – to qualified academic, nonprofit, and governmental researchers. Previously… Continue reading A Better Way To Work With Zillow ZTRAX Data: A Guide To Wrangling the Data in R