How Fresh is that Code?
Andrew Elliott
One of the beauties of the "R" programming language is the vitality of the user community. Language users are continuously uploading newly developed or revised versions of extension functionality. Looking at the range of packages available on CRAN, the "Comprehensive R Archive Network" I was struck by how many of these packages had recent versions resistered. So, I decided to dig a little, and at the same time give you a little flavour of quick and dirty data exploration with R. Some highlights:
Load in the package list from CRAN:
packages<- getRPackages("http://cran.r-project.org/web/packages/available_packages_by_date.html")
How many packages are in the archive?
dim(packages)[1]
## [1] 7422
Date of stalest package?
min(packages$dt)
## [1] "2005-10-29 UTC"
Date of freshest package?
max(packages$dt)
## [1] "2015-11-03 UTC"
Ooh! that's today: how many packages are fresh today?
nrow(packages[packages$dt==max(packages$dt),])
## [1] 5
And just for interest, which are they?
packages[packages$dt==max(packages$dt),c("name", "dt")]
## name dt
## 1 DLMtool 2015-11-03
## 2 epiDisplay 2015-11-03
## 3 MM2S 2015-11-03
## 4 quickmapr 2015-11-03
## 5 SALTSampler 2015-11-03
Ok, so let's compute the ages of the packages (in weeks). How many packages are less than 4 weeks old?
today<-max(packages$dt)
packages$age<-interval(packages$dt,today)/edays(7)
sum(packages$age<=4)
## [1] 587
Around 8%! let's look at the distribution by age - for convenience convert weeks to approximate years:
ageInYears <- packages$age / 52
hist(ageInYears, breaks=20)
More than half the packages are fresher than 1 year old; and it's easy to see that the growth took off just about 4 years ago after several years of slow burn. Let's look at the growth just over the past year (roughly 44 weeks):
freshThisYear<-packages[packages$age<=44,]$age
hist(freshThisYear, breaks=44)
I think it's clear that the takeup of R continues to accelerate, if the freshness of the user-contributed archive is any sort of guide.