A question came up recently at work about how to use a filter statement entered as a complete string variable inside dplyr’s filter() function – for example dplyr::filter(my_data, "var1 == 'a'"). There does not seem to be much out there on this and I was not sure how to do it either but luckily jakeybob had a neat solution that seems to work well.
some_data %>% filter(eval(rlang::parse_expr(selection_statement)))
Let’s see it in action using the iris flowers dataset.
This is just a small note (mainly for myself but hopefully may be of some use to a few others!) to remind of how to update a package on a drat repo.
Create the source file for the package you want to host on the drat repo using devtools::build().
Clone the drat repo hosting the package (in my case https://github.com/alan-y/drat).
Use drat::insertPackage("package-source.tar.gz", getwd()) to add the package to the drat repo (getwd() works for me if my working directory is at the top level of the drat repo).
Downloading the data
I recently saw this great post on Nathan Yau’s FlowingData website which guesses a person’s name based on what the name starts with. It also needs you to select a gender and a decade for when you were born before it can guess. Of course, it isn’t really a guess and is really just based on proportions calculated after restricting the data to what has been selected.
How resources are grouped in CKAN
Initialising ckanr and exploring groups of resources
Connect to CKAN with dplyr and download from one resource
Downloading all resources from a dataset
In previous blog posts (Hacking dbplyr for CKAN, Getting Open Data into R from CKAN) I have been exploring how to download data from the NHS Scotland open data platform into R. I’ve recently discovered that ROpenSci has a package to help with just this called ckanr and I wish I’d known about it earlier as it is really pretty handy!
A Learning Exercise
I created my first ever R package and got it released onto CRAN in March 2019. It’s taken me a while to get round to actually writing about this which tells me that despite many years of trying to overcome procrastination, I’m obviously still not there! The package is actually an RStudio addin called objectremover that helps you to quickly remove objects stored in memory (specifically objects saved in the Global environment) within an R session.
Create a dummy database
Test dbplyr’s SQL translation
Modify dbplyr’s SQL translation
Testing the dbplyr hack
At the end of my first post on CKAN discussing how to use the CKAN API to extract data from the NHS open data platform directly into R, I talked about how it would be neat to write some wrapper functions to make this process a little simpler.
Open Data in Scotland
Querying with Custom JSON
Querying with SQL
Conclusions and Further Ideas
I’ve got lots of rough pieces of R code written as I’ve been exploring/testing various things in the past. A lot of this is currently stored in a pretty disorganised fashion so I thought it would be a good idea to start writing some of these up into blog posts – at the very least, this should make it easier for me to find things later!
I am very excited to hear that there are attempts to create a brand new R user group in Glasgow! I had just talked in Post Number One about my guilt at not having been able to attend EdinbR as often as I wished but it should be much easier for me to find time to attend a group based in Glasgow. If you are based in (or near) Glasgow and would like to join the R community, this sounds like the place to be – I hope this idea takes off!