July Meetup: Thermal images and Object oriented systems in R
Description changed:
We've got two great talks lined up for our last meeting before the summer break.
Rebecca Senior is a 3rd year PhD student in the Department of Animal and Plant Sciences at the University of Sheffield, and avid lover of using R to do just about everything! She studies interactions between land-use change and climate change in the tropics. She’s going to talk about the work she’s been doing analysing thermal image data collected in the forests of Borneo.
Chris Hopkinson is a Sheffield R group regular, analyst and data lover. He’ll be giving us the low-down on object oriented systems in R.
Thermal images in R
Rebecca will introduce the ThermImage package and demonstrate its use in extracting and processing thermal images in R. She'll talk about the challenges associated with the structure of thermal data and the kinds of questions that ecologists can ask with such data, such as capturing the extent to which animals can move locally to track their preferred temperatures. A single thermal image produces 19,200 distinct temperature measurements, so optimising the extraction -> processing -> plotting -> analysis workflow is essential.
Object oriented systems in R
Object Oriented programming is commonplace in many languages. It is a useful way to organise code. R has some special flavours of Object Oriented programming. Understanding these will help you understand R better and write better software.
June Meetup: Literate programming and Writing a decision tree package for R
Description changed:
This month Anna Krystalli will present a exploration of literate programming, following up from a discussion on literate programming techniques at our meetup in April, and Pete Dodd will tell us about building an open source decision tree package in R.
Literate programming with Rmarkdown
In this session we'll explore the various options and strategies for literate programming and dynamic report generation in R and Rmarkdown. This session will make use of the literate programming packages knitr and rmarkdown.
Writing a decision tree package for R
Decision trees are a class of model widely used in health economic modelling. A tree (in the mathematical sense) has probabilities associated with branches leaving a node to describe the chances of a set of potential outcomes. In health economics, costs and health consequences are also associated with outcomes. By way of example, clinical diagnostic algorithms based on a sequence of symptom screens and diagnostic tests map particularly naturally into this framework. There are proprietary packages to develop, visualize and analyse such models, but nothing open-source and user-friendly. We describe an attempt to develop a package with a simple syntax for decision tree models in R based on the DOT language, which allows visualization, as well as considerable flexibility in analysis approach.
This month we'll be learning how to make R go faster using a couple of techniques.
Théo Michelot will speak about Using C++ to speed up R code
R is a very handy tool for data analysis, but it can be computationally quite slow. When analysing large data sets, or using complex models, it can be desirable to speed things up. For this reason, it has become increasingly common to combine R code with C++ code. C++ is a compiled programming language, which makes it much faster than R in some cases. I will first talk about Rcpp, an R package which makes it very simple to call C++ functions from an R script. Then, I will talk about the Template Model Builder (TMB), an R package which uses automatic differentiation in C++ to speed up the numerical evaluation of a function (typically, a likelihood function), thus making techniques such as maximum likelihood estimation very fast. I will show that these do not require a strong programming background, and that it is quite easy for someone familiar with R to use Rcpp and TMB. I will compare the three approaches with examples of code.
We're waiting for confirmation for our second speaker so watch this space!
April Sheffield R Meetup: Using R Notebooks and Ensembles in Machine Learning
Description changed:
This month's talks cover ensembles in machine learning and using R notebooks as a learning tool.
"Wisdom of the crowd": Ensembles in machine learning, and the factors that influence them
Pieter Wessels will present a brief discussion on Ensembles, including popular algorithms for creating same engine Ensembles (bagging, boosting, etc), multiple engine Ensembles, the factors that influence Ensembles, including diversity between members and a brief introduction to one measure of diversity.
Teaching with R Notebooks
Duncan Gillespie will speak about the pros and cons of using R notebooks as a learning tool in an introductory R course.
March Sheffield R Meetup: R coding surgery and social
Description changed:
This month we’re inviting Sheffield R users to come together to discuss individual projects and problems at our first R coding surgery and social. Come along for a chance to meet, greet, learn from and collaborate with fellow R users. There will be space to present problems, ideas and projects, and to get help, advice and feedback.
We’ll start with a round of introductions, allowing people briefly to introduce themselves and to propose a topic for discussion. This could be:
• an R project you’ve been working on
• a data science problem that you want to solve
• something you want to learn about.
After going around the room, we’ll have a more detailed discussion covering the topics raised and anything else that comes up along the way. There will be the opportunity for individuals to share code they've been working on and want help with / want feedback on / want to present. Make sure to have this ready before the meeting.
We’re hoping for a varied and lively discussion so please come along with any topic you’d like to discuss, whether really small or really big!
Be prepared!
• Come with a topic you want to discuss or with something you’ve done that you want to tell everyone about, or just come to listen and contribute as much or as little as you want.
• Bring your computer (if you have one) in case you want to work on something. We'll have a laptop/projector available for presentations.
• Feel free to contact us via meetup or twitter if you want to propose a topic or theme for discussion ahead of the meeting.
February Sheffield R Meetup: Survival analysis and non-standard evaluation in R
Description changed:
In this month's meeting we'll explore survival analysis, a means of analysing data in which the outcome is the time to the occurrence of an event of interest. We'll also be finding out about one of the more esoteric features of the R language, namely non-standard evaluation.
Survival analysis with coxme
Martin Garlovsky, a PhD student studying sexual selection and speciation, will talk about using Cox proportional hazards models with random effects to perform survival analysis. Survival analysis can be used for a wide range of "time to event" data, e.g. deaths, divorce, maturation, relapse, disease etc, and is applicable in a wide range of biological, medical and social research. The R packages coxme, coxph and survival provide methods for survival analysis.
Non-Standard Evaluation, or why R is a little unusual
Analyst and data lover, Christopher Hopkinson, will be delving into the bowels of the R language to try to understand how R evaluates our code and when this may lead to unexpected results.
If this topic is of interest, you might like to have a read of Hadley's Advanced R book.
December Sheffield R Meetup: Wrangling Spatial Data and Juggling Gaming Data
Description changed:
This month we'll be learning about R's capabilities for wrangling and handling spatial data and how Sky Betting and Gaming does data science.
Dan Olner from the Sheffield Methods Institute will be sharing his experience from some recent work on spatial analysis: Doing spatial stuff with R: the magic and the messiness (with an example applying it to windfarm/house-price interaction)
R has grown into an amazing tool for wrangling and analysing spatial data. It can do some awesome things with borderline magical ease. I'll cover some of the introductory essentials to doing spatial stuff before looking at a project using R (and other tools including QGIS) that analysed the price impact of wind turbines on house prices in Scotland. (Spoiler: couldn't find anything significant; see link below.) This project's story is more about the real-world messiness of getting R spatial analysis to do what you want it to.
We will also be joined by James Waterhouse (Head of Data Science) and Darrell Taylor (Principal Data Engineer) from Sky Betting and Gaming who will talk about Data Juggling at SkyBet: A brief look inside the data science toolbox at Sky Betting and Gaming.
November Sheffield R Meetup: Modelling, Inference, and Prediction
Description changed:
At this month's meeting we'll be talking about modelling, inference, and prediction.
Shaun Coutts joins us to talk about his experience with R and JAGS:
Getting by with help from family and neighbours: spatially and phylogenetically lagged models in R (with a bit of JAGS).
I am a quantitative ecologist who uses statistical, analytical and simulation models to look at pressing environmental issues such as invasive species and wildlife harvest. I am currently a post doc at the University of Sheffield modelling the evolution of herbicide resistance. I also use these approaches to address a wide range of topics such as population dynamics, community ecology, dispersal ecology, optimal population management, and social-economic systems. I am also interested in how far population models for one species or area can be extrapolated to others, and if so, how it should be done.
Lukas Drapal will introduce us to the online machine learning competition site kaggle:
Compete (and win) on Kaggle.com
Kaggle is a great place to learn data science by doing. Kaggle hosts machine learning competitions and has a very active community that share their techniques and tips. In this talk I describe how Kaggle works, share best practises and dive into challenges of a competition that we won among more than 1500 teams - Allstate Purchase Prediction Challenge.
Sheffield R Meetup - October - [Updated] Exploring R interfaces
Description changed:
Update: Due to a change in availability we have had to postpone the talk by James Waterhouse from Sky Betting and Gaming. If you were hoping to catch James then stay tuned: we're hoping that he will be able to join us at a later date.
At this month's meeting we'll be talking about how R interacts with different languages and technologies.
Our very own Mat Hall will give us a crash course in using R with Jupyter notebooks for interactive reproducible and collaborative research.
We'll also hear from Mike Croucher, who is an EPSRC Research Software Engineering Fellow working at The University of Sheffield. Mike specialises in assisting researchers to produce better quality software. He is co-founder of The University of Sheffield's Research Software Engineering group and plays a major role in the Sheffield Open Data Science Initiative. In September Mike was involved in organising the first ever Research Software Engineering conference.
September Meetup - Expert elicitation and unit testing in R
Description changed:
After the summer break, Sheffield R group is back with a couple of great talks to kick off the new season.
Our beginner's talk this month, presented by Alison Parton, is:
Can you trust the humble statistician? An intro to unit testing in R
"You've slaved away converting the awesome methods you've concocted into R code. It runs without any nasty red errors so you've not missed an end bracket, but are they all really in the right place? I'll be telling beginners what I've learnt after dipping my toes into the world of unit testing in R, and (hopefully) getting the veterans to assist in my quest for reliable code."
SHELF: an R package for expert elicitation
For our expert talk we welcome Professor Jeremy Oakley who will introduce SHELF, an R package containing tools to support the Sheffield Elicitation Framework (SHELF). This is a toolkit for eliciting probability distributions from experts, translating an expert's judgements about the characteristics of a distribution for a parameter of interest into a range of parametric distributions and providing visual and summary feedback. SHELF also provides methods for interactive plotting and elicitation, weighting for multiple expert elicitation and a graphical interface for the roulette elicitation method.
We are delighted to announce the first Hack event for the SheffieldR Users Group! This event aims to bring together R users and learners of all levels for a hands on session with some real world open data. We will be working with the National Biodiversity Network (NBN) dataset, which records sightings of different species along with their location taken from numerous wildlife surveys. This event is supported by the Sheffield Wildlife Trust, and we will be sharing our outputs with them as we go.
This is the first of three hack evenings based at the Sheffield Methods Institute at the University of Sheffield (see map). You are welcome to attend as many or as few sessions as you wish - so don't worry if you can't commit to all three at this stage. We kick off at 4pm but feel free to come along later if you can't make that time. Feel free to bring your own laptops if you wish but this is not required.
National Biodiversity Network data hack kickoff event
Description changed:
We are delighted to announce the first Hack event for the SheffieldR Users Group! This event aims to bring together R users and learners of all levels for a hands on session with some real world open data. We will be working with the National Biodiversity Network (NBN) dataset, which records sightings of different species along with their location taken from numerous wildlife surveys. This event is supported by the Sheffield Wildlife Trust, and we will be sharing our outputs with them as we go.
This is the kickoff of the hack that will introduce the data set and set the scene for the hack events. The hack will take place over three consecutive evenings. You are welcome to attend as many or as few sessions as you wish - so don't worry if you can't commit to all three at this stage.
All are welcome - there will be a mix of experienced R users as well as beginners, and no knowledge of biodiversity is required. An introductory session will be held on Feb 23rd at the February SheffieldR Users Group meeting at the Red Deer - you can sign up to this and any future SheffieldR events at our meetup site (http://www.meetup.com/SheffieldR-Sheffield-R-Users-Group/).