Malith's Perspective

Friday, August 19, 2016

postCP change point detection with GSOC

Posted by Unknown at 11:34:00 AM Labels: #GSOC , #postCP , #R , #Tech

Introduction

The project aimed at improving the postCP package and making it available on CRAN again. The snag that prevented the package from being updated is the recent requirement that in the R code, .C() calls require DUP=TRUE arguments, and .Call() is suggested instead of .C(). The implementation of postCP package required that it's done in the most R compliant way. Separating out the model specific implementation and the core implementation. The core part is implemented in C++ for speed in calculations. The project page can be found here.

Implementation

To improve the usability and user friendliness, a glm syntax based "formula" and "family" specification was added. These commits are in the feature-glmsyntax branch. Also the model specific implementation was done by adding four models. (Gaussian, Poisson, Binomial and Gamma). The previous package only included three models. (Gaussian, Poisson and Binomial).  These commits are found here. 
https://github.com/malithj/postCP_Improvement/commits/feature-glmsyntax?author=malithj

The model specific part was integrated with the core specific part (C++ forward backward algorithm) to give the required results. Also following the change point model, a separate section was added for parameter calculations and parameter updates based on the updated log evidences. Standard error checks were included to handle incorrect user inputs as much as possible and to give meaningful error messages. Vignettes have been included to provide long form documentation. The commits are found here.  
https://github.com/malithj/postCP_Improvement/commits/feature-model?author=malithj

The development has been completed as agreed in the project proposal. All the branches have been merged to the master branch and it reflects the latest package. 
https://github.com/malithj/postCP_Improvement

The completed source code of the package has been included in the following public Google folder for viewing. 
https://drive.google.com/drive/folders/0B8xO3Cc0h6rIbXNDTUM1TDlMbDQ?usp=sharing

Run time analysis 

The following table represents a run time analysis of postCP package done on a 2.7 GHz machine with 4GB RAM running on Ubuntu 14.04. It shows a linear time complexity; O(N) 

CPU TIME ( Time in seconds )
n (iterations )Mean ModelSlope Model
10000.1040.056
100000.5360.628
1000002.6362.3
100000016.07613.776

The mean model and the slope model are given below. 

Building the Package

1. Clone my repository https://github.com/malithj/postCP_Improvement

2. Navigate into the postCP folder 

3. Build using R / RStudio. ( If it's RStudio use Ctrl + Shift + B )

Coding Guidelines 

 I referred to "Hadley Wickam R style" and "R style. An Rchaeological Commentary" for appropriate coding style guidelines because I saw that there are some issues with existing R packages because of not following best practices. 

Final remarks 

I would like to thank my mentors Gregory Nuel and Guillem Regaill for all the support and guidance and also bearing with me for the duration of the project. :D I would also like to thank Minh Luong (original creator of the package) for the  additional documentation provide while I was trying to understand the previous implementation.

Finally thank you for GSOC for the great learning opportunity! :D

The official GSoC Project page can be found here.

Linked Blog posts

The following blog post explains my GSOC experience and the challenges that I had to face as a summary.
http://malithjayaweera.blogspot.com/2016/08/my-experience-with-gsoc-and-r.html





Tweet

No comments :

Post a Comment

Newer Post Older Post Home
Subscribe to: Post Comments ( Atom )

R-bloggers

Loading...

Navigation

  • Home
  • Twitter

About Me

Unknown
View my complete profile

Blog Archive

  • 2017 (2)
    • May (2)
  • 2016 (4)
    • August (4)
      • Changepoint Detection : Theoretical Background
      • My Experience with GSOC and R
      • postCP change point detection with GSOC
      • Conquering Raspberry Pi with Ubuntu 14.04 : Resolv...
  • 2014 (2)
    • December (2)

Popular Posts

  • Gauge Meters; A daunting task
    Since I'm on vacation, I thought of planning my project using the free time. As the next step, I thought of developing the front end of...
  • postCP change point detection with GSOC
    Introduction The project aimed at improving the postCP package and making it available on CRAN again. The snag that prevented the pac...
  • Using fabric8 docker-maven-plugin to automate docker builds
    In building the required libraries for a docker container, using a maven project, the libraries have to be copied to a separate location an...
  • My Experience with GSOC and R
    It all began when I started searching for a Google Summer of Code project last year (November, 2015) . While I was searching through the we...
  • Changepoint Detection : Theoretical Background
    Introduction  Changepoints are abrupt variations in the generative parameters of sequential data. Changepoint detection is the process o...
  • Conquering Raspberry Pi with Ubuntu 14.04 : Resolving issues with Partitioning
    Recently for a project, I used a Raspberry Pi. Although mostly it's preferred to use raspbian as an operating system, I chose to use U...
  • Air Conditioning? Big Deal?
    Last week we were assigned projects for our 4th semester and I was selected for the energy sector. As a part of it, when I was exploring t...
  • Setting up Kubernetes 1.7 on a CentOS 7.1 cluster
    It was quite a daunting task at the beginning to start with Kubernetes 1.7 alpha release because I knew that I was bound to face with diffi...
Powered by Blogger.

Labels

#GSOC (3) #Tech (3) #postCP (3) #Kubernetes (2) #R-gsoc (2) #docker (2) #CentOS (1) #Cluster (1) #Pi (1) #R (1) #Ubuntu (1) #docker-maven-plugin (1) #fabric8 (1) #raspberryPi (1) Air Conditioning (1) Embedded (1) Micro Controllers (1) Power consumption (1) Smart Systems (1) Tech (1)

© Malith's Perspective 2016 . Powered by Bootstrap , Blogger templates and RWD Testing Tool