Browsing Articles Written by

Michele Usuelli

Michele works for Microsoft as a Lead Data Scientist and is the author of two books about Machine Learning with R. He worked for Revolution Analytics, startup that developed a big data extension for R. The company got acquired by Microsoft in 2015.

R AND OOP - defining new classes

R blog By March 12, 2014 Tags: , , No Comments

My previous article shows an example in which data analysis requires a structured framework with R and OOP. In order to explain how to build the framework this article describes how to do that in more detail.

Using OOP means creating new data structures and defining their methods that are functions performing a specific tasks on the object. Defining a new data structure requires creating a new class and this articles shows how to create it through S4 R classes.

Share:

A possibility for use R and Hadoop together

R blog By July 9, 2013 Tags: , , , , , 1 Comment

As mentioned in the previous article, a possibility for dealing with some Big Data problems is to integrate R within the Hadoop ecosystem. Therefore, it's necessary to have a bridge between the two environments. It means that R should be capable of handling data the are stored through the Hadoop Distributed File System (HDFS). In order to process the distributed data, all the algorithms must follow the MapReduce model. This allows to handle the data and to parallelize the jobs. Another requirement is to have an unique analysis procedure, so there must be a connection between in-memory and HDFS places.

Share:

A Big Data introduction

R blog By June 5, 2013 Tags: , , , , No Comments

Since R uses the computer RAM, it may handle only rather small sets of data. Nevertheless, there are some packages that allow to treat larger volumes and the best solution is to connect R with a Big Data environment. This post introduces some Big Data concepts that are fundamental to understand how R can work in this environment. Afterwards, some other posts will explain in detail how R can be connected with Hadoop.

Share: