Browsing Category

R blog

A possibility for use R and Hadoop together

R blog By July 9, 2013 Tags: , , , , , 1 Comment

As mentioned in the previous article, a possibility for dealing with some Big Data problems is to integrate R within the Hadoop ecosystem. Therefore, it's necessary to have a bridge between the two environments. It means that R should be capable of handling data the are stored through the Hadoop Distributed File System (HDFS). In order to process the distributed data, all the algorithms must follow the MapReduce model. This allows to handle the data and to parallelize the jobs. Another requirement is to have an unique analysis procedure, so there must be a connection between in-memory and HDFS places.

Share:

A Big Data introduction

R blog By June 5, 2013 Tags: , , , , No Comments

Since R uses the computer RAM, it may handle only rather small sets of data. Nevertheless, there are some packages that allow to treat larger volumes and the best solution is to connect R with a Big Data environment. This post introduces some Big Data concepts that are fundamental to understand how R can work in this environment. Afterwards, some other posts will explain in detail how R can be connected with Hadoop.

Share:

Operating on files with R: copy and rename

R blog By May 22, 2013 Tags: , , 3 Comments

Nowadays, routinary operations on files, such as renaming or copying, are performed with some mouse clicks. Sometimes, it is useful perform this operations in batch. Linux users perform this operations through the shell. Also Windows users can use the shell, but there are also a lot of utilities that simplify these operations.

Why someone should use R to copy or rename a (lot of) file(s)?

Share:

Learning RStudio for R Statistical Computing

R blog By February 6, 2013 Tags: , 1 Comment

Book cover
"Learning RStudio for R Statistical Computing" will teach you how to quickly and efficiently create and manage statistical analysis projects, import data, develop R scripts, and generate reports and graphics. R developers will learn about package development, coding principles, and version control with RStudio.

This book will help you to learn and understand RStudio features to effectively perform statistical analysis and reporting, code editing, and R development.

Share:

Maps in R: choropleth maps

R blog By January 24, 2013 Tags: , , 10 Comments

This is the third article of the Maps in R series. After having shown how to draw a map without placing data on it and how to plot point data on a map, in this installment the creation of a choropleth map will be presented.

A choropleth map is a thematic map featuring regions colored or shaded according to the value assumed by the variable of interest in that particular region.

Share:

Maps in R: Introduction - Drawing the map of Europe

R blog By December 19, 2012 Tags: , , 18 Comments

This post is a brief follow-up to a question that appeared some time ago on the “The R Project for Statistical Computing” LinkedIn group, which I’m reporting here:

How can I draw a map of MODERN Europe?

Hi, I'm trying to draw a map of modern Europe but I've found only maps of twenty years ago, with Yugoslavia and Czechoslovakia still united!!!
Does anyone know where I can get a more recent map to be employed with packages such as 'sp' or 'maps'?
Thank you very much!

Two different solutions to the above question will be provided here, using two different R packages.

Share:

Genome annotation with NCBI2R

R blog By November 19, 2012 Tags: , , No Comments

It's very convenient manage data with R: you can import your dataset, you could find many packages which respond to your needs, then you could plot your results.
However it could be very bothersome retrieve the data from online databases. You need to use the specific API and maybe write your scripts using a new programming language, then you have to convert your data in a table format and finally import them with R.

Share: