R User Group of Milano (Italy)

Operating on files with R: copy and rename

Nowadays, routinary operations on files, such as renaming or copying, are performed with some mouse clicks. Sometimes, it is useful perform this operations in batch. Linux users perform this operations through the shell. Also Windows users can use the shell, but there are also a lot of utilities that simplify these operations.

Why someone should use R to copy or rename a (lot of) file(s)?

For an R user, R can be more intuitive than the operating system shell.

I found another good reason to use R for this operations: I need to operating on files as a preliminary step to my statistical analyses.

I received a lot of files (about 20000). Files were contained in a lot of directory structured like follow. Each directory refers to a day and contains some useless file, that I ignored, and a subdirectory with the txt files I need. The main directory has a name like "2012_09_21_Fri" while subdirectory has a name like "Fri 21 sep 2012". So, I need to copy the relevant files in a directory like "2012-09-21".

The first step is listing all directories I have. I saved both the full path and only the name of each directory in two different R vectors.

At this point, in every directory (so I put the code below in a for cycle), I search the subdirectory (it is the first element of the directory) and I list all files contained in the subdirectory.

Now, I need to create a new directory with a name like "2012-09-21". As seen above, information about day, month and year are available in the directory name but they are not well structured. So, I can use paste() and substr() function to build the name. Please note, that cdn contain only one element from dn. For example, cdn = cd[index] where index is the counter of the loop.

Now, I can create my directory, using the dir.create() function:

Now, I need to copy all the txt files from their old subdirectories to the new directories I created above.

Finally, also txt files name are difficult to interpret and I need to rename theses files. I list the files in the following way, removing the full path:

And now, I can rename my files. newNames is a character vector containing the new file names. newNames is built similarly to subdirName.

Leave a Reply