Functions for Working with Characters

1  Sizes of Objects

Before we start looking at character manipulation, this is a good time to review the different functions that give us the size of an object.
  1. length - returns the length of a vector, or the total number of elements in a matrix (number of rows times number of columns). For a data frame, returns the number of columns.
  2. dim - for matrices and data frames, returns a vector of length 2 containing the number of rows and the number of columns. For a vector, returns NULL. The convenience functions nrow and ncol return the individual values that would be returned by dim.
  3. nchar - for a character string, returns the number of characters in the string. Returns a vector of values when applied to a vector of character strings. For a numeric value, nchar returns the number of characters in the printed representation of the number.

2  Character Manipulation

While it's quite natural to think of data as being numbers, manipulating character strings is also an important skill when working with data. We've already seen a few simple examples, such as choosing the right format for a character variable that represents a date, or using table to tabulate the occurences of different character values for a variable. Now we're going to look at some functions in R that let us break apart, rearrange and put together character data.
One of the most important uses of character manipulation is "massaging" data into shape. Many times the data that is available to us, for example on a web page or as output from another program, isn't in a form that a program like R can easily interpret. In cases like that, we'll need to remove the parts that R can't understand, and organize the remaining parts so that R can read them efficiently.
Let's take a look at some of the functions that R offers for working with character variables:



File translated from TEX by TTH, version 3.67.
On 7 Feb 2011, 15:33.