Since strings are the first true objects we've encountered a brief description of methods is in order. As mentioned earlier (Section 1.4.3), when dealing with objects, functions are known as methods. Besides the terminology, methods are invoked slightly differently than functions. When you call a function like len, you pass the arguments in a comma separated list surrounded by parentheses after the function name. When you invoke a method, you provide the name of the object the method is to act upon, followed by a period, finally followed by the method name and the parenthesized list of additional arguments. Remember to provide empty parentheses if the method does not take any arguments, so that python can distinguish a method call with no arguments from a reference to a variable stored within the object.
Strings in python are immutable objects; this means that you can't change the value of a string in place. If you do want to change the value of a string, you need to invoke a method on the variable containing the string you wish to change, and to reassign the value of that operation to the variable in question, as some of the examples below will show.
Many of the string methods provided by python are listed in Table 2.2. Among the most useful are the methods split and join. The split method operates on a string, and returns a list, each of whose elements is a word in the original string, where a word is defined by default as a group of non-whitespace characters, joined by one or more whitespace characters. If you provide one optional argument to the split method, it is used to split the string as an alternative to one or more whitespace characters. Note the subtle difference between invoking split with no arguments, and an argument consisting of a single blank space:
>>> str = 'This parrot is dead' >>> str.split() ['This', 'parrot', 'is', 'dead'] >>> str.split(' ') ['This', 'parrot', '', 'is', 'dead']When more than one space is encountered in the string, the default method treats it as if it were just a single space, but when we explicitly set the separator character to a single space, multiple spaces in the string result in extra elements in the resultant list. You can also obtain the default behavior for split by specifying None for the sep argument.
The maxsplit argument to the split method will result in a list with maxsplit + 1 elements. This can be very useful when you only need to split part of a string, since the remaining pieces will be put into a single element of the list which is returned. For example, suppose you had a file containing definitions of words, with the word being the first string and the definition consisting of the remainder of the line. By setting maxsplit to 1, the word would become the first element of the returned list, and the definition would become the second element of the list, as the following example shows:
>>> line = 'Ni a sound that a knight makes' >>> line.split(maxsplit=1) ['Ni', 'a sound that a knight makes']In some versions of python, the split method will not accept a named argument for maxsplit. In that case, you would need to explicitly specify the separator, using None to obtain the default behavior.
>>> line.split(None,1) ['Ni', 'a sound that a knight makes']
When using the join method for strings, remember that the method operates on the string which will be used between each element of the joined list, not on the list itself. This may result in some unusual looking statements:
>>> words = ['spam','spam','bacon','spam'] >>> ' '.join(words) 'spam spam bacon spam'Of course, you could assign the value of ' ' to a variable to improve the appearance of such a statement.
The index and find functions can be useful when trying to extract substrings, although techniques using the re module (Section 8.5) will generally be more powerful. As an example of the use of these functions, suppose we have a string with a parenthesized substring, and we wish to extract just that substring. Using the slicing techniques explained in Section 2.4.3, and locating the substring using, for example index and rindex, here's one way to solve the problem:
>>> model = 'Turbo Accelerated Widget (MMX-42b) Press' >>> try: ... model[model.index('(') + 1 : model.rindex(')')] ... except ValueError: ... print 'No parentheses found' ... 'MMX-42b'When you use these functions, make sure to check for the case where the substring is not found, either the ValueError raised by the index functions, or the returned value of -1 from the find functions.
Remember that the string methods will not change the value of the string they are acting on, but you can achieve the same effect by overwriting the string with the returned value of the method. For example, to replace a string with an equivalent version consisting of all upper-case characters, statements like the following could be used:
>>> language = 'python' >>> language = language.upper() >>> language 'PYTHON'
Finally, python offers a variety of so-called predicate methods, which take no arguments, and return 1 if all the characters in a string are of a particular type, and 0 otherwise. These functions, whose use should be obvious from their names, include isalnum, isalpha, isdigit, islower, isspace, istitle, and isupper.
Related modules: string, re, stringIO.
Related exceptions: TypeError, IndexError.