<< Back to man.ChinaUnix.net


3. Python简介 An Informal Introduction to Python

In the following examples, input and output are distinguished by the presence or absence of prompts (">>" and "... "): to repeat the example, you must type everything after the prompt, when the prompt appears; lines that do not begin with a prompt are output from the interpreter. Note that a secondary prompt on a line by itself in an example means you must type a blank line; this is used to end a multi-line command.

在后面的例子中,区分输入和输出的方法是看是否有提示符("»> " 和"... "):想要重现这些例子的话,你就要在提示符显示后输入所有的一切;没有以提示符开始的行,是解释器输出的信息。需要注意的是示例中的从属提示符用于多行命令的结束,它表示你需要输入一个空行。

Many of the examples in this manual, even those entered at the interactive prompt, include comments. Comments in Python start with the hash character, "#", and extend to the end of the physical line. A comment may appear at the start of a line or following whitespace or code, but not within a string literal. A hash character within a string literal is just a hash character.

本手册中的很多示例都包括注释,甚至有一些在交互提示符中折行。Python中的注释以符号 "#" 起始,一直到当前行的结尾。注释可能出现在一行的开始,也可能跟在空格或程序代码之后,但不会出现在字符串中,字符串中的 "#" 号只代表 "#" 号。

Some examples:


# this is the first comment
SPAM = 1                 # and this is the second comment
                         # ... and now a third!
STRING = "# This is not a comment."

3.1 将Python当作计算器使用 Using Python as a Calculator

Let's try some simple Python commands. Start the interpreter and wait for the primary prompt, ">>". (It shouldn't take long.)

让我们试验一些简单的 Python 命令。启动解释器然后等待主提示符">>"出现(这用不了太久)。

3.1.1 数值 Numbers

The interpreter acts as a simple calculator: you can type an expression at it and it will write the value. Expression syntax is straightforward: the operators +, -, * and / work just like in most other languages (for example, Pascal or C); parentheses can be used for grouping. For example:


>>> 2+2
>>> # This is a comment
... 2+2
>>> 2+2  # and a comment on the same line as code
>>> (50-5*6)/4
>>> # Integer division returns the floor:
... 7/3
>>> 7/-3

Like in C, the equal sign ("=") is used to assign a value to a variable. The value of an assignment is not written:


>>> width = 20
>>> height = 5*9
>>> width * height

A value can be assigned to several variables simultaneously:


>>> x = y = z = 0  # Zero x, y and z
>>> x
>>> y
>>> z

There is full support for floating point; operators with mixed type operands convert the integer operand to floating point:


>>> 3 * 3.75 / 1.5
>>> 7.0 / 2

Complex numbers are also supported; imaginary numbers are written with a suffix of "j" or "J". Complex numbers with a nonzero real component are written as "(real+imagj)", or can be created with the "complex(real, imag)" function.

Python 也同样支持复数,虚部由一个后缀"j"或者"J"来表示。带有非零实部的复数记为"real+imagj)",或者也可以通过"complex(real, img)"函数创建。

>>> 1j * 1J
>>> 1j * complex(0,1)
>>> 3+1j*3
>>> (3+1j)*3
>>> (1+2j)/(1+1j)

Complex numbers are always represented as two floating point numbers, the real and imaginary part. To extract these parts from a complex number z, use z.real and z.imag.

复数总是由实部和虚部两部分浮点数来表示。可以从 z.realz.imag 得到复数z的实部和虚部。

>>> a=1.5+0.5j
>>> a.real
>>> a.imag

The conversion functions to floating point and integer (float(), int() and long()) don't work for complex numbers -- there is no one correct way to convert a complex number to a real number. Use abs(z) to get its magnitude (as a float) or z.real to get its real part.

用于向浮点数和整型转化的函数(float(), int() 和 long())不能对复数起作用--没有什么方法可以将复数转化为实数。可以使用abs(z)取得它的模,也可以通过z.real得到它的实部。

>>> a=3.0+4.0j
>>> float(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: can't convert complex to float; use abs(z)
>>> a.real
>>> a.imag
>>> abs(a)  # sqrt(a.real**2 + a.imag**2)

In interactive mode, the last printed expression is assigned to the variable _. This means that when you are using Python as a desk calculator, it is somewhat easier to continue calculations, for example:

交互模式下,最近一次表达式输出保存在 _ 变量中。这意味着把 Python 当做桌面计算器使用时,可以方便的进行连续计算,例如:

>>> tax = 12.5 / 100
>>> price = 100.50
>>> price * tax
>>> price + _
>>> round(_, 2)

This variable should be treated as read-only by the user. Don't explicitly assign a value to it -- you would create an independent local variable with the same name masking the built-in variable with its magic behavior.

这个变量对于用户来说是只读的。不要试图去给它赋值--限于 Python 的语法规则,你只会创建一个同名的局部变量覆盖它。

3.1.2 字符串 Strings

Besides numbers, Python can also manipulate strings, which can be expressed in several ways. They can be enclosed in single quotes or double quotes:

除了数值, Python 还可以通过几种不同的方法操作字符串。字符串用单引号或双引号标识:

>>> 'spam eggs'
'spam eggs'
>>> 'doesn\'t'
>>> "doesn't"
>>> '"Yes," he said.'
'"Yes," he said.'
>>> "\"Yes,\" he said."
'"Yes," he said.'
>>> '"Isn\'t," she said.'
'"Isn\'t," she said.'

String literals can span multiple lines in several ways. Continuation lines can be used, with a backslash as the last character on the line indicating that the next line is a logical continuation of the line:


hello = "This is a rather long string containing\n\
several lines of text just as you would do in C.\n\
    Note that whitespace at the beginning of the line is\

print hello

Note that newlines would still need to be embedded in the string using \n; the newline following the trailing backslash is discarded. This example would print the following:

注意换行用 \n 来表示;反斜杠后面的新行标识(newline,缩写“n”)会转换为换行符,示例会按如下格式打印:

This is a rather long string containing
several lines of text just as you would do in C.
    Note that whitespace at the beginning of the line is significant.

If we make the string literal a ``raw'' string, however, the \n sequences are not converted to newlines, but the backslash at the end of the line, and the newline character in the source, are both included in the string as data. Thus, the example:

然而,如果我们创建一个“行”("raw")字符串,\ n序列就不会转为换行,源码中的反斜杠和换行符n都会做为字符串中的数据处理。如下所示:

hello = r"This is a rather long string containing\n\
several lines of text much as you would do in C."

print hello

would print:


This is a rather long string containing\n\
several lines of text much as you would do in C.

Or, strings can be surrounded in a pair of matching triple-quotes: """ or '''. End of lines do not need to be escaped when using triple-quotes, but they will be included in the string.


print """
Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to

produces the following output:


Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to

The interpreter prints the result of string operations in the same way as they are typed for input: inside quotes, and with quotes and other funny characters escaped by backslashes, to show the precise value. The string is enclosed in double quotes if the string contains a single quote and no double quotes, else it's enclosed in single quotes. (The print statement, described later, can be used to write strings without quotes or escapes.)

解释器打印出来的字符串与它们输入的形式完全相同:内部的引号,用反斜杠标识的引号和各种怪字符,都精确的显示出来。如果字符串中包含单引号,不包含双引号,可以用双引号引用它,反之可以用单引号。(后面介绍的 print 语句,可以在不使用引号和反斜杠的情况下输出字符串)。

Strings can be concatenated (glued together) with the + operator, and repeated with *:

字符串可以用 + 号联接(或者说粘合),也可以用 * 号循环。

>>> word = 'Help' + 'A'
>>> word
>>> '<' + word*5 + '>'

Two string literals next to each other are automatically concatenated; the first line above could also have been written "word = 'Help' 'A'"; this only works with two literals, not with arbitrary string expressions:

两个字符串值之间会自动联接,上例第一行可以写成“word = 'Help' 'A'”。这种方式只对字符串值有效,任何字符串表达式都不适用这种方法。

>>> 'str' 'ing'                   #  <-  This is ok
>>> 'str'.strip() + 'ing'   #  <-  This is ok
>>> 'str'.strip() 'ing'     #  <-  This is invalid
  File "<stdin>", line 1, in ?
    'str'.strip() 'ing'
SyntaxError: invalid syntax

Strings can be subscripted (indexed); like in C, the first character of a string has subscript (index) 0. There is no separate character type; a character is simply a string of size one. Like in Icon, substrings can be specified with the slice notation: two indices separated by a colon.

字符串可以用下标(索引)查询;就像 C 一样,字符串的第一个字符下标是 0。这里没有独立的字符类型,字符仅仅是大小为一的字符串。就像在 Icon 中那样,字符串的子串可以通过切片标志来表示:两个由冒号隔开的索引。

>>> word[4]
>>> word[0:2]
>>> word[2:4]

Slice indices have useful defaults; an omitted first index defaults to zero, an omitted second index defaults to the size of the string being sliced.

切片索引可以使用默认值;前一个索引默认值为 0,后一个索引默认值为被切片的字符串的长度。

>>> word[:2]    # The first two characters
>>> word[2:]    # Everything except the first two characters

Unlike a C string, Python strings cannot be changed. Assigning to an indexed position in the string results in an error:

和 C 字符串不同, Python 字符串不能改写。按字符串索引赋值会产生错误。

>>> word[0] = 'x'
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: object doesn't support item assignment
>>> word[:1] = 'Splat'
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: object doesn't support slice assignment

However, creating a new string with the combined content is easy and efficient:


>>> 'x' + word[1:]
>>> 'Splat' + word[4]

Here's a useful invariant of slice operations: s[:i] + s[i:] equals s.


>>> word[:2] + word[2:]
>>> word[:3] + word[3:]

Degenerate slice indices are handled gracefully: an index that is too large is replaced by the string size, an upper bound smaller than the lower bound returns an empty string.


>>> word[1:100]
>>> word[10:]
>>> word[2:1]

Indices may be negative numbers, to start counting from the right. For example:


>>> word[-1]     # The last character
>>> word[-2]     # The last-but-one character
>>> word[-2:]    # The last two characters
>>> word[:-2]    # Everything except the last two characters

But note that -0 is really the same as 0, so it does not count from the right!


>>> word[-0]     # (since -0 equals 0)

Out-of-range negative slice indices are truncated, but don't try this for single-element (non-slice) indices:


>>> word[-100:]
>>> word[-10]    # error
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IndexError: string index out of range

The best way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n, for example:


 | H | e | l | p | A |
 0   1   2   3   4   5
-5  -4  -3  -2  -1

The first row of numbers gives the position of the indices 0...5 in the string; the second row gives the corresponding negative indices. The slice from i to j consists of all characters between the edges labeled i and j, respectively.


For non-negative indices, the length of a slice is the difference of the indices, if both are within bounds. For example, the length of word[1:3] is 2.


The built-in function len() returns the length of a string:

内置函数 len() 返回字符串长度:

>>> s = 'supercalifragilisticexpialidocious'
>>> len(s)

See Also:

Sequence Types
Strings, and the Unicode strings described in the next section, are examples of sequence types, and support the common operations supported by such types.
String Methods
Both strings and Unicode strings support a large number of methods for basic transformations and searching.
String Formatting Operations
The formatting operations invoked when strings and Unicode strings are the left operand of the % operator are described in more detail here.

3.1.3 Unicode 字符串 Unicode Strings

Starting with Python 2.0 a new data type for storing text data is available to the programmer: the Unicode object. It can be used to store and manipulate Unicode data (see http://www.unicode.org/) and integrates well with the existing string objects providing auto-conversions where necessary.

从Python2.0开始,程序员们可以使用一种新的数据类型来存储文本数据:Unicode 对象。它可以用于存储多种Unicode数据(请参阅 http://www.unicode.org/ ),并且,通过必要时的自动转换,它可以与现有的字符串对象良好的结合。

Unicode has the advantage of providing one ordinal for every character in every script used in modern and ancient texts. Previously, there were only 256 possible ordinals for script characters and texts were typically bound to a code page which mapped the ordinals to script characters. This lead to very much confusion especially with respect to internationalization (usually written as "i18n" -- "i" + 18 characters + "n") of software. Unicode solves these problems by defining one code page for all scripts.

Unicode 针对现代和旧式的文本中所有的字符提供了一个序列。以前,字符只能使用256个序号,文本通常通过绑定代码页来与字符映射。这很容易导致混乱,特别是软件的国际化( internationalization --通常写做“i18n”--“i”+ "i" +“n”)。 Unicode 通过为所有字符定义一个统一的代码页解决了这个问题。

Creating Unicode strings in Python is just as simple as creating normal strings:

Python 中定义一个 Unicode 字符串和定义一个普通字符串一样简单:

>>> u'Hello World !'
u'Hello World !'

The small "u" in front of the quote indicates that an Unicode string is supposed to be created. If you want to include special characters in the string, you can do so by using the Python Unicode-Escape encoding. The following example shows how:

引号前小写的“u”表示这里创建的是一个 Unicode 字符串。如果你想加入一个特殊字符,可以使用 Python 的 Unicode-Escape 编码。如下例所示:

>>> u'Hello\u0020World !'
u'Hello World !'

The escape sequence \u0020 indicates to insert the Unicode character with the ordinal value 0x0020 (the space character) at the given position.

被替换的 \u0020 标识表示在给定位置插入编码值为 0x0020 的 Unicode 字符(空格符)。

Other characters are interpreted by using their respective ordinal values directly as Unicode ordinals. If you have literal strings in the standard Latin-1 encoding that is used in many Western countries, you will find it convenient that the lower 256 characters of Unicode are the same as the 256 characters of Latin-1.

其它字符也会被直接解释成对应的 Unicode 码。如果你有一个在西方国家常用的 Latin-1 编码字符串,你可以发现 Unicode 字符集的前256个字符与 Latin-1 的对应字符编码完全相同。

For experts, there is also a raw mode just like the one for normal strings. You have to prefix the opening quote with 'ur' to have Python use the Raw-Unicode-Escape encoding. It will only apply the above \uXXXX conversion if there is an uneven number of backslashes in front of the small 'u'.

另外,有一种与普通字符串相同的行模式。要使用 Python 的 Raw-Unicode-Escape 编码,你需要在字符串的引号前加上 ur 前缀。如果在小写“u”前有不止一个反斜杠,它只会把那些单独的
uXXXX 转化为Unicode字符。

>>> ur'Hello\u0020World !'
u'Hello World !'
>>> ur'Hello\\u0020World !'
u'Hello\\\\u0020World !'

The raw mode is most useful when you have to enter lots of backslashes, as can be necessary in regular expressions.


Apart from these standard encodings, Python provides a whole set of other ways of creating Unicode strings on the basis of a known encoding.

作为这些编码标准的一部分, Python 提供了一个完备的方法集用于从已知的编码集创建 Unicode 字符串。

The built-in function unicode() provides access to all registered Unicode codecs (COders and DECoders). Some of the more well known encodings which these codecs can convert are Latin-1, ASCII, UTF-8, and UTF-16. The latter two are variable-length encodings that store each Unicode character in one or more bytes. The default encoding is normally set to ASCII, which passes through characters in the range 0 to 127 and rejects any other characters with an error. When a Unicode string is printed, written to a file, or converted with str(), conversion takes place using this default encoding.

内置函数 unicode() 提供了访问(编码和解码)所有已注册的 Unicode 编码的方法。它能转换众所周知的 Latin-1, ASCII, UTF-8, 和 UTF-16。后面的两个可变长编码字符集用一个或多个 byte 存储 Unicode 字符。默认的字符集是 ASCII,它只处理0到127的编码,拒绝其它的字符并返回一个错误。当一个 Unicode 字符串被打印、写入文件或通过 str() 转化时,它们被替换为默认的编码。

>>> u"abc"
>>> str(u"abc")
>>> u"漩"
>>> str(u"漩")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

To convert a Unicode string into an 8-bit string using a specific encoding, Unicode objects provide an encode() method that takes one argument, the name of the encoding. Lowercase names for encodings are preferred.

要把一个 Unicode 字符串用指定的字符集转化成8位字符串,可以使用 Unicode 对象提供的 encode() 方法,它有一个参数用以指定编码名称。编码名称小写。

>>> u"漩".encode('utf-8')

If you have data in a specific encoding and want to produce a corresponding Unicode string from it, you can use the unicode() function with the encoding name as the second argument.

如果你有一个特定编码的字符串,想要把它转为 Unicode 字符集,,可以使用 encode() 函数,它以编码名做为第二个参数。

>>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8')

3.1.4 链表 Lists

Python knows a number of compound data types, used to group together other values. The most versatile is the list, which can be written as a list of comma-separated values (items) between square brackets. List items need not all have the same type.

Python 已经有了几个复合数据类型,用于组织其它的值。最通用的是链表,它写为中括之间用逗号分隔的一列值(子项),链表的子项不一定是同一类型的值。

>>> a = ['spam', 'eggs', 100, 1234]
>>> a
['spam', 'eggs', 100, 1234]

Like string indices, list indices start at 0, and lists can be sliced, concatenated and so on:


>>> a[0]
>>> a[3]
>>> a[-2]
>>> a[1:-1]
['eggs', 100]
>>> a[:2] + ['bacon', 2*2]
['spam', 'eggs', 'bacon', 4]
>>> 3*a[:3] + ['Boe!']
['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boe!']

Unlike strings, which are immutable, it is possible to change individual elements of a list:


>>> a
['spam', 'eggs', 100, 1234]
>>> a[2] = a[2] + 23
>>> a
['spam', 'eggs', 123, 1234]

Assignment to slices is also possible, and this can even change the size of the list:


>>> # Replace some items:
... a[0:2] = [1, 12]
>>> a
[1, 12, 123, 1234]
>>> # Remove some:
... a[0:2] = []
>>> a
[123, 1234]
>>> # Insert some:
... a[1:1] = ['bletch', 'xyzzy']
>>> a
[123, 'bletch', 'xyzzy', 1234]
>>> a[:0] = a     # Insert (a copy of) itself at the beginning
>>> a
[123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234]

The built-in function len() also applies to lists:


>>> len(a)

It is possible to nest lists (create lists containing other lists), for example:


>>> q = [2, 3]
>>> p = [1, q, 4]
>>> len(p)
>>> p[1]
[2, 3]
>>> p[1][0]
>>> p[1].append('xtra')     # See section 5.1
>>> p
[1, [2, 3, 'xtra'], 4]
>>> q
[2, 3, 'xtra']

Note that in the last example, p[1] and q really refer to the same object! We'll come back to object semantics later.

注意最后一个例子, p[1]q 实际上指向同一个对象!我们在后面会讲到对象语法。

3.2 开始编程 First Steps Towards Programming

Of course, we can use Python for more complicated tasks than adding two and two together. For instance, we can write an initial sub-sequence of the Fibonacci series as follows:

当然,我们可以用 Python 做比2加2更复杂的事。例如,我们可以用以下的方法输出菲波那契(Fibonacci)序列的子序列:

>>> # Fibonacci series:
... # the sum of two elements defines the next
... a, b = 0, 1
>>> while b < 10:
...       print b
...       a, b = b, a+b

This example introduces several new features.


译者:刘鑫(march.liu AT gmail DOT com) 由:limodou转(limodou AT gmail DOT com)
See About this document... for information on suggesting changes.