Introduction:
Python is a popular, free, cross-platform,
open-source computer programming language that is in wide use. It has no
licensing restrictions that would prevent its use in commercial projects. It
has a rich set of libraries for scientific and technical applications. Support,
tutorials and documentation are widely available.
Python is one of a class of languages that
began as simple scripting languages but evolved over time to become more
powerful languages with compilers and libraries — Perl and Ruby are other
examples. To indicate the informal way they evolved, I call these
"bottom-up" languages. Each of them has advantages and drawbacks, but
like many technical issues, there is a snowball effect — as a language's
popularity increases, this motivates people to write application libraries and
dedicated development environments, which increases its popularity further.
Some will argue that this is a rather
uncivilized way to create computer languages, and a more formal,
"top-down" approach should produce better results. There are examples
of language designs like C, C++ and Java that have advantages in structure and
syntax because architecture and consistent design were (to some extent) considered
in advance of practical embodiments.
The advantages of "top-down"
language design include a more formal way to manage things like classes, a
structure conducive to compilation and rational memory management, and a
consistent, reliable way to link with other modules. The advantages of
"bottom-up" languages like Python include the existence of an
accessible, interpreted form for experimentation, and very quick development
times.
This is not to say that languages like
Python don't have drawbacks when used for serious development work. Python's
managment of classes is an example — every function within a class needs to
identify itself as belonging to the class that encloses it in a rather peculiar
way, and each function call within a class must also be written in a way that
only reveals the degree to which classes were an afterthought (more on this
topic later).
There is one aspect of Python that, all by
itself, prevented me from adopting it for many years — the absence of block
tokens, an unambiguous way for a programmer (or a syntax checking algorithm) to
identify a logical block. In most languages, there is a clear, unambiuguous way
to structure a program's logical blocks:
C/C++/Java:
if (condition == x) {
result = option(y);
if (result == z) {
process(a);
}
process(b);
}
process(c);
Ruby:
if (condition == x)
result = option(y)
if (result == z)
process(a)
end
process(b)
end
process(c)
Python:
if (condition == x):
result = option(y)
if (result == z):
process(a)
process(b)
process(c)
Yes, the final example means just what it
appears to — in Python, because of the absence of block delimiting tokens,
whitespace is syntactically significant, and if I change the indentation of any
line, the program's meaning changes. This means creating a syntax checker /
beautifier is difficult and of limited usefulness. Indeed, in Python, there are
some things a syntax checker must never do. If there is a code block or line
that has the wrong indentation, this changes the meaning of the program, as a result
of which source code editors and beautifiers must never change indentations.
From time to time I have a nightmare in
which someone applies a filter to a directory of Python source files, removing
all the leading white space — I wake up in a cold sweat. A group of Python
source files filtered in that way might as well be thrown away, but source
files for C, C++, Java, JavaScript, Ruby and other languages can be quickly and
unambiguously reformatted.
This issue kept me from using Python for many
years, but as time passed, I've gotten involved with a number of projects that
required some knowledge of Python — Sage and Blender among others — as a result
of which Python was more or less forced on me. For example, in order to write
my Sage tutorial, I had to create quite a lot of Python code, and now that I'm
starting to use Blender (a ray-tracing and graphic modeling environment), I've
discovered that it also uses Python.
I personally won't expend energy on a
computer language unless I can write a "beautifier" for it, a way to
automatically clean up source code files. My Python beautifier can't really do
much compared to its predecessors, but it has its uses (explanation later) and
it works within the limitations of the language — primarily the fact that the
program's meaning is determined by the indentation of the lines. For example,
in the Python code snippet above, changing the indentation of the three final
lines would change the meaning of the program, not just its appearance.
Here are some of Python's features:
Accessible interactively on the command
line, by way of interpreted scripts, and in some compiled forms.
Easy learning curve to the level of
useful programs, harder after that (typical for modern languages).
Support for all the expected properties
and libraries of a modern language — regular expressions, classes, graphics,
scientific and technical libraries, graphical user interfaces, portability
between platforms.
Interpreted scripts compiled into
bytecode before execution to inprove speed and efficiency.
Some Python development environments
for serious work and GUI design, as well as syntax/indentation support in most
programming ediitors.
Plenty of readily available
documentation.
Quick
Tour
Readers may simply browse this section, but
I recommmend that people download and install Python to be able to run the
examples firsthand.
Open a shell session (indicated by a '$'
prompt) and type "python" to start an interactive session (user
entries are in blue):
$ python
Python 2.6.4 (r264:75706, Jun 4 2010, 18:20:16)
[GCC 4.4.4 20100503 (Red Hat 4.4.4-2)] on
linux2
Type "help",
"copyright", "credits" or "license" for more
information.
>>>
The ">>>" string is
Python's prompt for interactive user entries.
Long Integers
Now type this:
>>> 111111111**2
12345678987654321L
>>>
What does this tell us? We typed eleven
"1"s followed by "**2" which is how you tell Python to
square the number to its left. The result, "12345678987654321L",
means that Python automatically switched to a numerical mode that isn't limited
by your computer's native integer data size. In fact, the size of integers is
only limited by your computer's memory:
>>> 111111111**11
31866355102719439692709575611832245125767178743323754858490688959195755275492295598602711L
>>>
>>> 111111111**111
11997241580139700753317722306541179696019537945276677076620236371271367738530816272427293245
36044189785106590881907871589130011785279768267350151806505206291147217916354548236760857171
58345945071801061169908649699656875046803888091118822420445243239281252464917711608768124665
00043506858289314436574268356926519770457593308469952173691331994280973613468226937436046635
29154170932905626164217324294276789230868612486289475029336445992904460558155636629618461983
93063655793401359040660390618941657378193647262405747743293028881358033403212918442648709014
92567835554908020509498870115766477236087661882191422576442109159919614878846924042427291286
52340825486056145043201441623544527642732118816973915240715987481283448945380735612004616582
51609207270501435418753039853438608270414558491772799555462255286036384211202553997802177952
587642973236246989126832243739532931275812149042894103594586318711L
>>>
I think my readers can see where this is
going, and readers should feel free to create absurdly long integers.
Big Floating-point Numbers
There is a Python module that allows the
above sort of extended precision for floating-point numbers — it's called
"mpmath". One can acquire it here or by way of the usual package
management utilities under the name "python-mpmath". Here is an
example of its use:
>>> from mpmath import *
>>> mp.dps = 200
>>> print pi
3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825
34211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964
4622948954930382
>>> print e
2.718281828459045235360287471352662497757247093699959574966967627724076630353547594571382178
52516642742746639193200305992181741359662904357290033429526059563073813232862794349076323382
9880753195251019
>>> print sqrt(2)
1.414213562373095048801688724209698078569671875376948073176679737990732478462107038850387534
32764157273501384623091229702492483605585073721264412149709993583141322266592750559275579995
05011527820605715
>>> print 80.0/81.0
0.987654320988
>>> print mpf(80)/81
0.987654320987654320987654320987654320987654320987654320987654320987654320987654320987654320
98765432098765432098765432098765432098765432098765432098765432098765432098765432098765432098
765432098765432099
>>>
The command "mp.dps = 200" sets
the number of decimal places in the mantissa (and it can be anything within
your computer's memory capacity). The fourth example shows that, unlike
mathematical constants and numbers submitted to mpmath's special set of functions,
bare numbers need to be created as mpmath floats (mpf(n)) in order to show
extended precision.
I won't display this next result, but I
just set mp.dps = 100,000 and printed the value of pi — my system took about
seven seconds to produce this result. Pretty impressive.
Plotting
In this next example, we will plot some
functions using the mpmath module we loaded above. I emphasize there is no
shortage of Python modules that support plotting, and I strongly recommend
SciPy and related modules in this connection, as well as for its mathematical
content.
>>> plot([cos, sin], [-4, 4])
(click here for graphic)
>>> plot([fresnels, fresnelc],
[-4, 4]) (click here for graphic)
>>> plot([lambda
x:exp(-(x*x)),lambda x:exp(-(x*x)) * sin(x*3*pi)**2] ,[-2,2])
(click here for graphic)
Python Objects
It's time to talk about objects — in
Python, everything is an object, and each object has a type. This is both
important and useful. (The following interactive Python session isn't a
continuation of the above session — there is no mpmath module loaded.)
>>> type(1)
<type 'int'>
>>> type (1.0)
<type 'float'>
>>> type(111111111**2)
<type 'long'>
>>> type (pi)
Traceback (most recent call last):
File "<stdin>", line 1,
in <module>
NameError: name 'pi' is not defined
>>> import math
>>> type(math)
<type 'module'>
>>> type(math.pi)
<type 'float'>
>>> math.pi
3.1415926535897931
>>>
We examined the type of a few objects, then
we "imported" the math module and examined its type. Remember that
importing modules is how one extends Python's abilities, and any but the
simplest math functions are part of the math module. If you try to use a common
math function and Python tells you it cannnot find it, chances are it's because
you haven't yet imported the math module: "import math".
There are two primary ways to import a
module's contents — we can say "import (module name)", or we can say
"from (module name) import (names)". Here is an example of the
difference:
>>> sqrt(2)
Traceback (most recent call last):
File "<stdin>", line 1,
in <module>
NameError: name 'sqrt' is not defined
>>> math.sqrt(2)
1.4142135623730951
>>>
It seems we must prefix the module name to
each math function. But this is because we earlier said "import
math". If we say "from math import *", the outcome is different:
>>> from math import *
>>> sqrt(2)
1.4142135623730951
>>> pi
3.1415926535897931
>>>
Each approach has advantages and drawbacks.
If we say "import math" and always prefix the math functions with the
module name "math", then we will always know which function we're
calling — what module it comes from. In larger, more complex programs, or in a
case where we have both "math" and "mpmath" modules loaded,
this may be very important. The convenience of being able to type
"sqrt(2)" instead of "math.sqrt(2)" may be undermined by a
confusion about names and their origins as programs become more complex.
Lists
Now let's play with lists. It turns out
that lists are a very powerful part of Python (and most modern languages). If
you are accustomed to thinking of lists as arrays, that's fine, but
"list" is the preferred term and ... it's one syllable easier to say.
Imagine the energy saved over a period of years!
Let's make a simple list. Enter:
>>> a = range(1,13)
>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
>>> type(a)
<type 'list'>
>>>
We used the "range" function
to create a list of 12 members with values 1 through 12.
Then we looked at it by entering the
list's name: "a".
Then we asked Python what the type of
"a" is, and Python identified it as a list.
Here are some examples of things we can do
with lists:
>>> a = range(1,13)
>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
>>> for n in a: print "%2d:
%3d" % (n,n*n)
...
1:
1
2:
4
3:
9
4:
16
5:
25
6:
36
7:
49
8:
64
9:
81
10: 100
11: 121
12: 144
>>>
We accessed each of the list's members
in a way that avoids a numerical index: "for n in a:".
We printed each member in two ways: as
the original number and as that number multiplied by itself.
Those who have used other computer
languages may recognize the formatting string for "print" — it
contains a specification for the style of each item to be printed: "%2d:
%3d" (the syntax originates in a C/C++ function called
"printf()").
The specification is followed by a
percent sign, then the variables to be printed enclosed in parentheses:
"(n,n*n)".
The variables enclosed in parentheses
have a special name. What is it?
>>> type((n,n*n))
<type 'tuple'>
>>>
A "tuple". Okay. When I first
heard this name, I thought there would be a different name for each grouping
based on size — two members would be called a "tuple," three members
would be a "triple," four members would be a "quadruple."
But no — regardless of their size, all these parenthesized entities are called
"tuples".
Tuples — "(1,2,3,4)" — are much
like lists — "[1,2,3,4]" — but tuples can't be changed once they're
created. This makes them a good choice for static data, or for a case where we
need to be sure the content won't change after we create it.
List Indexing
There are some subtle ways to access the
contents of lists — to slice them up and take only the parts we want. Let's
create a list and access its contents:
>>> a = ["dog","cat","bird","penguin"]
>>> a
['dog', 'cat', 'bird', 'penguin']
>>> a[0]
'dog'
>>> a[3]
'penguin'
>>>
The above example shows that List indexing
is zero-based, meaning the indices for our four-element list are 0,1,2,3.
>>> a[2:2]
[]
>>> a[2:3]
['bird']
>>> a[1:3]
['cat', 'bird']
>>>
For this two-argument access method, the
first value is the index of the desired first element, and the second value is
the index for the last desired element plus one.
>>> a[:2]
['dog', 'cat']
>>> a[2:]
['bird', 'penguin']
>>>
By leaving off one argument, we say we want
all the members in that direction — "[:n]" means "all from the
beginning to n-1" and "[n:]" means "all from n to the
end".
>>> a[-1]
'penguin'
>>>
Surprised? The idea of negative arguments
is that we won't necessarily know how long a list is, but we know we want an
element near the end. To do this, we provide a negative number, meaning
"the end of the list - n".
>>> a[1:-1]
['cat', 'bird']
>>>
The above is how one accesses all the
list's members except the first and the last.
>>> a[::-1]
['penguin', 'bird', 'cat', 'dog']
>>>
The above is a simple way to make a copy of
a list with the elements in reverse order.
There are many similar operations — the
reader should feel free to experiment with different indexing methods. Errors
are harmless, and there are any number of variations on the above examples.
List Comprehensions
It may seem that I'm dwelling a long time
on lists, but they're very important in program design, so it's time well
spent. List comprehensions are operations on lists that create other lists, in
various useful ways:
>>> a = range(1,9)
>>> a
[1, 2, 3, 4, 5, 6, 7, 8]
>>> [(x,x*x) for x in a]
[(1, 1), (2, 4), (3, 9), (4, 16), (5, 25),
(6, 36), (7, 49), (8, 64)]
>>> [2**n for n in a]
[2, 4, 8, 16, 32, 64, 128, 256]
>>>
That's the basic idea of a list
comprehension — you create a new list out of an old one, or from the result of
a sequence or range, plus any transformations you care to create. Here's a more
exotic example:
>>> [(c,ord(c)) for c in
list("zygote")]
[('z', 122), ('y', 121), ('g', 103), ('o',
111), ('t', 116), ('e', 101)]
>>>
Here is an example that nests list
comprehensions to create a two-dimensional list:
>>> [[x*y for x in range(1,13)]
for y in range(1,13)]
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22,
24],
[3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33,
36],
[4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44,
48],
[5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55,
60],
[6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66,
72],
[7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77,
84],
[8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88,
96],
[9, 18, 27, 36, 45, 54, 63, 72, 81, 90, 99,
108],
[10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110,
120],
[11, 22, 33, 44, 55, 66, 77, 88, 99, 110,
121, 132],
[12, 24, 36, 48, 60, 72, 84, 96, 108, 120,
132, 144]]
>>>
Python Scripts
To create a Python script, first create a
plain-text file, give it a suffix of ".py" and enter these lines:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
print "Hello World!"
The above example script includes the
standard heading for Python scripts. If you want your Python programs to run
anywhere, use this heading. Strictly speaking, the first line is not needed on
Windows, but the script won't necessarily run on Unix/Linux platforms without
it, so it's a good idea to keep it. The second line is also a good idea — it
assures compatibility with international character sets.
On Unix/Linux platforms, remember to make
your Python scripts executable before trying to run them. Let's say the above
script file has been named "first_program.py":
$ chmod +x *.py
$ ./first_program.py
Hello World!
$
Common Program Operations
Here are some brief script examples
(without the above needed header lines) showing how to do useful things.
A keyboard input example:
line = ""
while (line != "q"):
line = raw_input("Type someting (q =
quit): ")
print "You typed
\"%s\"." % line
Here are function definitions to read and
write text files:
def write_entire_file(path,data):
with open(path,'w') as f:
f.write(data)
def read_entire_file(path):
with open(path) as f:
return f.read()
def read_file_lines(path):
with open(path) as f:
for n, line in
enumerate(f.readlines()):
print "Line %3d: %s" %
(n+1,line.strip())
Here is a function that accepts a list as
an argument and prints some formatted results:
import math
def display_list(data):
for n in data:
print "The square root of %2.0f is
%3.16f" % (n,math.sqrt(n))
display_list(range(2,11))
Here is the output:
The square root of 2 is 1.4142135623730951
The square root of 3 is 1.7320508075688772
The square root of 4 is 2.0000000000000000
The square root of 5 is 2.2360679774997898
The square root of 6 is 2.4494897427831779
The square root of 7 is 2.6457513110645907
The square root of 8 is 2.8284271247461903
The square root of 9 is 3.0000000000000000
The square root of 10 is 3.1622776601683795
Here is a nested loop, a loop within a
loop:
def show_matrix(mat):
for y in mat:
for x in mat:
print "%3d" % (x * y),
print
show_matrix(range(1,13))
Here is the output:
1
2 3 4 5
6 7 8
9 10 11 12
2
4 6 8
10 12 14
16 18 20
22 24
3
6 9 12
15 18 21
24 27 30
33 36
4
8 12 16
20 24 28
32 36 40
44 48
5
10 15 20
25 30 35
40 45 50
55 60
6
12 18 24
30 36 42
48 54 60
66 72
7
14 21 28
35 42 49
56 63 70
77 84
8
16 24 32
40 48 56
64 72 80
88 96
9
18 27 36
45 54 63
72 81 90 99
108
10
20 30 40
50 60 70
80 90 100 110 120
11
22 33 44
55 66 77
88 99 110 121 132
12
24 36 48
60 72 84 96
108 120 132 144
Notice
in the above example that the first print statement doesn't emit a linefeed
because of the appended ",", and the second print statement's sole
purpose is to emit a linefeed at the end of each line.
Indentation,
logical blocks
Newcomers
to Python should be aware that whitespace is syntactically significant and that
the Python interpreter will accept any consistent indentation, even if a
particular block's indentation is at odds with others in the same source file.
The only rule is that the indentation of a logical block be consistent with
itself:
for
c in list("This is accepted."):
print c,
print
for
c in list("And so is this."):
print c,
print
The
first code example above indents eight spaces, the next indents by four. Python
accepts both. Most programming editors will try to create consistent
indentations by automatically indenting new lines in a consistent way after a
line ending with ":", but as a programming project moves forward and
logical blocks are manually moved, this discipline can erode. My point is that
Python won't reject valid code based on inconsistent indentation (unless a line
is indented in a way that doesn't agree with its adjacent lines), but for the
sake of consistency and readable code, programmers might want to pay attention
to this issue.
My
Python "beautifier" script PyBeautify will enforce consistent
indentation by flagging lines that aren't consistent (it won't try to correct
errors in indentation, only flag them). There is a Python module called
"tabnanny" that does much the same thing, but it won't enforce
overall consistency (i.e. the idea that all indentations should be multiples of
the same basic unit).
No comments:
Post a Comment