Learning Python

I'ld like to apologize right now for the length of this post, but there's something about someone learning a new tool that immediately helps them do something that I really enjoy.
Amy: I think I have a python problem.

Blake:	Ooh, I should be able to help. 

Amy:	Well, it's a problem that could be fixed by python.

Blake:	Close enough. 

Amy:	Ah, it goes beyond help.  I still have to figure out where to
start.  Like, do I even have python on this machine?  And how do you
read in something from a file?

Blake:	"python -v" 

Blake:	and:
 myFile = open( "filename.txt" )
 for line in myFile:
   print line 
 
Amy:	Holy, if comes up with a million lines of... stuff.
Filenames?

Amy:	But then it seems to be 2.3.

Blake:	Perhaps "python -V" 

Amy:	That's better!

Blake:	What did you want to do with the lines in a file? 

Amy:	Well, I want to take a bunch of pieces of data, like company
names and phone numbers and stuff, and stick it into a specific HTML
format.

Amy:	So I want to take in a file of data and output HTML.

Amy:	Do I want the HTML format hardcoded into the python or should
that be another file?

Blake:	Do you have a python prompt up? 

Blake:	Try typing :
 x = "Amy"
 print "Hello %s" % x 

Amy:	Ah hah.

Amy:	That's nice.

Blake:	So, I would do something like :
 myBigTemplate = """abc %s
 def %s
 ghi %s"""
 print myBigTemplate % ('1','2','3') 

Blake:	(triple-quoted strings can span more than one line.) 

Amy:	Oh, I see.  So I set up the formatting and then use 'print' to
spit out the HTML.

Blake:	Yup.  Oh, the other thing you can do is name the variables
you're replacing.  So this works:
 x = { 'name':'Amy', 'food':'apple'}
 print  "Hi %(name)s, do you want a %(food)s" % x 

Blake:	(Just don't forget the 's' after the closing bracket.) 

Amy:	So you think I should define the HTML format in the python
script itself?  It seems easier but somehow less clean.

Blake:	Yeah, for now.  You can always change it later.  :) 

Amy:	True.

Amy:	What's wrong with this:

Amy:	myTable = " %(coname)s "

coname = "Big Developer"

print myTable % coname

Blake:	You need to pass a dictionary in if you use the (name)
feature.  So, make a dictionary of variables, like x = {'coname':"Big
Developer"} 

Blake:	Then pass that in.  You could call your dictionary "values",
or something meaningful. 

Amy:	Ooh, that worked.

Amy:	Now I have to figure out how to get the dictionary from a
different file.

Blake:	What's the format of the file? 

Amy:	Well, I guess it could be something like "Company: Big Developer

Blake:	It could be? 

Amy:	Well, it's going to be exported from Access so I guess I could
define the format?

Amy:	It sounds like you can anyway.

Amy:	I'm going to work on the assumption you can export text in a
format like that, for now.

Blake:	Okay, although it might be easier to just assume that the
first thing on the line is the company... 

Amy:	The problem is that each company could have a varying number
of employees.

Blake:	What do you want to do in that case? 

Amy:	I want to iterate through all the names, adding a new row to
my table for each one.  I wonder if it would be possible to include
the number of names in the dataset.

Amy:	Although if they are the last thing on the line I guess you
could just go through them until you get to the EOL.

Blake:	You could, or you could repeat the company name for each employee. 

Amy:	But I don't want to repeat the company name in my HTML.

Blake:	Ahhh...  Okay, I understand.  How about:
 Big Co, "employee1, employee2, employee3", BooYeah 

Amy:	I'm not clear on the function of the "BooYeah".

Blake:	Neither am I.  It's just whatever other data you need in
there. 

Amy:	Ah hah.  The confusion is because the employees are the end of
the data.

Blake:	Oh, okay. 

Amy:	Yeah, I think if I could get a comma delimited file with
company information and employees on each line, I could parse it.
Assuming it was always correctly formatted. :P

Blake:	But if you're generating it from another program, it should be
correctly formatted. 

Amy:	It should.

Amy:	"In theory..."  but let's assume it will be.

Blake:	So try "import csv" at the Python prompt. 

Amy:	It didn't do anything.

Blake:	Sure it did.  Type "dir( csv )" or "help( csv )" to see what
it did. 

Blake:	(There's a webpage at
http://www.python.org/doc/2.3.2/lib/csv-contents.html that has more
readable contents of the help. ) 

Blake:	(And as another hint, you probably want to use the DictReader
class with a restkey of 'employee') 

Amy:	I will copy that and paste it somewhere and hopefully soon it
will mean something.

Blake:	Feel free to ask me questions about whatever doesn't make
sense. 

Amy:	Hah.  Part of the problem is that you're not working for me,
and part of the problem is that I don't even know how to begin asking
the questions.

Amy:	What's a sequence?

Amy:	As in "remaining data is added as a sequence keyed by the
value of restkey"?

Blake:	It's just a list. 

Amy:	Okay.

Blake:	x = [1,2,3] is a sequence. 

Amy:	Alright.  So I can iterate through it pretty easily?

Blake:	Yup. 

Blake:	I think I was wrong in my last explanation. 

Blake:	I think what they mean there is that you'll have a dictionary
with keys of "employee1", "employee2", etc... 

Amy:	Hm.

Amy:	I guess I could work with that.

Blake:	But a good way to find out would be to try running it on a
file, and printing it out. 

Amy:	How do I call DictReader?

Blake:	No, I take it back again.  I think my first explanation is
correct.  You'ld have an entry in your dictionary with a key of
'employees', and a value of ['Bill', 'Jane', 'Ted']. 

Amy:	Do I have to define something else to be a DictReader?

Blake:	First, you create one.:
 myReader = csv.DictReader( filename, ['company','whateverelse'], 'employees' ) 
 
Blake:	Then, you use it :
 for values in myReader:
     print template % values 
 
Amy:	It's too easy!

Blake:	http://www.python.org/doc/2.3.2/lib/node549.html 

Amy:	I give it the fishy eye.

Blake:	That's the beauty of Python.  If you think it's too easy,
you're on the right track.  :) 

Amy:	Ah hah, it's giving me an error!

Blake:	What's the error? 

Amy:	NameError: name 'data' is not defined

Amy:	Where 'data.csv' is the name of my file.

Blake:	What's the line you used? 

Amy:	myReader = csv.DictReader( data.csv, ['company','address','phone'], 'employees' )

Blake:	You need to put data.csv in quotes, too. 

Amy:	It doesn't say that in the manual!

Blake:	No, that's a syntax thing.

Blake:	Hey, can I post this to the weblog? 

Amy:	Uh, sure.

Amy:	How do I get the values in myReader to just output willy
nilly?  (I don't have a template yet, I just want to see if they're
reading in right).

Blake:	print values 

Amy:	I did that but it gave me another ... prompt.  I guess my
question is actually how do I end a for?

Blake:	Just hit return. 

Amy:	Oh, that really didn't work at all!

Amy:	Here is what I got:

Amy:	{'phone': None, 'company': 'd', 'address': None}
{'phone': None, 'company': 'a', 'address': None}
{'phone': None, 'company': 't', 'address': None}
{'phone': None, 'company': 'a', 'address': None}
{'phone': None, 'company': '.', 'address': None}
{'phone': None, 'company': 'c', 'address': None}
{'phone': None, 'company': 's', 'address': None}
{'phone': None, 'company': 'v', 'address': None}

Amy:	It's kind of funny how wrong it is.

Blake:	Oh, hah!  Yes.  Read the examples, and see what's different. 

Amy:	Yes, sensai.

Blake:	(Alternately, see what the companies spell if you read them
going down.) 

Amy:	Yeah, the filename.  That's the funny part.

Blake:	So you need to get it to read your file, instead of reading
the name of your file. 

Blake:	You can do that one of two ways.  Either use "open( filename
)", or "file( filename )".  They're the same, under the hood. 

Amy:	Oh, it worked!

Blake:	It did? 

Amy:	Yeah, when I asked it to print values it gave me this: 

Amy:	{'phone': ' 416-574-8372', 'company': 'Huge Builder',
'employees': [' Bob Smith', ' President', ' Joan Simpson', '
Vice-President Public Relations', ' Huw Thompson', ' Vice President
Technology'], 'address': ' 2002 Yonge St'} {'phone': ' 416-938-2837',
'company': 'Big Buildco', 'employees': [' Joanne Jones', ' CEO'],
'address': ' 19 King St'}


Amy:	Except for some reason it reordered the variables, but I don't
think that matters.

Blake:	No, cause you'll use them in whatever order you want in your
HTML template. 

Amy:	Yup.

Amy:	Cool!

Amy:	I could get this working before your dad gets back from his
golf game!

Blake:	The only other trick will be to get the employee data out.
For that I'ld use a separate template. 

Blake:	i.e. format the employees into a table first, and then add the
'employeeTable' to your dictionary. 

Amy:	A table?

Blake:	(To do that, assuming you've got the employee's formatted into
the variable "temp", you would write:
 values[ "employeeTable" ] = temp
 ) 

Blake:	An html table.  Or however else you want to add the employees. 

Amy:	Couldn't I format them after I format the rest of the stuff?

Blake:	You could, but formatting them before makes it easier to
insert them into the rest of the stuff. 

Amy:	Okay...

Amy:	The whole thing is going to have to be inside the "for values
in myReader", right?

Blake:	Mostly.  You could define your templates outside, but the
rest, yeah. 

Amy:	Okay.

Amy:	Why didn't this work:
for values in myReader:
...   print " %(company)s "


Amy:	It didn't return anything.

Blake:	Because you didn't tell it where to get the company from.
(You need the " % values" at the end of the print. 

Amy:	Oh.  So "values" is a real thing.

Blake:	At this point, I think you want to switch to a script. 

Amy:	Yeah, just a second. :)

Blake:	So that you can run it over and over again. 

Blake:	Yup, everything is a real thing.  There's very little magic in
Python. 

Amy:	That's going to take some getting used to.

Blake:	Hopefully it won't be too bad. 

Amy:	Oh, I can tell I'm serious now, i have two shell windows open.
:P

Blake:	Heh. 

Amy:	How do I do comments?

Blake:	# Like this. 

Amy:	Can I do line breaks wherever?

Blake:	Almost. 

Blake:	For now, let's say "Yes", and if you run into a problem,
you'll find out. 

Amy:	Okay.

Blake:	(And I can help you figure out where to put the break
instead.) 

Amy:	Oh my god.

Amy:	It worked.

Amy:	Just like that.

Blake:	Heh.  Now I'm definitely posting this to the weblog.  :) 

Blake:	What did you do for the employee names and titles? 

Amy:	I didn't do that part yet. :P

Blake:	Oh, okay. 

Amy:	I'm just excited I got the company to work.

Amy:	Now I must eat more.

Amy:	I'm running out of food.

Blake:	Heh.  I'll have you pulling your data from the live database
any second now. 

Amy:	Aiy!  Don't even say that!  Your dad would be so excited.

Blake:	It's really quite easy...  :) 

Amy:	Okay, now I'm stuck on the employees thing.  It's a list
called "employees"...  can I just do "for values in employees"?

Blake:	You can, but it wouldn't be quite what you wanted. 

Amy:	Ah.

Blake:	The quickest way I've found to get a useful list out of it is
the following line (assuming you've put the employees list into a
variable named "x": zip( [y for (i,y) in enumerate(x) if i%2==0], [y
for (i,y) in enumerate(x) if i%2==1] ) 

Blake:	But that's nigh-unreadable, so perhaps we should try to do it
an easier way, huh? 

Amy:	Holy wha?!

Blake:	See what I mean? 

Amy:	Yeah.  

Blake:	Ooh, how about this:
 names = [y for (i,y) in enumerate(x) if i%2==0]
 titles = [y for (i,y) in enumerate(x) if i%2==1] 

Amy:	First, isn't my employees list in a variable called
"employees"?

Blake:	Just a sec. 

Blake:	Yes, so replace 'x' with "values['employees']" 

Blake:	Or add the line:
 x = values['employees']
 before those other two bits of code. 

Amy:	And then what do "names" and "titles" end up as?  Lists?

Blake:	Yup. 

Amy:	Hm.  That's not really useful because I want to use them in
pairs, the name then the title.

Amy:	I guess I can use an index to refer to the nth item in each
list, and they shouls match up.

Blake:	Yes, but you could then write something like:
 for name,title in zip( names, titles ):
   print name, title 

Amy:	Should I look up zip or just ask you what it is?

Blake:	(zip takes two lists "[a1,a2,a3]" and "[b1,b2,b3]", and makes
a new list with both "[ (a1,b1), (a2,b2), (a3,b3) ]" 

Amy:	Oh, okay.

Blake:	enumerate (while I'm here), returns the items in a list, along
with their indices.  So you could have written:
 for i, name in enumerate( names ):
   print name, names[i], titles[i] 
 
Blake:	and "name" and "names[i]" should have the same value. 

Amy:	So basically I'm taking the original employee list, stripping
it into two lists, and then folding it back into a new list with a
slightly different format.

Blake:	Yeah.  An easier to use format. 

Blake:	I suppose you could do it all in one go, if you wanted...
Something like:
 for i,name in enumerate( values['employees'] ):
   if i%2 == 1:
     continue
   print "name =", values['employees'][i], " title =", values['employees'][i+1] 

Blake:	Which makes more sense to you? 

Amy:	No, I don't like doing things all in one go!

Amy:	I like doing things slowly and methodically.

Amy:	Hm.  It doesn't like "x = values['employees']"  

Amy:	It says values is not defined.

Blake:	What's your whole script look like? 

Blake:	(That line, in specific, should be in the :
 for values in myReader:
 block.) 

Amy:	Right.

Amy:	Well, it did something that time!

Blake:	Excellent.  Not what you wanted, I'm guessing. 

Amy:	Nope.

Amy:	But it did what I told it to do.

Blake:	Heh. 

Amy:	I have this: 
employeeRows = " %(name)s  %(title)s "

and then
  for name, title in zip( names, titles ):
    print employeeRows % name, title

Amy:	But I'm not passing in the name, title values right.

Blake:	Yes, since you're not using a dictionary, you can't use the
%(name)s format. 

Amy:	Do I just use %s>

Amy:	?

Blake:	So, you can do one of two things.  Stick with the %(name)s
format and switch to a dictionary, or switch to %s and pass them in in
the correct order. 

Blake:	Switching to a dictionary, by the way, is as easy as changing
the "% name, title" to "% locals()" 

Amy:	locals()?

Blake:	It's a link to the local variables. 

Blake:	Try putting a "print locals()" at various points in your
script. 

Amy:	So the local variables are just whatever it's working with
right now?

Blake:	Pretty much, yeah. 

Amy:	Hm.

Amy:	Now I have to figure out how to stick the employee HTML into a
variable so I can put it in the rest of the HTML later.

Blake:	What's the format of the html you want to stick it into? 

Blake:	(As a hint, instead of printing it, use += to append it to a
string...) 

Amy:	Pretty much what I had there, rows in a table.

Blake:	Let me know if you need any help with that, m'kay? 

Amy:	Do I have to define variables?

Blake:	Nope. 

Amy:	Hah, that was a trick question.

Blake:	(Well, kinda nope.) 

Amy:	Traceback (most recent call last):
  File "first.py", line 26, in ?
    employeeTable += employeeRows % locals()
NameError: name 'employeeTable' is not defined

Blake:	You can't just append to something that isn't there. 

Blake:	So start it with:
 employeeTable = "<table>" 

Amy:	That's better.

Amy:	I wonder what is wrong with my brain that I never remember to
put the close quote in.

Amy:	How do I tell it to put in a newline?

Blake:	"\n" 

Amy:	Or should I just triple-quote and put it in myself?

Blake:	That would work too. 

Blake:	Whatever looks nicer to you. 

Amy:	\n looks nicer

Amy:	Okay, I think I have the bones of it working.  Now I need to
put in the real formatting.

Blake:	Cool.  Could you show me some sample output before you do? 

Amy:	Sure.

Amy:	 Huge Builder 
  Bob Smith   President 
  Joan Simpson   Vice-President Public Relations 
  Huw Thompson   Vice President Technology 

 Big Buildco 
  Joanne Jones   CEO 

Blake:	No phone number? 

Amy:	I didn't do that yet.  I just assumed it would be about the
same as the company.

Blake:	(Just making sure it's not being overwritten by something
else...) 

Blake:	Yup.  It will be. 

Amy:	Actually I think I will do the """ thing for the HTML
templates, so it looks like regular HTML.

Amy:	Uhoh.

Blake:	What? 

Amy:	If a value is empty I want to leave out a row in my table.

Amy:	I will have to do that in an if in my "for values in
myReader", right?

Blake:	What do you mean by "if a value is empty"? 

Blake:	Oh, if you don't have the title for someone? 

Amy:	Well, more specifically, if the company doesn't have a suite
number.

Amy:	If it does I want a row with the suite number, if it doesn't I
don't want that row at all.

Blake:	Yeah.  Or you could build up a sub-template, like the
employees. 

Blake:	Have a line that looks like:
 values['suiteNumber'] = "<tr><td>%(suiteNumber)s<td><tr>" % values 

Amy:	Either way I will have to break everything else up into
"before Suite" and "after Suite" templates, though.

Blake:	Not really.  If you added the above line, then you could just
use "%(suiteNumber)s", and it would output the whole <tr><td> for you. 

Amy:	Oh, I see.

Amy:	What if suiteNumber is empty, though?

Blake:	Ah, yes, so you would have something like:
 if values['suiteNumber']:
   values['suiteNumber'] = "<tr><td>%(suiteNumber)s<td><tr>" % values  

Blake:	So, if it was empty, there would be no row, but if it wasn't
empty, it would get a row of its own. 

Amy:	Ah.  Okay.

Amy:	This is going to be really swell if it works.

Blake:	It will.  One way or another. 

Amy:	Uh oh.

Amy:	One of my data fields has commas in it.

Blake:	A-ha!  Did it mess up? 

Amy:	I didn't try it yet.  Should I quote the data with commas?

Blake:	You shouldn't have to.  The export thing should do it for you. 

Amy:	Shut up!

Amy:	Wait, what export thing?

Amy:	From access or whatever?

Blake:	Yeah. 

Amy:	Okay.  I'm not using real data yet, I'm just making up
fake(ish) data.

Blake:	Ah, right.  I would just assume that your real data is
correctly formatted. 

Blake:	(The rules for CSV quoting are kind of odd.) 

Amy:	If I have single quotes within a triple-quoted section, is
that okay?

Amy:	Or do I have to escape them or something?

Blake:	Yup. 

Blake:	You can also have single-quotes in a double-quoted section, or
double-quotes in a single-quoted section. 

Blake:	And you can triple-single or triple-double quote stuff, if you
needed a triple-whatever-the-other-quote-was in it. 

Amy:	Ah, wait.  I meant single-double-quote, not single quote.

Blake:	Whatever. 

Blake:	It all works. 

Amy:	Hm.

Amy:	It's whining about something.

Blake:	What's the complaint? 

Amy:	Traceback (most recent call last):
  File "second.py", line 71, in ?
    print myTable % values
ValueError: unsupported format character '"' (0x22) at index 29

Amy:	Perhaps it is the %?

Blake:	It's probably the %.  To get a % in the output, you need to
type %%. 

Amy:	Hah!

Amy:	Ta da!

Blake:	It all works? 

Amy:	Kind of, except some values aren't right. 

Amy:	But it's formatting mostly right.

Blake:	Hmm.  Cool. 

Blake:	Back in a sec. 

Amy:	It's not reading the CSV properly -- it's the problem with
commas inside fields I was talking about before.

Amy:	According to this it should work.
http://www.python.org/doc/2.3.2/lib/csv-fmt-params.html#csv-fmt-params

Blake:	No? 

Blake:	What's the line it's failing on? 

Blake:	And is this actual data, or hand-created data? 

Amy:	It's my fake data.

Amy:	Ah.  It didn't like my spaces after my commas.

Amy:	When I got rid of them it worked.

Amy:	Whoo!

Blake:	Hurray! 

Amy:	I CAN'T BELIEVE IT WAS SO EASY.

Amy:	You can put that in the blog.

Blake:	Oh, I will. 

Amy:	I'm sure I would have spent way more time looking for and
downloading and installing and testing a million graphical things, if
they even exist.

Amy:	Scripting is the shit.