I'ld like to apologize right now for the length of this post, but
there's something about someone learning a new tool that immediately
helps them do something that I really enjoy.
Amy: I think I have a python problem.
Blake: Ooh, I should be able to help.
Amy: Well, it's a problem that could be fixed by python.
Blake: Close enough.
Amy: Ah, it goes beyond help. I still have to figure out where to
start. Like, do I even have python on this machine? And how do you
read in something from a file?
Blake: "python -v"
Blake: and:
myFile = open( "filename.txt" )
for line in myFile:
print line
Amy: Holy, if comes up with a million lines of... stuff.
Filenames?
Amy: But then it seems to be 2.3.
Blake: Perhaps "python -V"
Amy: That's better!
Blake: What did you want to do with the lines in a file?
Amy: Well, I want to take a bunch of pieces of data, like company
names and phone numbers and stuff, and stick it into a specific HTML
format.
Amy: So I want to take in a file of data and output HTML.
Amy: Do I want the HTML format hardcoded into the python or should
that be another file?
Blake: Do you have a python prompt up?
Blake: Try typing :
x = "Amy"
print "Hello %s" % x
Amy: Ah hah.
Amy: That's nice.
Blake: So, I would do something like :
myBigTemplate = """abc %s
def %s
ghi %s"""
print myBigTemplate % ('1','2','3')
Blake: (triple-quoted strings can span more than one line.)
Amy: Oh, I see. So I set up the formatting and then use 'print' to
spit out the HTML.
Blake: Yup. Oh, the other thing you can do is name the variables
you're replacing. So this works:
x = { 'name':'Amy', 'food':'apple'}
print "Hi %(name)s, do you want a %(food)s" % x
Blake: (Just don't forget the 's' after the closing bracket.)
Amy: So you think I should define the HTML format in the python
script itself? It seems easier but somehow less clean.
Blake: Yeah, for now. You can always change it later. :)
Amy: True.
Amy: What's wrong with this:
Amy: myTable = " %(coname)s "
coname = "Big Developer"
print myTable % coname
Blake: You need to pass a dictionary in if you use the (name)
feature. So, make a dictionary of variables, like x = {'coname':"Big
Developer"}
Blake: Then pass that in. You could call your dictionary "values",
or something meaningful.
Amy: Ooh, that worked.
Amy: Now I have to figure out how to get the dictionary from a
different file.
Blake: What's the format of the file?
Amy: Well, I guess it could be something like "Company: Big Developer
Blake: It could be?
Amy: Well, it's going to be exported from Access so I guess I could
define the format?
Amy: It sounds like you can anyway.
Amy: I'm going to work on the assumption you can export text in a
format like that, for now.
Blake: Okay, although it might be easier to just assume that the
first thing on the line is the company...
Amy: The problem is that each company could have a varying number
of employees.
Blake: What do you want to do in that case?
Amy: I want to iterate through all the names, adding a new row to
my table for each one. I wonder if it would be possible to include
the number of names in the dataset.
Amy: Although if they are the last thing on the line I guess you
could just go through them until you get to the EOL.
Blake: You could, or you could repeat the company name for each employee.
Amy: But I don't want to repeat the company name in my HTML.
Blake: Ahhh... Okay, I understand. How about:
Big Co, "employee1, employee2, employee3", BooYeah
Amy: I'm not clear on the function of the "BooYeah".
Blake: Neither am I. It's just whatever other data you need in
there.
Amy: Ah hah. The confusion is because the employees are the end of
the data.
Blake: Oh, okay.
Amy: Yeah, I think if I could get a comma delimited file with
company information and employees on each line, I could parse it.
Assuming it was always correctly formatted. :P
Blake: But if you're generating it from another program, it should be
correctly formatted.
Amy: It should.
Amy: "In theory..." but let's assume it will be.
Blake: So try "import csv" at the Python prompt.
Amy: It didn't do anything.
Blake: Sure it did. Type "dir( csv )" or "help( csv )" to see what
it did.
Blake: (There's a webpage at
http://www.python.org/doc/2.3.2/lib/csv-contents.html that has more
readable contents of the help. )
Blake: (And as another hint, you probably want to use the DictReader
class with a restkey of 'employee')
Amy: I will copy that and paste it somewhere and hopefully soon it
will mean something.
Blake: Feel free to ask me questions about whatever doesn't make
sense.
Amy: Hah. Part of the problem is that you're not working for me,
and part of the problem is that I don't even know how to begin asking
the questions.
Amy: What's a sequence?
Amy: As in "remaining data is added as a sequence keyed by the
value of restkey"?
Blake: It's just a list.
Amy: Okay.
Blake: x = [1,2,3] is a sequence.
Amy: Alright. So I can iterate through it pretty easily?
Blake: Yup.
Blake: I think I was wrong in my last explanation.
Blake: I think what they mean there is that you'll have a dictionary
with keys of "employee1", "employee2", etc...
Amy: Hm.
Amy: I guess I could work with that.
Blake: But a good way to find out would be to try running it on a
file, and printing it out.
Amy: How do I call DictReader?
Blake: No, I take it back again. I think my first explanation is
correct. You'ld have an entry in your dictionary with a key of
'employees', and a value of ['Bill', 'Jane', 'Ted'].
Amy: Do I have to define something else to be a DictReader?
Blake: First, you create one.:
myReader = csv.DictReader( filename, ['company','whateverelse'], 'employees' )
Blake: Then, you use it :
for values in myReader:
print template % values
Amy: It's too easy!
Blake: http://www.python.org/doc/2.3.2/lib/node549.html
Amy: I give it the fishy eye.
Blake: That's the beauty of Python. If you think it's too easy,
you're on the right track. :)
Amy: Ah hah, it's giving me an error!
Blake: What's the error?
Amy: NameError: name 'data' is not defined
Amy: Where 'data.csv' is the name of my file.
Blake: What's the line you used?
Amy: myReader = csv.DictReader( data.csv, ['company','address','phone'], 'employees' )
Blake: You need to put data.csv in quotes, too.
Amy: It doesn't say that in the manual!
Blake: No, that's a syntax thing.
Blake: Hey, can I post this to the weblog?
Amy: Uh, sure.
Amy: How do I get the values in myReader to just output willy
nilly? (I don't have a template yet, I just want to see if they're
reading in right).
Blake: print values
Amy: I did that but it gave me another ... prompt. I guess my
question is actually how do I end a for?
Blake: Just hit return.
Amy: Oh, that really didn't work at all!
Amy: Here is what I got:
Amy: {'phone': None, 'company': 'd', 'address': None}
{'phone': None, 'company': 'a', 'address': None}
{'phone': None, 'company': 't', 'address': None}
{'phone': None, 'company': 'a', 'address': None}
{'phone': None, 'company': '.', 'address': None}
{'phone': None, 'company': 'c', 'address': None}
{'phone': None, 'company': 's', 'address': None}
{'phone': None, 'company': 'v', 'address': None}
Amy: It's kind of funny how wrong it is.
Blake: Oh, hah! Yes. Read the examples, and see what's different.
Amy: Yes, sensai.
Blake: (Alternately, see what the companies spell if you read them
going down.)
Amy: Yeah, the filename. That's the funny part.
Blake: So you need to get it to read your file, instead of reading
the name of your file.
Blake: You can do that one of two ways. Either use "open( filename
)", or "file( filename )". They're the same, under the hood.
Amy: Oh, it worked!
Blake: It did?
Amy: Yeah, when I asked it to print values it gave me this:
Amy: {'phone': ' 416-574-8372', 'company': 'Huge Builder',
'employees': [' Bob Smith', ' President', ' Joan Simpson', '
Vice-President Public Relations', ' Huw Thompson', ' Vice President
Technology'], 'address': ' 2002 Yonge St'} {'phone': ' 416-938-2837',
'company': 'Big Buildco', 'employees': [' Joanne Jones', ' CEO'],
'address': ' 19 King St'}
Amy: Except for some reason it reordered the variables, but I don't
think that matters.
Blake: No, cause you'll use them in whatever order you want in your
HTML template.
Amy: Yup.
Amy: Cool!
Amy: I could get this working before your dad gets back from his
golf game!
Blake: The only other trick will be to get the employee data out.
For that I'ld use a separate template.
Blake: i.e. format the employees into a table first, and then add the
'employeeTable' to your dictionary.
Amy: A table?
Blake: (To do that, assuming you've got the employee's formatted into
the variable "temp", you would write:
values[ "employeeTable" ] = temp
)
Blake: An html table. Or however else you want to add the employees.
Amy: Couldn't I format them after I format the rest of the stuff?
Blake: You could, but formatting them before makes it easier to
insert them into the rest of the stuff.
Amy: Okay...
Amy: The whole thing is going to have to be inside the "for values
in myReader", right?
Blake: Mostly. You could define your templates outside, but the
rest, yeah.
Amy: Okay.
Amy: Why didn't this work:
for values in myReader:
... print " %(company)s "
Amy: It didn't return anything.
Blake: Because you didn't tell it where to get the company from.
(You need the " % values" at the end of the print.
Amy: Oh. So "values" is a real thing.
Blake: At this point, I think you want to switch to a script.
Amy: Yeah, just a second. :)
Blake: So that you can run it over and over again.
Blake: Yup, everything is a real thing. There's very little magic in
Python.
Amy: That's going to take some getting used to.
Blake: Hopefully it won't be too bad.
Amy: Oh, I can tell I'm serious now, i have two shell windows open.
:P
Blake: Heh.
Amy: How do I do comments?
Blake: # Like this.
Amy: Can I do line breaks wherever?
Blake: Almost.
Blake: For now, let's say "Yes", and if you run into a problem,
you'll find out.
Amy: Okay.
Blake: (And I can help you figure out where to put the break
instead.)
Amy: Oh my god.
Amy: It worked.
Amy: Just like that.
Blake: Heh. Now I'm definitely posting this to the weblog. :)
Blake: What did you do for the employee names and titles?
Amy: I didn't do that part yet. :P
Blake: Oh, okay.
Amy: I'm just excited I got the company to work.
Amy: Now I must eat more.
Amy: I'm running out of food.
Blake: Heh. I'll have you pulling your data from the live database
any second now.
Amy: Aiy! Don't even say that! Your dad would be so excited.
Blake: It's really quite easy... :)
Amy: Okay, now I'm stuck on the employees thing. It's a list
called "employees"... can I just do "for values in employees"?
Blake: You can, but it wouldn't be quite what you wanted.
Amy: Ah.
Blake: The quickest way I've found to get a useful list out of it is
the following line (assuming you've put the employees list into a
variable named "x": zip( [y for (i,y) in enumerate(x) if i%2==0], [y
for (i,y) in enumerate(x) if i%2==1] )
Blake: But that's nigh-unreadable, so perhaps we should try to do it
an easier way, huh?
Amy: Holy wha?!
Blake: See what I mean?
Amy: Yeah.
Blake: Ooh, how about this:
names = [y for (i,y) in enumerate(x) if i%2==0]
titles = [y for (i,y) in enumerate(x) if i%2==1]
Amy: First, isn't my employees list in a variable called
"employees"?
Blake: Just a sec.
Blake: Yes, so replace 'x' with "values['employees']"
Blake: Or add the line:
x = values['employees']
before those other two bits of code.
Amy: And then what do "names" and "titles" end up as? Lists?
Blake: Yup.
Amy: Hm. That's not really useful because I want to use them in
pairs, the name then the title.
Amy: I guess I can use an index to refer to the nth item in each
list, and they shouls match up.
Blake: Yes, but you could then write something like:
for name,title in zip( names, titles ):
print name, title
Amy: Should I look up zip or just ask you what it is?
Blake: (zip takes two lists "[a1,a2,a3]" and "[b1,b2,b3]", and makes
a new list with both "[ (a1,b1), (a2,b2), (a3,b3) ]"
Amy: Oh, okay.
Blake: enumerate (while I'm here), returns the items in a list, along
with their indices. So you could have written:
for i, name in enumerate( names ):
print name, names[i], titles[i]
Blake: and "name" and "names[i]" should have the same value.
Amy: So basically I'm taking the original employee list, stripping
it into two lists, and then folding it back into a new list with a
slightly different format.
Blake: Yeah. An easier to use format.
Blake: I suppose you could do it all in one go, if you wanted...
Something like:
for i,name in enumerate( values['employees'] ):
if i%2 == 1:
continue
print "name =", values['employees'][i], " title =", values['employees'][i+1]
Blake: Which makes more sense to you?
Amy: No, I don't like doing things all in one go!
Amy: I like doing things slowly and methodically.
Amy: Hm. It doesn't like "x = values['employees']"
Amy: It says values is not defined.
Blake: What's your whole script look like?
Blake: (That line, in specific, should be in the :
for values in myReader:
block.)
Amy: Right.
Amy: Well, it did something that time!
Blake: Excellent. Not what you wanted, I'm guessing.
Amy: Nope.
Amy: But it did what I told it to do.
Blake: Heh.
Amy: I have this:
employeeRows = " %(name)s %(title)s "
and then
for name, title in zip( names, titles ):
print employeeRows % name, title
Amy: But I'm not passing in the name, title values right.
Blake: Yes, since you're not using a dictionary, you can't use the
%(name)s format.
Amy: Do I just use %s>
Amy: ?
Blake: So, you can do one of two things. Stick with the %(name)s
format and switch to a dictionary, or switch to %s and pass them in in
the correct order.
Blake: Switching to a dictionary, by the way, is as easy as changing
the "% name, title" to "% locals()"
Amy: locals()?
Blake: It's a link to the local variables.
Blake: Try putting a "print locals()" at various points in your
script.
Amy: So the local variables are just whatever it's working with
right now?
Blake: Pretty much, yeah.
Amy: Hm.
Amy: Now I have to figure out how to stick the employee HTML into a
variable so I can put it in the rest of the HTML later.
Blake: What's the format of the html you want to stick it into?
Blake: (As a hint, instead of printing it, use += to append it to a
string...)
Amy: Pretty much what I had there, rows in a table.
Blake: Let me know if you need any help with that, m'kay?
Amy: Do I have to define variables?
Blake: Nope.
Amy: Hah, that was a trick question.
Blake: (Well, kinda nope.)
Amy: Traceback (most recent call last):
File "first.py", line 26, in ?
employeeTable += employeeRows % locals()
NameError: name 'employeeTable' is not defined
Blake: You can't just append to something that isn't there.
Blake: So start it with:
employeeTable = "<table>"
Amy: That's better.
Amy: I wonder what is wrong with my brain that I never remember to
put the close quote in.
Amy: How do I tell it to put in a newline?
Blake: "\n"
Amy: Or should I just triple-quote and put it in myself?
Blake: That would work too.
Blake: Whatever looks nicer to you.
Amy: \n looks nicer
Amy: Okay, I think I have the bones of it working. Now I need to
put in the real formatting.
Blake: Cool. Could you show me some sample output before you do?
Amy: Sure.
Amy: Huge Builder
Bob Smith President
Joan Simpson Vice-President Public Relations
Huw Thompson Vice President Technology
Big Buildco
Joanne Jones CEO
Blake: No phone number?
Amy: I didn't do that yet. I just assumed it would be about the
same as the company.
Blake: (Just making sure it's not being overwritten by something
else...)
Blake: Yup. It will be.
Amy: Actually I think I will do the """ thing for the HTML
templates, so it looks like regular HTML.
Amy: Uhoh.
Blake: What?
Amy: If a value is empty I want to leave out a row in my table.
Amy: I will have to do that in an if in my "for values in
myReader", right?
Blake: What do you mean by "if a value is empty"?
Blake: Oh, if you don't have the title for someone?
Amy: Well, more specifically, if the company doesn't have a suite
number.
Amy: If it does I want a row with the suite number, if it doesn't I
don't want that row at all.
Blake: Yeah. Or you could build up a sub-template, like the
employees.
Blake: Have a line that looks like:
values['suiteNumber'] = "<tr><td>%(suiteNumber)s<td><tr>" % values
Amy: Either way I will have to break everything else up into
"before Suite" and "after Suite" templates, though.
Blake: Not really. If you added the above line, then you could just
use "%(suiteNumber)s", and it would output the whole <tr><td> for you.
Amy: Oh, I see.
Amy: What if suiteNumber is empty, though?
Blake: Ah, yes, so you would have something like:
if values['suiteNumber']:
values['suiteNumber'] = "<tr><td>%(suiteNumber)s<td><tr>" % values
Blake: So, if it was empty, there would be no row, but if it wasn't
empty, it would get a row of its own.
Amy: Ah. Okay.
Amy: This is going to be really swell if it works.
Blake: It will. One way or another.
Amy: Uh oh.
Amy: One of my data fields has commas in it.
Blake: A-ha! Did it mess up?
Amy: I didn't try it yet. Should I quote the data with commas?
Blake: You shouldn't have to. The export thing should do it for you.
Amy: Shut up!
Amy: Wait, what export thing?
Amy: From access or whatever?
Blake: Yeah.
Amy: Okay. I'm not using real data yet, I'm just making up
fake(ish) data.
Blake: Ah, right. I would just assume that your real data is
correctly formatted.
Blake: (The rules for CSV quoting are kind of odd.)
Amy: If I have single quotes within a triple-quoted section, is
that okay?
Amy: Or do I have to escape them or something?
Blake: Yup.
Blake: You can also have single-quotes in a double-quoted section, or
double-quotes in a single-quoted section.
Blake: And you can triple-single or triple-double quote stuff, if you
needed a triple-whatever-the-other-quote-was in it.
Amy: Ah, wait. I meant single-double-quote, not single quote.
Blake: Whatever.
Blake: It all works.
Amy: Hm.
Amy: It's whining about something.
Blake: What's the complaint?
Amy: Traceback (most recent call last):
File "second.py", line 71, in ?
print myTable % values
ValueError: unsupported format character '"' (0x22) at index 29
Amy: Perhaps it is the %?
Blake: It's probably the %. To get a % in the output, you need to
type %%.
Amy: Hah!
Amy: Ta da!
Blake: It all works?
Amy: Kind of, except some values aren't right.
Amy: But it's formatting mostly right.
Blake: Hmm. Cool.
Blake: Back in a sec.
Amy: It's not reading the CSV properly -- it's the problem with
commas inside fields I was talking about before.
Amy: According to this it should work.
http://www.python.org/doc/2.3.2/lib/csv-fmt-params.html#csv-fmt-params
Blake: No?
Blake: What's the line it's failing on?
Blake: And is this actual data, or hand-created data?
Amy: It's my fake data.
Amy: Ah. It didn't like my spaces after my commas.
Amy: When I got rid of them it worked.
Amy: Whoo!
Blake: Hurray!
Amy: I CAN'T BELIEVE IT WAS SO EASY.
Amy: You can put that in the blog.
Blake: Oh, I will.
Amy: I'm sure I would have spent way more time looking for and
downloading and installing and testing a million graphical things, if
they even exist.
Amy: Scripting is the shit.