1 """HTML40 -- generate HTML conformant to the 4.0 standard. See:
3 http://www.w3.org/TR/REC-html40/
5 All HTML 4.0 elements are implemented except for a few which are
6 deprecated. All attributes should be implemented. HTML is generally
7 case-insensitive, whereas Python is not. All elements have been coded
8 in UPPER CASE, with attributes in lower case. General usage:
10 e = ELEMENT(*content, **attr)
12 i.e., the positional arguments become the content of the element, and
13 the keyword arguments set element attributes. All attributes MUST be
14 specified with keyword arguments, and the content MUST be a series of
15 positional arguments; if you use content="spam", it will set this as
16 the attribute content, not as the element content. Multiple content
17 arguments are simply joined with no separator. Example:
19 >>> t = TABLE(TR(TH('SPAM'), TH('EGGS')), TR(TD('foo','bar', colspan=2)) )
27 <TD colspan="2">foobar</TD></TR></TABLE>
29 As with HTMLgen and other HTML generators, you can print the object
30 and it makes one monster string and writes that to stdout. Unlike
31 HTMLgen, these objects all have a writeto(fp=stdout, indent=0,
32 perlevel=2) method. This method may save memory, and it might be
33 faster possibly (many fewer string joins), plus you get progressive
34 output. If you want to alter the indentation on the string output,
37 >> print t.__str__(indent=5, perlevel=6)
44 <TD colspan="2">foobar</TD></TR></TABLE>
46 The output from either method (__str__ or writeto) SHOULD be the
47 lexically equivalent, regardless of indentation; not all elements are
48 made pretty, only those which are insensitive to the addition of
49 whitespace before or after the start/end tags. If you don't like the
50 indenting, use writeto(perlevel=0) (or __str__(perlevel=0)).
52 Element attributes can be set through the normal Python dictionary
53 operations on the object (they are not Python attributes).
55 Note: There are a few HTML attributes with a dash in them. In these
56 cases, substitute an underscore and the output will be corrected. HTML
57 4.0 also defines a class attribute, which conflicts with Python's
58 class statement; use klass instead. The new LABEL element has a for
59 attribute which also conflicts; use label_for instead.
61 >>> print META(http_equiv='refresh',content='60;/index2.html')
62 <META http-equiv="refresh" content="60;/index2.html">
64 The output order of attributes is indeterminate (based on hash order),
65 but this is of no particular importance. The extent of attribute
66 checking is limited to checking that the attribute is legal for that
67 element; the values themselves are not checked, but must be
68 convertible to a string. The content items must be convertible to
69 strings and/or have a writeto() method. Some elements may have a few
70 attributes they shouldn't, particularly those which use intrinsic
73 Valid attributes are defined for each element with dictionaries, with
74 the keys being the attributes. If the value is false, it's a boolean;
75 otherwise the value is printed.
77 Subclassing: If all you need to do is have some defaults, override the
78 defaults dictionary. You will also need to set name to the correct
79 element name. Example:
81 >>> class Refresh(META): defaults = {'http_equiv': 'refresh'}; name = 'META'
83 >>> print Refresh(content='10; /index2.html')
84 <META http-equiv="refresh" content="10; /index2.html">
86 Weirdness with Netscape 4.x: It recognizes a border attribute for the
87 FRAMESET element, though it is not defined in the HTML 4.0 spec. It
88 seems to recognize the frameborder attribute for FRAME, but border
89 only changes from a 3D shaded border to a flat, unresizable grey
90 border. Because of this, there is a border attribute defined for
91 FRAMESET. Similarly, HTML 4.0 does not define a border attribute for
92 INPUT (for use with type="image"), but one has been added anyway.
94 Historical notes: My first experience with an HTML generator was with
95 the one which comes with "Internet Programming with Python" by Aaron
96 Watters, Guido van Rossum, and James C. Ahlstrom. I hate to dis it,
97 but the thing really drove me nuts after awhile. Horrible to debug
98 anything, but maybe my understanding of it was incomplete. I then
99 discovered HTMLgen by Robin Friedrich:
101 http://starship.skyport.net/crew/friedrich/HTMLgen/html/main.html
103 It worked much better, for me at least, good enough for a major
104 project. There were, however, some frustrations: Subclassing could
105 sometimes be difficult (in fairness, I think that was by design), and
106 there were some missing features I wanted. Plus the thing's huge, as
107 Python modules go. These are relatively minor gripes, and if you don't
108 like this module, definitely use HTMLgen.
110 Mainly I did this because the methodology to do it just sorta dawned
111 on me. The result is, I think, some pretty clean code. Really, there's
112 hardly any actual code at all. Hey, and when was the last time saw a
113 subclass inherit from only one parent class with only a pass statement
114 and no attributes defined? There's 27 of them here. There's almost no
115 logic to it at all; it's pretty much all driven by dictionaries.
117 Yes, there are a number of features missing which are present in
118 HTMLgen, namely the document classes. All the high-level abstractions
119 are going in another module or two.
123 __version__ = "$Revision: 1.8 $"[11:-4]
126 from string import lower, join, replace
127 from sys import stdout
129 coreattrs = {'id': 1, 'klass': 1, 'style': 1, 'title': 1}
130 i18n = {'lang': 1, 'dir': 1}
131 intrinsic_events = {'onload': 1, 'onunload': 1, 'onclick': 1,
132 'ondblclick': 1, 'onmousedown': 1, 'onmouseup': 1,
133 'onmouseover': 1, 'onmousemove': 1, 'onmouseout': 1,
134 'onfocus': 1, 'onblur': 1, 'onkeypress': 1,
135 'onkeydown': 1, 'onkeyup': 1, 'onsubmit': 1,
136 'onreset': 1, 'onselect': 1, 'onchange': 1 }
138 attrs = coreattrs.copy()
140 attrs.update(intrinsic_events)
142 alternate_text = {'alt': 1}
143 image_maps = {'shape': 1, 'coords': 1}
144 anchor_reference = {'href': 1}
145 target_frame_info = {'target': 1}
146 tabbing_navigation = {'tabindex': 1}
147 access_keys = {'accesskey': 1}
149 tabbing_and_access = tabbing_navigation.copy()
150 tabbing_and_access.update(access_keys)
152 visual_presentation = {'height': 1, 'width': 1, 'border': 1, 'align': 1,
153 'hspace': 1, 'vspace': 1}
155 cellhalign = {'align': 1, 'char': 1, 'charoff': 1}
156 cellvalign = {'valign': 1}
158 font_modifiers = {'size': 1, 'color': 1, 'face': 1}
160 links_and_anchors = {'href': 1, 'hreflang': 1, 'type': 1, 'rel': 1, 'rev': 1}
161 borders_and_rules = {'frame': 1, 'rules': 1, 'border': 1}
163 from SGML import Markup, Comment
164 from XML import XMLPI
166 DOCTYPE = Markup("DOCTYPE",
167 'HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" ' \
168 '"http://www.w3.org/TR/REC-html40/loose.dtd"')
169 DOCTYPE_frameset = Markup("DOCTYPE",
170 'HTML PUBLIC "-//W3C//DTD HTML 4.0 Frameset//EN" ' \
171 '"http://www.w3.org/TR/REC-HTML/frameset.dtd"')
173 class Element(XMLPI):
176 attr_translations = {'klass': 'class',
178 'http_equiv': 'http-equiv',
179 'accept_charset': 'accept-charset'}
181 def __init__(self, *content, **attr):
183 if not hasattr(self, 'name'): self.name = self.__class__.__name__
184 if self.defaults: self.update(self.defaults)
186 if not self.content_model and content:
187 raise TypeError, "No content for this element"
188 self.content = list(content)
190 def update(self, d2):
191 for k, v in d2.items(): self[k] = v
193 def __setitem__(self, k, v):
195 if self.attlist.has_key(kl): self.dict[kl] = v
196 else: raise KeyError, "Invalid attribute for this element"
198 start_tag_string = "<%s %s>"
199 start_tag_no_attr_string = "<%s>"
200 end_tag_string = "</%s>"
202 def str_attribute(self, k):
203 return self.attlist.get(k, 1) and '%s="%s"' % \
204 (self.attr_translations.get(k, k), str(self[k])) \
205 or self[k] and k or ''
208 a = self.str_attribute_list()
209 return a and self.start_tag_string % (self.name, a) \
210 or self.start_tag_no_attr_string % self.name
213 return self.content_model and self.end_tag_string % self.name or ''
216 class PrettyTagsMixIn:
218 def writeto(self, fp=stdout, indent=0, perlevel=2):
219 myindent = '\n' + " "*indent
220 fp.write(myindent+self.start_tag())
221 for c in self.content:
222 if hasattr(c, 'writeto'):
223 getattr(c, 'writeto')(fp, indent+perlevel, perlevel)
226 fp.write(self.end_tag())
228 def __str__(self, indent=0, perlevel=2):
229 myindent = (perlevel and '\n' or '') + " "*indent
230 s = [myindent, self.start_tag()]
231 for c in self.content:
232 try: s.append(apply(c.__str__, (indent+perlevel, perlevel)))
233 except: s.append(str(c))
234 s.append(self.end_tag())
237 class CommonElement(Element): attlist = attrs
239 class PCElement(PrettyTagsMixIn, CommonElement): pass
241 class A(CommonElement):
243 attlist = {'name': 1, 'charset': 1}
244 attlist.update(CommonElement.attlist)
245 attlist.update(links_and_anchors)
246 attlist.update(image_maps)
247 attlist.update(target_frame_info)
248 attlist.update(tabbing_and_access)
251 class ABBR(CommonElement): pass
252 class ACRONYM(CommonElement): pass
253 class CITE(CommonElement): pass
254 class CODE(CommonElement): pass
255 class DFN(CommonElement): pass
256 class EM(CommonElement): pass
257 class KBD(CommonElement): pass
258 class PRE(CommonElement): pass
259 class SAMP(CommonElement): pass
260 class STRONG(CommonElement): pass
261 class VAR(CommonElement): pass
262 class ADDRESS(CommonElement): pass
263 class B(CommonElement): pass
264 class BIG(CommonElement): pass
265 class I(CommonElement): pass
266 class S(CommonElement): pass
267 class SMALL(CommonElement): pass
268 class STRIKE(CommonElement): pass
269 class TT(CommonElement): pass
270 class U(CommonElement): pass
271 class SUB(CommonElement): pass
272 class SUP(CommonElement): pass
274 class DD(PCElement): pass
275 class DL(PCElement): pass
276 class DT(PCElement): pass
277 class NOFRAMES(PCElement): pass
278 class NOSCRIPTS(PCElement): pass
279 class P(PCElement): pass
281 class AREA(PCElement):
283 attlist = {'name': 1, 'nohref': 0}
284 attlist.update(PCElement.attlist)
285 attlist.update(image_maps)
286 attlist.update(anchor_reference)
287 attlist.update(tabbing_and_access)
288 attlist.update(alternate_text)
290 class MAP(AREA): pass
292 class BASE(PrettyTagsMixIn, Element):
294 attlist = anchor_reference.copy()
295 attlist.update(target_frame_info)
300 attlist = coreattrs.copy()
303 class BLOCKQUOTE(CommonElement):
305 attlist = {'cite': 1}
306 attlist.update(CommonElement.attlist)
308 class Q(BLOCKQUOTE): pass
310 class BR(PrettyTagsMixIn, Element):
315 class BUTTON(CommonElement):
317 attlist = {'name': 1, 'value': 1, 'type': 1, 'disabled': 0}
318 attlist.update(CommonElement.attlist)
319 attlist.update(tabbing_and_access)
321 class CAPTION(Element):
323 attlist = {'align': 1}
324 attlist.update(attrs)
326 class COLGROUP(PCElement):
328 attlist = {'span': 1, 'width': 1}
329 attlist.update(PCElement.attlist)
330 attlist.update(cellhalign)
331 attlist.update(cellvalign)
333 class COL(COLGROUP): content_model = None
337 attlist = {'cite': 1, 'datetime': 1}
338 attlist.update(attrs)
342 class FIELDSET(PCElement): pass
344 class LEGEND(PCElement):
346 attlist = {'align': 1}
347 attlist.update(PCElement.attlist)
348 attlist.update(access_keys)
350 class BASEFONT(Element):
353 attlist.update(font_modifiers)
358 attlist = font_modifiers.copy()
359 attlist.update(coreattrs)
362 class FORM(PCElement):
364 attlist = {'action': 1, 'method': 1, 'enctype': 1, 'accept_charset': 1,
366 attlist.update(PCElement.attlist)
368 class FRAME(PrettyTagsMixIn, Element):
370 attlist = {'longdesc': 1, 'name': 1, 'src': 1, 'frameborder': 1,
371 'marginwidth': 1, 'marginheight': 1, 'noresize': 0,
373 attlist.update(coreattrs)
376 class FRAMESET(PrettyTagsMixIn, Element):
378 attlist = {'rows': 1, 'cols': 1, 'border': 1}
379 attlist.update(coreattrs)
380 attlist.update(intrinsic_events)
382 class Heading(PCElement):
384 attlist = {'align': 1}
385 attlist.update(attrs)
387 def __init__(self, level, *content, **attr):
389 apply(PCElement.__init__, (self,)+content, attr)
392 a = self.str_attribute_list()
393 return a and "<H%d %s>" % (self.level, a) or "<H%d>" % self.level
396 return self.content_model and "</H%d>\n" % self.level or ''
398 class HEAD(PrettyTagsMixIn, Element):
400 attlist = {'profile': 1}
405 attlist = {'align': 1, 'noshade': 0, 'size': 1, 'width': 1}
406 attlist.update(coreattrs)
407 attlist.update(intrinsic_events)
410 class HTML(PrettyTagsMixIn, Element):
414 class TITLE(HTML): pass
416 class BODY(PCElement):
418 attlist = {'background': 1, 'text': 1, 'link': 1, 'vlink': 1, 'alink': 1,
420 attlist.update(PCElement.attlist)
422 class IFRAME(PrettyTagsMixIn, Element):
424 attlist = {'longdesc': 1, 'name': 1, 'src': 1, 'frameborder': 1,
425 'marginwidth': 1, 'marginheight': 1, 'scrolling': 1,
426 'align': 1, 'height': 1, 'width': 1}
427 attlist.update(coreattrs)
429 class IMG(CommonElement):
431 attlist = {'src': 1, 'longdesc': 1, 'usemap': 1, 'ismap': 0}
432 attlist.update(PCElement.attlist)
433 attlist.update(visual_presentation)
434 attlist.update(alternate_text)
437 class INPUT(CommonElement):
439 attlist = {'type': 1, 'name': 1, 'value': 1, 'checked': 0, 'disabled': 0,
440 'readonly': 0, 'size': 1, 'maxlength': 1, 'src': 1,
441 'usemap': 1, 'accept': 1, 'border': 1}
442 attlist.update(CommonElement.attlist)
443 attlist.update(tabbing_and_access)
444 attlist.update(alternate_text)
447 class LABEL(CommonElement):
449 attlist = {'label_for': 1}
450 attlist.update(CommonElement.attlist)
451 attlist.update(access_keys)
455 attlist = {'compact': 0}
456 attlist.update(CommonElement.attlist)
460 attlist = {'start': 1}
461 attlist.update(UL.attlist)
465 attlist = {'value': 1, 'type': 1}
466 attlist.update(UL.attlist)
468 class LINK(PCElement):
470 attlist = {'charset': 1, 'media': 1}
471 attlist.update(PCElement.attlist)
472 attlist.update(links_and_anchors)
475 class META(PrettyTagsMixIn, Element):
477 attlist = {'http_equiv': 1, 'name': 1, 'content': 1, 'scheme': 1}
481 class OBJECT(PCElement):
483 attlist = {'declare': 0, 'classid': 1, 'codebase': 1, 'data': 1,
484 'type': 1, 'codetype': 1, 'archive': 1, 'standby': 1,
485 'height': 1, 'width': 1, 'usemap': 1}
486 attlist.update(PCElement.attlist)
487 attlist.update(tabbing_navigation)
489 class SELECT(PCElement):
491 attlist = {'name': 1, 'size': 1, 'multiple': 0, 'disabled': 0}
492 attlist.update(CommonElement.attlist)
493 attlist.update(tabbing_navigation)
495 class OPTGROUP(PCElement):
497 attlist = {'disabled': 0, 'label': 1}
498 attlist.update(CommonElement.attlist)
500 class OPTION(OPTGROUP):
502 attlist = {'value': 1, 'selected': 0}
503 attlist.update(OPTGROUP.attlist)
505 class PARAM(Element):
507 attlist = {'id': 1, 'name': 1, 'value': 1, 'valuetype': 1, 'type': 1}
509 class SCRIPT(Element):
511 attlist = {'charset': 1, 'type': 1, 'src': 1, 'defer': 0}
513 class SPAN(CommonElement):
515 attlist = {'align': 1}
516 attlist.update(CommonElement.attlist)
518 class DIV(PrettyTagsMixIn, SPAN): pass
520 class STYLE(PrettyTagsMixIn, Element):
522 attlist = {'type': 1, 'media': 1, 'title': 1}
525 class TABLE(PCElement):
527 attlist = {'cellspacing': 1, 'cellpadding': 1, 'summary': 1, 'align': 1,
528 'bgcolor': 1, 'width': 1}
529 attlist.update(CommonElement.attlist)
530 attlist.update(borders_and_rules)
532 class TBODY(PCElement):
534 attlist = CommonElement.attlist.copy()
535 attlist.update(cellhalign)
536 attlist.update(cellvalign)
538 class THEAD(TBODY): pass
539 class TFOOT(TBODY): pass
540 class TR(TBODY): pass
544 attlist = {'abbv': 1, 'axis': 1, 'headers': 1, 'scope': 1,
545 'rowspan': 1, 'colspan': 1, 'nowrap': 0, 'width': 1,
547 attlist.update(TBODY.attlist)
551 class TEXTAREA(CommonElement):
553 attlist = {'name': 1, 'rows': 1, 'cols': 1, 'disabled': 0, 'readonly': 0}
554 attlist.update(CommonElement.attlist)
555 attlist.update(tabbing_and_access)
557 def CENTER(*content, **attr):
558 c = apply(DIV, content, attr)
559 c['align'] = 'center'
562 def H1(content=[], **attr): return apply(Heading, (1, content), attr)
563 def H2(content=[], **attr): return apply(Heading, (2, content), attr)
564 def H3(content=[], **attr): return apply(Heading, (3, content), attr)
565 def H4(content=[], **attr): return apply(Heading, (4, content), attr)
566 def H5(content=[], **attr): return apply(Heading, (5, content), attr)
567 def H6(content=[], **attr): return apply(Heading, (6, content), attr)
569 class CSSRule(PrettyTagsMixIn, Element):
571 attlist = {'font': 1, 'font_family': 1, 'font_face': 1, 'font_size': 1,
572 'border': 1, 'border_width': 1, 'color': 1,
573 'background': 1, 'background_color': 1, 'background_image': 1,
574 'text_align': 1, 'text_decoration': 1, 'text_indent': 1,
575 'line_height': 1, 'margin_left': 1, 'margin_right': 1,
576 'clear': 1, 'list_style_type': 1}
580 def __init__(self, selector, **decl):
585 start_tag_string = "%s { %s }"
587 def end_tag(self): return ''
589 def str_attribute(self, k):
590 kt = replace(k, '_', '-')
591 if self.attlist[k]: return '%s: %s' % (kt, str(self[k]))
592 else: return self[k] and kt or ''
594 def str_attribute_list(self):
595 return join(map(self.str_attribute, self.dict.keys()), '; ')
600 r=replace; return r(r(r(s, '&', '&'), '<', '<'), '>', '>')
602 safe = string.letters + string.digits + '_,.-'
607 if c in safe: l.append(c)
608 elif c == ' ': l.append('+')
609 else: l.append("%%%02x" % ord(c))
612 def URL(*args, **kwargs):
613 url_path = join(args, '/')
615 for k, v in kwargs.items():
616 a.append("%s=%s" % (url_encode(k), url_encode(v)))
617 url_vals = join(a, '&')
618 return url_vals and join([url_path, url_vals],'?') or url_path
620 def Options(options, selected=[], **attrs):
623 opt = apply(OPTION, (o,), attrs)
625 if v in selected: opt['selected'] = 1
629 def Select(options, selected=[], **attrs):
630 return apply(SELECT, tuple(apply(Options, (options, selected))), attrs)
632 def Href(url, text, **attrs):
633 h = apply(A, (text,), attrs)
637 def Mailto(address, text, subject='', **attrs):
639 url = "mailto:%s?subject=%s" % (address, subject)
641 url = "mailto:%s" % address
642 return apply(Href, (url, text), attrs)
644 def Image(src, **attrs):
645 i = apply(IMG, (), a)
649 def StyledTR(element, row, klasses):
651 for i in range(len(row)):
652 r.append(klasses[i] and element(row[i], klass=klasses[i]) \
656 def StyledVTable(klasses, *rows, **attrs):
657 t = apply(TABLE, (), attrs)
658 t.append(COL(span=len(klasses)))
660 r = StyledTR(TD, row[1:], klasses[1:])
661 h = klasses[0] and TH(row[0], klass=klasses[0]) \
663 r.content.insert(0,h)
667 def VTable(*rows, **attrs):
668 t = apply(TABLE, (), attrs)
669 t.append(COL(span=len(rows[0])))
671 r = apply(TR, tuple(map(TD, row[1:])))
672 r.content.insert(0, TH(row[0]))
676 def StyledHTable(klasses, headers, *rows, **attrs):
677 t = apply(TABLE, (), attrs)
678 t.append(COL(span=len(headers)))
679 t.append(StyledTR(TH, headers, klasses))
680 for row in rows: t.append(StyledTR(TD, row, klasses))
683 def HTable(headers, *rows, **attrs):
684 t = apply(TABLE, (), attrs)
685 t.append(COL(span=len(headers)))
686 t.append(TR, tuple(map(TH, headers)))
687 for row in rows: t.append(TR(apply(TD, row)))
690 def DefinitionList(*items, **attrs):
691 dl = apply(DL, (), attrs)
692 for dt, dd in items: dl.append(DT(dt), DD(dd))