Number: 695
Title: HTML labels don't accept latin characters
Submitter: Paulo Pinto
Date: Mon Apr 25 04:42:17 2005
Subsys: Dot
Version: 2.0
System: x86-Windows-XP SP2
Severity: major
Problem:
Nodes with HTML labels will display the node name if the label contains latin characters. See attached files for reproduction. Command line: dot -Tpng bug_latin_html.txt -o bug_latin_html.png

Error output:
  Error: not well-formed (invalid token) in line 2
  ... AndrÚ</td> ...
  in label of node node1

I would expect this to work as I use the exact same character in a normal label, also demonstrated in the example file.
Input file: b695.dot
Output file: b695.png
Comments:
[erg] This is basically a request to officially support other character sets than utf-8. For now, the user can use utf-8 or HTML encodings. Note that if we allow a charset attribute, this should be passed into libexpat.
Owner: erg
Status: Request