Graphviz Issue Tracker
Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0002260graphvizGraph Librariespublic2013-02-17 04:042013-06-28 13:38
Assigned Toerg 
PlatformOSOS Version
Summary0002260: agwrite sometimes breaks UTF-8 strings
DescriptionYou can mostly use UTF-8 with Graphviz (and the Python module pygraphviz relies on this), but sometimes UTF-8 strings are broken when writing out a graph using agwrite. This happens because _agstrcanon attempts to insert a backslash and a LF when the current line is longer than 80 characters: if the break occurs in the middle of a multi-byte UTF-8 sequence, invalid output is produced.
Steps To ReproduceUse pygraphviz to create a graph, give it a Unicode label containing a bunch of non-ASCII characters, then write out the graph. Pad the string if necessary to ensure that the 80-character point comes in the middle of a multibyte sequence.
Additional InformationI am attaching a patch that fixes this issue by ensuring that multi-byte UTF-8 sequences are not broken. There should be no impact on plain ASCII, and minimal on other encodings (e.g. ISO Latin-1, assuming it is even supported) as long as there are no huge blocks of bytes with values > 127. Even then, breaking long lines is mostly a cosmetic issue.
TagsNo tags attached.
VERSIONhg head
Attached Filesdiff file icon graphviz-utf8-write.diff [^] (1,058 bytes) 2013-02-17 04:04 [Show Content]

- Relationships

-  Notes
There are no notes attached to this issue.

- Issue History
Date Modified Username Field Change
2013-02-17 04:04 camillo New Issue
2013-02-17 04:04 camillo File Added: graphviz-utf8-write.diff
2013-02-18 15:27 erg Assigned To => erg
2013-02-18 15:27 erg Status new => resolved
2013-02-18 15:27 erg Resolution open => fixed

MantisBT 1.2.5[^]
Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker