Handling of special characters

Normally I use graphviz with the tool 'gvedit 1.01' that is part of the Windows installation of graphviz. Now, because of a certain demand described in forum topic "Different spline-styles on different levels not possible?", I had to use the command line version of 'dot'. When processing my gv-file with 'dot' I've got this warning:

(dot.exe:5472): Pango-WARNING **: Invalid UTF-8 string passed to pango_layout_set_text()

In my input-file I use the Ascii-Character 133 (three dots in a row) for some labels.
So the gv-file looks like this (I hope that the charater "..." between A and B will be visible after saving this posting):

Digraph G
{
C [label = "A…B"];
A -> C [splines = "true"];
B -> C [splines = "false"];
}

But the result when using the command line call of 'dot' shows a different character (a box with a cross). Only when I call dot from 'gvedit' I get the desired result. In the past I also created graphviz-files with cyrillic unicode characters in the labels of the nodes. This was no problem when I opened the files in 'gvedit'. But I just made a test with comand line dot and got the message:

Warning: ju.gv:1: syntax error in line 1 near '´╗┐Digraph'

Why do I get these different results?
Is there any way to influence the way, 'dot' handles special characters and unicode files?

Dot only accepts UTF-8 and

Dot only accepts UTF-8 and Latin-1 input, the former being the default. The extended ascii codes 128-159 are non-standard, so if you are using them in your input, it's not going to work. (I don't know why it appears to work in gvedit, since gvedit just calls dot. If you want to save your gvedit graph and send it, I can take a look.)

There is a unicode code for ellipsis (8230), which you can put in using UTF-8, but you may find it easier to use the html versions: … or …

Missing screenshot

I tried to add a screenshot in my posting yesterday just with copy and paste into the rich text editor of this forum.

There is an embedded image in the source code of the posting but the picture is not visible.

Somehow I can't see how to attach a picture to a posting in this forum even if I think, that I did it before.

Yes I don't think it is easy,

Yes I don't think it is easy, which for a visualization forum is not ideal.

I also thought that gvedit simply calls dot ...

I also thought that gvedit simply calls 'dot' but somehow it behaves diffenrent. I'm not sure what you mean with sending the gvedit graph but here is a simple proof that handling different characters in gvedit is quite simple:

Digraph G

{

C [label = "A…B"];

A -> C [splines = "true"];

B -> C [splines = "false"];

D [label = "Пётр Ильич Чайковский"];

C -> D;

}

 

looks like I want it to be.

screenshot

But when I save the file and process it with the command line dot I get the message:

Error: Invalid 3-byte UTF8 found in input. Perhaps "-Gcharset=latin1" is needed?

It's also not possible to load this file in gvedit again. Then it looks different and does not work anymore:

Digraph G

{

C [label = "Aâ?¦B"];

A -> C [splines = "true"];

B -> C [splines = "false"];

D [label = "Ð?Ñ?Ñ?Ñ? Ð?лÑ?иÑ? Чайковский"];

C -> D;

}

Normally I write my gv-files with an other editor and then I copy and paste the code in "gvedit". Working this way I neved had problems with special characters.

There was a bug in gvedit

There was a bug in gvedit causing it not to produce valid utf-8 on output, which explains the behavior you note above.  This has now been fixed.

If you have a gv file produced by your other editor which works correctly in gvedit, it should also work correctly directly with dot. If not, please submit a bug report or email the file to me.