Creating website link chart

Trying to create a visualization of the links on my website. Not having a lot of success creating a readable graph.

Sadly I'm having no success. I've tried DOT & Neato and now twopi.

Anythoughts on how best to do this? Needless to say the data is HUGE (5million or so records).

Am I just out of luck or is this just going to take a lot of processing time/power?

With that many nodes, you

With that many nodes, you will need to use either twopi or sfdp, and it will still take some time. You will probably also want to either shrink the output size, or use pdf/ps/svg for output. If your graph is highly connected, it may still end up being a mess.

I've been trying pdf, is

I've been trying pdf, is there any benefit over the 3 types?

It's higly connected, I'm trying to parse it down to get minimum connections, but it's tough since it's a website and leads to connectivity among the pages.

I could use some help figuring out how exactly to configure so help the nodes be more readable and spaced out. I'm not sure if I'm taking the right approach.

 

Trying something like this:

digraph G {
node [pad=".05",nodesep="2.2",margin=".05",textsize="8",size="2,1", shape=box,style=filled,color=".7 .3 1.0"];

 

To be potentially legible, a

To be potentially legible, a bitmap version would probably be too big for the bitmap formats. You can use the size attribute to scale the image down, but then it probably won't be legible anymore. Thus, you should use a vector graphics format such as pdf or svg. The only difference would be the quality of viewers you have for pdf or svg.

To remove overlap, use overlap=false; you should also use outputorder=edgesfirst so the edges will be drawn behind the nodes, and use semi-transparent edge color. If you actually use nodes with text, the drawing will be even bigger. You might consider drawing the nodes as points (colored disks) and store the label info as a tooltip, which will appear on mouse-over.

Getting a better layout graphs of low diameter is still an area of research and depends a lot on the type of graph. It can be very helpful if you can prune as many edges (and nodes) as possible. For example, if you had some notion of how often a link from one page to another is traversed, you could delete edges that are rarely used for layout purposes, perhaps adding them later during drawing.

A website graph is inherently directed, but supporting edges in both directions can really clutter the layout. Consider using a single edge to indicate if there is any edge a -> b or b->a, then use the dir attribute to indicate which edges are there.

Recent comments