Presentation on theme: "Mapping and Browsing the Web in a 2D Space ¹ School of Computing Sciences, University of Technology, Sydney, NSW 2007, Australia ² Department of Mathematics."— Presentation transcript:
Mapping and Browsing the Web in a 2D Space ¹ School of Computing Sciences, University of Technology, Sydney, NSW 2007, Australia ² Department of Mathematics & Computing, The University of Southern Queensland, QLD 4350, Australia Mao Lin Huang¹, Wei Lai 2 and Yanchun Zhang²
Question? What would the next generation Web Browser look like? Why shouldn’t we have a map in our Web journey? Is that possible to map the entire Cyberspace and use the map to guide the overall Web journey?
Introduction The current generation of Web browsers, such as Netscape's navigator and MS internet explorer, provide users with an effective and convenient way to move in cyberspace.This is done by clicking on a series of hyper-links embedded in Web pages. However, this arrangement does not give users a visual map to guide the users in their Web journey. It does not provide a sense of space while the user is exploring the (cyber) space, instead it only gives a series of linear lists.
Introduction This is mainly because of the difficulty of constructing such a huge, complex, and dynamic map with a (virtually) unlimited number of hyper-documents (nodes) and hyper-links (edges). Most existing visualization techniques and current research interests emphasize site mapping”. That is, they try to find an effective way of constructing a structured geometrical map for one Web site. This can only guide the user through a very limited region of cyberspace, and does not help users in their overall journey through cyberspace.
The overview of web site-mapping techniques * The classic web site-mapping is using linear listing to list the contents of a web site. * In fact, recent web-masters are adding site-maps to their servers. They started to use some existing information visualisation techniques to build some graphic maps to view the structure of single web site. Examples are: Lotus' WebCutter, Microsoft's WebMapper and IBM's Mapuccino. These are simple diagrams, with hierarchically structure of the many documents in a web site. * They limit their information to one individual web site; they provide no map of the world beyond (the cyberspace).
A screen dump from Mapuccino (collected from IBM home page). Using 1 D listing to map web sites. This is the common solution to site-mapping.
An overview diagram using virtual page technique This is the overview of a large file system produced by Navigational View Builder (collected from the paper presented by Sougata Mukherjea at WWW3). It is a 2D site mapping.
Using a very large virtual page The visualisation technique stays behind the Navigational View is the virtual page technique which predefines the drawing of the whole graph, and then provides a small window and scroll bar to allow the user to navigate through it (by changing the viewing area).
A web overview diagram (HT-Live) of J. Kennedy's family tree using hyperbolic tree technique (collected from Inxight Software Co.).
Fish-eye views The technique stays behind the HT-Live is fish-eye viewing technique, which can keep a detailed picture of a part of a graph as well as the global context of the graph. It changes the zoomed focus point (collected from CS-93-40, technical reports, Dept. of Computer Science at Brown University).
Hyperbolic tree The hyperbolic browser technique performs fish-eye viewing with animated transitions to preserve the user’s mental map. It changes both the viewing area and the zoomed focus point (collected from Xerox PARC and Inxight Software Co.).
A web overview diagram of CSSE at Swinburne University of Technology, using bifocal display technique (collected from the paper written by Chris Pilgrim and Ying Leung presented at AusWeb96).
2D bifocal display divides the viewing area into nine distinct regions. It allows the whole data structure to be displayed through the visualization and save the screen space. The focused region provides a detailed view of part of the entire data space, and the other surrounding regions show the whole structure of the data space.
A screen dump from Lotus' WebCutter which provides a Star-Like view of web site-maps.
The technique stays behind the WebCutter is a force-directed layout method. One typical model of these methods is the Spring Physical Model, in which each node is replaced by a steel ring, and edges are replaced by Hookes’s law springs. The rings have a gravitational repulsion acting between them. And we can find a drawing which minimizes the energy. Force-directed layout methods for visualization
Ptolomaeus - a web cartographer developed by University of Rome III, using Sugiyama layout technique to draw the web site-maps.
The layout technique stays behind the Ptolomaeus is the Sugiyama layout algorithm. Assigns nodes to several layers. Orders the nodes within each layer to reduce the number of edge crossings.
The classical visualisation techniques that we described above for creating web site maps are only suitable for mapping one individual web site, which only contains moderate data (with hundreds or up to thousands of nodes). We define this type of methods as “static layout + dynamic viewing” approach, that build a static global context of the graph, and then allow the user to navigate through it. Since the amount of data that can be effectively displayed at one time is limited, and the whole global context may not be displayed in detail at one time, they always involve a mechanism to change the view (dynamic viewing). This allows the user to effectively view only at one time a small area of the whole visualization by changing the viewing area, zoomed focus point, or view point of the visualization. Static layout + dynamic viewing
Fred Tony The “ static layout + dynamic viewing” method is the traditional solution to the “ small window” problem.
A summary of previous visualization tech’s While these techniques deal with graphs of moderate size, they do not handle huge graphs (with millions or perhaps billions of nodes). The major problems may be outlined as below: These techniques predefine the layout. In most cases, the whole graph may not be known. In some cases, the local node in a distributed system may know only a small subgraph of the graph. It may be impossible to pre-compute the layout of the whole graph. Pre-computation of the overall geometrical structure of huge graph is very computationally expensive. Most layout algorithms have super-linear time complexity, and in practice are too slow for interactive graphics if the number of nodes is large than a few hundred. The layout is predefined and views are extracted of this layout. The user is unable to navigate logically through the graph and they naturally thinks in terms of logical relations, not in terms of the synthetic geometrical mapping onto the screen.
The huge web graph Local area information systems are merging into a huge shared system (distributed information systems) in which a vast amount of data available is over the Internet. We want to visualize the structure of this information. As the amount of data that we want to visualize becomes larger and the relations become more complicated, classical layout methods and site-mapping techniques tend to be inadequate. Web graphs are very large; even a small organization (such as a University) has many thousands of web documents. When the graph represents web data, the graph is not only huge but also partially unknown.
The complexity of web graph The real-world web hyper-links are very complex. The structure of the web could be for more complicated than those which can be comfortably dealt with by any existing drawing algorithm and visualisation methods. The structure of the web could be too complex to be read and understood by the viewer. However, in most cases, a particular user may only be interested in a part of the information with certain properties. Therefore, some rules of filtering can be used to pick up only those essential hyper-links for the purpose of the visualization, and to make the structure to be drawn as simple as possible.
Objective Visual Web Browser - mapping and browsing the entire Cyberspace, not only for one individual web site in a 2D space. We look at the whole of Cyberspace as one graph; a huge and partially unknown graph. We use on-line visualization technique to maintain and display a subset of this huge graph incrementally.
Objective Visual Web Browser consists of three major components: a fast accessible linkage server, a filtering mechanism, and an on-line visualizer.
Information filtering To reduce the complexity of web graph, we remove unwanted information (links and nodes) from the returned neighborhood and only retain the essential part of the real Web graph. This simplified visualization is a tree structure. We try to convert an real-world web graph G to a simplified visual graph T which is more clearer and comprehensible. The user can navigate through this simplified visualization.
This task can be done by defining a list of filtering rules: Rule 1: Graph structure based rule: We provide two mechanisms to eliminate the closed regions and ensure a tree-structured visualisation for navigation. Rule 2: Web context structure based rule: Remove all nodes which represent internal document anchors, since they do not provide much help for globe Web exploration. Rule 3: Data type based rule: A particular user may only be interested in a certain type of information, thus we may only choose those doc’s with certain properties adding to visualisation (e.g. html, gif, jpg …). Rule 4: Document structure based rule: Some organisations place Logo images in their home pages. Rule 5: Link number based rule: We may choose the first 20 links from the highest ranking of a list.
On-line visualizer Now, we discuss the visualization and graph drawing methods that we used to draw and display this simplified graph T. We use on-line visualization to maintain the user's orientation of their Web journey. We use a force-directed algorithm to draw this simplified graph T. We use successive display of logical display frames of the Web graph, and multiple animation to reduce the user's cognitive effort required in recognizing the change of views and preserve their “mental map’’ of the view.
Online Navigational Visualisation Online Navigational Visualisation (OFDAV) OFDAV provides a major departure from traditional methods. We visualise a tiny part (a frame F i ) of a huge graph at time t. We change from F i to F i +1 by user interaction. OFDAV does not need to know the whole graph, it does not predefine the geometry (the user can navigate logically), and it is user-oriented.
In OFDAV, the view of the user focuses on a small subgraph of a large graph G at any point in time. The subgraph is defined by its focus nodes. Conceptually, the focus nodes form a FIFO queue. We then allow the user to change the set of focus nodes by selecting another node on the screen. We use a force-directed graph drawing algorithm to draw the subgraph of G and a logical neighbourhood of this subgraph. We use animation to guide the user between views, reduce the cognitive effort and preserve the mental map. We also adopt a history that traces the subgraphs that the user has visited. This assists in backtracking through the graph. Online Navigational Visualization
A sequence of overview diagrams A number of researchers have noted that overview diagrams provide a reasonable solution to the ``lost in hyper-space'' problem. Our system can dynamically generate a sequence of such diagrams. Some overview diagram systems have been proposed.
Focus+Context views Another number of researchers have developed new dynamic methods to visualize query results of web search. Mukherjea proposes a dynamic focus+context view technique to show the focus node, immediate neighbourhood of the node and some landmark nodes in a web site. This helps user to quickly gain the understanding of where they are.
Focus+Context views However, from visualization & navigation points of the views, this technique has a number of weaknesses: The mental map is broken when jumping from one view to another. (OFDAV adopts three types of animation to smooth transform from one view to another.) The user understands where they are, but has no guide to returning to where they have visited in the past. (OFDAV adopts a ``history'' tail to traces the previous focus nodes that user has visited. This assists user in backtracking through the graph.)
The Online Graph Model The exploration of the huge graph G uses a sequence of sub-graphs F1, F2,... ; each F i is a logical frame. The logical frame is the sub-graph which is currently being viewed on the screen. Each logical frame is defined by a focus node v i of Fi. A FIFO queue Qi of focus nodes of Fi is maintained. To change from F i to F i+1, a node in F i is selected and it becomes a new focus node. A node from Qi is deleted.
The Graph is Partially Unknown The graph is supplied to the system by a series of requests for neighborhoods of focus nodes. Huge graph new focus node v Neighborhood of v
The Online Graph Model The logical key frame F i is the graph induced by nodes near Q i (in graph-theoretic distance).
Transitions To change from one logical key frame F i to next F i+1, the user selects a node v i+1 in F i with a mouse click. The v i+1 is appended to the queue, and a node is deleted from the queue in a FIFO manner.
* A fast accessible linkage server: it provides linkage information for all pages indexed by a particular search engine. It can quickly produce and return the entire neighborhood of a new focus node, including information about the neighborhood. * An information filter: a mechanism that reduces the complexity of the web sub-graph. It remove the unwanted information and only retains the essential part of the neighborhood for visualization. * An on-line visualizer: a visualization technique that uses the on-line exploratory concept to dynamically display the web sub-graphs. It provides users with a dynamic map during their web journeys.
Visual Web Browser architecture At run-time, the user interactively click on a graphic node v in the logical key frame F i. This node becomes a new focus node and is added into queue Q i, and then the corresponding URL is sent to the Linkage Server. The Linkage Server quickly produces the entire neighborhood (a list of URLs) of the focus node v and sends it back to Information Filter. The Filter selects the essential part of the neighborhood, and sends it to the On-line Visualizer for display. As soon as the neighborhood arrives, the On-line Visualizer creates a neighborhood tree T(v) and adds it into the logical key frame. To save the screen space, the system will delete an old focus node u from the queue Q i. The corresponding neighborhood tree T(u) will be also deleted from the frame F i. Now a transition has been made; the old frame has changed to a new frame F i+1 and the queue Q i has changed to Q i+1.
Animated graph drawings For each logical key frame F i there is an animated graph drawing D(F) which consists of a sequence D 1, D 2,..., D k of drawings of F i ; each is a screen of F i. The nodes common to F i and F i-1 stay on the first screen of F i as they were in the final screen of F i-1. The screens of this animated drawing are computed using a Modified Spring Algorithm. This is based on Hooke’s law springs, but the strength of the springs varies. The change from one screen D i to the next screen D i+1 is computed by a numerical method which converges to a stable configuration of the force system.
Animated graph drawings The Modified Spring Algorithm has many forces, including: Hooke’s law springs for all edges, with varying strengths depending on whether the endpoints are focus nodes or not. Gravitational repulsion forces for all nonedges. Special gravitational forces between nodes in each neighbourhood. Some further forces. The effect of these forces is to: try to keep the queue of focus nodes in a left-right line; keep node images disjoint radially display neighbourhoods around each focus node
In the spring model, each node is replaced by a steel ring, and edges are replaced by Hookes’s law springs. The rings have a gravitational repulsion acting between them, and we can find a drawing which minimizes the energy. Spring model
Modified spring algorithm In order to address the specific criteria of on-line drawing, we add extra forces among the neighbourhoods, N(v i ), N(v i+1 ), …, N(v i+B-1 ) of the focus nodes. These extra forces are used to separate the neighbourhoods so that user can visually identify the changes. This extra force is also a Newtonian gravitational force.
The force model Suppose that Fi = (Gi, Qi ) is the logical frame which is currently being viewed on the screen, and Gi = (Vi, Ei ). The total force applied on node v is: Where f uv is the force exerted on v by the spring between u and v, and g uv and h uv are the gravitational repulsions exerted on v by one of the other node u in Fi. (1)
An example of modified spring algorithm. In this frame, there are two focus nodes, x and y. The total force on node v is:
The mental map Our goal is to preserve the user’s mental map, while taking best advantage of the view screen. In OFDAV, we use three types of animation to assist the user in understanding the change in view. Fade Animation: We use shrinking/growing to help the user identify nodes that are disappearing/appearing. Camera Animation: This moves the whole drawing so that the new focus node moves toward the centre of the screen. Layout Animation: We use a complex system of forces based on Hooke’s law springs to adjust the layout between logical key frames.
Conclusion In the future, we need to implement the fast accessible linkage server that provides linkage information for all pages indexed by a particular search engine. This server can quickly produce and return the entire neighbourhood (in the graph theory sense) of the focus node (a user focused URL), including information about the neighbourhood.
Conclusion More sophisticated filtering strategies and rules should be created. Existing filtering rules may sometimes make us lose useful information. The labelling problem has not been completely solved yet. If we put the entire long URL string into a box as its label, then the boxes are enlarged and cost more display space. The issues are: 1) how to shorten the length of labels, and 2) make these short labels unique. The investigation of these issues is proceeding.