2IntroductionPajek is a program, for Windows, for analysis and visualization of large networks having some thousands or even millions of vertices. In Slovenian language the word pajek means spider.
3ApplicationPajek should provide tools for analysis and visualization of such networks:collaboration networks,organic molecule in chemistry,protein-receptor interaction networks,genealogies,Internet networks,citation networks,diffusion (AIDS, news, innovations) networks,data-mining (2-mode networks), etc.See also collection of large networks at:
4Main goalsto support abstraction by (recursive) decomposition of a large network into several smaller networks that can be treated further using more sophisticated methods;to provide the user with some powerful visualization tools;to implement a selection of efﬁcient (subquadratic) algorithms for analysis of large networks.
5six data structures in pajek network – main object (vertices and lines - arcs, edges):graph, valued network, 2-mode or temporal networkpartitionNominal property of vertices. Default extension: .cluvectornumerical property of vertices. Default extension: .vecpermutationreordering of vertices. Default extension: .perclustersubset of vertices (e.g. a class from partition). Default extension: .cls.hierarchyhierarchically ordered clusters and vertices. Default extension: .hie
6Network – .netNetwork can be defined in different ways on input file. Look at three of them:1. List of neighbours (Arcslist / Edgeslist)(see test 1.net)*Vertices 51 ”a”2 ”b”3 ”c”4 ”d”5 ”e”*Arcslist1 2 42 33 1 44 5*Edgeslist1 5
8ExplanationData must be prepared in an input (ASCII) file. Program NotePad can be used for editing. Much better is a shareware editor, TextPad.Words, starting with *, must always be written in first column of the line. They indicate the start of a definition of vertices or lines.Using *Vertices 5 we define a network with 5 vertices. This must always be the first statement in definition of a network.Definition of vertices follows after that – to each vertex we give a label, which is displayed between “ and ”.Using *Arcslist, a list of directed lines from selected vertices are declared (1 2 4 means, that there exist two lines from vertex 1, one to vertex 2 and another to vertex 4).Similarly *Edgeslist, declares list of undirected lines from selected vertex.In the file no empty lines are allowed – empty line means end of network.
10ExplanationDirected lines are defined using *Arcs, undirected lines are defined using *Edges. The third number in rows defining arcs/edges gives the value/weight of the arc/edge.In the previous format (Arcslist / Edgeslist) values of lines are not definedthe format is suitable only if all values of lines are 1.If values of lines are not important the third number can be omitted (all lines get value 1).In the file no empty lines are allowed – empty line means end of network.
11Network – .net 3.Matrix (see test 3.net) *Vertices 5 1 ”a” 2 ”b” 3 ”c” 4 ”d”5 ”e”*Matrix
12ExplanationIn this format directed lines (arcs) are given in the matrix form (*Matrix). If we want to transform bidirected arcs to edges we can use “Network>create new network>Transform>Arcs to Edges>Bidirected only”
13Additional definition of network Additionally, Pajek enables precise definition of elements used for drawing networks (coordinates of vertices, shapes and colors of vertices and lines, ...).Example: (see test 4.net)*Vertices 51 “a” box2 “b” ellipse3 “c” diamond4 “d” triangle5 “e” empty...
14Draw Layout of networks Energy: The network is presented like a physical system, and we are searching for the state with minimal energyKamada-Kawai: using separate components, you can tile connected components in a planeFruchterman-Reingold: draw in a plane or space and selecting the repulsion factorEigen Values: Selecting 2 or 3 eigenvectors to become the coordinates of vertices. Can obtain nice pictures
15Partition – .cluPartitions are used to describe nominal properties of vertices.e.g., 1-men, 2-womenDefinition in input file (see test.clu)*Vertices 512
16Vector – .vecVectors are used to describe numerical properties of vertices (e.g., centralities).Definition in input file (see test.vec)*Vertices 50.580.250.08
17Pajek project filesIt is time consuming to load objects one by one. Therefore it is convenient to store all data in one file, called Pajek project file (.paj). (see test.paj)Project files can be produced manually by using “File>Pajek Project File>Save”To load objects stored in Pajek project file select “File>Pajek Project File>Read”
18Menu structureCommands are put to menu according to the following criterion:commands that need only a network as input are available in menu Net,commands that need as input two networks are available in menu Networks,commands that need as input two objects (e. g., network and partition) are available in menu Operations,commands that need only a partition as input are available in menu Partition . . .
20Global and local views on network Local view is obtained by extracting sub-network induced by selected cluster of vertices.Global view is obtained by shrinking vertices in the same cluster to new (compound) vertex. In this way relations among clusters of vertices are shown.Combination of local and global view is contextual view: Relations among clusters of vertices and selected vertices are shown.
21ExampleImport and export in 1994 among 80 countries are given. They is given in 1000$. (See Country_Imports.net)Partition according to continents (see Country_Continent.clu)1 – Africa, 2 – Asia, 3 – Europe, 4 – N. America, 5 – Oceania, 6 – S. America.Operations>Extract from Network>PartitionOperations>Shrink Network>Partition
22Extracting Subnetwork Operations>Extract from Network>Partition
24Removing lines with low values Network>Info>Line Values
25Removing lines with low values Network>Create New Network>Transform>Remove>Lines with value>lower than (340000)
26Resources Download Text file into Pajek WoS to Pajek Tutorial The latest version of Pajek is freely available, for non-commercial use, at its home page:Text file into PajekWoS to PajekTutorialExploratory Social Network Analysis with Pajekvisit Pajek wiki for more information
31wos2pajek The download link: The new tutorial slides: The new tutorial slides:
32MontyLingua Download from: http://web.media.mit.edu/~hugo/montylingua/ Unpack it and copy ‘montylingua-2.1’ to C:\Python26\Lib\site-packagesSet up a new environment variable named ‘MONTYLINGUA’ and set the variable value as c:\Python26\Lib\site-packages\MontyLingua-2.1\Python
33wos2pajek Download the latest version of WoS2Pajek. Unpack it, and double click on WoS2Pajek.py to show the main interface of program:
36WoS2Pajek ProgramThe current version of WoS2Pajek requires 7 parameters to be given by the user:MontyLingua directory: path to the directory in which the MontyLingua package is installed;project directory: where the output files are saved;WoS file;maxnum – estimate of the number of all vertices (number of records+number of cited Works) –30*number of records;step – prints info about each k*step record as a trace; step= 0– no trace.use ISI name / short name;make a clean WoS file without duplicates;boolean list[DE, ID, TI, AB] specifying which fields are sources of keywords.
43Co-author network Read WA.net Network/2-mode network/2-mode to 1-mode/ColumnsNetwork/Create Partition/Components/Weak Operations/Network+Partition/Extract SubNetwork[1-*]Network/Create New Network/Transform/Remove/LoopsWANew.net (which is a co-author network)Questions:The author with highest co-authors?
44Bibliographic coupling network [Read Cite.net]Network/Create New Network/Transform/1-mode to 2-modeNetwork/2-mode Network/2-mode to 1-mode/RowsNetwork/Create Partition/Components/Weak Operations/Network + Partition/Extract SubNetwork [1-*]
45Co-citation network [Read Cite.net] Network/Create Partitions/Degree/OutputOperations/Network+Partition/Extract subNetwork [1-*]Network/Create New Network/Transform/1-mode to 2-modeNetwork/2-mode network/2-mode to 1-mode/ColumnsNetwork/Create Partition/Components/Weak Operations/Network+Partition/Extract SubNetwork [1-*]
47Two-mode network One-mode network Two-mode network each vertex can be related to each other vertex.Two-mode networkvertices are divided into two sets and vertices can only be related to vertices in the other set.
48Example Suppose we have data as below: P1: Au1, Au2, Au5 *vertices 15 101 "P1"2 "P2"3 "P3"4 "P4"5 "P5"6 "P6"7 "P7"8 "P8"9 "P9"10 "P10"11 "Au1"12 "Au2"13 "Au3"14 "Au5"15 "Au5"*edgeslist3 146 13Suppose we have data as below:P1: Au1, Au2, Au5P2: Au2, Au4, Au5P3: Au4P4: Au1, Au5P5: Au2, Au3P6: Au3P7: Au1, Au5P8: Au1, Au2, Au4P9: Au1, Au2, Au3, Au4, Au5P10: Au1, Au2, Au5See two_mode.net
49Transforming to valued networks The network is transformed into an ordinary network, where the vertices are elements from the first subset, using“Network>2 mode network>2-Mode to 1-Mode>Rows”.
50Transforming to valued networks If we want to get a network with elements from the second subset we use“Network>2 mode network>2-Mode to 1-Mode>Columns”.
51Basic information about a network Basic information can be obtained by “Network>Info>General” which is available in the main window of the program. We getnumber of verticesnumber of arcs, number of directed loopsnumber of edges, number of undirected loopsdensity of linesAdditionally we must answer the question:Input 1 or 2 numbers: +/highest, -/lowest where we enter the number of lines with the highest/lowest value or interval of values that we want to output.If we enter 10 , 10 lines with the highest value will be displayed. If we enter -10, 10 lines with the lowest value will be displayed. If we enter 3 10 , lines with the highest values from rank 3 to 10 will be displayed.
52Metformin NetworkLoad metformin network to Pajek
53EntityMetricsEntitymetrics is defined as using entities (i.e., evaluative entities or knowledge entities) in the measurement of impact, knowledge usage, and knowledge transfer, to facilitate knowledge discovery.Ding, Y., Song, M., Han, J., Yu, Q., Yan, E., Lin, L., & Chambers, T. (2013). Entitymetrics: Measuring the impact of entities. PLoS One, 8(8): 1-14.
55Diameter of the network Network/Create New Network/SubNetwork with Paths/Info on DiameterPajek returns only the two vertices that are the furthest away.
56Component Strongly connected components Weakly connected components Network>Create Partition>Components>StrongWeakly connected componentsNetwork>Create Partition>Components>WeakResult is represented by a partitionvertices that belong to the same component have the same number in the partition.Examplecomponent.net
64BicomponentNetwork/Create New Network/......with Bi-Connected Components stored as Relation NumbersBicommponents are stored in hierarchyLoad USAir97.netGet bicomponents with (14 of them) with component size >3
65BicomponentThe largest component is 244 airports
66BicomponentsHierarchy>Extract Cluster (13), then result is stored in clusterDraw the cluster
70K-CoresA subset of vertices is called a k-core if every vertex from the subset is connected to at least k vertices from the same subset.K-Cores can be computed using “Network>Create Partitions>K-Core” and selecting Input, Output or All core.Result is a partition: for every vertex its core number is given.In most cases we are interested in the highest core(s) only. The corresponding subnetwork can be extracted using “Operations>Extract from Network>Partition” and typing the lower and upper limit for the core number.ExampleSee k_core.net
74Betweenness Centrality How nodes are connecting different clustersBetweenness centralityNetwork>Create vector>Centrality>Betweenness
75Betweenness Centrality The betweenness centrality value for each node
76Closeness Centrality Closeness centrality Network>Create Vector>Centrality>ClosenessShowing how one node is close to all other nodes in the network
77Shortest PathNetwork/Create New Network/SubNetwork with Paths/.. ...One Shortest Path between Two VerticesEnter two verticesForget values on linesYes, if searching for the shortest path is based on lengthsNo, if searching for the shortest path is based o vlaue of linesIdentify vertices in source networkNoResult will be a new subnetwork containing the two selected verticesLayout>Energy>Kamada Kawai>Fix first and last
78Shortest pathNetwork/Create New Network/SubNetwork with Paths/.. ...One Shortest Path between Two Vertices ( )