TABLE OF CONTENTS (H)

NETWORKS

Introduction

We collected more than 160 undirected unweighted networks, which have been chosen in order to cover the largest possible set of network typologies: as far as we know, this is the largest examined dataset of real-world graphs. Note that, in the case of several of these graphs, the exact value of the diameter was still unknown or only approximated. We also considered almost 20 synthetic graphs obtained from well-known generative models. For all the real-world networks in our dataset (which are available in the NDE format), we report the filename and an acronym (used to identify the network in our papers), the number of the vertices and the number of the edges, both for the total graph and for the maximum connected component, the average number of breadth-first searches required by the iFUB algorithm to measure the diameter, the diameter value, and the reference to the original source of the data file. Note that some graphs that have been used in our papers are not publicly available, due to a sort of disclosure agreement from the original source: for these files, we provide in the corresponding paper a link to the website where one can find information for obtaining them.

Autonomous Systems

These networks are induced by tracing routes on the Internet, typically referring to connections among Internet Service Providers.
Name Acronym n m nmcc mmcc runs Δ Source
as-skitter ASSK 1696415 11095298 1694616 11094209 8.8 31 snap
itdk0304_rlinks ITDK 192244 609066 190914 607610 9.0 26 sommer

Biological networks

These graphs refer to databases of physical, genetic and biological interactions.
Name Acronym n m nmcc mmcc runs Δ Source
Brady2 BRA2 1124 1321 789 1115 5.2 28 metexplore
Brady BRAD 1116 1330 763 1101 8.9 20 metexplore
Burk BURK 1028 1228 741 1058 5.0 22 metexplore
Caenorhabditis_elegans CAEN 4723 9842 4428 9659 16.7 13 metexplore
celegans_metabolic CELE 354 1501 346 1493 6.2 7 aarenas
Chla2 CHLA 1202 1413 809 1160 33.4 21 metexplore
coli1_1Inter_st COLI 418 519 328 456 11.0 13 urialon
Cupri CUPR 1060 1270 767 1094 5.0 24 metexplore
dip20090126_MAX DIP2 19928 41202 19928 41202 14.2 30 sommer
Drosophila_melanogaster DROS 10625 40781 10424 40660 19.2 12 interactome
Erw ERW0 969 1224 740 1098 5.0 18 metexplore
Esche2 ESC2 943 1314 810 1239 15.8 16 metexplore
Esche3 ESC3 997 1331 821 1233 6.9 19 metexplore
HC-BIOGRID HCBI 4039 10321 4039 10321 13.7 23 sommer
Homo_sapiens HOMO 1027 1166 707 968 16.1 20 metexplore
Homo HOM1 13690 61130 13478 61006 7.0 15 interactome
hprd_pp HPRD 9465 37039 9219 36900 7.0 14 hprd
interdom INTE 1706 78983 1654 78916 6.6 8 jena
ipfam IPFA 1334 12002 513 9370 5.0 12 ipfam
Mes MES0 1116 1348 771 1129 5.0 24 metexplore
Meta META 3648 5049 3078 4667 6.0 39 metexplore
Meth2 MET2 952 1155 699 1008 11.3 25 metexplore
Meth3 MET3 930 1142 704 1011 9.0 25 metexplore
Meth4 MET4 936 1153 710 1020 13.2 25 metexplore
Meth5 MET5 1001 1208 725 1046 9.8 25 metexplore
Meth6 MET6 1051 1278 781 1118 13.3 25 metexplore
Meth METH 956 1157 711 1016 9.0 25 metexplore
Mus_musculus MUSM 4610 5747 3745 5170 23.9 20 interactome
Mus MUS0 1187 1378 819 1142 7.3 22 metexplore
Mic MYC0 1340 1513 874 1226 14.4 21 metexplore
Plant PLAN 1762 2198 1412 1941 6.0 37 metexplore
ppi_dip_swiss PPID 3834 11958 3766 11922 7.7 12 lasagne
ppi_gcc PPIG 37333 135618 37333 135618 17.6 27 lasagne
Pseudo2 PSE2 977 1206 711 1035 5.0 24 metexplore
Pseudo4 PSE4 1082 1307 800 1143 7.4 22 metexplore
psimap PSIM 1028 11615 526 9524 8.8 11 synechonet
Ral RAL0 1077 1276 769 1091 7.0 21 metexeplore
Rattus_norvegicus RATT 1914 2110 1415 1785 39.0 19 interactome
Rhizo2 RHI2 1138 1345 762 1102 5.0 24 metexeplore
Rhizo RHIZ 1071 1323 777 1142 5.0 20 metexeplore
Rhodo RHOD 957 1183 707 1034 6.0 25 metexeplore
Salmo SALM 1006 1323 801 1203 10.3 20 metexeplore
Shigi SHIG 982 1299 795 1193 9.6 19 metexeplore
Sino SINO 986 1187 687 1001 7.0 23 metexeplore
string STRI 2658 26805 2575 26757 68.9 9 string
yeast_bo YEAS 1846 2203 1458 1948 43.8 19 lasagne
yeastInter_st YEA1 688 1078 662 1062 7.5 15 metexplore
Yer2 YER2 956 1147 708 1006 19.0 19 metexeplore

Citation networks

In these networks, nodes represent published papers or books, and edges represent citations.
Name Acronym n m nmcc mmcc runs Δ Source
cit-HepPh CIT1 34546 420920 34401 420827 18.6 14 snap
cit-HepTh CIT2 27770 352323 27400 352058 11.4 15 snap
citeseer CITE 259217 532040 220997 505327 5.0 52 citeseer
cit-Patents CITP 3774768 16518947 3764117 16511740 139.0 26 snap
cora CORA 2708 5278 2485 5069 33.0 19 sen
hep-th-citations HEPT 27400 352021 27400 352021 9.0 15 sommer

Collaboration networks

In these networks, nodes represent published papers or books, and edges represent citations.
Name Acronym n m nmcc mmcc runs Δ Source
advogato ADVO 7418 48037 5272 45903 11.4 9 trustlet
ca-AstroPh CAAS 18771 198050 17903 196972 11.0 14 snap
ca-CondMat CACO 23133 93439 21363 91286 45.4 15 snap
ca-GrQc CAGR 5241 14484 4158 13422 28.6 17 snap
ca-HepPh CAH1 12006 118489 11204 117619 37.5 13 snap
ca-HepTh CAH2 9875 25973 8638 24806 15.2 18 snap
Cond_mat_95-99 COND 22015 58578 22015 58578 1368.8 12 tnet
dblp20080824_MAX DBLP 511163 1871070 511163 1871070 14.9 22 sommer
eva EVA0 7253 6724 4475 4662 18.4 18 lasagne
geom GEOM 6158 11898 3621 9461 18.6 14 pajek
imdb IMDB 908830 37588613 880455 37494636 20.0 14 imdb
jazz JAZZ 198 2742 198 2742 6.1 6 aarenas
MathSciNet MATH 391529 873775 332689 820644 11.6 24 cfinder
PGPgiantcompo PGPG 10680 24316 10680 24316 11.0 24 pajek

Communication networks

In these networks, nodes represent people and edges represent communication among them (such as emails and phone calls).
Name Acronym n m nmcc mmcc runs Δ Source
email-Enron EMA1 36691 183830 33695 180810 7.0 13 snap
email-EuAll EMA2 265214 365569 224832 340794 5.0 14 snap
email EMA3 1133 5451 1133 5451 25.5 8
wiki-Talk WIK1 2394385 4659564 2388953 4656681 11.8 11 snap

Electronic networks

These graphs correspond to adjacency matrices derived from finite element meshes or to stiffness matrices, or they are calculated during simulations for path optimization in digital electronic circuit projects (as in the case of road networks, almost all of these graphs have a narrow degree distribution).
Name Acronym n m nmcc mmcc runs Δ Source
144 144A 144649 1074393 144649 1074393 24586.7 40 partition
3elt 3ELT 4720 13722 4720 13722 882.2 65 partition
4elt 4ELT 15606 45878 15606 45878 986.2 102 partition
598a 598A 110971 741934 110971 741934 480.8 47 partition
add20 ADD2 2395 7462 2395 7462 6.6 15 partition
add32 ADD3 4960 9462 4960 9462 5.0 28 partition
auto AUTO 448695 3314611 448695 3314611 60121.8 82 partition
bcsstk29 BCS1 13992 302748 13830 302424 6758.0 32 partition
bcsstk30 BCS2 28924 1007284 28924 1007284 90.0 33 partition
bcsstk31 BCS3 35588 572914 35586 572913 3474.2 56 partition
bcsstk32 BCS4 44609 985046 44609 985046 34.7 79 partition
bcsstk33 BCS5 8738 291583 8738 291583 137.4 25 partition
brack2 BRAC 62631 366559 62631 366559 30.8 73 partition
crack CRAC 10240 30380 10240 30380 6.0 107 partition
cs4 CS4A 22499 43858 22499 43858 3273.9 75 partition
cti CTI0 16840 48232 16840 48232 4602.5 64 partition
data DATA 2851 15093 2851 15093 86.9 79 partition
fe_4elt2 FE4E 11143 32818 11143 32818 4150.8 121 partition
fe_body FEBO 44775 163734 30581 113424 5349.0 103 partition
fe_ocean FEOC 143437 409593 143437 409593 18184.4 229 partition
fe_pwt FEPW 36463 144794 36463 144794 16722.6 272 partition
fe_rotor FERO 99617 662431 99617 662431 1095.0 62 partition
fe_sphere FESP 16386 49152 16386 49152 7377.6 128 partition
fe_tooth FETO 78136 452591 78136 452591 492.0 48 partition
finan512 FINA 74752 261120 74752 261120 29670.8 87 partition
m14b M14B 214765 1679018 214765 1679018 265.2 51 partition
memplus MEMP 17758 54196 17758 54196 5.0 12 partition
s838_st S838 512 819 512 819 69.9 15 urialon
t60k T60K 60005 89440 60005 89440 249.2 649 partition
uk UK00 4824 6837 4824 6837 5.0 214 partition
vibrobox VIBR 12328 165250 12328 165250 607.5 10 partition
wave WAVE 156317 1059331 156317 1059331 3575.1 56 partition
whitaker3 WHIT 9800 28989 9800 28989 8.6 161 partition
wing_nodal WIN1 62032 121544 62032 121544 8257.6 92 partition
wing WIN2 10937 75488 10937 75488 931.2 26 partition

P2P networks

In these networks, nodes represent computers and edges represent an established connection among them.
Name Acronym n m nmcc mmcc runs Δ Source
p2p-Gnutella31 P2PG 62586 147891 62561 147877 21664.8 11 snap
p2p P2PZ 5380578 142038401 5380491 142038351 3588.0 9 lasagne

Product co-purchasing networks

In these networks, nodes represent products and edges link commonly co-purchased products.
Name Acronym n m nmcc mmcc runs Δ Source
amazon0302 AMA1 262111 899791 262111 899791 5.0 38 snap
amazon0312 AMA2 400727 2349868 400727 2349868 273.5 20 snap
Amazon0505 AMA3 410236 2439436 410236 2439436 5.0 22 snap
amazon0601 AMA4 403394 2443408 403364 2443311 18.8 25 snap

Road networks

In these networks, nodes represent intersections and endpoints and edges represent roads connecting these intersections or road endpoints (these networks are more similar to random graphs, when considering the range of variation of the nodes degree).
Name Acronym n m nmcc mmcc runs Δ Source
roadNet-CA ROA1 1965206 2766607 1957027 2760388 79035.8 865 snap
roadNet-PA ROA2 1088092 1541898 1087562 1541514 626.0 794 snap
roadNet-TX ROA3 1379917 1921660 1351137 1879201 40246.3 1064 snap

Social networks

In these networks, nodes represent people and edges represent interactions between them.
Name Acronym n m nmcc mmcc runs Δ Source
soc-sign-epinions SOC1 131827 711782 119130 704572 5.0 16 snap
soc-sign-Slashdot090221 SOC2 82140 500480 82140 500480 16.8 13 snap
soc-Slashdot0811 SOC3 77360 546486 77360 546486 6.3 12 snap
soc-Slashdot0902 SOC4 82168 582532 82168 582532 16.1 13 snap
soc-Epinions1 SOCE 75879 405739 75877 405738 13.5 15 snap
trust TRUST 49288 381217 49288 381217 5.0 14 trustlet
wiki-Vote WIK2 7115 100761 1637868 15205016 112.6 7 snap

Synthetic networks

These graphs are generated according to the following models: Erdős-Renyí, geometric random (unit-disk and unit-square), forest-fire and Kronecker.
Name Acronym n m nmcc mmcc runs Δ Source
forest1e4_2 FOR1 10000 153925 10000 153925 11.5 10 snap
forest1e4 FOR2 10000 49354 10000 49354 6.6 18 snap
forest5e4_2 FOR3 50000 1095697 50000 1095697 8.0 12 snap
forest5e4 FOR4 50000 243441 50000 243441 12.9 21 snap
Gnp_1e3 GNP1 1000 3854 1000 3854 534.7 6 lasagne
Gnp_1e4 GNP2 10000 59849 10000 59849 7753.2 6 lasagne
Gnp_2e3 GNP3 2000 8994 2000 8994 1308.0 7 lasagne
Gnp_5e3 GNP4 5000 24809 5000 24809 3789.0 7 lasagne
kron14 KRO1 8156 24506 8156 24506 7004.9 4 snap
kron16 KRO2 30429 65534 29722 65160 46.8 12 snap
ud_1e3 UD13 1000 16727 1000 16727 22.6 22 lasagne
ud_1e4 UD14 10000 313726 10000 313726 25.8 48 lasagne
ud_2e3 UD23 1999 35697 1999 35697 89.4 29 lasagne
ud_5e3 UD53 4998 97027 4998 97027 29.2 43 lasagne
us_1e3 US13 1000 14334 1000 14334 17.5 17 lasagne
us_1e4 US14 10000 242533 10000 242533 48.7 38 lasagne
us_2e3 US23 2000 37928 2000 37928 22.3 20 lasagne
us_5e3 US53 5000 135833 5000 135833 50.5 25 lasagne

Web networks

In these graphs nodes represent web pages and edges are hyperlinks.
Name Acronym n m nmcc mmcc runs Δ Source
arabic_2005 ARAB 22743881 553903073 22634275 552231867 11.0 47 webgraph
cnr_2000 CNR2 325557 2738969 325557 2738969 5.0 34 webgraph
enwiki-20071018 ENWI 2070486 42336692 2070367 42336614 497.1 9 cfinder
eu_2005 EU20 862664 16138468 862664 16138468 15.6 21 webgraph
GoogleNw GOOG 15763 148585 15763 148585 78.8 7 cfinder
in_2004 IN20 1353703 13126172 1353703 13126172 16.0 43 webgraph
web-BerkStan WEBB 685230 7600595 654782 7499425 5.0 208 snap
web-Google WEBG 875713 5105039 855802 5066842 23.4 24 snap
web-NotreDame WEBN 325729 1090108 325729 1090108 5.0 46 snap
web-Stanford WEBS 281903 1992636 255265 1941926 5.0 164 snap
web WEB0 39454463 783027125 39252879 781439892 90.5 32 complexnetworks

Word networks

In these networks, nodes represent words and edges represent their adjacency in a text.
Name Acronym n m nmcc mmcc runs Δ Source
darwinBookInter_st DARW 7381 46281 7377 46279 9.0 8 urialon
eatRS EATR 23219 304937 23219 304937 3211.8 6 pajek
eatSR EATS 23218 304934 23218 304934 498.3 6 pajek
frenchBookInter_st FREN 8325 23841 8308 23832 22.4 9 urialon
japaneseBookInter_st JAPA 2704 8300 2698 8297 8.0 8 urialon
spanishBookInter_st SPAN 11586 45129 11558 45114 5.0 10 urialon
ydata-ysm-advertiser YADV 653260 2278448 653260 2278448 5.0 24 yahoo

Sources

  1. Sandbox (Webscope from Yahoo! Labs), 2010.
  2. WebGraph, 2001.
  3. cfinder (Clusters and Communities, overlapping dense groups in networks), 2005.
  4. M. Latapy (FTP public directory), 2007.
  5. Chris Walshaw (University of Greenwich Graph Partitioning Archive), 2000.
  6. J. Duch and A. Arenas. Community identification using Extremal Optimization. Physical Review E, 72:027104, 2005.
  7. cfinder (Clusters and Communities, overlapping dense groups in networks), 2005.
  8. IMDB (Internet Movie DataBase), 1990.
  9. Pajek dataset, 2006.
  10. tNet.
  11. TrustLet, 2007.
  12. CiteSeer, 1997.
  13. The Cora dataset, 2007.
  14. SNAP (Stanford Network Analysis Package), 2009.
  15. MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks, 2010.
  16. J. Duch and A. Arenas. Community identification using Extremal Optimization. Physical Review E, 72:027104, 2005.
  17. Uri Alon Lab, 2002.
  18. Christian Sommer’s homepage, 2009.
  19. Interactome, 2003.
  20. HPRD (Human Protein Reference Database) , 2003.
  21. The Jena Protein-Protein Interaction Website, 2009.
  22. iPfam (Protein Domain Interactions Database), 2009.
  23. LASAGNE (Laboratory of Algorithms, modelS and Analysis of Graphs and NEtworks), 2011.
  24. SynechoNET (Integrated protein-protein interaction database of Synechocystis sp. PC6803), 2007.
  25. STRING (functional protein association networks), 2000.