Do we live in a small world? Measuring the Spanish

Anuncio
Do we live in a small world? Measuring the
Spanish-speaking Blogosphere
Fernando Tricas
(Departamento de Informática e Ingenierı́a de Sistemas, U. Zaragoza, Spain)
Vı́ctor Ruiz (Blogalia.com)
Juan J. Merelo
(Depto. Arquitectura y Tecnologı́a de Computadores, U. Granada, Spain)
30th April 2003
Abstract
The blogosphere is the community of bloggers, people or collectives who share
information and opinions ordered chronologically. The Spanish-speaking blogosphere contains several thousand blogs; despite its small size, compared to the
English-speaking (or maybe global) blogosphere, its characteristics are a bit different.
In general, it could be said that the Spanish blogosphere has not reached critical mass yet. Moreover, the main reference of the Spanish-speaking blogosphere
is still the English-speaking web; most links found point outside the Spanishspeaking web.
In particular, it is still quite uncommon that news items seen or generated in the
Spanish blogosphere become popular throughout it; when this happens, most of the
time it’s due to the reproduction of the English blogosphere. There is also an “increasing returns” phenomenon: most bloggers (and readers of blogs) concentrate
in some blogging sites (such as Blogalia or BarraPunto), and so they dominate
the link space of the whole blogosphere. Finally, there is a third characteristic:
the Spanish-speaking blogosphere is slower than the English-speaking one: ideas,
topics and links spread in a slower way.
This paper will show our experience in developing blogging tools, in particular, the “Blogómetro” (http://blogometro.blogalia.com), which is an open source
program that checks on a daily basis the link space in the Spanish-speaking blogosphere, in a similar way to BlogDex or Daypop, which check the English-speaking
blogosphere (and a small part of the global one). We will show and analyze data
gathered from the end of the year 2002 to the beginning of 2003.
Fernando Tricas work is partially supported by the Spanish research project CICYT TIC2001-1819
Authors want to thank Adela Torres for her English revision. Any fault you can find about the language has
been added later by us.
1
1 Introduction
The Spanish blogosphere is, obviously, part of the global blogosphere, and is roughly
defined as the set of blogs (or, sometimes, blog-looking web pages) that are written
in Spanish (in any part of the wold) or in any other of the official languages in Spain
(Catalan, Basque, Galician). We have also considered blogs written in other languages,
if they are written by people living in countries that naturally fit in the Spanish blogosphere area of influence (for example, we heard recently about some Venezuelan
bloggers that write in English to point the mainstream attention to their country, or the
‘Trilingual blog’, http://trilingual.blogspot.com/).
It is a quite active group, albeit often disregarded by the Spanish 1 traditional media,
who, very often, when they write a generalist article on blogs, mention any popular
English-speaking blog (like for instance Megnut, Meg Hourihan’s blog) instead of the
most popular Spanish blogs (for which there are several candidates, depending on the
measure we use, as we will see).
Why would anyone want to undertake measurements on this? First of all, because
we can. That is, the Spanish blogosphere has a size that is barely manageable by
the analysis software we use; any bigger size would probably crash it, or the study
would require other strategies. Second, because we want to understand it, understand
its mechanisms, and see the position of each blog (notably our own) within this blogosphere. Third, because we want to look at its possible problems (fragmentation, disregard of itself, very oriented to the English blogosphere2), and point out possible solutions. Four, because we believe that this work will help to improve self–consciousness,
it will bring more people to the activity, and it will help to establish relations. Furthermore, we hope that this knowledge will enable us to suggest a mechanism to improve
communication within the Spanish blogosphere, and make it more visible to the non–
blogging community.
The rest of the paper is organized as follows: next section is devoted to explain the
framework we have used for this analysis; section 3 will be devoted to big numbers,
overall macroscopic measures in the Spanish blogosphere; it will be followed by section 4 which will study the Spanish blogosphere as a social network. Finally, we will
present our conclusions and some hints on how this work will follow in 5.
2 Methods and Context
In this section we are going to concentrate on our tools, we will make some comments
about the Spanish blogosphere, and we will provide some hints about other projects of
interest.
1 In the context of this work we will use this term often not only in the sense of ‘from Spain’ or ‘speaking
in Spanish’ but also to refer to things related to the considered blogs, as described above.
2 Again, we are talking about blogs written in English, independently of location
2
2.1 The tools
Our project started at the beginning of the Summer of 2002, and it is in a ‘beta’ stage.
All data in this study have been taken from the Blogómetro, a suite of tools whose
main visible aspect is its blog (http://blogometro.blogalia.com), hosted
in Blogalia (http://www.blogalia.com/). There, a list of fresh links taken
from our list of blogs (ranked by the number of sites pointing to them) is published
daily. The Blogómetro is an open–source collaborative project, offering our research
to the community. It is open to the participation of interested people. In this sense, not
only its source code is available at the project page (http://sourceforge.net/
projects/blogometro), but also the list of sites scanned daily.
The bot that cralws the sites is written in Python and every day, early in the morning,
checks all blogs in the list. From each raw HTML file, it scrapes the links, and stores
them in a database if they have not been included before. In consequence, a link is
considered new or fresh if its URL has not been seen before in that particular blog; that
means that if a blog refers to another several times by its URL, it will count as a single
reference; that also means that links included in the blogroll list are considered only
once (during the lifetime of the data).
The DBMS is PostgresSQL, a free, open source program, that is easily interfaced
via languages such as Perl or Python. The database contains only two tables, one for
the blogs themselves, with URL and description, and another for the URLs. Data has
been stored for approximately 4 months, from November, 15, 2002, to April, 15, 2003.
The database was purged in two ways:
First of all, as the amount of data grows very fast, we need to delete from time
to time what we consider ‘irrelevant links’; that is, links that only appear once in
the database, and are old enough (two months at this moment), in order to keep
the database manageable with our hardware.
Second, and only for the analysis of the social network aspects, self–links have
been excluded when possible. That is not always possible, except in the case
that links to a blog include the blog’s URL. People use their own customized
systems of publishing (frames, iframes, skeleton in one domain and postings in
other different one, ...) making it difficult to do this part of the work.
The addition of new blogs to our list is done by hand and, even though we have
found moderate interest about the project, we have not received many submissions of
sites from other people, so the list is mainly ours (with the pros and cons this may
have).
2.2 About the Spanish blogosphere
As far as we know, the weblogging phenomenon started with BarraPunto (http:
//barrapunto.com/), a collective weblog oriented to provide news and discussion about free and open source software. In fact, it started as a Spanish clon of Slashdot (http://slashdot.org/). Some features make BarraPunto different from
Slashdot (and more interesting from the point of view of blogging and, in particular,
personal blogging); let us remark two of them.
3
Table 1: Number of blogs hosted in popular sites in the Spanish blogosphere
Site
Number of blogs
Blogging sites
BarraPunto
648
Blogspot
315
Blogalia
164
Pitas.com
24
Antville
16
General hosting sites
i! (España)
34
geocities.com
22
cjb.net
19
The ‘miBarrapuntos’: an adaptation of other ‘my’ services that allow people to
create their own BarraPunto-like sites, with support for comments, sections, topics, and even collaborative weblogs. It is a more advanced tool than the ‘Diaries’
you can find at the standard Slash code.
Ecolutions: a mechanism to get any history posted in any part of the site, and
copy it -maybe with some editing, and with a link to the original story- to our
own blog.
In parallel, many people started blogging, using some of the ‘standard’ sites: mainly
BlogSpot (http://blogspot.com/) and others, but also home-made blogs (using free or
paid hostings, and free or commercial tools). Finally, let us comment the site that provides hosting to our tools. Blogalia (http://www.blogalia.com/) was created
at the beginning of the last year, and it has become a populated site, with a very strong
sense of community. It started as a personal project of one of us, Vı́ctor Ruiz, to provide a simple and usable tool to Spanish–speaking people interested in starting a blog.
It was also planned as a site where established bloggers could find a more friendly tool.
In Table 1 we can see the sites where most of the Spanish blogs are hosted. They
are approximately half of the total number of blogs in our list, so there exist a wide
set of blogs made with very different tools and hostings. They have been separated in
sites oriented specifically to the hosting of blogs and other sites, oriented to general
free–hostings. It is interesting to note that we are not aware of blogs hosted in some of
the other popular blog–hosting sites. Further research will be needed in this area, we
hope that people will start sending sites when our project will be more widely known.
As a final remark, le us notice that there are some bloggers that use these same hosting
sites hidden behind their own domain, so this table can show a slightly different picture
of the reality.
2.3 Related initiatives
We want to talk here about other initiatives born in the Spanish blogosphere that partially overlap with our work, either in the community construction side, or in the mea4
suring aspects.
First, let us concentrate on the community aspects:
During last Christmas, a game that maybe many of you have played as children
was proposed in the Spanish blogosphere: the ‘Ciberamigo Invisible’ (http://
www.awacate.com/amigo/), a blogging version of this well-known game,
also know as ‘Secret Santa’ in some places. The idea is to exchange gifts in a
group of people in such a way that each person has to give something to another
member of the group, with pairs selected in a random way. In our case the gifts
were virtual things such as: graphics, banners, texts, ...
It was an important social success with around one hundred participants (Anecdotally, a search in Google (http://www.google.com/) for ‘Amigo invisible’ gives the page of the event as the first one in the list at this moment).
Concentrating on more technical aspects, and as an implementation of the ‘Ridiculously Easy Group Forming’ (http://www.myelin.co.nz/cgi-bin/
wcswiki.pl?RidiculouslyEasyGroupForming), Philip Pearson created ‘The internet Topic Exchange’ (http://topicexchange.com/). At
this site, anybody can create a ‘channel’ dedicated to any topic of his interest. Then, anybody blogging about this topic can send an adequately crafted
request, and gets its contribution listed in the channel page. Surprisingly, the
most active topics at this moment belong to the Spanish blogosphere: the ‘bitacoras’ channel (http://topicexchange.com/t/bitacoras/, whose
main topic is blogging about blogs) and ‘directorio de blogs hispanos’ (http:
//topicexchange.com/t/directorio_blogs_hispanos/, whose
objective is to be a way to publish newborn blogs).
Finally, there are a number of directories that try to offer comprehensive listings
of blogs with different classifications but, as far as we know, none of them has a
list longer than ours.
Now, let us talk about some more technical approaches.
The ‘vecindario’ (http://www.pisotrece.com/vecindario/) is an
implementation of the ‘Blogging Ecosystem’ (http://www.myelin.co.
nz/ecosystem/) that uses a lists of blogs smaller than ours (around 700
blogs) but that shows interesting results.
Very recently, in the last days another project has been shown: the ‘Mapa de la
Blogosfera hispana’ (http://www.hiperespacio.com/blogosfera/)
which is a hand–made graphic representation of the relations of the Spanish blogosphere as perceived by the author.
In some of these projects and also as a personal feeling of our own work, the conclusion is that it is very difficult to get an idea about the size and extension of our
blogosphere. One of the problems detected is that not many blogs ping sites like Weblogs.com (http://www.weblogs.com/) that would allow us (and others) to try
some kind of auto–discovery. Most of the other blog–related tools rely on this site, so
we are out of luck in this automation.
5
4000
"maxEnlaces.plot" using 1:2
3500
3000
2500
2000
1500
1000
500
0
01/06 01/13 01/20 01/27 02/03 02/10 02/17 02/24 03/03 03/10 03/17 03/24 03/31
Date
Figure 1: Evolution of the number of fresh links per day.
3 Big Numbers
During the period of study 281648 links were observed (109687 excluding self-links
and purged entries), which yields an average of 1573 links in a day; considering the
number of blogs, this makes an average of 1.14 links per blog per day. present at If we
assume that each new history posts a new link, we will have around 1500 new histories
a day in the Spanish blogosphere. Not a big deal, but at least we have a ballpark
estimate. Another measure that can be of interest is the activity of the blogs in our
list: during the last month 1160 have posted at least one history, with a link. We can
compare these data with the activity in January (1299 blogs), and in November (943
blogs). We can see the number of daily links in Figure 1. In this figure the grid shows
weekly periods, allowing us to see some periodical behavior. There are some spurious
peaks (too high, or zero) due to problems with our spider (network failure, bugs in the
program, accidental deleting of some data that is recovered in the same scrapping, ...).
From the point of view of blogging, the week clearly starts at Mondays and grows
until the weekend: at the end of these periods the number of links shows always a decreasing pattern; there is almost no life during the weekends. Our guess is that there are
many techies blogging from work (even as part of their work, as a way to interchange
information with others) and that during weekends they tend to be disconnected. If we
pay more attention, we can discover a curious descending peak in most of the weeks,
corresponding to Wednesdays (some weeks on Tuesdays): it seems that people start the
week wanting to blog, and they needed to relax in the middle of the week, to continue
later with the activity. Finally, let us point the difference of size in the weeks previous
6
to the one starting at Jan, 27 that corresponds to the last purge of the database. Anyway (and supporting our definition of uninteresting links) the weekly structure is also
observable with sets of data that have been purged.
What was the most popular link during that period? Unsurprisingly, the first 20
links or so are taken by banners that have appeared in weblogs, such as popular weblogs such as Slashdot (number 2), and blog hosting sites and software such as Slash,
Blogger, Blogspot and Blogalia. BarraPunto (http://www.barrapunto.com)
starts to show up here, first, by itself, and then, by having its banners as the most
pointed-to links). The first real link is http://www.librodenotas.com/mt/
prestige.html (75 links), a (quite critical) page on the Prestige wreck, which
was part of a campaign to Google–bomb the word prestige. It obviously succeeded.
The daily newspapers “El Paı́s” and “El Mundo” show up roughly the same number of times, 48 and 45, with El Paı́s having a slight edge. However, this counts
only references to the main page, not to particular news. Unsurprisingly, this edge
is lost when we take into account all references: “El Mundo” is four times as popular as “El Paı́s”, 887 vs 278; this is not surprising, since El Paı́s switched recently
to a pay-per-content model. That is why it is almost reached by “Periodista Digital” (http://www.periodistadigital.com, 204 links), who usually makes
some of El Paı́s content publicly available under the “fair quoting” provision of the
copyright act. Looking at individual blogs, Mini-D (http://www.minid.net),
a popular weblog on design and current events is the most popular one with 151
links, and Cuaderno de Bitácora (http://rvr.blogalia.com/), one of our own
(Ruiz’s), with 130 links, the second. The links from one point of the blogosphere
to another represent roughly one tenth of the total, namely, 12533 for the studied
period. Within these links, once again, a popularity contest takes place, with Libertonia (http://libertonia.escomposlinux.org) winning hands on. However, much of the links to Libertonia come from its own blog-like diaries, so maybe we
should consider Libro de Notas. (http://www.librodenotas.com) the winner3
4 Spanish blogosphere as a social network
The data obtained from the Blogómetro have been analyzed using a number of software tools, most of which have crashed under the load (very specially GraphViz and
UCINET (several times)). However, we have managed to obtain some useful information from them. The first analysis performed was to check
if the
Spanish
blo
gosphere fits itself to a power law. We
fitted
a
power
law
using
the
open-source tool GnuPlot, resulting and
. However, the chisquare test was around 6, much bigger than one, which means that the model does
not fit itself well to the data. Data and function are shown in figure 2 Besides the
fact that this data does not fit to the model, unlike data published by Kottke in http:
//www.kottke.org/03/02/030212screw_the_po.html, where it finds a
perfect fit for the (global? English–speaking?) blogosphere, and an exponent of -0.83.
3 We also live in the blogosphere, so let us share with you our ranks #7 (Ruiz’s), 13 (Merelo’s) and 14
(Tricas’s), just in case you wanted to know.
7
Link vs blog distribution
350
data
fitted curve
300
# links
250
200
150
100
50
0
0
200
400
600
800
1000
1200
rank
Figure 2: “Powerlaw” distribution of links per blog; x axis represents the blogs ordered
by number
and y axis the number of links. Data points have been
of incoming
links,
fitted to
; however, the fit is not good enough.
8
Our working hypothesis here is that there is some fundamental critical mass that makes
a certain community behave like a power law; unfortunately (or maybe fortunately), the
community we belong to does not seem to have reached that level yet.
Other superstructures on the blogosphere were analyzed using Visone (available
from http://www.visone.de). Visone is a tool that allows to plot maps of the
social network under analysis, as well as perform some measurements on it. One of
these measures is the betweenness, which roughly measures how often a particular
blog is found when traveling using links from one blog to another. In that particular
sense, as can be seen in Figure 3, eCuaderno (the blog of another BlogTalk speaker,
José Luis Orihuela, #208, http://orihuela.blogspot.com) takes the central
place among all Spanish blogs (with almost 6% centrality). Several others, including
http://bitacoras.net (#8) and http://www.gistain.net/ (#14) also
are lying prominent places.
These “central” blogs play a prominent role within the blogosphere: they register memes in the community, and expose them, so that they can be picked up by
other blogs, and thus, act as veritable “meme mills” that spread memes throughout
the Spanish blogosphere. This fact is also reflected in other two measures: “hub”
and “authority”. The first hub is, once again, Libertonia, but taking into account that
most of its links would be automatically generated, we will consider the second, ’Fernand0’s barrapunto’, a journal hosted by “BarraPunto” and authored by one of us
(Tricas) the hub in the Spanish community; curiously enough, our other two BarraPuntos (Victor’s and Merelo’s) are placed 4th and 5th. This might be mainly due to
the fact that we are editors within BarraPunto, and have automatic references when
a person places one of the newsitems edited by us in their own journals. Maybe this
only emphasizes the role that collective blogs such as BarraPunto, EsCompOsLinux
of PuntBarra (the catalan-language equivalent) play as hubs in the community. Actually, they take 9 out of the first ten places, the other corresponding, you guessed it,
to one of us: Tricas, whose manually edited blog is #9. Hubs’ main feature is how
often they quote other blogs; the opposite, being quoted, is measured by the authority
quotient, which was also measured using Visone. Authorities are “quotables”, in the
sense that other blogs mention their histories very often. Most of the 10 first places
are taken by collective blogs, notably Libertonia, but some others show up: PJorge
(http://www.pjorge.com), which seems thus widely respected within the community (#7), simbiosis (http://simbiosis.blogalia.com), a blog that daily
comments other blogs, and our own (#9, 10 and 11). Linux and open source-related
blogs are the most prominent in these places, which shows its importance within the
Spanish community, since maybe blogging in Spanish had its origin in it. Authority
measures take into account not only the number of incoming links, but also their “quality”, that is, the authority of the blog that posts them. Finally, another macroscopic
measure is the “degree of separation”, that is, the average number of links needed to
reach a blog from another. It has been measured using UCINET, yielding a value of
3.761, that is, on average, approximately 4 links separate each Spanish blog from any
other. That means that, quite matter-of-factly, the Spanish blogosphere is a small world.
9
1208
2 1228
279104
559214
792
1256
674
12941232
1 1061
1045
52
1053
500 788
1307
234
1
995
910
1
1
366
791039
1
806
11
1
1
983
1
1 1
1
693
467
1 1218
1224
1
1
826 833 393
533
1
694
1010
1862 90787
857
1
1154
1 196 1 1 63
212
1
896
621
656
452
164
314
556
1
1
2
524
1032
1
1
265
784 11 1 11
332
1
303
1
1
270
2786
11 1
1410
1
1 11
620 1 847
1043
1
1
1 1102 1287 1 2
351
897 1 1
1189
1 1
354
1
1
1
99
1
1012
1407
1
593
57 367
1 11
1
1
2
1049
1
1 2
1192 1
1
1
11
695
1302746
381 11
11
1
1803
1257
1
1
409
1
571
1
1
1
1260
2
680
1
1
1
1
415
1 432
11
1
1024
2
830
1 21
12 103 11
1
1
1
1
755
1 21 1199
349
11 1
1111 1
111 1 1
1 190
2
1
1
1
1
1
1
1
1
809
1 1014
1
1
2
368
743
2
2
2
1
1
1
903
280 1
1
1
389
1
11 1
414
1
1 135
1
2
111213
884
1
11 2 1
1 798
1 21 11 1 329
1
1 1199
11
1 188
11 1 1 11
1 1
292
1 1
6
11
22 1 1
1006
1278
1
1
1
831
1247
1
1
1128
511
1
1242
605
1269
1
2
657
16
1
1
11
1271
1
1
1 1 1 11851
799
105 1 1
1688
1122
2
2 961
2 111
268
1
1 1
11 1450
1 1 21111 11 1
578
11
1 21 1
1 1125
1
11 21
837
1
1
1
3
1
215
11
845
1
1
756
1
1
1
886
1
1
1
1
515
1
370
1
1
1 801
1
1051 1
1
1 11 1
1 1 1 11
1128 1 11
6 2
1111
928
930
2754
1040
1
2 11 11
1
1111
728
12 31752
1
1267
650
1
1
3
2
1
39
141
1
1
1
2
1
1
1
1
1
785
1
1
1
1
1
1
1
1
1 11
112 1
121 1 21 1 21 1
916
11
1
595
858
469
11 2 1 12 1 2
1195
1
201
12 2
951
2 1 1711
11 1 8 2 1 3
1 111 453
1
1 11 10911 1 1 11 1289
12 12 1 1 11
1 122 2111 12 11
1 11
587
255
1
2
102
1
2
1
1
1
1141
1
585
176
1
65
1
1
1
5
333
1
1
106
1 1
2
12
816
1 1 1 111 1 11 21 1 1 11
438
1 1 1
1190
940
619
2
1 11
1 11
4
11 1 12 1
1
1 11 11 1 1 11 21 1 1976
3
1661 111 1
41880
1
1
613
343
1
1
1
1
1
1
646
1
238
2
1
1
1
147
2
305
645
22
1
1374
11
11
6
1
1004
3426 1 11 2 124
1 1 16
1111
1 11 1 11 88
21172 3 11
11 11
111 11 1 1
11 1 11111 11 1 1 1 111 1
1
1
855
2
86
1
1
1
1
1
1
1
1
1
1216
2
1
2
463 1
1 1
1 1
3
1 1
295
12 1
1 1
107 111 6 11 11517
235
21 121 11 2 11 11 2 11
1120
121 11
11 11 1
1308
1276
1 12 11 1 1 763
454
1
2 111 2 11 11 11 1391
134
111
1
1
999
346
12
947
1299
1
2
1
1
507
2
1
1
1
1
872
1
1
1
1
2
1
576
1
1
1
1
2
820
1
2
1
1
1
1
1
1
1
423 11
1 1 4 3 111 3 3
854
1
1800
1
383
1
1
11 1
594
1 111 1 1
1 12395
2 1
1
1023
396
11
725
11
944
692
1233
1226
1
1 1 101
12 1 1
1
1
111115
2175
1 11
1 1
1311 1 1111 1 420
11 1 4 1 1111
627
112 1
6 134
11111 17
1
1
8
1
1
1
1
1
1
760
473
824
1
1245
112
387
1
1
534
111 1
121211 155115
41
11 1
12
1238
1 11
1 1 13 1
869 1 1 148
1 1 1202
1
948
1 1 12
1
1 11
11
1 1 1 466
1 1 1 1 750
3 2 11 1 11 3 12
1
147111 1111 111 2
1 11 1 1111 1603
39
1
1
1
549
1
1
1091
1222
1
1
1
1
977
1
1
2
1
899
1
1
1
1
1
352
1
1 360
144
1
1028
2
529
1
1
253
1 1
1773
1
980
11 11 113 1
1
1 1 1 211 11
2 1 121 1 1
198
341 1
1 1 1 211 1 11 1 1 1 11 11 21 12
1 12 2
276
1 1
11492 1
11 1 729
1130
1 1 1 1 11
677 2 11 1679
11
1 1 1
11
11 1 11
52
770
1
1
1
1
1
58
1
1
1
1008
143
2
1
1
1
2
1
1
1
1
1
1
1
1
4
12
1 1 11 11
1
12
1
878
1 8 2
1
4
766
1
1 1 11 1
290
1703
1027
2111 111614
895
1 111 11538
1687
21
1 11 111 11 158
406
1 1 11
1 2 1 11 1 11 1211 1 111 1 11111
1
1
1 311 111
3
3 11
11111 11 6 1 11
1
1103
1
1
1
1
2
4
2
45
1
245
1
1
1
1
10
783
1
1
1
1
1
5
1 11 1
931
1 1 111 1 21
1
3
508
12
1
11 11 1 1 2 1 121
1
1262
342
219
3371 2111 111 1121 12
933
1
157 21
12 1 2 3 1
67
4191
1077
2
1 111
1 11
11 11
1 1 1722
1
1
1
1
11 1 1 11
1
283
2
1
1
1
1
516
1
111 2 111121 11
1
1
1
1 11
70 1153
11 211
1
1 1306
7 141 11 3
12
1 2
632
97
1 1
1 1
1 1
937
1239
11195
1 2 11 1
1
1
1059
663
1 148
311 21 1 1 1
1 11 111 1 1 111 11 1
1 1 1
21 11 1 2101 1 1
1
1
149
893
589
1
712
1
1
1
1
1
1
1
1
6
1
1 1261
1
1
1
1
1
2
242
1
321 28322 11112 1
14
1117
1111
1 4 1 11 22111 1 1 1111 11 111
1
1 1
1 1
1188
2 1 745
11
11 1 111 1 11 12
1 1
1 11111 11 1 1
1
1 1121
121 1 11 1 11
1 1 1111
1 2
1
1
376
1
2
1
1
1
1
1
1
2
1
1
1
1
1
618
1
1
1
4
733
1
1
2
1
474
1
1
1
1
681
11
11 2 2
1 1 1 1
1
3
1
2 1 11 1 1
1 911 2 1 1 1 1 1 3
111 1 11111
11
317
1 112 1 1
115 11 11 1 112 1111111 122
111 981
1021
152
1
1 1
1
867
5 11 2 1
111 1 1 1 31 1 1 1 1 11 11111 111 111111 121111111 211111 11111 111 11211
2 3 11 1 1 1111 11625
790
829
1
1
1
1
1
1
1
1
636
2
1
1
2
1047
1
1
1
2
1
7
1
3
1
1
1
1
3
1
1
1 1
21 3901 1
689111 111 1 12
11
1 11 111 11 1 1 1 24
12
112111
796 1 28 900
1 5
1 1 1 111 1
1 5 1
111 1111
1 12 1 111 1 1 132 1 111 11111111111 5 1
1 1
1
1
287
1
11
1
11
11 1 2 1
11
1 111
0
1241
862
1
1
1
1
1
1
2
1
1
1
1
1
1
3
1
1
1
116
1
1
1
1
1
4
1
1
1
1
1
1
1
1 1 11122
1 11 211 11
24 1 1 777
211 11 1 2 1 1 1 11 111 1 11111 11 111 211 1 11 1
1
11
1111111 22 1 1
732
622
1
1
210
11 1
260
2111 920
1109
11
1
4 11
85
1
22 1
1
1
1
1
1 1 11 1 1
841
2 1211 11 1
1
1
1
664
1 2 11 1131 111 12 311 121 1 1 11 511 31 1111 111 57 4 1 1 1 112 11 111 111111 1
735
278
246
1
33 11 2 11112 111 21111 1 1111 1
1
1
2
1
1
558
1 1 2 16 1
403
2 11 211 111
11 11440
1 1
1 11 1 111
1 1
2 1
1 11
1 1
111111
1
1 4 11 1111 4
1 131111111 2
2 1
530
2841
1
1
11 1
1 111 1111 1 1 21
227
11 1 2 394
111
11 11211111
1
1
1
1
1
1
1
1
1
1
1
1
66128
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1042
1
155
1140
2
1
1
1
1 1
1
1 1 1 1 1
3 2
1
1 121
2 2 21 11 1112
1
1
11
1 2
1
1 1
11
21121
131700
98
1 11 11 11264
718
520
1
1 1
1 111 1111 21 1211111111 111 11 111
2 121 1 1 1 111 121 1
1 11
111 11331 1211 11 1211 11 112 111 1 11 1 1
11 111111
11
1 82
1 1 111 121 1 1 21111211 1 1 1
1
1
1
1
1
263
2
853
1
3
1
1
1
2
1
1
1
1
1
1
1
1 357
247
1108
1
1
2
1
1
1 11 1 1
11713
1
667
1483
2
111 1 1 1 1
11111311111 1 12111111123 1111 11 11112321116 1 1 1 2 2 1 3 111 2 1 12121 11 11 11 1 111
170
1
11
1 11964
1 11 11 2 1121 12 11111112111111111 1 1
289
844
2 11121180
11 1 21 11 11 1 1211 2 121
282
1
1056
1
26
1
244
1
1
4
1
1
1
1
1
1
1
4
1
1
2
1
1
1
2
1
1
1
1
2
1
1 1 167
1 1
12 11 3 1 285
1
1
1
1
299
1 1 1 1
1
1 1 21
11 221 1 1 12 1 121411 1111 11 111 1 21 111 11 311111111111111111311111112 11 1 1111112833
1 11 311 11 11 5517411599
1 111211 21 11 3 1122 111
1311 1 1 111 1 11 1 1 1
926 637111111 31 1 4 11 112 1 211 1 11 1 11
1
3
1
1
1
1
1
1
1
1
1
1
1
2
1
1
3
1
2
1
1
923
1
1
1089
2
1
1
1
2
1
1
2
1
2
1
1
1
1
2
1
715
1
1
1
121
323
1231
1
1
758
1 11
1 1 1 1 1 1 1130
1
308
381
2 11
1 822
51
2 11 1
1
1
5281 1 11 1 31
1 1 11111
21 1 11
1 11 11121 1 2 111 11 1 111 2 1 1 1
318
212 211 1 1 1 1111 1
11 211
31
1191
1
59
1 11422
464
1
11 111 11 1
355
1 11 1 51112 1 11 2 1221 1 111 11 111 111111 1111111114112111 111 1 2 1 1 1 1
21141 161111 123 1 211 1 111 211 1 1 15 12 311 1 1 1 1 1
363
2
1
1
1007
1
1
1
1
11254
11 1 1 4 1 2 2 1
575
2 11 1
1
1
1 111
111 11 1
293 11
5 2 211 2 11
1275
1 1 1 3 11911111
111
1
11
1
1
11 11111 13135 12 111 2 11 1
2
11 1
133
111111 7 2 1 1 11111121711 111 111111111111111211111111 111 11 111 2111 1 1111 1 1
1
222
1
911
1135
1
1
1
2
3
1
1
1
1
1
1
2
2
1
1
1
1
2
1
1
1
1
1
1
1
358
1
1
2
1
1
1 74
397
1
1 61 3121111 1 111 111 2111 12 1 1 1 11 3
1
685
2 1 1 1 1411 2 1 1111
3 1
2 2 1 1 1 1 111 1 1
11 12
683
1
1
1 1 1 1 1 13 1 1 111 112111111 511
1 21
2711
1 1205
21112 1 2 12 3 1
960
23 12 113 8701 11 11
1 1 1 2 1 11 2 21 1 1 1 22 111 2 1 11
183
1
810
21 75111 111 11 211 32111 11 1211215
300
33411 2 1 1 111111141 111 1 1 121 11 11 1 1 321 112211 1 11 1 1
513
1
1016
177
1 11 8141
1 1
92 1 839
1 1 115 11 1
1 21 11
1 11 3 12
1 1 11 11 1 11112 1121 1211111
1 11 1 1 1111 12 1 21 11 122
586
611
1
11
9 1 71
182
11 111 1 1 11 1111111 1 1411121 1 111 1
21
1 1 1 3 11 1738
11
1 1 11211 1 1 3111111111211112111112111 11111111111112111113 11 1
1
2
1
1
1
1
1
3
1
1
1
1
1
1
1 1
1
1
1
1
1
2
1
1
1
1
1
1
2
1 1
2
1 1 1
1311 111 1 1 1 1 11 1111 1111211 2 21 115
3 11121
11
1
1 111 11 1
1 3 1 11 111 2111 1
2111 11 12 1 1
2
123
697
709
11 1211 1294
1 21 1 2
11 1 1 11 1
1 2 111
11 1
1
211111321 111111 1 11 1
1 33 1 11
1281
141122 111 1
1
11 1 1 2
1 1 11 111 1 1 152 1111 11 111112212111208
1105
1 1
11
1
25 197
1 1 4 12
1 2 11
11
863
1 8 1 1131 11 1 1 2 111
1811 1
21 21 1 111111111 112 11111111111511111111111 114111 1
21 1 2
1
1 1 11 1 2 325
286
1
30
11111 21111211111111 122111 1 1 211 1 1
1110
1 2
11
1 1 111 1112 1111131 11121 1 11113 12 111121 1 11 11 2
1 111
1072
11
1 2
1 11 1253
267
1111 1
561
1
12 1 1 1 1 1
11 1 11111 21
1 2 11121111111 1 11 1 22 11 1 11121 32
1 21 11 1111 11 11 11
1
12
1 1 11 1 111 211 23
640
1
573
7
1272
17 112 111 1 1 437
1
1
495
1 1 111 1 1 2 11111 121 1 11 1
31 1 1 111 1 1 11114 1 1 1
1 1 1 1 11 1 1
1058
250 1 1 11172
1
111111111211111111111712141121112111121 1
1 1 1 12
11 1
17
2 1 2211 1 11211111211121210
1
1104
1
2
1
1
1
1
1
1
2
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1 11
1
3 11
1
1 1101
1
1
848
312
1967
2
1 111 1 31211111 111111 1121 112111111111 1 21 1 2 1 1 11
836
1
1 11 1
1 1 1 1 18 181111 3 113 11 1 11 11
11 1817
1 11 1 21 1
2 1 1 21 1 11 11 11 21 11 23
1 1 111 3111161 1111 1
273
808
11
1
1641 1
2
111121
1 111311112 1 211 11 1111111 11111 31111121 21111121 111 21
21 1 1111
4
1030
31
1 1 2 121 1
12 1 11 1 658 1 1230
2
11
1
3
1 1
2
121
1069
1 21 1
111
1768
1 1 12 1
21
1280
111 11 1 1344
1 49 11
5652555
11 1 3
1 4112 111 111 1 11 11211113111111123111 1 12111 31211 1511 1 1
21
12111 11 11 1 1 111 11 103
11 1 22 1 3
12 1
1
2
1
1
1
1
1
1
3
1220
1
1
1
1
1
2
2
1
1
1
1
1
1
1
1
1
1 1
971 1 1 392
3
1
12 17
1
31 11112 1 111
1
3
1 12 11 1 1 51 21
1 2 23
21 2 1 42
121 12 1 1 1
31
1 2
1290
121 2 2 11
1 1
772
600
1 111
1 1 2 1 11 1 111 1 1111112211112 111111 111 52211 312 1 11 1
1121 111
1 11 112 1 1 11
11111 1 1 11 1 1
1
942
1
730
1
2 1 1 2 13212
1 2 1
2 1
2 4
778
1 1 6 15 1 3
4
11 1 1 1 23 1 1
1098
1 1111 1
288
1 111375
1 1 11121
211112 1 1 2114
11 11 1 211 2 1 1 111111 111 11 1111121311111 11 2
1125
2 11
1 1
11
2
1129
11 1 1
560
2
11
2
1074
1112 1 1 1
475
801
962
316
11 1 1
2
2 1 22 111 1111 111 2 1111 2 111 12 212 11 2112 11231 1118 11 14
881
1 1 1 2 1 11 11 121
1 111 11779
111 11 2 11
1116
11
1
949
11 4 1
111 111 1 1 1 1 1
1
1
1
19
11 11
1 11 111 22
362
1211 1
553
1111 717
1 11 12 1
2 1204
11 1 11 1 11 1112114 1
821
1 2
3
1 478
11
1 11
1 11 1 3 1 134
1
1111 296
11
2
13011
1 1 1
1
1 968
3
1 1 825
61 1 11 111 12 1 1 11 1 111 1 1111 11332 111131111 111111 11 11 1 1
6111 11
1
14 831
11
1
1 111 11
11 1111 11 11 112131 1 1
1
12 11 111 2 23111 11 1 21 1
443
11 1
2
1
1
629
1 1 330
11 21
1221 3 11 1 1421
3
1048
1 1 1 21 1 1 1
506
6
1 11 11 11 1 1 2211211 121 41111 112 21 11 1 1 1 1
339
2 21 111 21 2 1 1 1 2 1 1
1
11
1
1
564
448
1
1
1
2
1
442
3
1
1
1
532
2
1
1
1
1 1
1
647
31
1003
1 1
1
1 1 1 1
1936
2121 21 11
1 1 5111 21 3740
1046
11 2 1 21 11 1 1 1
1076
1
630
874
11 1 212 11 1 1 11 311 211 1
1
1 211 2 1 1444
11 1 11 1112 31
572
359
1
583
114
11
21 1 121 1 1 111 11 112 1 1311 11 11 1111 1111 1 11 111 1 1111 5 212 1
21
100 111 447
1542
885
111
1
1 1
11 11
1
1
4
11
111 561 1 1 1225
1 1 2 1 3146
1 1 11 1
298
1
307
1
1237
607
12
53911 1 544
11 11 181 3214 11 1 71 1111 11 1 1111 1 12 22 1 1 2 111111 111 1 12 1111 2111212 111 12 71
1
1 1 2 1 11345
111057
1 111 16
1
1113
12 2111 1 11 111 11 13 1 11
1246
31 1
1 4 1 1
1
1 1 1
1244
1221
1 1
1 22
1 541111 111 684
973
1304
259
1 1114
1 1
1 2 11
21 12
1
113111
1
11 1 1 1 11121 1 1112 41 2 11 11 1 1 1 11111 1 111 112 121 1 1 2 1 111
365
1 1 1 1 1 207
347
1268
11
1 1 1 1112 1 2 11 3 22 12 1 1
306
1
597
1
2 11 11 1 1 11 1 1992
1 3411 1 111591
1
1 11
121
1 2 1 2 212
1 1 1 21 1 1 1 221 1 112
11 1
4111212
1
1052
35 11 459
1146
11
484
1
11 1 1
211 11
11963
1315
1203
1 4 1 111 12 111121 1 62111 1111 21 2 111 1
1 1 12
16 1 11 40 1111
1
751
1 1 1 11 11
11 1 2111 1 1
1 1 1 111 11 449 1 1 1
1
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1
256
254
1
41
1
12 1
2 11 2
11 1 1 2 111 1 13918
11 11 194
11 124 1 1
1
1 111 111111321 341 11 21 11 1 11 1 11 111111 1 112 11 11114
946
111 3
590
3
1 1111 1 1 11 1
10411
1 1 11 1 1
1
11 11 1 21111141 1764
1 111
1
2
1035
1211 1
2 1
2111 1 1
1 17
21 1 111 111
1 1 11111
1 131
21 2 1 1 131 1 2 1 1 1
1211
1
775
1 12 1111 11 2536
11 11 121
965
473
1
795
1055
1
915
1 11 11 1 111311111 21 11 1 1
1
11 1 11 1
1212 1
21
21 1
4451 1
1 1 11
1 1 21 1 1 117441 1641
29
457
348
1
1
1
241
1
840
2
1 111 11 41
11 21 1 1
1 1
7911 1
11 1 1 11 1 1 1 1 2 1 1
716
1097
236
11 1
996
12 11112 2111 231121 3 11 11 21 1 1 11 1111 11211 111 1 1
281
1 11
126 11 1 1970
1
95
1 12
2
1 11 1 121 1
12228
1 11 2 13
1 1
2
1
497
1
225
702
11 1 11 1 22 3 1 11 1
2 1 1 1131125211
1279
753
2 1
1 1 11
11 1 1
1236
1 21 2 1 1 112211 1
2 1 111 3 1
1
1088
36
1
248
1302
1 1 1 11112
22 111
2
1
1
1157
793
327
634
11699
1 2
974
1 1 1 11 1 1 4
1
87711111 1111 21 11111 76
11 1
11 1 11 11 1211 1 11 1 21 111 11 1112
1 111 1 1402 11 1
762
1143
1181
1011
11 1
1
1
2
1
1067
251
2
2
264
1
1
1
1
388 1
1
924
1
22502
1
1
1
1249
2
1
566
1
3
1
489
1
1
1
1
1
1 1 11
2
932 220
1
1
12
1
1
11
21 1
721
1 1
1 11861
1 1
1
1099
1 11 1
932
11 311
12
11168
1 11 1 1361 1 1
1137
1
1292
1 1 114 111111 111211 111 43 1 1 1 1 1 11
1
11
32 1 3 1 1
1288
168
1
1 2
1197
11111133
1
310
111 9873
749
11
1
11
11
1
2 906
1 13 11 111 11
1
1
21 1 1 4 1
319
2 11
1
206
111 1
1 1 1 1
32
1214
494
2 111 11 111914111
1
1
580
1 1 1 834
1
11
1 1 1211 21 111 21 51 1 3
1 1 2111 1 1 11
1096
1
291 11 41 138
3111 1 1 1 21 1178
11 1
1255
1 1111 1
1259
1 1 11 1 1 1
1196
11 1 1 2
1 1
1
1 1111 1 11 11 1 1
1 11 1 3 1 11
458
110
945
1 21
11 3
1277
11
1 11
548
372
1 11 1 1 11
1
269
11 1 221 1 211 1111 111
11 1 131 1 11
1231
41
714
161 805
1 1 11111
12 1 1 1122
938
1 1 1 1
17356
1
635
11 1 1
1
1
2 1 11 1 1 1 134
1 1 44
2
416
927
385
1
1 1 1 2 11
11
111226
111139
11
1
111 11111 11 1 11 1 11 11 1 1111 1
131
2
1 12111371 11
2
13
11 1 706
1 11
1
112 1 1 1 3 1
1
1
1
2
1
1
1
1
441
1
1
1
1
1
1
1
1
1
1
1
1
659
9
1
672
1
1
1
1
1
2
871
1
1015
1
11
1
21111173 1
111
1
1468
1
2 1 1 11 1 11240
1 11 21
1 4 1 1 1509
22 1 1 11 547
11 46 1 2 1 12 1
1
1 11 1
591
1 1 1
4
461
1 11
11455
842
1 1639
1 21 122 1
1 1
11
11
1736
1 386
272
11
882
1
579
91
1
11 1 1 1 1 1
990
1
966
1022
1
1
111 21 2 11 1 1 111 1
11
1 1 1 1 11
1235
1 1
1066
1 1
887
615
2 1 1 1 1 15
11
11 1111
1136
541
1
2
221
1019
1
780
898
11
1
1 934
243
12 1
275
1
1 1
11
522
11 1
51460
312
1
3 111 612
227
499
769
1 2 1 1 11 11 1 11 1 1 3
633
1 12 1 1
1
1 1009
1
43
1 11 1 1
1
493
2 1
1
1 665322
2 1
13 1
1
1
813
888
192
94
1 11 1
1
723
451
1
1107
11 1 1 1 2
1 21
111
1 111 1 1 11 1
189
11 1
2 1127
1 336
2
132
3
1
1
1
1
1
2 202
1
1
1
939
258
1
411
686
125
119
3
1
1 1 1
1
1
739
11
1
1005
1
652
1
11
2
1
113425
1
1
1
1273
1
1
1
1
1
77
1
1
1
1
1
224
1303
691
1 2
1
1 313
1124
2
1
708
13
1
1 2 1 1 11
1112
2
789
11 2111
488
456
1
169 2 1
1013
889
1
1206
1 1
32
1
704
1
2
776
540
150
958
381
742
1 1227
608
111 1 1
1 12 1 111 1 384
574
140
1
1062
972
1 1 1
1
596
1217
328
41 1
648 1
823
1
398
879
1
525 187
1050
1 481
11
1 11
161
1139
1
959
1
1
512
569
852
1070
550
1
1029
2 13
1054
1282
353
514
1815
3 1
22 1
959
1 21248
643
163 1 545
1 892
1198
979
309
237
523
277
434
1
18
3
364
919
1044
1018
1
998
34
1115
1
11 1 1
3
504
1 624
340
1 1 2
969
1286 1 11
430
400
1617 1 1
654
720
1263 626
9612971 60
588
510
925
21
1819
1169
1
1
145
2
1
1 2
719
1 1193
20405
1036
274
71
185
417
1092
429
1
690
670
2
1223
705
200
1305 1144
1020
203 503
261
262
490
1284 1265
1200
604446
731
1100
2696
117
1215
240
609
761 5811106
1293
111
Figure 3: Betweenness plot for the Spanish blogosphere. Each blog is situated on a
filled circle according to its betweenness measure. The “most betweened”, that is, the
one with that particular value highest, is placed on the center.
10
5 Conclusions and Future Work
This paper shows the still immature state of a community, the Spanish speaking blogosphere, which is growing very quickly. Its immaturity is shown by the fact that it has
not reached yet the state where incoming links follow a power law, but it is probably
in the good path, since it is already well connected, and shows “small world” features.
This fact is not connected with the power law distribution, as we had initially expected;
probably it only depends on size.
The blogómetro is an ongoing project, and its measures will be periodically taken
to show the state of the Spanish blogosphere, and measure its evolution. We will try
to improve the software in several ways, including public access to the database using
web interfaces, addition of self–discovery features, and improvement of other technical
details (detecting links that are equal, even if they look different, among others). Until
now, we have concentrated on having a tool for studying the blogosphere and helping
others to discover it. Maybe we should do more work on the audience aspects, to do
the tool known and useful for others. We expect that by this time, next year (or the
next-to-next) we will be able to show a power law.
We have the feeling that Spanish bloggers do not link so often and frequently as
they should and, maybe, we are loosing topics because no links are provided. We are
thinking about trying to measure words or phrases, in order to detect topics without
links in a similar way to the recently implemented ‘Word Burst’ of DayPop (http:
//www.daypop.com/burst/) or ‘Memeufacture’ (http://memeufacture.
com/). A more refined work, separating links to blogs items from links to other general media will be the subject of our research. Another phenomenon that we would like
to measure and detect would be what we could call ‘background histories’; that is, histories that appear in the blogosphere and are linked slowly during long periods of time
accumulating an important number of links, but that do not appear in daily rankings
because of this slowness: for sure some of them will be interesting topics for reading.
Other projects we intend to undertake in the future include: cluster formation in the
blogosphere, interactive visualization, and, if data is available, take measures on other
blog communities (such as, for instance, the Portuguese-speaking, which is probably
very similar to ours).
Finally, it would be interesting to compare these results with the ones of the global
blogosphere, to detect similarities and differences.
11
Descargar