An Overview On Structure and Evolution of Free/Libre and Open Source Software Development

This work is licensed under an Attribution-NonCommercial-ShareAlike 2.
5 Creative Commons Licence
An overview on structure and evolution of

Free/Libre and Open Source Software development
Davide Tarasconi
info@davidetarasconi.net
Introduction
Open Source/Free Software is becoming more and more popular during the years, for different reasons.
This dramatically increasing popularity is due to a series of sometimes not understood reasons that lead to
research studies, surveys and market analysis from proprietary software firms that see Open and Free
software as a real competitor: some rough facts are clear, as the wide market share of Apache and Linux
based servers1 confirm, while others are more difficult to identify.
For example, finding valuable indicators for proprietary software/open and free software performance
comparison it's a hard job, not always possible because of the stark structural differences between
proprietary and Open Source software development2.
The increasing market shares of some projects, and their claimed (and often confirmed) technological
superiority over proprietary software products, coupled with free of charge or at least more cheaper offers,
led government structures (from local to international) to take part into this “Open Source race”: not only
the adoption of Open Source software is strongly supported by establishments, but Open Source
development is now a fresh field of research since there's the belief that typical practices of
developers/users communities are an example of optimal distributed coordination of knowledge
management and creation, especially considering the “Information Society” development that EU is
supporting through researches and ad hoc policies during the last years.
What I'll present in this paper is a series of case studies on Open Source software development, facing
some uncommon structural features that make this special kind of distributed, technology-enabled
software development teams an interesting object of research, due to its nearly impossible nature, from
economical, social and technical standpoints.
This is not the place where I want to discuss the philosophical, ideological differences between the main
two “parties”3 of the so-called Open Source/Free Software movement: the differences between Free
Software Foundation, headed by Richard Stallman (also GNU4 leader) and the Open Source Initiative,
1 Monthly survey by Netcraft
(http://news.netcraft.com/archives/2007/02/02/february_2007_web_server_survey.html)
2 For an in-deep analysis of Open Source software economical impact and value see David A. Wheeler “Why Open
Source Software / Free Software (OSS/FS, FLOSS, or FOSS)? Look at the Numbers!”
(http://www.dwheeler.com/oss_fs_why.html) (see also References)
3 “At the heart of FSF is the freedom to cooperate. Because non-free (free as in freedom, not price) software
restricts the freedom to cooperate, FSF considers non-free software unethical. FSF is also opposed to software
patents and additional restrictions to existing copyright laws. [...] The OSI is focused on the technical values of making
powerful, reliable software, and is more business-friendly than the FSF. It is less focused on the moral issues of Free
Software and more on the practical advantages of the FOSS distributed development method. While the fundamental
philosophy of the two movements are different, both FSF and OSI share the same space and cooperate on practical
grounds like software development, efforts against proprietary software, software patents, and the like. As Richard
Stallman says, the Free Software Movement and the Open Source Movement are two political parties in the same
community.” (Free/Open Source Software, a general introduction, see References)
4 GNU stands for GNU is Not Unix, a non-profit organization born in 1984 that aims to develop an entire free
which has Eric S. Raymond as a main figure, are not fundamental for the type of studies that follow: more
important is to understand some basic, common structural features that characterize Open Source/Free
Software production.
Common features of Open Source/Free Software development
The nature of Open Source/Free Software is, simply put, a collaborative effort toward software
development through distributed teams using web-based communication and development tools: the
software products are shared, often free of charge5, and available with their own source code – that
means everyone can modify and redistribute it.
Apart from distributed development, the other hallmark for Open Source/Free Software is the use of
particular licences that permit software sharing and modification: the most famous and used licence
(actually, halfway in between a political manifesto and a legal document) is the GNU General Public
Licence (GPL) which states, briefly, that “the GNU General Public License is intended to guarantee your
freedom to share and change free software--to make sure the software is free for all its users.”6.
There is a great number of “Open Source licenses”, but GNU GPL is the more adopted and the one with
more legal value and support: again, differences between other licences are sometimes minimal, and this
is not the place for legal discussions and litigations.
Illustration 1: A simple scheme for the common concept of “developer/user”

community (adapted from David A. Wheeler presentation, see References)
“Release early and often” it's an imperative within Open Source/Free Software communities: since
developers/users have the rights to modify and redistribute the code, everyone is constantly reusing the
and building upon the work of others.
The reduced duplication of efforts allow this kind of distributed development to scale to massive, never
seen levels involving thousands of developers around the world, with a lightning fast development speed
and feature implementation: another characteristic of Open Source/Free Software is that the communities
are formed by developers/users, it's a special case where, very often, who uses the software is also the
same person who develops it7.
software based operating system.
5 Many Open Source/Free Software products, however, are commercial and supported by for-profit organization.
6 GNU General Public Licence preamble (http://www.gnu.org/licenses/gpl.txt): basically, the GPL creates a kind of
“consortium”; anyone can use and modify the program, but anyone who releases the program (modified or not)
must satisfy the restrictions in the GPL that prevent the program and its derivatives from becoming proprietary.
7 Developer/user role could be “direct” (the used is a developer, so has the technical knowledge on how modify the
source code) in “indirect” (the user has no technical capabilities, but it's involved in bur-reporting processes and
use communication tools to keep in touch with developers)
During the time this uncommon feature, in comparison with proprietary software where developers and
users are very often different people, created a kind of “Open Source/Free Software ecosystem”, with
communities building all the tools they need for their work: these communities are, thus, software
developers' communities, building both development tools as well as “end user” applications.
The openness of these projects and the massive testing activities lead to an incredible level of bug
discoveries and, then fixing: the members of these communities have the incredible possibility of full
access to technical knowledge that in proprietary software is kept under a hood of secrecy, protected by
patent and copyright systems.
The distributed nature of Open Source/Free Software lead to another point that distinguish it from
proprietary software are costs: if not free, commercial Open Source/Free Software cost much less that
proprietary solutions.
Cost comparison between Microsoft and FOSS Solutions

Microsoft Solution Linux/FOSS Solution Savings
Company A: 50 Users $87,988 $80 $87,908
Company B: 100 Users $136,734 $80 $136,654
Company C: 250 Users $282,974 $80 $282,894
Table 1: A comparison8 between Microsoft products and the Open Source/Free Software alternative
(offering the same functionalities): price differences between the solutions increase with the number of
users, making Open Source/Free Software a valuable alternative especially for public sector organizations
Again, as the costs of Open Source/Free Software remain the same even with a growing number of users
(proprietary software usually apply fees for extra licenses when users are added) other significant cost
reduction came taking into account also maintenance costs.
Apart from economical and structural development feature, some proven technical features outscore
proprietary software products performance: high security and stability makes Open Source/Free Software
perfect in those application, specially web-based ones, where robustness is a key feature.
Open standards, vendor independence and the open source code are fundamental to those organizations
that require a lot of flexibility: a special case are developing countries, that using this low cost software
(when not free) can improve their capabilities to learn software development practices and became
independent from foreign proprietary software production firms.
Free/Libre and Open Source Software
Since the increasing interest in Open Source/Free Software development is leading to an increasing
number of researches on this movement itself, it's a good practice to build and use a conceptual
framework that will keep us apart from the philosophical, ideological ambiguities I briefly discussed above.
The term “Free/Libre and Open Source Software” (FLOSS since now) is born in 2001, as a consequence
of the very first EU researches on Open Source: this term is relatively new and used only at an
academical level, but it's a strictly analytical one and is quickly rising as a common term9.
What distinguish this definition from previous ones is the French/Spanish word librè/libre, used to avoid
the “free” word ambiguity10: the use of this european word stands for a new conception of what FLOSS is,
or could be, for the EU policies from innovation and knowledge management standpoints.
8 Reported on “Free/Open Source Software, a general introduction” (see References): data is taken from the
original study “Linux vs. Windows: The Bottom Line” by Cybersource consulting company
(http://www.cyber.com.au/)
9 The rising and success of this term is compared with the epidemics-like spreading of Open Source software.
10 Free Software Foundation uses the “free as in beer” and “free as in speech” disambiguation.
During the last five years EU studies on FLOSS contributed in formalization of a conceptual framework,
regarding different fields of interest: principal studies are on worldwide impact of FLOSS, government
policies, gender issues, FLOSS as a problem solving system and FLOSS as a new
economical/technological paradigm11.
The main advantage that stems from these researches is the availability of empirical data and
observations that, unlike the so-called “practitioner-advocate literature”12, helps researchers and
institutional organizations in a process of analytical understanding that can lead to successful policy
making and research programmes.
A matter of structure
The first, biggest challenge of any kind of study on FLOSS development starts with the question: how can
these people successfully co-operate on software development projects, without face-to-face meeting,
without the presence of central authorities, even without any sort of activities' planning?
Apart from shared values, reputation, gift economy13 considerations, what are the real structural features
of these collaboration networks?
While investigating a social structure, we are focusing our attention on individuals, their actions and the
interactions in which they are involved.
We can therefore consider some measures to assess some basic features of a social structure: one of
them is called centrality, that is and indicator for individual “activity”, measuring that individual involvement
within structure's interactions.
There is an evident dichotomy, emerged from early studies on FLOSS, between what is called
“development centralization” and “communication centralization”: the former is far more studied than the
latter, and it's typically a measure for action, while communication centralization is an interaction measure.
Action measure means that the object of the analysis are the actions of the individuals of the social
structure: in this special case we're talking about the process of writing code, but there's more.
In fact, despite writing code is the fundamental activity of the social structure we're trying to analyze, there
are al lot of coordination, collaboration practices and issues that can be understood with communication,
interaction measures.
Since their characterization as technologically-enabled communities, FLOSS development teams
communications can be found into mailing-lists' emails, bug-tracking systems and instant messaging
chats.
The seminal essay “The Cathedra and The Bazaar”of Eric Raymond describe some common features that
characterize the Open Source projects: he points out a series of arguments to sustain the thesis of the
superiority of the “bazaar-like” software development against, proprietary, “cathedral-like” one.
The great number of people usually involved and collaborating to an open project makes possible what
Raymond calls “Linus' Law”14: the more the people, the more the possibilities to find bugs, the more the
possibilities to fix them.
From the thoughts of the Open Source movement pioneers, it seems evident that the communication
structures are strongly decentralized: the absence of a central authority who decides the roles of the
participants could lead us to the idea that a formal control structure is useless, if not harmful for the
structure stability itself.
That's the case of the Linux 8086 project, where core developers excluded with a “kill-file” application the
email coming from outside the developers' “inner circle”: using these method was a complete failure, in the
word of Alan Cox – former member of that project and stable Linux kernel maintainer from 1991, and
created a “clique” instead of a “bazaar”.
Some considerations heard from pioneers of Open Sorurce seems confuse and even contradictory, as the
11 http://www.flossworld.org, http://www.flosspols.org, http://www.infonomics.nl/FLOSS
12 This kind of “literature” is usually formed by developers interviews, single project case studies and anecdotal
stories that obviously lack of objectiveness and scientific, analytical value.
13 A lot of open questions remain on shared values: Raymond talks about a gift economy based on reputation, while
Iannacci critics this Raymond's view (see References), stating that Open Source software production is an
extreme form of market economy.
14 “Given enough eyeballs, all bugs are shallow”.

concept of ownership that Raymond describe as “benevolent dictatorship” but also states that no one si
forced to work for specific modules and that the force of Open Souce is that everyone can code on what
it's more interesting for them.
These misleading and confusing concepts strongly need to be analytically investigated and formally
recognized an put into a framework that could lead to systematic analysis and evaluation of FLOSS
processes.
Academical studies on FLOSS development projects
We're in the early years of studies on FLOSS development: while the most part of researches and case
studies are on development analysis often on a single project, a lot of researches started to shed light on
interactions, communications between team members.
The first multi-project study by Sandeep Krishnamurthy found, incredibly, that the most part of first 100
project hosted on SourceForge15 where “caves” instead of communities, in other words one developer,
almost personal, projects.
Starting with this incredible result, the above mentioned importance of communities in FLOSS
development could present a trouble: I found it a kind of incitement to improve researches in this field and
a warning that we're just at the beginnings of a new fields of studies on this subject.
As I said before, most of the researches have been focused on development centralization: the basic
structure of a development team is an onion-like one, with core developers surroundend by “layers” of co-
developers, active users which contribute to the bug-reporting process (no code submission) and passive
users which only uses the project's output without joining the community.
I will then consider and illustrate some multi-project researches conducted by a group of researchers of
the Syracuse University that uses statistical and social network analysis techniques: after an overview of
some structural studies I will resume some studies involving FLOSS as a complex evolving system.
Illustration 2: Onion-like structure of FLOSS development team
Investigating the communications' structure: centralization
The study by Kevin Crowston and James Howison (see References) is one of the first effort toward an
analytical comprehension of common features of FLOSS projects: they decided to keep apart the
development aspect16, while focusing on the communication one.
15 SourceForge (http://sourgeforge.net) is a web-based platform that hosts about 140000 projects, with nearly
1500000 registered users: it provide tools enabling distributed, collaborative development and bug-fixing/reporting.
16 Krishnamurthy and Mockus findings about development collaboration demonstrates that in FLOSS teams few
developers contribute to the most part of the code, so there's is a common, strong centralization feature that it's
not interesting for the scope of my overview.
They used data from SourceForge, where and when available: because of their interest in teams
interactions, they restricted their study to projects with at least 7 developers and 100 messages on
project's bug-tracking system, to be sure that in this way data comes only from active projects17.
The object of the analysis, as said before, are bug reports: a bug report on SourceForge contains a
schematic description of the bugs (Illustration 2) followed by a series of messages that users/developers
can post to discuss the nature of the bug itself.
The interactions between who starts a discussion and who responds have been collected by Crowston
and Howison using ad hoc web spiders: they then analysed the statistics of 61068 bugs coming from 120
projects.
The results regarding the distributions of the number of bug threads against the number of projects and
the number of unique posters18 against the number of projects show that the larger the community the
larger is the communications' range.
Illustration 3: Structure of a bug report: a description followed by comments, put in

reverse cronological order so the oldest is at the bottom. An interaction (the black
arrow) is the response to the previous comment.
Social network analysis is a tool to measure interactions between actors: in this particular case an actor is
a SourceForge registered user, with a unique ID and an interaction between two actors/users occur every
time there's a response19 to a message on a bug-tracking page (see Illustration 2).
17 At the time of the first data collection (April 2002), SourceForge hosted more than 50000 projects: only 140 where
suitable fort the analysis, according to Crowston and Howison parameters.
18 Anonymous posters' bug reports messages were simply not counted here. The researchers also tried to count
anonymous users as distinct user (e.g. anonymous001, anonymous002, ect.), but they decided to drop these
missing data. For further details on the methodology used by the researchers see References.
19 Here lies another methodological problem: the researchers decided to count as an interaction the response to a
previous message, although some responses are for the previous message and other for the original bug poster.
The interaction data collected has been used to plot interaction graph, using Netminer and Pajek, here
below there's an example of the graph for OpenRPG20 project.
Illustration 4 - From this plot is clear that there's a strong

centrality of three actors, namely Dev1, Dev2 and Dev3.
What can been seen from the graph, a strong centrality of few individuals, can be calculated as a
centrality score to assess whether or not the collaboration network has a “consistent” social structure.
The classical centrality notion is based on a measure of each individual's degree: in our case, who
receives or send more messages is more “central” inside the community than someone who receives o
send less or no messages at all.
Crowston and Howison decided to use out-degree centrality21 measure, since the interest in finding who
contribute to an high number of reports.
In other words: a network characterized by a high centrality score, one or few individuals will have high
centrality values, while many others with low values. In a decentralized network the centrality values will
be more o less the same for every individual.
The results of the calculation of projects network's centrality are show above: there is no uniform
centralization nor decentralization, the outdegree values stand in a wide range, from 0.99 to 0.13 (mean
value 0.58).
This unskewed distribution can be visually rendered using social network analysis plotting software (Pajek,
in this case): we can see the structural differences between a highly centralized project as curl, and a
20 OpenRPG is an Interned-based, real time Role Playing Game.
21 The other centrality measures are in-degree and so-called Freeman (or global) centrality: the former is the
measure of incoming interactions, while the latter sums incoming and outcoming ones.
decentralized project like squirrelmail.
Illustration 5: Interactions plot, curl project, highly Illustration 6: Interactions plot, squirrelmail project,
centralized (0.922) decentralized (0.377)
The different values for centralization tell us that these measure could be a characteristic one: it can be
useful to compare project centralization to its size, in order to discover to what extent an increasing
number of participants affect project structure.
Coordination and learning are characteristics of team based collaboration that are largely dependent on
group dimension: the correlation between the number of IDs (log transformed) and the outdegree
centrality is significant, an obvious interpretation could be that a large project can't be headed only by one
or few developers.
Investigating the communications' structure: hierarchy
Following the same methodology of the latter study, Crowston and Howison also made an inquiry on
another network feature: hierarchy measures.
The study on centralization and hierarchy includes also data from the Apache Foundation 22 and GNU's
Savannah23: since there are claims that the FLOSS development is characterized by low hierarchy and
centralization while others claim the very opposite, the researchers felt the necessity for a broader range
of projects to analyse.
The concept of what communication centrality means in the case of distributed teams development, and
what is its correlation with project size, has been investigated in the previous case study: from that first
insight it seems that when a project grows in size, it become more “modular” because of difficulties
emerges in coordination of larger groups.
While, as I reported above, the most part of “practitioner-advocate literature” (as I said short before,
anecdotal experiences of FLOSS developers, users, pioneers) stress the importance of non-hierarchical,
decentralized communications, Raymond himself spend a lot of time on the concept of ownership, saying
that, despite the “clamour of the Bazaar”, there are community members always refers to one or many
developers with special rights for code releases and modifications.
There is an evidence of this counter-intuitive feature in the most part of the studies on development: the
major part of code production/submission is strictly managed by one or a small pool of developers, even
for larger projects.
The onion-like structure illustrating this hierarchical organization may be valid also for communicational
patterns: from the studies on motivations of FLOSS developers the importance of prestige and status level
inside the community could lead us to believe that there is a structured hierarchy also for communication
processes.
During the investigation of hierarchy structure, researchers adopted the following hierarchy measures:
connectedness, group hierarchy, efficiency and “lubness” (short for least upper boundedness).
The connectedness value goes from 1, or connected graph (every node in the network is reachable from
every other node in the network) to 0 or unconnected graph, where there are no connection between
nodes: connectedness values can be useful to understand if the group is homogeneous or, rather, there
are isolated developers working for their own or with smaller, unconnected teams24.
Illustration 7: Connectedness distribution for the different

projects' repositories
Also the group hierarchy value goes from 1 to 0, where the former extreme value means that the network
has no loops, or, in other words, that is a superior/subordinate graph.
On the other hand, if group hierarchy is equal to 0, we are facing a totally symmetrical
network/organization, characterized by loops between every node, and typical for network of informal
22 Incubator is a platform maintained by the Apache Foundation that hosts open source projects and ensure their
quality.
23 Savannah is an open source version of SourceForge: ironically, SourceForge became proprietary software in
2001.
24 Team fragmentation could be an indicator warning for possible project forking.
relations.
Illustration 8: Group hierarchy distribution for the different projects' repositories
Closely related to density, network efficiency measures is about how many paths there are between
nodes: in a perfect efficient network removing a single link will result in disconnected network, so that this
high efficiency means also low performance.
In a high density network, with multiple paths linking nodes, thus has a low efficiency in contrast with a
great network robustness.
Illustration 9: Efficiency distribution for the different projects' repositories
Lubness (stands for Least Upper Boundedness) measures if a pair of individuals in the network interact
with a common individual: if lubness is equal to 1 all pairs of individuals have a common superior.
Hence, lubness is strongly associated with the typical community situation where there is someone who
asks for answers (the most of the member, as I report in the next section) and few that answers
everyone's question.
The distributions of the hierarchical measures indicate that, as in the case of centralization, there is not a
uniform hierarchical feature for FLOSS projects: again the high connectedness scores of the most part of
the projects demonstrate that also communication structure, like development one, it's strongly
hierarchical, a finding that is completely opposite to the decentralized “bazaar-like” view.
Knowledge brokers
During the early years of Open Source projects every participant was supposed to have the technical skills
required to review and contribute to the code submissions, a deep understanding of the entire
technological structure of the software: the huge success of many projects (Linux, the Apache web server,
Python and Perl scripting and programming languages, MySQL and PostgreSQL database applications,
Mozilla Firefox browser) increased the user base of these projects, introducing a great number of non-
technical, inexperienced users.
In a such growing environment this “non-technical side of the coin” become more and more important: we
can see it like a model for self-learning and self-organizing communities.
The study by Sowe, Stamelos and Angelis (see References) focuses on non-developers mailing lists: non-
development tasks are typically testing, bug-reporting, documentation writing, software translations.
The improvement and the adoption rate of an Open Source software are connected with the coordination
between who seeks for knowledge and so-called “knowledge brokers”, individuals that literally bridge the
gap between the “newbies” and the experienced developers.
The researchers decide to extract emails data from three Debian mailing lists: they created a parsing
script to extract identifiers like “message-id”, “to:”, “from:”, “subject:”, “in-reply-to:”,ect.
In such way they decided to identify as “poster” someone who writes a message without a “Re:” in the
email address, otherwise, obviously, we can talk of a “replier”: the name and email identifiers about
senders were stored into a database table, a control over name/email correspondence has been made to
identify emails belonging to the same poster.
First if all, to better understand the following tables, I give a brief explanation of the three Debian25 mailing-
lists from which data is gathered:
25 Availabable from http://lists.debian.org/ : the researchers selected the three lists with lifespan, activity and
openness criteria.
– KDE: discussion list dedicated to KDE26 on Debian;
– Mentors: list dedicated to beginners and new packages maintainers;
– User: this is a “high volume list”, general discussion about Debian (for English speakers)
Illustration 10: Cumulative statistics: variables Pkde, Pmentor,

Puser and Rkde, Rmentor, Ruser represent the postings and
replies to every single list.
The researchers thus restricted the analysis on those posters who replied or posted informations on all the
three lists: they found a group of 136 individuals that are active among all three lists, surely individuals
with a lot of knowledge and experience to share.
Illustration 11: Most active users statistics
The descriptive statistics of these 136 posters are similar to the “cumulative” statistics shown above.
Social network analysis helped the researchers in the crucial task of visualizing the mailing lists' affiliation
network: in this network “knowledge seekers” share a common space (the lists, the squares of the network
of Illustration 12) with “knowledge providers”: visualizing a network of those 136 posters could be a
problem, so the researchers decided to set a cut-off value to exclude posters with less than 10 messages
(the ones in the “excludes” box).
From the network visualization we can figure out that there are groups of posters strictly connected with
one list, while others contributing to many lists; better, we can observe three distinct groups.
1. Posters (both seekers and providers) active in one single list;
26 KDE is a desktop environment for GNU/Linux operating systems.

2. Knowledge providers that helps seekers across two lists;
3. Knowledge “brokers”: the object of this study, the ones that are linked with all the three nodes/lists
of the network (the black circles in the middle of the figure)
Illustration 12: Mailing lists and users network
The importance and uniqueness of these 15 knowledge brokers led the researchers to contact them for an
email survey: the results tell us that these linchpin figures are long time list participants (since 2001,
93.3%), are packages maintainers (86.7% of them) and are completely aware of they central position in
the lists' communities.
The social network analysis techniques applied in this case study shed light on communication structure of
non-development process: in particular, it helped to visualize and the recognize three main actors, the
knowledge seekers, the knowledge providers and, central figures, the knowledge brokers.
Without these visualization techniques it could have been nearly impossible to extract and find data about
these “brokers” only by email discussions analysis.
Agents and artifacts in the space of FLOSS projects
From the results of the studies I summarized before we can try to analyze some features of FLOSS
development from and agent/artifact point of view: it's clear that a common FLOSS project has agents with
specific properties like resources, permission and directedness27, agents that are able to make
attributions, interpretations of other agents and artifacts.
These agents interact with each other, with a rising interaction stream of events involving two or more
agents: surely in FLOSS communities there are strong communication networks, as we saw before, where
nodes (developers/users/maintainers) are connected by edges representing streams of discourse events.
Moreover, the communication network within a FLOSS project is also overlapping with a competence
27 Resources can be hence seen as technical capabilities, programming abilities; permissions regard software
modification and redistribution and has to do with a reputation game issue (see the continuation of the paper);
directedness and its “no-goal” characteristic is stress also by Linus Torvalds when asked for goal setting and
answering that there's no need to set goals.
network where transformations passed through the edges are code submissions or changing, that are
possible thanks to the resources (and permissions) that agents bring into the network.
Also the notion of system can be interpreted as a part of the comparison, or better, the FLOSS project
itself could be seen as a system: a system is defined by a series of scaffolding competences that, through
competences networks, allow the efficient resources allocation for activation of these networks and
maintain competences embodiment28.
While the definition of agent for a FLOSS network is relatively easy, finding a good definition for what an
artifact is in our case can be a trouble.
I'm sure that we can talk about “information artifact”, and it's clear that if we're talking about a software, or
a programming language, code can be treated as an artifact, with agents contributing with code
submissions.
I think that two different point of view can be used: on one hand, a more close analysis, the competence
networks involving code submission can be seen a system that produce a “software artifact”, while on the
other hand, communication networks shares knowledge that can be exploited as a set of competences.
However, if we focus our attention on communication interactions the definition of what an artifact is
becomes a problem: can the “communication unit” (an email, a bug-report, a forum's post) be considered
as an artifact?
Maybe we need a less “atomistic” view.
If we look a the communicational interaction stream typical for the FLOSS projects and we consider the
systems that allow the creation and distribution of knowledge inside (and sometimes outside) the
communities we can define for artifacts all the systems that enables the communications between our
agents.
Since, as shown in many researches, code development and code itself (the artifact of development
process) is highly hierarchical and centralized29, we need a broader definition for the artifacts (forums,
mailing lists, bug-tracking systems) involved into the less hierarchical and decentralized communication
process.
The mailing list network visualization of the latter case study on Debian project help us with an empirical
demonstration of this concept: we can easily see the interactions between different types of agents
(seekers, providers, brokers) and the artifacts that enables these interactions (the three lists).
Although the interactions can be seen also without artifacts as hubs but only as interactions between
agents (as in Crowston and Howison studies), there's are several studies on learning and knowledge
creation processes that stress the importance of tools that enables interaction between communities'
members30.
If we want to follow agent-artifact theory in a more faithful way it's a better idea to analyse the competence
structures as network of individuals who shares code as an artifact: the “system building process” can thus
be analysed through different levels.
Devepers and users seen as economic agents at the micro-micro level, use attribution to better
understand other developers/users roles and, in a learning environment typical for FLOSS project, to
better understand the features of code.
The problem of agents directedness is that agents in a FLOSS project are virtually free of doing anything
the want, there is no planning nor a “role setting procedure” and this could lead to a misleading belief that
these agents are all equal.
Previous case studies shown that FLOSS development has a certain degree of freedom, but it's very far
from being anarchical: the members of a community know who the others are, what they do, and that
there are roles.
Hence, considerations about directedness of agents can be made even without “central planning” or no
goal setting, as Linus Torvalds argues: the orientation of a FLOSS community is based in its shared
values, ethic rules and to some other aspect related to reputation that will be discussed in the second part
28 The modularity of the code make this example more clear: the dependency between different modules of a
software lead to interactions about intertwined competences needed to perfect subsystems compatibility, leading
to an overall system improvement.
29 Code is defined by syntax and semantic: obviously development process have to be structured in term of
centralization and hierarchy, the very nature of the code characterizes its development.
30 Tuomi says that “Knowledge is embedded in social practices, conceptual systems, and material artifacts that are
used in social practices” (see References)
of the paper.
The cognitive structure proposed for network's agents fits perfectly the “build upon others work” paradigm:
the interpretation of code bugs are made confronting every trouble the agents encounter with a sort of
“shared knowledge database”31, then applying cognitive operators (COMPARE, COMBINE and
TRANSFER) directly to the code/artifact.
The problem of competences generation, at the micro-level, faces the two-folded problem of the
generation of new interactions (and, thus, new competences), on one hand, and the use of newly
generated scaffolding competences to build a stable system, on the other hand.
The tension characterizing the agents behaviour at this level of interaction is due to an effort toward the
building of closed-loop and self-reinforcing, stabilized structures and the obvious need of new generative
relationship that can lead to system instability.
From the FLOSS development comparison this process can be seen as a problem of authority and
coordination: there is a strong will, especially from experienced developers, to consolidate knowledge and
practices about single software modules and, in general, to not violate some “guidelines” - that could be
technical rules or simple shared values.
On the other hand in a such bubbling and active environment is simply impossible to avoid “new ideas”,
new solutions and new features proposals: project forking32 can be seen as the extreme result of the
destabilizing function created by new generative relationships.
The stability of a FLOSS project results from a mutually perceived aligned directedness, a set of
permissions given to the communities members to continue their actions/communications, some sort of
communication networks (in our case web-based ones) that enables contacts between actors and
resource collection abilities.
The case of the Linux 8086 project is a shining example of denied permission: the “kill-file” application
filtering the project's mailing-list excluded the core developers from the users' group, “drying up” the
interactions stream.
Again, as preliminary studies on FLOSS demonstrated that the most part of the projects doesn't pass a
preliminary phases (normally coded as pre-Alpha, Alpha and Beta), the main reason could be that they are
personal project maintained by only one individual so that the community isn't great and strong enough to
carry a sufficient level of generative potential.
The problems about competences creation in agent-artifact theory faces some problem that can be hardly
addressed to the comparison with FLOSS project: for example there's no formal top-down structure that
can direct some sort of competences creation process so there is no mixing of top-down and bottom-up
processes.
There is surely a delegation process in competences that sometimes could lead to the creation of a new
agent or, better, a new attribution of identity of an agent: an example could be a user that starts working on
a particular feature and the become that feature/module recognized maintainer.
The second problem about competences is their two-stepped stabilization, agentizing and then
scaffolding: agentizing process occurs when a particular agent has the permission to “speak for” a certain
building block of a specified competence and to make changes in a specific competence network.
This process can be compared to the nested structures of “trusted lieutenants”, “credited maintainers”
(investigated in some studies presented just after the next paragraph) that filter code submissions and the
releases different patches for every single module a maintainer overview.
The concept on market system that lies on meso-level is probably the most difficult concept to compare
with FLOSS development: a community of developers can be considered as a gift economy based on
shared value and reputation as a reward but there are criticism and some authors argues that it's a
special, extreme case of some kind of “pseudo-economical” market system.
Role and rules are not formally shared among FLOSS developers, or, better, some rules become formal
guidelines, scaffolding principles of the communities, while other norms are transmitted in a word-of-mouth
way: the frequent use of communication networks enable the diffusion of this informal practices.
Despite non-monetary rewards we can observe that many FLOSS project are in competition: indicators
such as the number of downloads or the community dimensions have a great influence for the “market
share” of a projects.
Developers can thus be motivated by the will to contribute to better designed and written software
31 Can be compared with a “memory artifact” since we're talking about computer systems, but it can also refers to
individual, personal experience.
32 A project fork happens when the original project is split into another project by a group of developers that, for
different reasons, decide to build a new project with different technical features or using other shared values.
modules (and from the reputation reward they can gain from their experience), improving the internal
performance of a project and then making this project more competitive in the supposed FLOSS “market
system”.
The importance of rule-setting inside and outside a system is an important issue that is difficult to analyse
for FLOSS development, while it's a main argument of interest for all those organization like the EU which
are studying FLOSS to catch hints for policy making and foster competitive and innovative environments
for developing or starving economies.
Some conclusions about FLOSS structure and ideas for future researches
What I presented in this first part of the paper is only an overview of some of the growing number of
studies involving FLOSS development of the last few years: they are preliminary, pioneering researches
on a software development method (that involves also broader features apart from coding) that has been
defined by Eric Raymond for the first time only in 1998.
This early studies on structure of these collaborating, distributed teams confirm the two-sided reality of
hierarchical and centralized structure: when, one one hand, we look at the coding, development processes
have a wide range of centralization and hierarchical levels, on the other hand, the communication
processes, have a tendency for larger project to be less centralized than smaller ones.
The problems is to analyse data from different sources in order to extend the range of the number of
projects involved: the aim of these researches is to find common features, so the greater the number of
projects analysed the greater the quantity of valuable data collected.
The first study about projects centralization is an example of how an analytical case studies on FLOSS are
needed: the so-called “practitioner-advocate literature” is in fact a collection of interviews, anecdotal
stories that describe the FLOSS movement into a not very objective fashion, as the differences in projects'
centralization reveal that there's not a common pattern
A weak point of these researches is that the projects taken into account are successful ones: future
studies should investigate and compare successful FLOSS projects with unsuccessful ones, considering
the well-known fact that is quite more probable that a project of this kind will fail or will not pass its
preliminary phase33.
Another interesting tip for future researches could be the comparison between communication patterns
observed in FLOSS team and proprietary software development teams: again, the static view offered by
these studies about these communicational structures is interesting, but it would be more interesting a
study about the dynamic changes that surely characterize the teams and community evolution over time.
Talking about evolution, there is a strong interest also about the evolution features and complex systems
comparison that some researchers on complexity theories tried to apply to the Open Source world.
The clamorous evolution of the Bazaar
One of the most keen review and response to Raymond's “The Cathedral and the Bazaar” is a
thesis/paper by sociologist Ko Kuwabara: in his work he describes the “bazaar-like” model using
complexity theories to describe emergent patterns and evolution of Linux Operating System34
development.
I'll follow Kuwabara's paper outlines, but, first, we have to understand why Linux case is so important for
studies on Open Source.
First of all, Linux is considered a first-class operating system, a shining example of nearly-perfect
computer system design: despite a huge, geographically dispersed, volunteer-based community without a
formal planning program Linux is considered35 the best operating system available, far better that
33 Projects' statistics on SourceForge are clear: only a very small percentage of the projects arrives to the
Production/Stable and Mature status, while the major part rest into a Pre-Alpha, Alpha phase.
34 Kuwabara wrongly talks about “Linux Operating System”: his analysis focuses, instead, mainly on Linux kernel
development. The Linux kernel is a Unix-like operating system kernel. Linux was created by Linus Torvalds in
1991. At the time, the GNU Project had created many of the components required for a free software operating
system, but its own kernel, GNU Hurd, was incomplete and unavailable.
35 This is not the place where I want to discuss the comparison between Linux and other operating systems: some
proprietary products.
Second, size matters: Linux project involved, and continue to involve, an unprecedent number of
developers from all around the world, and we're talking about (2000 estimation) 40000 volunteers working
with no formal organization36.
Linux could be suitable for an analysis under the perspective of complexity theories because of its nature
of “improbable” artifact (or “impossible good”): a project of that size without central control involving
thousands of volunteering individuals from around the world makes no sense in term of coordination,
motivation, planning and economic action.
Despite this chaotic view, Linux project survived and created a sort of order out of disorder, self-organizing
its development and communication system, evolving in something better and different over the time.
Linux as a complex system
The Linux project can be seen as a complex system: it is a hierarchical system, with two different
interaction levels, one regarding the code, its hierarchical,logical structure and development and another
one regarding the community of users and developers, with a decentralized structure based on individuals'
motivation and coordination.
These two “local interaction layers”, one for the code and one for the community stress the above
mentioned characteristic of Linux as an “improbable artifact”: twice improbable, as states Kuwabara, once
for the technical complexity of the code, and twice for the complexity of the social network involved.
We have to remember that Linux is born as a personal project of a 20-something computer science
student, a bunch of lines of code that at the very beginning weren't even feasible for any purpose, if not
testing and experimenting: the complexity of this system is an emergent property grown over and over
during the years, as the supporting community dimension was increasing around Linus Torvalds.
“Around Linus Torvalds”, I said: here there is another common issue, the issue of authority and, on
another hand, the lack of it.
Torvalds is worldwide recognized as the leader of the Linux kernel project: but what does this role implies,
if researches, studies and interviews on Open Source and its practitioners tell us that there no one
establishing rules and plans?
Surely, during the years, Torvalds kept the Linux project development under control, he has the final word
on what the update patches must or must not contain: in the early years he contributed also with strict
development guidelines37 because of his commitment toward a unified and high quality standard of coding.
Obviously, a man alone can't review all the thousands of lines of code that are changed every week in this
huge community: Torvalds works closely with a kind of “inner cycle” of core developers, who gained
respect and technical expertise from their early collaboration at the project.
This aspect reflect the general onion-like structure discussed above, but, again, there's an emergent trait
even here: no one knows who these “lieutenants” are, neither how many of them are there38.
The “inner circle” is then only a general categorization for those who act as maintainers of specific
modules: they're a sort of self-created39 middle layer between Torvalds and other developers/users, a sort
of delegation/entrustment act of the Linux founder, due to the continuing growth of his project.
Decentralization benefits are thus an efficient human resources allocation and parallel development
enabler.
From the theories of complexity point of view, Torvalds, in his role of “final-word-setter”, only selects
facts as its security, stability and the rate and speed of bug-fixing process are unquestionably better than
proprietary operating systems. Other supposed points of superiority maybe be less clear and less objective.
36 As Mockus and Herbsleb states “Work is not assigned; individuals choose what work they will do. The choices are
constrained, however, by various motivations that are not fully understood. For example, it can be assumed that
developers try to maximize the chance that their code will be included in a release, and will enhance their
reputation.” (see References)
37 There are a lot of interviews confirming that Linus rejected over and over many patches that weren't written in a
clear and simple way.
38 Their number ranges between “6/12” to “100 very active kernel folks” according to some developers' interviews.
39 Again, as said before, developers themselves decide what to develop, and someone became a primary maintainer
because of his own experience, not because Linus or someone other decided his role.
between patches and code submissions with a high fitness: the quality of the project can be seen as an
emergent property of an evolutionary process, based on local interactions at code and discussion levels.
Evolution is a natural element of any complex system, as far as local interactions between actors are
responsible for mechanisms of adaptations that fuel the evolutionary process itself: the idea of a complex
adaptive system by John Holland could be suitable for a comparison with Linux project.
Some brief concepts of biological evolution can help us with the comparison with Linux project evolution.
Biological evolution is mediate by the genetic system: the gene provides the basic instructions for the
process of proteins building that will define the physical and behavioural characteristics of a certain
organism.
The second task performed by the gene is to provide a physical medium in order to share genetic
information from a generation to another one: the gene replicates itself through a process of cell division
The variation in genes are caused by two process: mutation, that introduces variations through
spontaneous, random changes in a particular gene, and crossover, that introduces variations through
sexual exchange of genetic material between organisms
Mutation and crossover are both random processes that account for endless variability between each
generation.
Concept of “blind evolution” is too strong if we want to make a comparison between evolution in biological
realm and Linux project: the very existence of experienced developers imply that a lack of foresight is
impossible, even if there are no blueprints for software design, rather, we can say that in this particular
case, evolution is “short-sighted”.
Evolution theory could help us in the way that it permit an “elegant” way to describe how Linux project can
produce such a complex and high quality product without a strong foresight capacity and a top-down
development model.
The idea of cumulative selection gives us the notion that changes happens gradually, step-by step: the
comparison with code editing is obvious, clear.
But the process of code editing is the same for Open Source project and proprietary software production:
the difference stays with the parallel code editing, typical for a community of a great number of actors
involved in a process of selection of the best code features.
In the economic and business environment in which proprietary software development takes place,
parallel editing is avoided, software specifications are established by the top-down management:
developers are paid for this.
These constraints are totally absent into an Open Source project like Linux.
Again these “parallel efforts” should not be overestimated: in Linux project case maintainers witnesses the
development, preventing excessive, redundant solutions so project forkings are usually avoided.
The meme, the gene and the code
Since we are talking about the evolution of a system that allow also creation and diffusion of knowledge,
we can also investigate some aspects regarding cultural evolution.
The concept of meme was introduced by ethologist Richard Dawkins in his seminal work, “The Selfish
Gene”: a meme is, simply put, “a unit of cultural transmission, or a unit of imitation” 40, like a gene, a meme
is a replicator that is able to propagate itself across populations.
The role of the meme in cultural evolution is similar to the role of the gene in biological evolution: the main
difference is that a meme can flow laterally, vertically, across time and across populations, a gene can only
be transmitted through generations.
A meme spreads through a process that reminds Lamarckian evolution concepts: classical Darwinian
evolution is blind and dumb, in the Linux project case the evolution process is not-so-blind and not-so-
dumb, because the capacity to learn, coupled with a little of foresight, means that variation is less random
and more self-directed.
Genetic evolution introduces changes in the frequency of distribution of certain traits in the populations,
while memetic evolution adds a new dimension to population change, allowing also individual learning: it
introduces changes in the probability distribution of a behavioural or cognitive pattern in the actor's
40 The science of memetics is quite a borderline one: I don't want to expand this argument here, since different
definitions and applications of meme theory led to different field of studies, sometimes with some sort of contrasts
between different school of thoughts.
repertoire.
As there is no a single gene, for examples,for blue eyes trait even memes infrequently occur in isolation:
more frequently memes comes in organized groups called memeplexes, which are self-reinforcing
structures composed by a certain number of coherent memes.
Linux can be seen as a memeplex, where its source code is composed by memes that satisfy almost all
the following attributes that influence memes' proliferation: coherence, novely, simplicity, individual utility,
salience, expressivity, formality, infectiveness, conformism and collective utility41.
The “selective pressure” inside the Linux process tends towards the submission and implementation of
patches that satisfy almost completely the attributes above: new code must be coherent, bug-free,
containing new and superior functionalities, simple and easily understandable by other community
members.
The quality of the code is then assured through feedback processes: when new features are proposed,
they are tested by several developers, in many different environments to evaluate their stability, and
discussions about what can be included involve the entire community.
Feedback processes in the case of Linux project are enabled by a series of factors:
1. the use of Internet for communications (no need for further explanations here);
2. Linux kernel code is written in C, a digital language: in a comparison with genetic information,
coded as specific sequences of molecules, we can assume that the source code works as a
replicator42;
3. the opening of the code exposes more bugs to more eyes, increasing the probability of detection
and, then, fixing.
The evolution of this project has some features that explain matters of quality and speed of development:
the passage between the evolution of a biological system and a non-biological system could be seen as a
reckless step on one hand, but on the other hand the similarity with many aspects are too similar to be
avoided.
Variation within Linux project, as I reported above, is much less random than in biological environments:
it's more self-guided by virtue of “short-sighted” learning of human actors, that is facilitated by source code
opening, a condition that, coupled with the use of ICT systems, enables memes' exchange and spreading.
The reasons explaining the quality and speed of code development, in term of evolutionary trends of
search and exploration of new solutions, don't explain why this complex community doesn't crashes down
under the overwhelming number of actors involved: why doesn't chaos win it all?
Self-organization and Linux as an impossible good
The idea of a chaotic process of searching and exploring supported by the parallel development and bug-
fixing processes, draw out an idea of disorder and disorganization: but Linux project is known for it's
quality and speed of development so why don't this chaos disrupt the collaborating structure?
The other fundamental concept if we want to look at Linux as a complex system is self-organization: self-
organization is the counterpart of chaos, and keep the system into a relatively stable status, despite its
bazaar-like activity.
Positive feedbacks work as a reinforcing element for self-organization, creating loops that sustain a cycle,
as Brian Arthur case study on VHS/Betamax has shown: the idea of small changes applied to networks of
interconnected agents that lead to greater changes is known under several names such as positive
feedback, increasing returns, non-linearity.
Looking for a comparison with Linux case, we have to face a contradictory element in developers'
behaviours: formally, the motivation that inspires developers is a sort of hacker ethic, establishing the main
idea that anyone can modify open source.
There is, indeed, a strong concept of “ownership” that implies that is well known who has the rights to
41 From the paper on memetics “Evolution of Memes on the Network: from chain-letters to the global brain” by
Francis Heylighen (http://pespmc1.vub.ac.be/papers/Memesis.html).
42 Fecundity, copying-fidelity and longevity are the essential qualities of a replicator, according to Dawkins: it's clear
that digital information satisfies all this criteria.
modify and to redistribute a certain piece of software43.
This is a natural consequence of what Raymond recalls in his essay “Homesteading the Noosphere”:
people who starts a project are seen as important figures from who follows, they are considered as
“owners” of the project.
An emergent ownership system like this, born with the “final-word-setter” role of Linus Torvalds in the early
stage of Linux development, it's a feature that has locked the community within a series of stable patterns
of interaction around maintainers over the time, resisting the growing numbers and complexity of the
community itself.
While we were previously focusing on parallel development features, we're now shifting to evolutionary
processes at the community level: what are the motivations and incentives that makes Linux possible?
Questions about the production of a “public good” face a huge dilemma: once produced, by definition, a
public good is available to those who contributed and those who did not.
This implies that if everyone “free-ride”, enjoying the good without supporting its production, the good itself
is never produced: individuals need to be motivated to avoid free-riding and the efficacy problem, the
perception the one person cannot make a visible, recognized difference to the community.
The larger the community the greater the probability for the members to free-ride without affecting the
production or getting caught: it's even more difficult to face the problem of efficacy and to coordinate all
those individuals.
Linux, being a digital public good, produced by a community of volunteers with no formal coordination, it's
impossible: since, as public good, Linux is also characterized by perfect “jointless of supply”44 (means that
the good can't be depleted) and “non-excludability” (no one can be prevented from consuming the good)
the incentives for its production shouldn't be enough, hence it's not only impossible but even paradoxical.
The problem of the theoretical impossibility of Linux development, not the free-riding problem but the
efficacy one, is partly solved if we don't think about it as an individual efficacy (enabling collective action)
problem, but if we turn it into a group efficacy one – group efficacy that motivates individual contribution.
Shortly, individuals are more prone to enjoy successful or promising groups that facilitate social learning
through positive, reinforcing feedback loops.
Larger groups are, from the efficacy point of view, more successful: in a such heterogeneous and large
population like Linux community it's more likely to find “outliers individuals”, who spend more times, skills
and resources in some projects.
These pivotal individuals, thus, propels the process of teams' creation: again, if we started to figure out
“how” this happens, we are still looking for the “why”: a number of survey have been collected, and,
despite being heterogeneous, three main motivations became clear – enjoyment, reputation and
attachment to the community.
Illustration 13: Rationalist model for reputation
The “reputation game” seems to be a strong motivation: researchers and Raymond himself called it
“reputation game” because, again, it could be seen as a self-reinforcing cycle.
43 As a common practice, project forking is generally avoided and new code isn't accepted without the consent of
moderators.
44 Also known as “non-rivalry”.

Illustration 14: Evolutionary model for reputation game
It's easy to notice that the second “version” of reputation has a strong reinforcing nature: this
“evolutionary” model of motivation offers a stronger and two-faced view of reputation as a motivation for
programming and sharing opportunity on one side, and as an element that “locks” people into patterns of
collaboration and interaction on the other side.
These two different aspects of reputation are not further investigated, but the author suggest that they can
be tested with computer simulation, after a proper data collection.
As Axelrod claims, in crowded communities with frequent and repeated interactions, reputation opens new
opportunities and reinforces existing patterns: the norm of sharing code emerge to the extent to which
reputation “suggest” to the developer to stay with the project and, also, the norm of ownership emerges as
the reputation signal to other community members that a certain project has already a capable maintainer.
Adaptive coordination evolution in Linux
A complex evolving system45 is a system that co-evolves with its environment so that the evolution of one
system is partially dependent on the evolution of others systems.
A complex evolving system has the following characteristics:
1. Connectivity or interdependence
Connectivity is related both to the elements within a system and between systems themselves.
A low degree of connectivity leads to static systems, while strongly connected ones are tipically
unstable.
2. Self-organization or emergence
Emerging structures and orders of hierarchies depend upon the level of interconnection between
the elements and the systems.
3. Exploration of the space of possibilities

When systems became unstable they're pushed to search for new possibilities of new order
through innovation and diversification.
4. Feedback processes
Self-reinforcing feedback, path-dependency.
45 Complex evolving system is used instead of complex adaptive system: a CES notion is more appropriate since
we are considering a human complex system, that has peculiar features that distinguish it from other (biological,
physical, chemical) complex systems.
Illustration 15: Heterarchies as loosely coupled systems – T
(Torvalds), TL (Trusted Lieutenant), CM (Credited Maintainer), D/U
(Developer/User)
Heterarchies can be conceptualized as loosely coupled systems, having a low degree of connectivity
between their subsystems or elements: compared with hierarchies, heterarchies have more decentralized
interactions, with actors who adapt to each other in a “parametric fashion”.
For parametric adaptation researchers mean that every decision maker in the network considers prior and
others decision in order to adapt his own decision process.
The figure below are two examples of decision making networks: Illustration 4 is an example of
supervisor/subordinates hierarchical decision making, while Illustration 5 shows a decentralized,
heterarchical decision making process.
Illustration 16: Centrally regulated complex decision Illustration 17: Self-reinforcing, adaptive decision
making making
In a “centrally regulated” system there are decision to be coordinated and then coordinating decision,
coordination processes and decision processes are separate: for the mutual adjustment, adaptive system
every decision is part of the coordinating process.
The process of new kernel releases, where developers use Torvalds' releases as a base for
experimentations and new features implementation, is and example of parametric adaptation: the freedom
to explore, use and modify the source code become a way to explore the space of possibilities.
Again, as previously mentioned studies on structure tried to demonstrate, this adaptation lead to a
modular structure for coding development, especially in the case of bigger communities: Torvalds can't,
obviously, check all the code submitted by thousands of developers but through an emergent process
there are layers of trusted individuals that filter the best patches.
In this way the code that arrives to Linus Torvalds is nearly perfect, the “only” decision he has to take is
what to include in new releases: it took years to achieve such an elegant and effective structure, but we
have to remember that this is a totally emergent trait appeared during a process of evolution of the
developers' community.
Emergent decision-making patterns
Decision making and leadership have been studied also by the Syracuse University research group: early
studies demonstrates that there are different styles of decision making participation.
At one extreme there are projects like Linux, where a small group, or even one person takes the crucial
decisions while at the other extreme there are very decentralized and democratic decision making
structures, that use voting systems to take decisions.
Researchers decided to select three project from two very different software categories: three instant
messenger software products and three ERP (Enterprise Resource Planning) platforms.
They decided to analyse these different products to investigate possible differences between single user,
“desktop” products and business oriented, multi-user products.
Second, all these project are in the stable/production status, five of them are hosted on SourceForge, so
there are data available to track their development processes.
Finally, these are considered successful projects46: the increasing number of downloads, developers and
user attracted are indicators of members' satisfaction.
The analysis was then conducted upon email involving decision episodes: these messages have been
taken into account because they contain a trigger that allow mailing list users to choose between some
different possible choices.
In order to observe changes in decision making process the email sampling was made in three different
periods: researchers took the first 20 decision episodes for the project (Beginning Period), the last 20
decisions (Ending Period) and other 20 decision episodes around a major code release, somewhere
halfway between Beginning and Ending periods.
The decision episodes were thus coded, taking into account the number of messages per episode,
duration of the episode, total number of participants and the role of each message's sender.
After a first analysis, five additional variables were added: decision type, decision trigger type, decision
46 Crowston, Howison and Annabi elaborated a multidimensional evaluation method to assess whether a FLOSS
project is successful or not (see References).
process complexity, decision announcement and decision style47.
The comparison between these two projects' decision processes show different trends for user,
developers and administrators involvement over the time: aMSN is clearly a successful project that is
attracting and involving a growing number of people, while Fire development it's surely slowing down.
Comparing these projects under a core/periphery point of view lead us to a confirmation of the previous
trends: it's evident from Table 4 and 5 that in Fire project core developers lost interest over time, while
their counterparts of the aMSN project kept a certain control over the decision making practices.
Table 4 and 5 – From these tables' data is clear that Fire project suffers a lack of continuity from
administrators, while developers importance grows; on the other hand aMSN project kept a more stable
group of decision makers.
Since all these statistics have a quite “linear” nature, researchers faced the problem of decision process
complexity evaluation: Fire project decision are increasingly based on single choice triggers, while aMSN
project decision making involves a growing number of multiple-complex episodes.
These multiple-complex decision episodes, according to the authors, can be compared with the “garbage
can decision theory” by Michael D. Cohen: in this type of decision making some problems are solved by
solutions there were “in the can” before the problem itself was evident, or looking for the solution of a
specific problem lead to the solution of another one, leaving the former unsolved.
47 Decision type: CODE or Non-CODE decisions. Decision trigger type: CODE decision, (1) bug reports, (2) feature
request, (3) problem report, (4) patch submission, (5) release to-do lists and (6) mixed bug and features lists not
associated with releases. Decision process complexity: “Single”, for single choice, “Multi-Simple” (multiple
choice,e.g. A to-do list) and “Multi-Complex”, complex decisions that involve pattern that lead to new choices and
problems. Decision announcement: a confirmation that a decision has been taken. Decision style: since the
analysis in on-going, this variable in not used yet.
In other words, garbage can theory disconnects problems, solutions and decision makers from each other,
unlike traditional decision theory.
The correlation between the changing trends of complexity decision types and other variables needs
further analysis: what this case study present are only preliminary and incomplete results of a long term
research project.
Moreover, the rising importance of an anarchical decision making type like the one described by garbage
can theory can lead to some considerations about the emergence of these decisions for certain projects,
and if this kind of decision making could be a widespread one in FLOSS communities.
Illustration 18: Complexity decision trends over Illustration 19: Complexity decision trends over
time, Fire project time, aMSN project
Some thoughts on FLOSS as a complex, evolving system and ideas for future researches
This outlook on Linux project and FLOSS development teams using complexity theories are, like the
previous studies on communicational structures, early works and personal research findings: all these
studies suffer of a lack of wide data upon which findings can have “true” analytical value.
Obviously comparison between biological and non-biological systems it's not easy nor widely accepted,
but it's a truism that, unlike proprietary software development characterized by top-down, strongly
hierarchical organization and management, FLOSS development has decentralized, self-organization and
“chaotic” feature that are clear to all those who studies complex systems.
Comparison between natural and social sciences in the field of complexity theories are just at the
beginning: a little effort has been spent in developing a “theory on complex social systems”48.
Again applying complexity theories may bring someone to the misleading belief that it could be easily
turned into a “managing tool”: far from this, we're talking about something that is more like a conceptual
framework, a way of thinking.
Certainly, with the increasing possibilities that ICT tools give us, the more interconnected our society and
economy are, the more is probable that we will attend to a more evident number of complex behaviours
and patterns.
Some of these “complex behaviours” are recognized, almost of “public domain”: we are far from a
complete understanding of complex organizations, especially complex human systems, since most part of
the theory and of the tools used to investigate this field still lacks of empirical data.
Final considerations
From the overview of the case studies I collected (and many others I reviewed), one fact became clear:
we're facing something “new”, complex, something “out of nothing”, that possibly can (and in some case
already had) lead to deep changes in many fields of our economical, political and social systems.
The Free/Libre and Open Source Software development is a fresh field of study: its features and growing
are interesting for an increasing number of research fields.
Software development matters were the first to be analysed, specially in regard to quality and
48 Eve Mitleton-Kelly, introduction of “Ten Principles of Complexity and Enabling Infrastructures”, see References.
development speed: as some authors suggest, proprietary software firms should adopt some FLOSS
practices that can increase software development performances.
After the first studies on development it became clear that “it wasn't all about code”: the heterogeneity of
FLOSS projects and their dramatically increasing numbers and successful stories lead to some
considerations about the importance of the “movement”, represented by communities of software users
and developers.
Moreover, the EU is spending a lot of time and money in researches on FLOSS: this term itself, as I said
in the opening of this paper, was adopted since its first use by Rishab Ghosh 49, who is currently ending
one of the first long-term study on economical and social impact of FLOSS.
From the economical and political standpoints FLOSS is a valuable resource: developing countries'
governments are great supporters of Open Source solutions, since they can permit these countries to
bridge the gap known as “digital divide”, using cheap but fast, stable, secure and efficient software
platforms.
Again the communication structure of FLOSS communities has a proven efficacy and efficiency in
knowledge creation, management and spreading: since there's a lot of political interest on policy making in
the field of “Information Society”, hence, the importance of FLOSS goes far beyond “mere” coding and
hacking.
Studies on FLOSS could resume its main features and help organizations, re-inventing collaborations
systems and knowledge management practices.
A deeper issue is related to Open Source licensing and the future of copyrights and patents systems: like
the growing number of developers and users, a lot of people is increasingly interested in matters of
copyright and patents.
Since we live in a digital age, with a growing number of digital good, “ancient”, protective, monopolistic,
restrictive laws like copyrights seem surpassed or simply not applicable because of their “extensions
periods” that make no sense considering the speed of development of certain digital goods and the
amount of shared knowledge that simply can't be excluded.
The notion and the use of “copyleft” licences is a “derivable” of FLOSS movement that can have a strong
impact also outside the software world: as Lawrence Lessig50 argues, software code has the same role in
cyberspace as law does in real world, in fact, he simply argues that “code is law”.
Lot of studies confirm that restrictive copyright laws and patents in the digital era are a bias for innovation
and creativity: the diffusion of reciprocal licensing51 (as GNU GPL) permits the knowledge sharing and
avoid any kind of “closings”.
The use of alternative licenses also for media (photos, texts, videos) other than software products may be
the greatest impact that FLOSS movement could have on our society: this could affect a series of ad hoc
policies and structural re-organization for a real “Information Society” creation.
49 Rishab Gosh conducts evidence-based research on the socio-economic, legal and technical aspects of
Free/Libre/Open Source Software (FLOSS) worldwide. He is Founding International and Managing Editor of First
Monday, the most widely read peer-reviewed on-line journal of the Internet. He is Programme Leader at UNU-
MERIT at the University of Maastricht. He also coordinated the European Union -funded FLOSS project. (personal
webpage, http://dxm.org/osi/ )
50 Lessig is an academical professor and lawyer who became popular for his position against copyright extensions,
supporting free culture, free software and knowledge sharing.
51 Reciprocal licensing ensure that development remains collaborative and cannot be exclusively appropriated.
References
J. Coffin
“Analysis of open source principles in diverse collaborative communities“
First Monday, volume 11, number 6 (June 2006),
URL: http://firstmonday.org/issues/issue11_6/coffin/index.html
K. Crowston, K. Wei, Q. Li, and J. Howison

“Core and periphery in Free/Libre and Open Source software team communications”
URL: http://floss.syr.edu/publications/hicss2006.pdf
K. Crowston, J. Howison, and H. Annabi

“Information systems success in free and open source software development: Theory and measures”
URL: http://floss.syr.edu/publications/crowston2006flossSuccessSPIPpre-print.pdf
K. Crowston, J. Howison
“Hierarchy and centralization in Free and Open Source Software team communications”
URL: http://floss.syr.edu/publications/ktp2005.pdf
K. Crowston, J. Howison
“The social structure of Free and Open Source software development”
URL: http://floss.syr.edu/publications/sna-FirstMonday.pdf
R. Heckman, K.Crowston, Q. Li, E. Allen, U. Eseryel, J. Howison and K. Wei

“Emergent decision-making practices in technology-supported self-organizing distributed teams”
URL: http://floss.syr.edu/publications/Heckman2006Emergent_Decision-
making_Practices_in_Technology-supported_self-organizing_distributed_teams.pdf
F. Iannacci, E. Mitleton–Kelly
“Beyond markets and firms: The emergence of Open Source networks”
First Monday, volume 10, number 5 (May 2005),
URL: http://firstmonday.org/issues/issue10_5/iannacci/index.html
K. Kuwabara
“Linux: A Bazaar at the Edge of Chaos”
First Monday, volume 5, number 3 (March 2000),
URL: http://firstmonday.org/issues/issue5_3/kuwabara/index.html
S. Krishnamurthy
“Cave of Community? An empirical examination of 100 mature open source projects”
URL: http://pascal.case.unibz.it/retrieve/3242/krishnamurthy.pdf
D. A. Lane
“The structure of agent space: some definitions and problems to resolve”
E. Mitleton-Kelly
“Ten Principles of Complexity and Enabling Infrastructures”
URL: http://www.psych.lse.ac.uk/complexity/ICoSS/Papers/Ch2final.pdf
A. Mockus, J.Herbsleb
“Why not improve coordination in distributed software development by stealing good ideas from Open
Source”
URL: http://mockus.us/papers/whynot.pdf
E. S. Raymond
“The Cathedral and the Bazaar”
URL: http://catb.org/esr/writings/cathedral-bazaar/cathedral-bazaar/
E. S. Raymond
“Homesteading the Noophere”
URL: http://catb.org/esr/writings/cathedral-bazaar/homesteading/
S. Sowe, I. Stamelos and L. Angelis

“Identifying knowledge brokers that yield software engineering knowledge in OSS projects”
URL: http://opensource.mit.edu/papers/IST-Vol-48-11-2006.pdf
I. Tuomi
“What did we learn from open source?”
First Monday, Special Issue #2: Open Source (October 2005),
URL: http://firstmonday.org/issues/special10_10/tuomi/index.html
I. Tuomi
“Internet, Innovation, and Open Source: Actors in the Network”
First Monday, volume 6, number 1 (January 2001),
URL: http://firstmonday.org/issues/issue6_1/tuomi/index.html
D. A. Wheeler
“Why Open Source Software / Free Software (OSS/FS, FLOSS, or FOSS)? Look at the Numbers!”
URL: http://www.dwheeler.com/oss_fs_why.html
Presentation URL: http://www.dwheeler.com/numbers/oss_fs_why_presentation.pdf
K. Wong and P. Sayo

“Free/Open Source Software, a general introduction”
UNDP-APDIP, 2004
(donated by the International Open Source Network and UNDP Pacific Development Information
Programme to Wikipedia's community)
URL: http://en.wikibooks.org/wiki/FOSS_A_General_Introduction

An Overview On Structure and Evolution of Free/Libre and Open Source Software Development

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

An Overview On Structure and Evolution of Free/Libre and Open Source Software Development

Hochgeladen von

Copyright:

Verfügbare Formate

This work is licensed under an Attribution-NonCommercial-ShareAlike 2.

5 Creative Commons Licence

An overview on structure and evolution of

Common features of Open Source/Free Software development

Illustration 1: A simple scheme for the common concept of “developer/user”

Cost comparison between Microsoft and FOSS Solutions

Free/Libre and Open Source Software

11 http://www.flossworld.org, http://www.flosspols.org, http://www.infonomics.nl/FLOSS

14 “Given enough eyeballs, all bugs are shallow”.

Academical studies on FLOSS development projects

Illustration 2: Onion-like structure of FLOSS development team

Investigating the communications' structure: centralization

Illustration 3: Structure of a bug report: a description followed by comments, put in

Illustration 4 - From this plot is clear that there's a strong

Investigating the communications' structure: hierarchy

Illustration 7: Connectedness distribution for the different

Illustration 8: Group hierarchy distribution for the different projects' repositories

Illustration 9: Efficiency distribution for the different projects' repositories

Illustration 10: Cumulative statistics: variables Pkde, Pmentor,

Illustration 11: Most active users statistics

1. Posters (both seekers and providers) active in one single list;

26 KDE is a desktop environment for GNU/Linux operating systems.

Illustration 12: Mailing lists and users network

Agents and artifacts in the space of FLOSS projects

The clamorous evolution of the Bazaar

Linux as a complex system

The meme, the gene and the code

Self-organization and Linux as an impossible good

Illustration 13: Rationalist model for reputation

44 Also known as “non-rivalry”.

Adaptive coordination evolution in Linux

3. Exploration of the space of possibilities

Emergent decision-making patterns

K. Crowston, K. Wei, Q. Li, and J. Howison

K. Crowston, J. Howison, and H. Annabi

R. Heckman, K.Crowston, Q. Li, E. Allen, U. Eseryel, J. Howison and K. Wei

S. Sowe, I. Stamelos and L. Angelis

K. Wong and P. Sayo

Das könnte Ihnen auch gefallen