Sie sind auf Seite 1von 15

Introduction

CVSAnaly: retrieving data from SCMs


GNU R: computing statistics

Analysis of FLOSS projects


Libre Software course

Felipe Ortega

jfelipe@libresoft.es
GSyC/Libresoft

3 de marzo de 2010

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

(cc) 2010 Felipe Ortega.


Some rights reserved. This document is distributed under the Creative
Commons Attribution-ShareAlike 3.0 licence, available in
http://creativecommons.org/licenses/by-sa/3.0/

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

Index

1 Introduction

2 CVSAnaly: retrieving data from SCMs

3 GNU R: computing statistics

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

Why we need metrics about FLOSS projects?

Describe and account FLOSS project organization


Characterize evolution patterns in FLOSS project.
Finding out common structures in FLOSS projects

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

Overview of the analysis

Identify target project(s)


Retrieve information from SCM platform (CVS, SVN, GIT...)
Compute metrics and graphs locally
Arrange results and compose report

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

What we need?

Automated tools
Libre software
Support for data management and statistics
MySQL/SQLite + CVSAnalY + GNU R

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

Index

1 Introduction

2 CVSAnaly: retrieving data from SCMs

3 GNU R: computing statistics

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

CVSAnalY: the tool

The CVSAnalY tool extracts information out of source code


repository logs and stores it into a database.
Support: CVS, SVN, GIT.
Future support: Mercurial, Bazaar.
Outcome: database of extracted information

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

CVSAnalY database schema

Data for every commit (scmlog )


Data for every file recorded (files)
Data for every contributor (people)
Basic metrics: sloc, loc, ncomments, nfunctions...
http://melquiades.flossmetrics.org/wiki/doku.php?id=scm

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

Installing CVSAnalY

Dependencies:
mysql-server
python (sqlite3, python-pysqlite2)
python-mysqldb
sloccount
automake
autoconf
git-core
pkg-config

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

Installing CVSAnalY

git clone git://git.libresoft.es/git/cvsanaly


git clone git://git.libresoft.es/git/repositoryhandler
Install Repository Handler $ ./autogen.sh && make &&
sudo make install (or copy repositoryhandler dir in
/cvsanaly )
Install CVSAnalY2 $ sudo python ./setup.py install
Check everything correctly works $ cvsanaly2 --help
Retrieving info from remote URI and using SQLite as dbserver.
$ cvsanaly2 --db-driver sqlite -d /db/nautilus.db
http://svn.gnome.org/svn/nautilus

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

Index

1 Introduction

2 CVSAnaly: retrieving data from SCMs

3 GNU R: computing statistics

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

Why using GNU R

Most widespread tool for statistical analysis


More than 2,000 libraries already availabe.
Perfect integration with DBs (MySQL, SQLite).
Easy automation (scripts).

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

Installing GNU R

# apt-get install r-base r-recommended


r-cran-rmysql
You can also install additional packages (from R CRAN).
Download package, then: $ sudo R CMD INSTALL
package.gz

Felipe Ortega Analysis of FLOSS projects


Introduction
CVSAnaly: retrieving data from SCMs
GNU R: computing statistics

GNU R: receipts

Scripts computing statistics (numbers, graphs).


Total figures, averages...
Graphs for aggregated statistics
Example of metric evolution (monthly number of commits)
Advanced techniques made easy (season-trend
decomposition).
Usage: $RScript <receipt-name.R> <dbname.db>

Felipe Ortega Analysis of FLOSS projects

Das könnte Ihnen auch gefallen