Now I would like to release the source code of the software described in an earlier post. The package will be released under the GNU General Public License, any release of other software which includes the full source code or parts of it must also be released under the GPL.
The program includes three main data structures: the list of all publications from the Bibtex file, a list of authors, and a list of relations. The latter two are implemented as python classes. As first step, the full bibtexfile is parsed into an array of dictionaries with the package BibtexParser. Second, the author list is created by iterating over all publications and authors in the bibtex database. Third, the relationsship array is created with entries for all combinations of two authors A and B by iterating over the publications list and checking if both A and B are author of a specific publication. The the graph is constructed by adding nodes for each author and edges for all relations. As a last step, the graph is compacted by constraints given as arguments. Two short examples follow to describe such kind of constraints. The parsed bibtexfile is included in the published package.
Starting it without any argument will show the license info and the help.
A Graph Display Software for bibtex databases Copyright (C) 2015 Benjamin Laemmle, jdmorise a t yahoo.com This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. usage: "Examples: optional arguments: -h, --help show this help message and exit -if INPUT_FILENAME, --input_filename INPUT_FILENAME Filename of publication database in bibtex format -gf GRAPH_FILENAME, --graph_filename GRAPH_FILENAME Filename of graph output stored as png -ma MAIN_AUTHOR_NAME, --main_author_name MAIN_AUTHOR_NAME -ert EDGE_RELATION_THRES, --edge_relation_thres EDGE_RELATION_THRES Only add edges with ERT or more number of relations -art AUTHOR_RELATION_THRES, --author_relation_thres AUTHOR_RELATION_THRES Only add authors with ART or more number of relations -apt AUTHOR_PUBLICATION_THRES, --author_publication_thres AUTHOR_PUBLICATION_THRES Only add authors with APT number of publications -lvl LEVEL, --level LEVEL -b BEFORE, --before BEFORE Only use Publications before YEAR for the graph -a AFTER, --after AFTER Only use Publications after YEAR for the graph -gp GRAPH_PROGRAMM, --graph_programm GRAPH_PROGRAMM Graph Programm for rendering the graph. one of the following: fdp,dot,sfdp,circo,twopi.
The first example was created without any filter, just specifying the input database and the output picture:
python graph_coaut_bibtex.py -if Darabi.bib -gf graph_plain.png
Now, in the second example the filters are explained in more detail. First, we would like to see only direct relations between an author and anybody else, so only direct collaboration. This is achieved by specifying the “main author” (-ma “A. Mirzaei” ) and the collaboration “level” (-lvl 1). Then, we would like to remove all authors with less than three publications (e.g. P. Suri) by adding an “author publication threshold” (-apt 3). To make it more readable, we also remove all edges with less than three relations by specyfing an “edge relation threshold” (-ert 3), which removes a couple of authors like K. Juan.
python graph_coaut_bibtex.py -if Darabi.bib -gf graph_red.png -ma "A. Mirzaei" -lvl 1 -apt 3 -ert 3
The reduced graph is shown below with only a limited amount of authors remaining.
Additional arguments include an “author relation threshold”(-art) to remove all authors with only few relations, and you can filter by publication date by specifying a certain range. As a last argument, the graph generation programm can be selected (-gp) where fdp is chosen as default. The GraphViz documentation will give more details about the examples.
The python code can be found together with the example bibtexfile and the license file in my DropBox. Have fun!