A collection of PHP classes to manage bibliographic formatting for OS bibliography software using the OSBib standard. Taken from and originally developed in WIKINDX (http://wikindx.sourceforge.net).
Released through http://bibliophile.sourceforge.net under the GPL licence.
If you make improvements, please consider contacting the administrators at bibliophile.sourceforge.net so that your improvements can be added to the release package.
October 2005
Mark Grimshaw (WIKINDX)
Andrea Rossato (Uniwakka)
Guillaume Gardey (BibOrb)
Christian Boulanger (Bibliograph)
INTRODUCTION
BIBSTYLE
CITESTYLE
TESTOSBIB
PARSEXML
LOADSTYLE
PARSESTYLE
STYLEMAP
UTF8
BIBFORMAT
BIBFORMAT USAGE
CITEFORMAT
CITEFORMAT USAGE
OSBib is an Open Source bibliographic formatting engine written in PHP that uses XML style files to store formatting data for in-text or endnote-style (including footnote) citations and bibliographic lists. Released through Bibliophile, OSBib is designed to work with bibliographic data stored in any format via mapping arrays as defined in the class STYLEMAP
. For those bibliographic systems whose data are stored in or that can be accessed as bibtex-type arrays, STYLEMAPBIBTEX
is a set of pre-defined mapping arrays designed to get you up and running within a matter of minutes. Data stored in other formats require that STYLEMAP
be edited.
OSBib provides support for printing the formatted output to web browsers or for exporting to Rich Text Format (for insertion into OpenOffice and similar word processors), exporting to OpenOffice's native sxw format or to plain text with no font formatting.
Style files are stored in XML format and are available for download from the Bibliophile site at:
http://bibliophile.sourceforge.net
The naming of the style files to be downloaded is (for example):
OSBib-americanPsychologicalAssociation_1.0_1.1
where the first number (in this case '1.0') is the version number of the OSBib classes the style is at least compatible with and the second number is the version number of the style file itself. For an explanation of the structure of the XML file, see bibliography_xml and citation_xml.
The OSBib package has two sections which share some common PHP files. Files in the directory format/ will format the bibliography output as described above. Files in the directory create/ will create or edit the XML style files. As supplied in the OSBib package, the create interface is stand-alone and runs via index.php
. Users wishing to integrate the creation/editing interface within their bibliographic management system will need to modify or extract various portions of index.php
for use in their own PHP code.
This is not part of the distribution package but is here as an example of how WIKINDX uses OSBib-Format. BIBSTYLE::process()
is the loop that parses each bibliographic entry one by one. You are likely to need a similar process loop. Further comments are found in CITESTYLE.php.
This is not part of the distribution package but is here as an example of how WIKINDX uses OSBib-Format. CITESTYLE::start()
is the method that parses citations within a block of text. You will need a similar method. Further comments and help are found in CITESTYLE.php. Many of the methods used in CITESTYLE are similar to those used in BIBSTYLE so are not here described separately.
This is not part of the distribution package but is here as a very simple example of how to set up bibliography formatting without many of the extra options found in BIBSTYLE. It can be run direct from a web browser to display how raw input is transformed into a formatted bibliography.
Parse the XML style file into usable arrays. Used within BIBFORMAT::loadStyle()
and CITEFORMAT
.
include_once($pathToOsbibClasses . "LOADSTYLE.php");
ARRAY LOADSTYLE::loadDir($pathToStyleFileDirectory);
This scans the style file directory and returns an alphabetically sorted (on the key) array of available bibliographic styles e.g.
$styles = LOADSTYLE::loadDir("styles/bibliography");
print_r($styles);
An example output from this would be :
Array ( [APA] => American Psychological Association (APA) [BRITISHMEDICALJOURNAL] => British Medical Journal (BMJ) [CHICAGO] => Chicago [HARVARD] => Harvard [IEEE] => Institute of Electrical and Electronics Engineers (IEEE) [MLA] => Modern Language Association (MLA) [TEST] => test [TURABIAN] => Turabian [WIKINDX] => WIKINDX -- Show All )
Use this to provide your users with a HTML FORM selectbox to choose their preferred style where the key from the array above is used in BIBFORMAT::loadStyle()
.
This is used internally in BIBFORMAT
and CITEFORMAT and parses a single style definition string for a particular resource type (book, web article etc.) from a style XML file into an array to be used by OSBib.
(If your database stores or access its records in a BibTeX style format, you should use STYLEMAPBIBTEX
instead as this has been specially devised to offer an out-of-the-box solution for such systems and is a version of STYLEMAP
that should not require editing. See also USAGE below.)
This contains all the mapping between your particular database/bibliographic management system and OSBib. There are plenty of comments in that file so read them carefully.
1/ You should edit $this->types
array.
2/ You should edit each resource type's array changing only the key of each element. However, do not edit any key (or its value) that is 'creator1', 'creator2', 'creator3', 'creator4' or 'creator5'. For resource types in $this->types
that you set to FALSE, you do not need to do anything to the specific resource array as these arrays will then be ignored.
A SQL query in WIKINDX to display each resource in a format suitable for OSBib processing may return the following associative array for one resource:
Array ( [resourceId] => 1 [type] => journal_article [title] => {X} Window System, Version 11 [subtitle] => [noSort] => The [url] => [isbn] => [field1] => 20 [field2] => S2 [field3] => [field4] => [field5] => [field6] => [field7] => [field8] => [field9] => [file] => [collection] => 1 [publisher] => [miscField1] => [miscField2] => [miscField3] => [miscField4] => [tag] => [addUserIdResource] => 1 [editUserIdResource] => [year1] => 1990 [year2] => [year3] => [pageStart] => [pageEnd] => [creator1] => 1,2,3 [creator2] => [creator3] => [creator4] => [creator5] => [quotes] => [paraphrases] => [musings] => [publisherName] => [publisherLocation] => [publisherType] => [collectionTitle] => Software Practice and Experience [collectionTitleShort] => [collectionType] => journal [timestamp] => 2005-04-24 10:48:15 )
What is important here is that the key names of the above array match the key names of the resource type arrays in STYLEMAP
. This is how the data from your particular database is mapped to a format that OSBib understands and this is why you must edit the key names of the resource type array in STYLEMAP
. The one exception to this is the handling of creator elements (author, editor, composer, inventor etc.) which OSBib expects to be listed as 'creator1', 'creator2', 'creator3', 'creator4' and 'creator5' where 'creator1' is always the primary creator (usually the author). Do not edit these key names.
include_once($pathToOsbibClasses . "BIBFORMAT.php");
$utf8 = new UTF8();
BIBFORMAT
expects its data to be in UTF-8 format and will return its formatted data in UTF-8 format. If you need to encode or decode your data prior to or after using OSBib, do not use PHP's utf8_encode() and utf8_decode() functions. Use the OSBib functions UTF8::encodeUtf8() and UTF8::decodeUtf8() instead. Additionally, if you need to manipulate UTF-8-encoded strings with functions such as strtolower(), strlen() etc., you should strongly consider using the appropriate methods in the OSBib UTF8 class.
METHODS
Properly encode a string into multi-byte UTF-8. |
Properly decode a multi-byte UTF-8 string. |
Convert a UTF-8 string to lowercase. Where PHP has been compiled with mb_string, mb_strtolower() will be used. |
Convert a UTF-8 string to uppercase. Where PHP has been compiled with mb_string, mb_strtoupper() will be used. |
Return a portion of a UTF-8 string. Where PHP has been compiled with mb_string, mb_substr() will be used. |
Ensure that the first letter of a UTF-8 string is uppercase. |
Return the length of a UTF-8 string. Where PHP has been compiled with mb_string, mb_strlen() will be used. |
This is the main OSBib engine for formatting bibliographic entries.
include_once($pathToOsbibClasses . "BIBFORMAT.php");
$bibformat = new BIBFORMAT([STRING: $pathToOsbibClasses = FALSE, BOOLEAN: $useBibtex = FALSE]);
By default, $pathToOsbibClasses
will be the same directory as BIBFORMAT
is in.
NB - BIBFORMAT
expects its data to be in UTF-8 format and will return its formatted data in UTF-8 format. If you need to encode or decode your data prior to or after using OSBib, do not use PHP's utf8_encode() and utf8_decode() functions. Use the OSBib functions UTF8::encodeUtf8() and UTF8::decodeUtf8() instead. Additionally, if you need to manipulate UTF-8-encoded strings with functions such as strtolower(), strlen() etc., you should strongly consider using the appropriate methods in the OSBib UTF8 class.
PROPERTIES (to be set after instantiating the BIBFORMAT
class)
$bibformat->output
-- By default this property is 'html' but you can change it to 'rtf' for exporting to RTF files, 'sxw' for OpenOffice or 'plain' for plain text. It is used to format bold, underline, italics etc. for the appropriate output medium.
$bibformat->patterns
-- A preg pattern (e.g. "/matchThis|matchThat/i"
) that, in conjunction with $bibformat->patternHighlight,
is used to highlight words or phrases when displaying the results to a browser. This is useful when the bibliography to be displayed is the result of a SQL search. Default is FALSE and its value will be ignored if $bibformat->output
is anything other than 'html'.
$bibformat->patternHighlight
-- A CSS class defining the highlighting for above. Default is FALSE.
$bibformat->bibtexParsePath
-- If you wish to use STYLEMAPBIBTEX
because your database stores or accesses its data in a form similar to BibTeX, you should set the constructor parameter $useBibtex
to TRUE and set this property to the path where PARSECREATORS
, PARSEMONTH
and PARSEPAGE
can be found. These classes are not part of OSBib but are part of the bibtexParse package that can be downloaded from http://bibliophile.sourceforge.net. By default, this path will be to a bibtexParse/
directory in the same directory as BIBFORMAT
is in.
$bibformat->cleanEntry
-- If TRUE, convert BibTeX (and LaTeX) special characters to UTF-8. Default is FALSE.
METHODS
Parses the XML style file into raw arrays (to be further processed in These last two are used in |
Transform the raw XML arrays from
The following should be called for each database row you wish to process. |
Among other things, |
Internally within BIBFORMAT
, data from the SQL query $row
is formatted and stored in a $item
associative array. The following methods accomplish this:
This method should be called for each type of creator the resource has. (See
|
Bibliographic styles may require the book edition number to be a cardinal or an ordinal number. If your edition number is stored in the database as a cardinal number, then it will be formatted as an ordinal number if required by the bibliographic style. If your edition number is stored as anything other than a cardinal number it will be used unchanged. The conversion is English - i.e. '3' => '3rd'. This works all the way up to infinity-1 ;-) |
BIBFORMAT::formatPages() |
BIBFORMAT::formatDate() |
Running time for films, broadcasts etc. |
Add an item to the internal |
Add all remaining items to the internal |
BIBFORMAT::map() After you have added resource elements to the $item array using the methods above, calling map() will produce a formatted string suitable for printing to the output medium. |
The formatting in BIBFORMAT
works on one resource at a time so you will want to call it via a loop as you cycle through your data.
If you do not intend to use // Instantiate the After loading $bibformat->longMonth = array( 1 => 'January', 2 => 'February', 3 => 'March', 4 => 'April', 5 => 'May', 6 => 'June', 7 => 'July', 8 => 'August', 9 => 'September', 10 => 'October', 11 => 'November', 12 => 'December', ); $bibformat->shortMonth = array( 1 => 'Jan', 2 => 'Feb', 3 => 'Mar', 4 => 'Apr', 5 => 'May', 6 => 'Jun', 7 => 'Jul', 8 => 'Aug', 9 => 'Sep', 10 => 'Oct', 11 => 'Nov', 12 => 'Dec', ); The title/subtitle separator can be set as: $citeformat->titleSubtitleSeparator = ": "; // process loop starts here: |
If you are using // Instantiate the // process loop starts here: |
This is the main OSBib engine for formatting in-text and endnote-style citations within a block of text.
include_once($pathToOsbibClasses . "CITEFORMAT.php");
$citeformat = new CITEFORMAT(CLASSOBJECT: &$bibstyleClass, CLASSMETHOD: $process [, STRING: $pathToOsbibClasses = FALSE]);
CITEFORMAT
uses BIBFORMAT
to format its appended bibliographies. You must set up a class similar to BIBSTYLE
and a method similar to BIBSTYLE::process()
(see above) prior to implementing CITEFORMAT
and passing both the class and the method to CITEFORMAT
.
By default, $pathToOsbibClasses
will be the same directory as CITEFORMAT
is in.
NB - CITEFORMAT
expects its data to be in UTF-8 format and will return its formatted data in UTF-8 format. If you need to encode or decode your data prior to or after using OSBib, do not use PHP's utf8_encode() and utf8_decode() functions. Use the OSBib functions UTF8::encodeUtf8() and UTF8::decodeUtf8() instead. Additionally, if you need to manipulate UTF-8-encoded strings with functions such as strtolower(), strlen() etc., you should strongly consider using the appropriate methods in the OSBib UTF8 class.
PROPERTIES (to be set after instantiating the CITEFORMAT
class)
$citeformat->output
-- By default this property is 'html' but you can change it to 'rtf' for exporting to RTF files or 'plain' for plain text. It is used to format bold, underline, italics etc. for the appropriate output medium.
$citeformat->hyperlinkBase
-- By default this property is FALSE but, if displaying the parsed block of text back to a web browser, you can turn on hyperlinking of citations by specifying the URL instead. CITEFORMAT
will append the unique ID number as extracted for each bibliographic entry from the database (see usage below). WIKINDX uses "index.php?action=resourceView&id="
.
CITEFORMAT
is a little more complex than BIBFORMAT
to use mainly due to disambiguation requirements, decisions as to whether to use in-text citation, endnote or footnote citations etc. etc. etc. so read the instructions carefully.
The following is a rough order of events you will need to set up and is a general outline of what happens in // Instantiate the After loading $citeformat->longMonth = array( 1 => 'January', 2 => 'February', 3 => 'March', 4 => 'April', 5 => 'May', 6 => 'June', 7 => 'July', 8 => 'August', 9 => 'September', 10 => 'October', 11 => 'November', 12 => 'December', ); $citeformat->shortMonth = array( 1 => 'Jan', 2 => 'Feb', 3 => 'Mar', 4 => 'Apr', 5 => 'May', 6 => 'Jun', 7 => 'Jul', 8 => 'Aug', 9 => 'Sep', 10 => 'Oct', 11 => 'Nov', 12 => 'Dec', ); Two forms of possessive (for creator names) and 'et al.' equivalent can be set as: $citeformat->possessive1 = "'s"; // Set to FALSE if not used $citeformat->possessive2 = "s"; // Set to FALSE if not used $citeformat->textEtAl = "et al."; // start() is the method called externally that starts the whole process: |