BioPHP: PHP for Biocomputing

Last updated: April 22, 2003

BioPHP Installation/Quickstart top

Prepared by Serge Gregorio

For purposes of this document, the terms BioPHP and GenePHP mean the same thing -- the body of code produced by this project. They may be used interchangeably. The following are required to develop bioinformatics applications using BioPHP:

  1. A text editor

    Any would do (notepad for Windows, vi for Unix) but if you want an integrated development environment completed with debugging tools, check out PHPEd from Nusphere Corporation. If you intend to develop complex web-based bioinformatics applications, an HTML editor like Microsoft Frontpage or Macromedia Dreamweaver would be useful.

  2. The PHP intepreter - you can get download one from the Official PHP website.

  3. A web server - Microsoft IIS and Apache are two popular web servers today.

Teaching you how to install the above software or how to program in the PHP scripting language is beyond the scope of this tutorial. The links listed above have more than enough information on that subject.

To use BioPHP/GenePHP, take the following steps:

  1. Download the source code in either TAR format or ZIP format from the BioPHP website. Decompress them using the following commands:
       If you downloaded the TAR package:
    
       Unix prompt> tar -xv genephp_1.0.tar
    	
       If you downloaded the ZIP package:
    	
       DOS prompt> pkunzip genephp_1.0.zip
    
  2. Place them in the directory that serves as your webserver's document root (e.g. c:\inetpub\ in Windows or /usr/local/www/httpd/ in Unix) or in some folder for your PHP library or include files.

  3. Inside your PHP script, put any or all of the following line at the top of your code, as this example shows:
     <?php	
       require_once("seq.php");
       require_once("seqdb.php"); 
       require_once("resten.php"); 
       require_once("seqdb.php"); 
       require_once("etc.php"); 
       // The rest of your PHP code here.
     ?>

Which file(s) to include depends on the BioPHP classes and methods you intend to use in your PHP script. If in doubt, include all of them. For more information, consult the BioPHP Technical Documentation. Later, I might release a package where all the code is in just one file for convenience.

[ Back to Top ] [ Back to Homepage ]

Quickstart

I. Working with Sequence Databases

A. Create, open, and navigate a database

Before you can even begin writing serious bioinformatics applications, you must have access to a sequence database. There are two ways of accessing sequence data - online (through the internet) or offline (the file(s) is stored locally in your computer). BioPHP version 1.0 only supports the latter method. Also, it only supports the GenBank format from the National Center for Biotechnology Information (NCBI) and the Swissprot format from the EMBL.

You can download sequence data from these sites.

B. Read a sequence from a database

Once you've downloaded one or more *.SEQ files, place them in your webserver's document root directory together with your BioPHP files.

Let's assume our *.SEQ file is named gbuna.seq.

Before BioPHP can do anything with our *.SEQ file, we must first create indexes for them.
The script snippet below illustrates this.

// Syntax: $seqdb = new seqdb($dbname, $dbformat, $file1, $file2, ...);

<?php
require_once("seqdb.php");

$seqdb_obj = new seqdb("myfirstdb", "", "gbuna.seq");
?>

The second argument (database format) is optional. Passing a blank string sets it to its default value of "GENBANK".

The third, fourth, and succeeding arguments (virtually unlimited) are the *.SEQ files we wish to include in our database. To keep things simple, we've only provided one *.SEQ file in our example.

Save and run the script in your debugger or web browser. While you may not see any visual feedback, you can check if your script worked by looking for two files in your document root, myfirstdb.idx and myfirstdb.dir. If they're there, then your script worked just fine. If not, then Houston, we have a problem! We will learn more about trapping these kinds of errors (failure of open/create databases) later.

II. Sequence Analysis

There are quite a number of tools included in BioPHP 1.0 for analyzing sequences. Here is just one example.

To find all mirrors inside a sequence string, we use the find_mirrors() function. Recall that this function would return a three-dimensional array of the form:

([len1] => ((mirror1, pos1), (mirror2, pos2)), [len2] => (…), …)

The code below would list all mirrors within the sequence "AGGGAATTAAGTAAATGGT
AGTGG", that have even lengths between 6 and 8 letters.

CODE: 
  
  <?php
  require("seq.php");

  $seq_obj = new seq();
  $seq_obj->sequence = "AGGGAATTAAGTAAATGGTAGTGG";
  $mirrors_found = find_mirror($seq_obj->sequence, 6, 8, "E");
  print "List of Mirrors Found<BR>";
  foreach($mirrors_found as $mirror_length => $mirrors_samelength)
     {
     print "Of Length $mirror_length<BR>";
     foreach($mirrors_samelength as $mirror)
        {
        print "&nbsp;&nbsp;";
        print "Mirror String: " . $mirror[0] . " Position Index: " 
              . $mirror[1];
        print "<BR>";
        }
     }
  ?>

OUTPUT:

  List of Mirrors Found
  Of Length 6
    Mirror String: AATTAA Position Index: 4
    Mirror String: ATGGTA Position Index: 14
  Of Length 8
    Mirror String: GAATTAAG Position Index: 3


[ Back to Top ] [ Back to Homepage ]

 


Copyright © 2003 by Sergio Gregorio, Jr.
All rights reserved.