BioPHP: PHP for Biocomputing

Last updated: October 20, 2003

All the scripts were written by Serge Gregorio. Check out the Writer module for Kegg Reaction records (and its source code) and the SQL/XML Converter module in the GenBank DNA parser. The Remote Data Retrieval via DBGet module for Swissprot protein records might not work here as Sourceforge has disabled the CURL library for PHP. This section is (always) under development.

To view the source code for the smaller, simpler file formats, click here.


Input/Output Scripts top

Amino Acid databases

AAIndex File (20% complete)

Extracts individual data elements from a AAIndex file (amino acids). At the moment, it can only parse three data fields: Accession No, Data Description, and Literature References.

DNA databases

EMBL (10% complete)

Extracts individual data elements from an EMBL DNA file.

GenBank (has a SQL/XML Converter module!)

Extracts individual data elements from a GenBank file (nucleic and amino acids).

RefSeq (0 % complete)

Extracts individual data elements from a RefSeq record (in FASTA format).

Protein and related databases

BLOCKS (40% complete)

Extracts individual data elements from a BLOCKS protein family file.

PDB (80% complete)

Extracts individual data elements from a PDB protein file. At the moment, it cannot yet parse the ff. sections: REMARK1 to REMARK4, ANISOU, SIGUIJ, TER, HETAM, and ENDMDL.

[ View source code for the PDB file parser ]

PDBSTR (10% complete)

Extracts individual data elements from a PDBSTR protein file.

PIR (Codata) (10% complete)

Extracts individual data elements from a PIR (Codata) protein file.

PMD (40% complete)

Extracts individual data elements from a Protein Mutant Database (PMD) file.

PRF (80% complete)

Extracts individual data elements from a PRF file.

PRINTS Motif File (10% complete)

Extracts individual data elements from a PRINTS Protein Motif file.

PRODOM (20 % complete)

Extracts individual data elements from a PRODOM protein family file.

Prosite Motif File (90% complete)

Extracts individual data elements from a Prosite Motif file. Still under development. Cannot display all entries in the '/M:' qualifier in the 'MA' (matrix/profile) section of your input Prosite file.

Swissprot (85% complete) (has Remote Data Retrieval via DBGet module!)

Extracts individual data elements from a Swissprot Protein file. Still under development. At the moment, it can display everything except for the FEATURES section. The DBGet module uses the CURL library by Daniel Stenberg. If it doesn't appear to work, it only means Sourceforge (the host of this site) still hasn't added support for it yet. I've filed a request though.

Other databases

EPD Promoter File (10% complete)

Extracts individual data elements from an EPD file. EPD stands for Eukaryotic Promoter Database.

Entrez Genome (80% complete)

Extracts individual data elements from an Entrez Genome file or record, except for the FEATURES section.

[ View source code of Entrez Genome parser ]

ENZYME (Expasy)

Extracts individual data elements from an ENZYME (Exapsy) file or record.

HGBase Haploid-type Mutation File (80% complete)

Extracts individual data elements from an HGBase Haploid-type Mutation file.

Kegg Compound File

Extracts individual data elements from a Kegg Compound file.

Kegg Enzyme File (90% complete)

Extracts individual data elements from a Kegg Enzyme file. At the moment, all sections except REFERENCE can be parsed.

Kegg Genome File (15% complete)

Extracts individual data elements from a Kegg Genome file. At the moment, it can only parse the ENTRY, NAME, DEFINITION, and TAXONOMY sections.

Kegg Orthology Groups File

Extracts individual data elements from a Kegg Orthology (KO) file.

Kegg Reaction File (has a Writer module!)

Extracts individual data elements from a Kegg Reaction file. Then from the results page, you can edit the data fields, and write them back as a Kegg Reaction record.

[ View source code for all Kegg file parsers ]

Transfac Binding Sites (25% complete)

Extracts individual data elements from a Transfac Regulatory Protein Binding Sites file. At the moment, it can only parse the Accession, Identifier, Date, Sequence Type, Description, Gene Region data fields.

Transfac Cell File

Extracts individual data elements from a Transfac Cell record.

Transfac Factor Table (25% complete)

Extracts individual data elements from a Transfac Transcription Factor Table record.

Transfac Gene Files (70% complete)

Extracts individual data elements from a Transfac Gene file.

Transfac Matrix Files (40% complete)

Extracts individual data elements from a Transfac Matrix file.

Transfac T-Factor Class (50% complete)

Extracts individual data elements from a Transfac Transcription Factor Class record.

UNIGENE (15% complete)

Extracts individual data elements from a Unigene record.



 


Copyright © 2003 by Sergio Gregorio, Jr.
All rights reserved.