Monday, June 25, 2007

Week 5 Project Update

Below is the weekly update email I sent out this morning.

PROJECT: Command Line Topological Query Application for BioSQL

Last Week I:
  • Tried to work with Bio::Phylo .. decided to stick with Bio::Tree
  • Finished PhyExport - a command line program to export trees from the PhyloDB
This week:
  • Start PhyOpt - a phylogeny optimizer program to calculate nested sets
  • Extend PhyExport to export subtrees based on nested set values calculated by Ph

Friday, June 22, 2007

PhyExport: Working copy posted

I now have a working version of PhyExport (as phyexport.pl) posted on the project source code repository. This version uses the Bio::Tree object. The biggest problem I had was to figure out how to add data to the tree object in a recursive subfunction. The recursive subfunction was used to fetch all of the children nodes from the root.

I ended up giving the program the package name PhyloDB. I then used 'our $tree' to set the scope of tree object to a package level variable. This allowed me to add nodes to the tree as $PhyloDB::tree.

PhyExport can now export node names, edge lengths, and boot strap values in any export format that Bio::Tree can use.

I have a lot of clean up work to do with this, but at least I have something that works now.

Wednesday, June 20, 2007

PhyImport Bug: Child and parent switched

I just realized that the child and parent ids were getting switched in the edge table using the phyimport program. I did not realize this until I tried to extract trees back out of the database..silly mistake.

Tuesday, June 19, 2007

Google Changing Midterm Evaluation Criteria

Although I plan on meeting my midterm deadlines for submitting code to my project's svn repository on google, I just saw that google is changing the midterm evaluation requirements. There is a lot of signal to noise problems on the GSOC student and announce mailing lists so I am posting this here FYI.

So, we're actually changing the code submission requirement for the mid-term to a requirement that you fill out a survey instead - we haven't had the resources to implement the infrastructure we wanted to have in place before you submitted your code to us.

I'll also make a post about this to the announcement list, but:

1) No student mid-term code submission.
2) Students need to take a survey instead.
3) Instructions for completing the survey will be sent to the program
announcement list.
4) You will be able to complete the survey between July 9 - July 16th.

I'll also update the FAQ in the next few days.

Cheers,
LH

Bio::Phylo -- giving up for now

The ioctl problem in Bio::Phylo is not easy to fix .. I ran the script with strace:

strace perl -le phylotest.pl

and I get the following output that I really can't figure out:

execve("/usr/local/bin/perl", ["perl", "-le", "phylotest.pl"], [/* 41 vars */]) = 0
uname({sys="Linux", node="JamieHidThis", ...}) = 0
brk(0) = 0x9643000
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls/i686/mmx/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls/i686/mmx", 0xbfffa1a0) = -1 ENOENT (No such file or directory)
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls/i686/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls/i686", 0xbfffa1a0) = -1 ENOENT (No such file or directory)
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls/mmx/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls/mmx", 0xbfffa1a0) = -1 ENOENT (No such file or directory)
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls", 0xbfffa1a0) = -1 ENOENT (No such file or directory)
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/i686/mmx/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/i686/mmx", 0xbfffa1a0) = -1 ENOENT (No such file or directory)
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/i686/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/i686", 0xbfffa1a0) = -1 ENOENT (No such file or directory)
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/mmx/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/mmx", 0xbfffa1a0) = -1 ENOENT (No such file or directory)
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/libperl.so", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\10\2\000"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0555, st_size=1194580, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75f8000
old_mmap(NULL, 1205760, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x944000
old_mmap(0xa5e000, 45056, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x119000) = 0xa5e000
old_mmap(0xa69000, 5632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xa69000
close(3) = 0
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=80436, ...}) = 0
old_mmap(NULL, 80436, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb75e4000
close(3) = 0
open("/lib/libnsl.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 <\0\000"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=87563, ...}) = 0
old_mmap(NULL, 80480, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xc5b000
old_mmap(0xc6c000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x11000) = 0xc6c000
old_mmap(0xc6d000, 6752, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xc6d000
close(3) = 0
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/libdl.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/lib/libdl.so.2", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260\32"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=13601, ...}) = 0
old_mmap(NULL, 12244, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x907000
old_mmap(0x909000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x1000) = 0x909000
close(3) = 0
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/libm.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/lib/tls/libm.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\3604\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=185942, ...}) = 0
old_mmap(NULL, 135616, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xee2000
old_mmap(0xf03000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x21000) = 0xf03000
close(3) = 0
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/libpthread.so.0", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/lib/tls/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0G\0\000"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=86486, ...}) = 0
old_mmap(NULL, 65140, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x33d000
old_mmap(0x34a000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xc000) = 0x34a000
old_mmap(0x34b000, 7796, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x34b000
close(3) = 0
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/lib/tls/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200X\1"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1516255, ...}) = 0
old_mmap(NULL, 1279980, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x15c000
old_mmap(0x28f000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x132000) = 0x28f000
old_mmap(0x292000, 10220, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x292000
close(3) = 0
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/libcrypt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/lib/libcrypt.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\t\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=22242, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e3000
old_mmap(NULL, 181308, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x111000
old_mmap(0x116000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x4000) = 0x116000
old_mmap(0x117000, 156732, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x117000
close(3) = 0
open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/libutil.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/lib/libutil.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\16\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=11375, ...}) = 0
old_mmap(NULL, 11012, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xabf000
old_mmap(0xac1000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x1000) = 0xac1000
close(3) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e2000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb75e2080, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
munmap(0xb75e4000, 80436) = 0
set_tid_address(0xb75e20c8) = 20752
rt_sigaction(SIGRTMIN, {0x341660, [], SA_RESTORER|SA_SIGINFO, 0x347f80}, NULL, 8) = 0rt_sigprocmask(SIG_UNBLOCK, [RTMIN], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=10240*1024, rlim_max=RLIM_INFINITY}) = 0
rt_sigaction(SIGFPE, {SIG_IGN}, {SIG_DFL}, 8) = 0
brk(0) = 0x9643000
brk(0x9664000) = 0x9664000
brk(0) = 0x9664000
getuid32() = 507
geteuid32() = 507
getgid32() = 507
getegid32() = 507
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=32148976, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb73e2000
close(3) = 0
mmap2(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb73c1000
time([1182268075]) = 1182268075
stat64("/home/jestill/src/bioperl/bioperl-live/5.8.0/i386-linux-thread-multi", 0xbfffa800) = -1 ENOENT (No such file or directory)
stat64("/home/jestill/src/bioperl/bioperl-live/5.8.0", 0xbfffa800) = -1 ENOENT (No such file or directory)
stat64("/home/jestill/src/bioperl/bioperl-live/i386-linux-thread-multi", 0xbfffa800) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/5.8.0/i386-linux-thread-multi", 0xbfffa800) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/5.8.0", 0xbfffa800) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/5.8.0/i386-linux-thread-multi", {st_mode=S_IFDIR|0755, st_size=8192, ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
_llseek(0, 0, 0xbfffa5f0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
_llseek(1, 0, 0xbfffa5f0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
_llseek(2, 0, 0xbfffa5f0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
open("/dev/null", O_RDONLY|O_LARGEFILE) = 3
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfffa688) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(3, 0, [0], SEEK_CUR) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
fstat64(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
readlink("/proc/self/exe", "/usr/bin/perl", 4095) = 13
getpid() = 20752
getppid() = 20751
close(3) = 0
exit_group(0) = ?
Process 20752 detached


It looks like the problem is toward the bottom involving SNDCT_TMR_TIMEBASE but I really have no clue what that is. I am giving up on using Bio::Phylo at least for now.

Monday, June 18, 2007

Bio::Phylo

I have installed Bio::Phylo from CPAN. I will first try to get phyimport.pl up and running using the Bio::Phylo object model for nodes. If this works without too much trouble, I will use Bio::Phylo for PhyExport as well. This should allow more of the information related to the tree to be added to the database and exported to output files.

Bio::Phylo documentation.

Update:
I am getting the following error when trying to parse a NEXUS file
"Inappropirate ioctl for device"

I don't know what is going on with that. If I don't get this working quickly I will abandon Bio::Phylo.

Week 4 Project Update

I spent most of last week out of town for a meeting so I have some catching up to do this week.

Last week:
  • Tried to get a more stable dsn parser to work by using DBI subfunction parse_dsn
  • Finished up PhyImport
  • Updated code to better fit with existing bioperl coding standards
  • Fixed my installation of bioperl (This was to get NEXUS file import working)
    • I am now working from bioperl-live
  • Committed biosql-phylodb-mysql.sql to biosql-schema CVS
    • This is my first commit to a group project :)
    • I hope bioperl converts to SVN soon
This week:
  • Finish PhyExport to export trees from the database to text files
  • Start PhyOpt by getting a precomputed nested sets working
  • Try to figure out if I really want to continue to use Bio::Tree or switch to Bio::Phylo
    • The Bio::Phylo object seems more rich but it is not a bioperl module

PhyImport: Added root node info to tree table

I have added the root node to the tree table. This was not to hard to do using the Tree object in bioperl.

I tried to get the parse_dsn subfucntion from DBI to work, but it is not parsing the dsn correctly, and I need to move forward on other aspects of the project. For now, only a specific dsn string will be properly parsed by PhyImport.

I have changed the name of programs to lowercase.

I am considering switching to using R. Vos's Bio::Phylo. It seems like a richer object model for phylogenies.

I am setting PhyImport aside for now to work on PhyExport but I will come back to it later.

Friday, June 15, 2007

PhyImport: Trying to fix import of NEXUS file

It seems like the trouble I am having with nexus file parsing has something to do with the installation of bioperl I was using. Running PhyImport.pl with the bioperl-live has fixed the problem.

Note to self:
To see what version of bioperl is being used, I need to do the following from the command line:
$ perl -MBio::Perl -le 'print Bio::Perl->VERSION;'


Thursday, June 14, 2007

Attempted to Post to CVS

I just tried to attempt to commit something to the CVS server for the first time (biosql-phylodb-mysql.sql). This was my first attempt to commit with CVS and I am not sure if it worked. It seems like SVN is a bit easier to use.

PERL Coding Practices

I got a really good email from my mentor regarding coding suggestions and coding practices in PERL/bioperl. There have also been some recent discussions of coding practices on the bioperl mailing list. This has all had me looking up info on coding practices that I am linking here just I know where to go for the links:
I am very self taught when it comes to coding, and PERL is a very vulgar programming language, so it is good for me too see how other folks are implementing standards. This seems very important with group open source projects.

Tuesday, June 12, 2007

Week 3 Update Email

I am currently out of town for some wheat related work, below is the email update I sent out.

Week 3 project update: Command Line Topological Query Application for BioSQL

Last week I:
  • Made changes to MySQL phylo tables in the biosql-phylodb-mysql.sql (http://phylosoc2007jestill.googlecode.com/svn/trunk/sql/biosql-phylodb-mysql.sql) to get foreign keys working and to include recent changes in the biosql schema
  • Updated PhyInit.pl to include these schema changes
  • Completed a version of PhyImport that uses the TreeIO module of bioperl to import tree nodes and edges
  • Generated random trees to serve as test import trees for PhyImport
This week I will:
  • Figure out problems I am having importing NEXUS trees with PhyImport (something to do with my bioperl installation)
  • Add node and edge attribute information to PhyImport
  • Add tree root information to PhyImport
  • Begin PhyExport for whole tree export

Friday, June 8, 2007

PhyImport: Bio::TreeIO and Nexus problems

I was able to get nodes and edges loaded into the database for the example newick file using Bio::TreeIO. This is working for newick and New Hampshire extended files. However, I can't find a nexus file that Bio::TreeIO can seem to handle. :(. Perhaps this is due to the general chaos surrounding the NEXUS "standard", but I would like to get the import working for nearly all NEXUS files.

Maybe I should switch to the Bio::Phylo object, but I wanted to use the object model that was most tightly integrated with bioperl. I am trying to see if I can generate at least one "nexus" file that I can parse.

Since I can not even convert from a newick file to nexus, I seem to be having trouble with my installation of bioperl similar to a recent discussion: (http://portal.open-bio.org/pipermail/bioperl-l/2007-February/024829.html).

Thursday, June 7, 2007

PhyImport: Test Newick Format Tree



I added a randomly generated tree to the code repository. This is a simple newick format tree with 26 leaf nodes. The image links to the *.tre file.
The file is randtree_26.tre.

This will serve as a test file for the development of PhyImport.

The tree was generate using RandTree.pl. I have been having trouble with the NEXUS file I originally wanted to use.

I have stopped trying to get parseTreeePG.pl to work with the MySQL schema. I have moved on to using the Bio::TreeI object as I originally proposed. Since the tree above was generated with Bio::Tree::RandomFactory it works with Bio::TreeI.

PhyImport: Can't defer foreign keys in MySQL

The use of InnoDB with foreign keys now causes the following error in the parseTrees program:

DBD::mysql::st execute failed: Cannot add or update a child row: a foreign key constraint fails at ./parseTreesPG.pl line 710.

This is not a problem in PG because foreign key checks are deferrable. Since foreigns keys are not deferrable in MySQL I am temporarily turning off FK checks in the PERL code:
$dbh->do("SET FOREIGN_KEY_CHECKS=0");
#UPDATE tree TABLE HERE
$dbh->do("SET FOREIGN_KEY_CHECKS=1");
to deal with this in MySQL.


This solves this problem, but now I am still getting problems with commit:
commit ineffective with AutoCommit enabled at ./parseTreesPG.pl line 316.
Commmit ineffective while AutoCommit is on at ./parseTreesPG.pl line 316.
DBD::mysql::db commit failed: Commmit ineffective while AutoCommit is on at ./parseTreesPG.pl line 316.


I therefore added a check to see if AutoCommit was on before attempint $dbh->commit:

unless ($dbh->{AutoCommit}) {
$dbh->commit;
}
I am now trying to see if this will fix the problem without introducing new errors.

MySQL Schema Changes, Blog comments enabled

I've made changes to the MySQL schema to fit the changes made by H. Lapp to the Postgres version of the PhyloDB extensions. Since MySQL does not support booleans I used ENUM:
is_rooted ENUM ('FALSE', 'TRUE') DEFAULT 'TRUE'
in the tree table.

Comments are now enabled in the blog, I did not know that they were turned off.

Wednesday, June 6, 2007

PhyInit: INT(10) != INTEGER

I fixed the Foreign Key problems for some of tables that make references to other PhyloDB tables. However linking to the other BioSQL tables seems to be a problem because the INTEGER values in the Phylo tables are created as INT(11) while the INTEGER values in the other BioSQL tables are INT(10) UNSIGNED.

I am going to make all of the integer values in the PhyloDB tables INT(10) so that the foreign key values will work. This will also make the tables consistent with the rest of BioSQL.

PhyInit: Change to InnoDB tables causes ALTER TABLE errors

Changing the table types to InnoDB now causes problems with using ALTER TABLE to create foreign keys.
For example:
ALTER TABLE tree ADD CONSTRAINT FKNode
FOREIGN KEY (node_id) REFERENCES node (node_id);
is giving the error:

DBD::mysql::db do failed: Can't create table './biosql/#sql-cc7_bba.frm' (errno: 150) at ./PhyInit.pl line 363, <> line 1.

Typing the SQL code directly in the MySQL Command line gives:

ERROR 1005: Can't create table './biosql/#sql-cc7_ba6.frm' (errno: 150)

It looks there may be some help in an online discussion of this issue. It is odd that it flags this as a "Can't create table error" when this is really an ALTER TABLE problem.

Info on Foreign Key constraints is also in the MySQL manual. The conditions for foreign key definitions that are listed in the MySQL manual are:
  • Both tables must be InnoDB tables and they must not be TEMPORARY tables.
    All of my tables are now InnoDB tables so this is not the problem.

  • Corresponding columns in the foreign key and the referenced key must have similar internal data types inside InnoDB so that they can be compared without a type conversion. The size and sign of integer types must be the same. The length of string types need not be the same. For non-binary (character) string columns, the character set and collation must be the same.
    Both columns in the broken SQL are INT(11) so this is probably not the problem.

  • In the referencing table, there must be an index where the foreign key columns are listed as the first columns in the same order. Such an index is created on the referencing table automatically if it does not exist.
    This is it, adding indexes to the tables fixed the problem.

  • In the referenced table, there must be an index where the referenced columns are listed as the first columns in the same order.
    This is it, adding indexes to the tables fixed the problem.

  • Index prefixes on foreign key columns are not supported. One consequence of this is that BLOB and TEXT columns cannot be included in a foreign key, because indexes on those columns must always include a prefix length.
    This is not the problem since the columns are INT(11).

  • If the CONSTRAINT symbol clause is given, the symbol value must be unique in the database. If the clause is not given, InnoDB creates the name automatically.
    This is not the problem since FKnode is a unique value in the database. As a test, I ran the SQL without specifying the symbol, and I still have the error.

I am crossing these off the list as I can ..

Transaction Support in MySQL

I am working with the parseTreesPG.pl script to make it work with MySQL and I am having trouble with transaction support. The use of
$dbh->commit();
is currently causing fatal errors with the message
commit ineffective with AutoCommit enabled at ./parseTreesPG.pl line 736
According to the documentation, this error message occurs when AutoCommit is off, or when transactions are not supported by the system you are using.

It looks like transaction support for MySQL has been around for a few years, but I have never worked with transactions before so this is new for me.

I am working through the Requirements for Transaction Support in MySQL to see where the trouble is.
  • The version of MySQL I am using (4.0.18-standard) should support transactions
  • The version of DBD:MySQL I am using supports transactions
  • ISAM and MyISAM tables in MySQL do NOT support transaction support.
  • The tables that do support transaction support are: BDB, InnoDB and Gemini.
The code that I am using to create tables currenty does not specify the table type, so MyISAM tables are being created. So .. my guess is that the MyISAM tables are the problem.

It looks like I will need to make sure that MySQL is creating InnoDB tables by modifying the PhyInit.pl script CREATE TABLE syntax to specify the table type as INNODB, this would be something like:
CREATE TABLE tree (
tree_id INTEGER NOT NULL auto_increment,
name VARCHAR(32) NOT NULL,
identifier VARCHAR(16),
node_id INTEGER NOT NULL
, PRIMARY KEY (tree_id)
, UNIQUE (name)
,
)TYPE=INNODB;

Monday, June 4, 2007

Week 2 Project update

Below is the weekly progress report email I sent to the Wg-phyloinformatics listserv.

Hi All --

Week 2 Update for: Command Line Topological Query Application for BioSQL

Last week:
* Updated project web page:
- to reflect changes in command line options
- linked to SVN source
- linked to existing code that is relevant to what I am working on
* Modified my original command line options to fit the standards used in the existing BioSQL scripts
* Wrote PhyInit.pl to initialize the phylogenetic data tables for BioSQL
- This currently assumes an existing BioSQL database
- DB handle info can be sent as:
(1) dsn string as ENV variable,
(2) dsn string at command line,
(3) separate command line vars (--host,--driver,--dbname) that are used to create dsn string
- A new DB will be created if a DB with the name in the dsn string does not exist
() This uses the --dbname or a series of split commands to get info from command line --dsn
() SQL create table code is hard coded in MySQL format
- Password can be entered in a 'secure' fashion if not passed at the command line.
* Given an existing DB, only the new tables will be created, existing tables will be deleted
- User is warned before deleting any existing data. This step does a record count to tell the user how many records would be deleted from any existing tables.
* Started PhyImport.pl to import phylogenetic data from NEXUS,Newick files
- Mainly just set up command line options and POD documentation
* Posted changes I made to the schema to get this to work in MySQL on the project code repository
* All new PERL code was place in the project working repository listed below

* The -h or --help command line switch can be used to read POD documentation

This week:
* PhyImport.pl
- Add NEXUS file support
- Add Newick file support
* PhyInit.pl
- Check over POD documentation
- Add ability to use sql in the sqldir, this will create the full BioSQL schema if needed

Project Web: https://www.nescent.org/wg_phyloinformatics/PhyloSoC:Command_Line_Topological_Query_Application_for_BioSQL
Project blog: http://phylosoc2007jestill.blogspot.com
Working repository: http://code.google.com/p/phylosoc2007jestill

Friday, June 1, 2007

PhyInit: Create tables

I added the code to create the phylo tables if they do not exist. The other BioSQL tables are not currently created, and the current version uses hard coded SQL instead of running external SQL code. The current implementation will only work with MySQL.

Download of source available: PhyInit.pl