For some research work I’m doing, I would like to do some Bayesian modeling for text categorization. Since I’m not interested in re-inventing wheels when there are plenty of very well constructed wheels available for the taking, I thought I would install Andrew McCallum’s libbow and rainbow packages on my Mac. Of course, I had a little bit of trouble, and thought it would be a good idea to document how I went about installing since a quick google search didn’t turn much up. (Not quite true: I turned up one or two references to the packages being available via fink, but I couldn’t find them in my setup.)
Details follow the read more link.Apple Technote on porting command line tools
This Apple Technote on porting command line tools has some interesting information on header files and other gotchas when working on OSX.
The first problem was
make gcc -c -Ibow -I. -I./argp -DHAVE_STRERROR=1 -DHAVE_GETTIMEOFDAY=1 -DHAVE_RANDOM=1 -DHAVE_SRANDOM=1 -DHAVE_SETENV=1 -DHAVE_STRCHR=1 -DHAVE_STRRCHR=1 -DHAVE_LOG2F=1 -DHAVE_SQRTF=1 -DHAVE_ALLOCA_H=1 -g -O -Wall -Wimplicit -o array.o array.c In file included from array.c:22: ./bow/libbow.h:40:53: error: malloc.h: No such file or directory
This isn’t really a big shock, since Apple has their malloc.h header off in /usr/include/sys/
, where GCC wouldn’t be able to find it. Since in OSX and other POSIX implementations the malloc functions are included in stdlib.h I just removed all references to mallo.h from the code.
That went well, and got me up to
gcc -c -Ibow -I. -I./argp -DHAVE_STRERROR=1 -DHAVE_GETTIMEOFDAY=1 -DHAVE_RANDOM=1 -DHAVE_SRANDOM=1 -DHAVE_SETENV=1 -DHAVE_STRCHR=1 -DHAVE_STRRCHR=1 -DHAVE_LOG2F=1 -DHAVE_SQRTF=1 -DHAVE_ALLOCA_H=1 -g -O -Wall -Wimplicit -o array.o array.c In file included from array.c:22: ./bow/libbow.h:1631: warning: type qualifiers ignored on function return type ./bow/libbow.h:2128: error: array type has incomplete element type
The warning isn’t a big deal, but the error is a bit more troubling. This pops up because GCC has become more strict about following the C++ spec on incomplete types (a question on a similar problem, see the two follow-ups as well.) I could try to track things down and make sure that
the extern struct argp_child bow_argp_children[];
line has the argp_child struct defined before that line, but I thought it was easier to change the version of gcc to 3.3. Apple ships with a few gcc compilers installed, you can use gcc_select
to list (-l) and change the default system compiler. I changed to 3.3. That got me up to:
barrel.c:22:20: values.h: No such file or directory barrel.c: In function `barrel_iterator_count_for_doc': barrel.c:1182: warning: division by zero
A quick search of the web turned up the nugget that values.h has been replaced by float.h, so over in the barrel.c file and bow/svm.h I replaced #include <values.h>
with #include <float.h>
and things went fine, until I ran across a new error:
svm_smo.c:194: error: `MAXDOUBLE' undeclared (first use in this function) svm_smo.c:194: error: (Each undeclared identifier is reported only once svm_smo.c:194: error: for each function it appears in.)
This was an artifact of swapping in limits.h for values.h, MAXDOUBLE is now called DBL_MAX, so making that change in the broken file fixed that problem. That actually compiled all of the object files, so now we just have to deal with linking. Here’s a new error:
ld: can't locate file for: -lcrypt
It looks like the problem here is that libcrypt does not exist for OSX. Instead, you can try using libcrypto. What are the differences? I have no idea. So in Makefile.in, I added an “o” to the ALL_LIBS line that specified libcrypt, re-configured, re-made, and then a make install and things are installed wonderfully. Time to do some model building.
Leave a Reply