Thursday, November 1, 2007

AANRO data in Muradora

Last week the Namchi Nguyen (Chi) got in contact with us to show that the Muradora team had added an import utility so Muradora can index items sitting in an existing Fedora repository (Fez can do this, and after some prompting from the ARROW and RUBRIC communities VTLS added this feature to Vital as well).

The Muradora team have put up a demo server containing AANRO data and some other stuff. Remember this is someone else's server and the data may not be in there for long.

I had a conversation with Chi this week about this new version. It looks promising and it seems that it would meet most of the AANRO requirements although with its focus very much on access control it may not be a perfect match with AANROs open access data.

There are a couple of issues that need to be resolved (I've already mentioned this stuff to Chi):

  1. At present when you do a search the default behaviour is to show that there were a certain number of results, and only filter out results that you are not meant to be able to see when they are displayed. From what I know of institutional repositories this is not acceptable as even knowing that someone is working on a particular topic may be a problem for some intellectual property.

    I'm not sure that this would be huge problem for AANRO, but in the current version is certainly compromises usability in the general case.

    Chi tells me they will fix this soon by having a 'guest' mode that only searches open access objects.

  2. The current interface is not very hypertextual once you get to a metadata page there are no links to see other things about the same subject or by the same author. I'm sure this would be simple to add.

  3. There is nothing in the demo to show a subject hierarchy or ontology at work unless you count collections (we don't have that on our Solr demo either come to think of it).

  4. There are a number of bugs and little tidy-ups that are required, and anyone talking on this software would have to be prepared to work with something where they would be amongst the first to use it.

  5. The current demo does not have its metadata editing configured to work with author affiliation in MODS. I've confirmed with Chi that this is just a matter of writing some more XForms code, which can be tricky but there are others working on the same thing (also without affiliation, though).

Unfortunately we've run out of time on this evaluation the draft final report went off this week, but I'll keep an eye on Muradora.

(Regarding performance, from Chi's email I gather that Index time for 140,000 records including adding all records to an AANRO collection was 16-18 hours on a machine with 4GB RAM, 2x 1.8Ghz Opteron CPU (each with 2 cores), 320GB SATA hard disk, but this time could be improved doing a bit more work to the data and re-indexing will not take that long. (Chi if you want to comment here, please do so)).

Copyright 2007 The University of Southern Queensland

Content license: Creative Commons Attribution-ShareAlike 2.5 Australia.