Categories


Contact

Search

Links


Archive


Open Repository Blog

Wednesday Jul 30, 2008

More hertz, but less pain... and a wavering signal

Yes, I know it's been a while since I've posted any updates. We haven't really had anything major to announce for a while - lots of tweaks to existing repositories, and one or two new pilots. Although there are a couple of larger projects happening in the background, which should become visible shortly.

As I write this, I am sat on a train to Edinburgh - where for the rest of the week I shall be attending the repository world's latest conference, the Repository Fringe. Any suggestions that the timing and venue of this meetup have been chosen for the various social activities available are emphatically refuted!

I will take this opportunity to mention a significant upgrade that was made a few weeks ago - the introduction of an enhanced server infrastructure. I'll spare the gory details, but it means that we can better distribute the requests to our hosted repositories across the servers we are running. So whilst we haven't introduced any new hardware, we have effectively doubled the peak capacity that the servers can handle.

What's more, because user login information is replicated across the servers, when a server fails or the software it is running is upgraded, any users that are logged in will no longer find themselves mysteriously logged out.

All of which is good news for the managers and users of our hosted repositories, and quite noteworthy to the wider DSpace community as very few institutes run a fully clustered repository (I had to fix one or two minor issues in the main DSpace code to even make this possible - these fixes will be part of the upcoming 1.5.1 release).

 

 

Friday May 16, 2008

What You Get is more than What You See

Sorry, I've been neglecting you, haven't I? Time for another update then, and this one is more WYSI than most (err, ok, I'll stop the puns).

The first part to mention is a new cut-down 'content management system' - when I say cut-down, I'm not just being modest. It doesn't currently allow uploading of images or other objects, and doesn't post to a blog or rolling new page (although now I'm starting to get ideas!). But what it does let you do is create additional pages of html that become available under a 'pages/FILE.html' url. So you can create a page acknowledging publisher's that have let you archive full text of your material, for example.

If you want to access this, log in to your repository as an administrator, and then from the 'admin' screen, select 'Additional Pages' from the left hand navigation. It's all pretty straightforward to use, but if you would like some assistance, please get in touch.

Now, editing all this html by hand is a bit tedious, right? So, related to the above is our other feature announcement - if you are using a modern Javascript-friendly browser (IE, Firefox, Opera, Safari), the 'edit html' form in the content management system above replaces the html entry box with a 'what you see is what you get' editor. If you've ever used software like Microsoft Word (and who hasn't), the operation should be fairly obvious - and much easier than tedious coding of 'pointy br slash pointy' and the like.

But why stop there? What about all that nasty html code that you see on the homepage news editor? Or the community / collection edit pages? Well we've replaced the small html entry box on the news editor with a larger WYSIWYG editor - click here for an example. For the community / collection edit screens, we've left the small box of html code there by default, but added an 'Add/remove editor' link next to it, that will convert it into a cut down version of WYSIWYG editor - this time, click here for an example.

A couple of quick tips about the WYSIWYG editor - firstly, there is a button labelled 'HTML' on the second row of buttons. If you want to see or edit the html code, clicking that button will open a pop-up that displays the html code. Secondly, by default the editor creates new paragraphs when you hit the 'enter' key - and new paragraphs always have a blank line between them. If you just want to create a new line without the additional spacing, hold down the 'Shift' key whilst you press 'Enter'.

One final note, a big congratulations this week to Medecins Sans Frontieres who have gone public with their repository. It's great to see the work that's gone into it come to fruition.

 

 

Wednesday Apr 30, 2008

Another day, another month, another feature

 

What better way to celebrate the last day of April than with another release of new features? (Yes, I can think of a few too, but you've got to be careful what you say on these blogs - you never know who might be reading).

Today brings to new enhancements to Open Repository. The first of which, you can see an example of in the screenshot below - tabbed browse panels for communities and collections.

Now, when you go to the homepage of a community or collection, the 'browse by' box will be presented with two tabs 'Community' (or 'Collection') and 'All'. The browse links when displaying the 'Community' or 'Collection' tab take you to browse list of content that is within the community or collection that you are viewing. Clicking on 'All' will update the box with links for browsing the content of the entire repository - including the 'communities and collections' link.

The really nice part about this enhancement is that these navigation options stay with you when you are browsing within a community or collection - so if you go to the 'Title' list for a specific community, the browse box will still be presented with the tabs, and you can still choose to go to the 'Author' or 'Date issued' browse for that community.

browsing

 

The second feature is only available to repository administrators. One of the most popular features of Open Repository has been the document conversion facility, that allows submitters to create PDFs of their Word files (amongst other options). However, this was restricted to being part of the submission process - once an item was accepted into the repository, although you could add and remove bitstreams, you no longer had access to the document conversion facilities.

Today's update changes this - as an administrator, if you edit an item, an additional button appears for the 'Document Convertor' (it's near the top of the page, just under 'Delete (expunge)' and 'Move Item'. On entering the document convertor, you will be presented with the existing bitstreams for that item, and any conversions that are available. It works in the same way as the conversion facility in the submission process, and also allows you to remove any bitstreams (just in case you create any by accident!). For editing any other metadata associated with the bitstream (name, description), or to add any files, you can use the form on the 'edit item' page as normal.
 

 

 

Monday Apr 21, 2008

How many, how fast?

No, I know I'm clearly not referring to the number of blog posts recently (unless the answer is very few, very slowly). So, what's been happening in Open Repository land?

Well, after a very busy couple of days with the DSpace User Group sessions (the presentations from which are now online), we've been hard at work tinkering under the hood of Open Repository.

All of our repositories are now registered within Google's Webmaster tools service, which means we can keep a track on the indexing of the sites, and if any problems are being encountered. Thanks to this, I've been able to track down a couple of small bugs hiding in the RSS feed code, and the XHTML headers that caused problems when viewing a tiny minority of items. All problems that this has shown up have now been rectified.

There is also a new tool available that allows repository admins to match parts of the metadata when it is filled from a PubMed ID or DOI, and insert additional metadata into the item automatically.

But, back to the original question, which refers to some changes to the file-type analysis tool. As you know, this was enabled for all our clients just before the Open Repositories conference. However, it was a bit slow. Actually, quite a lot slow. For a repository with 2000 items in it, the initial page took over 3 minutes to display. In fact, with 2000 items, every page in the analysis tool would take at least 3 minutes, and in the worst cases would even take over 10 minutes to display.

This has now been improved slightly. For the same repository, the initial page will typically display in under 5 seconds. And every page of the analysis tool will start to display in under 5 seconds - even for the worst case scenario of listing a breakdown of 2000 items (which should complete downloading in about 20 seconds).

I say generally, because sometimes the pages do take a bit longer to load - for example, if it hasn't been used in a while, then the database may need to do a bit more work to load the necessary data. But even in those cases, it is at least usable now.

So, if you haven't looked at the file-type analysis tool - or refrained from using it due to the performance - now is the time to give it a chance.





 

 

 

Tuesday Apr 01, 2008

What happens when you try to get 400 people into 2 rooms?

Having survived Microsoft's hospitality yesterday (and they were extremely hospitable, hence the reason most of those that were there are struggling), I am back at the campus of Southampton University (this time with Peter and Dominic in tow) for the formal start of Open Repositories 2008.

Today's sessions started with an very interesting keynote presentation by Peter Murray-Rust of Cambridge University, exploring the issues of using repositories for scientific data (with a particular focus on chemistry). This touched on the difficulties of getting users to interact with repositories - how they only want to use the tools that they and processes that they are familiar with, and that repository ingest needs to fit in with this, either by direct integration with the tools (in the kind of way the Microsoft was demonstrating yesterday), or by having alternate ingesting procedures (discovering the content from other sources, harvesting and mining that data with little or no interaction from the user).

This presentation turned out to be as entertaining as it was informative, as the demonstrations that Peter had prepared caused havoc with the arrangements that Southampton University had made to relay the projections to a second room due to the numbers that turned up. So the answer to the question of what happens when you try to get 400 people into 2 rooms is that all your best laid plans fall through, and you end up squeezing 400 people into just one!

The following session consisted of three presentations relating to web 2.0 technologies (although as we certainly know in the DSpace community, you should never refer to something as 2.0!). The presentations focused on Connotea (and OpenID integration), scholarly practice and the impact of social networking, and a very impressive (or at least pretty) demonstration of cross repository browsing using RichTags.

It was particularly good to see what Connotea are up to, and the possibilities to build on and integrate the services of Connotea into a repository (such as the recommendation system). Usefully, this also provided me with an introduction to Ian Mulvany (Nature Publishing Group), who I've had a very productive meeting with to talk about how we can enhance the links that we provide to submit articles from Open Repository services into Connotea libraries - the good news is that they've already made some enhancements recently that allow Connotea to read the embedded meta links that we are placing in the html, in many cases signifcantly improving the quality and quantity of data that is transferred. There are still some kinks to work out, and I look forward to having further discussions with Ian once we both get back to our respective offices next week.

 

 

Monday Mar 31, 2008

Open Repository @ Open Repositories 2008

After saying a tearful farewell to Mark on Friday, the Open Repository team has picked itself up, dusted itself down - and run off to hide in Southampton for a week. Peter and Dominic are heading down later today, however I've been sent down here early to scout the area.

Before I go on, I'll (re-)introduce myself - yes, I'm that Graham... technical architect, developer, DSpace committer, and now roving reporter. 

So, Southampton - should be a very interesting week. The main conference starts tomorrow, with the DSpace user group (that I have the dubious honour of organising!) on Thursday, and finally the latest revision (0.3) to the fledgling OAI-ORE specification being unveiled on Friday. I'll be keeping you updated, as it happens (providing the network holds out - there are about 475 delegates registered for this week).

But for today, I'm attending a small meeting of repository folk at the invite of Microsoft, who are announcing this week new initiatives in the repository space. Firstly, there is the Research Output Repository platform, which provides the building blocks to creating a repository solution on a Microsoft software stack. Also mentioned are a number of tools and plugins to the Office software that allows for these applications to be used seamlessly within the workflow for publishing and archiving documents within a repository (based on standards like SWORD and OpenSearch).

This is also serving as an open forum for all the major stakeholders in the DSpace, EPrints, Fedora and now Microsoft communities to have a good discussion on the issues of interoperability - with each other, and the tools that users require to engage with their repositories.

If I manage to survive the onslaught of food that is accompanying this meeting, I hope to be back tomorrow with more information from the main conference. Wish me luck!

 

 

Monday Mar 03, 2008

February update

With winter making one last ditch attempt to keep us all in our thermals, it's time to take a look back at what we've achieved this February. The unseasonally high temperatures have obviously helped as we seem to have made good progress in this slightly shorter of months.

We've moved to new, improved web servers, significantly improved the speed and efficiency of the browse and search features, added date restrictions to the advanced search feature, added links to various social bookmarking sites (Digg, Stumble upon, Facebook, Connotea, Del.icio.us, Citeulike) and deployed the ability to download to EndNote and Reference Manager. Phew!

Our final projects for the month are currently in various stages of testing and will hopefully be deployed at the end of this week. These will be: item embargos, EThOS standardized submission forms, and a tool to define content types within the repository.

Item embargos allow a user to enter an embargo date when uploading a file during submission. The bitstream in the item view will be labelled as under embargo and inaccessible until the date set. The embargos will need to be set up by us, but it's a simple change to a config file. Embargos set up across the repository, to all collections within a specific community, or to specified collections. All admins will be able to see embargoed content, and special groups can be set up to view embargoed content. Embargo dates can be changed within the item edit page, and we're working on allowing embargos to be added to bitstreams added after submission.

Further to this we're also creating a thesis-specific submission form that can be applied to a collection or collections of your choice, that also conforms to the UK's EThOS project requirements. This will be released in hand with the item embargo tool.

The content type tool will show you how many items in your repository contain which file types, and how many do not. For example, you can look at how many items contain PDFs, and then at all those that don't. For those that don't, you'll be able to see a breakdown of which file types they do contain, and link to the specific items in each case. There will also be a list of all items that are metadata only.

We've also been working on various small customisation requests for individual repositories, and we're looking forward to seeing HeRA ready for public launch at the end of the month. Our March task list has been set, and I'll add a further update on that later this week. In the meantime we've sent Graham off to Baltimore, not to catch up on season 4 of The Wire, but to attend the OAI-ORE launch. There will certainly be MUCH more of this to come, as ORE looks to become a central tool in repository usage moving forward.

 

 

Monday Feb 25, 2008

Can you Digg it?

So after another terrible pun, it's time to reveal the latest features to be added to the Open Repository service.

After having worked through various user test cases, the download to EndNote and Reference Manager feature has been released to live for all production and pilot sites. This feature is available on both the individual item page, and for a page of search results.

The feature will also work for the Firefox extension, Zotero.

Additionally we have also added links to each item page for the social bookmarking sites: Digg, Del.icio.us, Connotea, Citeulike, Stumbleupon and Facebook. However, we are unable to set these up for pilot sites.

Click here for an example of both features.

 

 

 

Friday Feb 22, 2008

Gratitude and Platitude

It's very common for companies to gather together the positive feedback they've received and send it round internally, and I felt that it was high time we did the same for us. Although, I've removed any mention of those involved, to avoid potential embarrassment at such gushing praise, here are a few of our favourites for your public perusal:

  • We think you have been absolutely fantastic and we really do appreciate all of the hard work you have put in to deal with our queries and requests.
  • I'm incredibly impressed by your ability to turn round these sorts of problems within hours (even minutes) rather than days or weeks!
  • Like a good Santa you gave us almost everything we wanted, but held enough back for next year and because we weren't perfect this year! 
  • ...much better than socks and a poorly-knitted cardigan! That's fantastic thank you
  • Just let you know that the launch was very successful and people showed great interests in the repository. Many thanks for your support
  • We are very satisfied with your service so far! I am happy you are developing helpful tools of great importance!
  • Big thanks for all the work
  • You are really rocking, today!
  • Are you trying to test my coronaries, or just good at timing?
  • Thanks very much for this - it has indeed gone! I think I need to lie in a darkened room now...
  • I am grateful to you for going beyond normal limits to help get this off the ground
  • What would I do without your willingness to shed light on these mysteries for me!
  • Thanks for this.  You've been remarkably quick.
  • We really appreciate the attention you've given us on all these picky details.

And finally...

  • Thanks for this reassurance - I actually thought I had lost the plot...  No - you're completely correct, I've clearly lost the plot. Now slightly concerned that I've lost a day though! Wonder what else I've missed...


In return, we'd like to say thank you too, to everyone we work with for making our jobs not only interesting, but also highly pleasurable.

And for our own and finally, we've realised that during their involvement with Open Repository, at least two administrators have got married, and two have fallen pregnant. Now, we're not saying that... well, make of it what you will!

Tune in next week for some more news you might Digg (groan!)


 

 

Friday Jan 18, 2008

A beginner's guide to metadata

For those who may not yet have seen the email, the Repositories Support Project is holding an event at the University of Wolverhampton,  focused on the ever thorny issue of metadata within institutional repositories. 

Wolverhampton's repository, WIRE, is an Open Repository service, so this will be an excellent opportunity to ask the Wolverhampton team all about how wonderful we are.

The text from Jackie Knowles' email is copied in below.

Metadata & Institutional Repositories : A beginners’ guide

Tuesday 11th March 2008, 10.00 - 4.00

Learning Centre Seminar Room, University of Wolverhampton

Metadata is a challenging area for repository administrators and this event will take beginners through some of the basics using a series of presentations and practical exercises. This event forms part of the RSP series of focussed events which take single themes identified as key training issues and offer focussed training and specific support to the delegates in attendance. The day will also include the usual opportunities to network and meet with colleagues working in the repository field. As with all earlier RSP events the day will be offered at no-cost to attendees.

Draft Programme

09.30     Registration and refreshments

10.00     Welcome & housekeeping

10.05     Talk 1 – An Introduction to metadata (Ann Chapman, UKOLN)

10.30     Talk 2 – Metadata and repositories (Jackie Knowles, RSP)

11.20     Refreshment break

11.40     Talk 3 – Metadata and harvesting (Stuart Lewis, RSP)

12.10     Speed networking
12.50     Panel Q&A

13.00     Lunch

14.00     Practical exercise (including refreshments mid afternoon)

15.30     Report back and questions

16.00     Close

Who should attend?

Focussed events are aimed specifically at repository managers, technical staff, administrators, information workers and library staff engaged in the day-to-day operation and development of repositories. There are no limits on the number of delegates who are welcome to attend from any one institution. To book a place please visit http://www.rsp.ac.uk/events/FocusProgramme-Metadata.php

Please contact support@rsp.ac.uk for further details of any of our events.

 

 

Monday Jan 14, 2008

Really Simple Slip-up

It has been drawn to our attention that there have been problems with the RSS feeds on OR. Specifically that they're working backwards. Or in other words, the earliest submitted items are appearing first. 

But no more, it's been fixed. A simple change has been made to the configuration file to draw the correct ordering. As intended, RSS feeds will, from here on, display items as they are added.

And now for the good news.

In the same build we've made a couple of significant changes to the service. The first change is the addition of the full download to EndNote or Reference Manager options. At the top of each item page you'll see the button to 'Export to' and a drop down list to choose the citation manager required. It'll even work if you have that really nifty Firefox* extension Zotero. Actually, it works really well on Zotero.

For the time being, this option will remain hidden to all but administrators while we're fishing for feedback. Once we know that people are happy and the metadata is appearing accurately, we'll push the feature live to all. We do also have the option to switch this feature on or off.

With this in place we can begin to investigate the reverse process, adding content from EndNote or Reference Manager files.

The second change relates to the search. For the time being you should notice nothing, other than it's running up to about 25% faster than before. But underneath, lurks a pupa of code, waiting for an interface to arrive so that it can reveal itself to the world. This code will allow us to add new sort options to search results, as we now have with the browse, including limiting search by date, which is one of our most requested feature requests. I'll add a further update when this butterfly of search takes wing (which will be as soon as we've finished the interface).

 
*It's not that we're biased or anything, but we do prefer Firefox as a browser.  


 

 

Foxy

The nice people at FoxLand recently built us a rather lovely little Flash introduction to OR.

 

 

 

Friday Jan 11, 2008

Welcome the newbies

Some time ago, when we asked our customers what they wanted to know more of about the Open Repository service, one of the responses was to get more information on new customers and pilots.

So, firstly, welcome to the Environmental Cancer Risk, Nutrition and Individual Susceptibility Network, based in Poland, Leeds Metropolitan University, the Health Service Executive of Ireland and Royal College of Nursing (UK) who have all recently joined us (or are just about to) with pilot repositories.

We're also pleased to say that the Helsebibliotekets Research Archive have decided to upgrade their pilot to a full production repository with us, after seeing the improvements brought by the 1.4.9 build. Additionally, Northumbria University signed up just before Christmas taking us up to 14 production repositories being hosted by the service now. We look forward to a long and fruitful collaboration with you all.

Meanwhile, somewhere in the background, some men from the Ministry have been looking at our network security and Graham's been doing some things with searching that should bring some post Christmas cheer to our users. But more of that next week.

 

 

Thursday Jan 03, 2008

It's only a plateau, not a summit, but it's an immense success

In the wider world, it's worth noting that on Boxing day, or the day after Christmas, as the Americans call it, research funded by the US National Institutes of Health (NIH) was mandated by law to be made open access within 12 months of publication. The full story is up on the SPARC newsletter, along with a fairly detailed history of open access movements during 2007.

The length and ferocity of the battle over NIH funded research makes this a major victory for open access, although far from the end of the war. However, as a statement on the acceptance of open access principles, and therefore tacitly, usage of institutional repositories, there can hardly come much bigger than from the White House.
 

 

 

New year, more improvements

So we're only a few days into 2008 and already stuck in to the next batch of improvements for Open Repository.

Firstly, we've upgraded the browse module and after some time trials, we estimate the browse feature is now running about three times faster than before, especially when pulling large lists, such as viewing 100 results. That's live as of this morning.

Furthermore, admins will now see an 'Export  to' button, followed by a choice of either EndNote or RefMan, on each item view. This feature  is still a work in progress, hence only being available to admins for the time being. Choosing this export option will show the item's metadata in EndNote or ReferenceManger format a new tab. The next step is to have the metadata download into the particular citation management software, but the main changes are up which is important. If anyone gets the chance to have a play with this feature and check the metadata exports are all ok, we'd be most grateful.

We've also expanded the in-submission document conversion tool for OpenOffice documents. This means that Microsoft Word, Excel and PowerPoint documents can now also be converted into their respective Open Document formats (ODF), as well as into PDF. Additionally OpenOffice documents (including StarOffice documents up to 5.0) will be recognised when uploaded by the submission form and can also be converted into PDF. The system will accept all OpenOffice 2.x documents and older.

This improvement comes at an auspicious time with the news that the Norwegian government has mandated the use of Open Document Formats (ODF) for files published for use by the Norwegian public. Part of the press release reads: The government has decided that all information on governmental websites should be available in the open formats HTML, PDF or ODF. With this decision the times when public documents where only available in Microsoft's Word-format is coming to an end.