Open Repository Blog

What You Get is more than What You See
Sorry, I've been neglecting you, haven't I? Time for another update then, and this one is more WYSI than most (err, ok, I'll stop the puns).
The first part to mention is a new cut-down 'content management system' - when I say cut-down, I'm not just being modest. It doesn't currently allow uploading of images or other objects, and doesn't post to a blog or rolling new page (although now I'm starting to get ideas!). But what it does let you do is create additional pages of html that become available under a 'pages/FILE.html' url. So you can create a page acknowledging publisher's that have let you archive full text of your material, for example.
If you want to access this, log in to your repository as an administrator, and then from the 'admin' screen, select 'Additional Pages' from the left hand navigation. It's all pretty straightforward to use, but if you would like some assistance, please get in touch.
Now, editing all this html by hand is a bit tedious, right? So, related to the above is our other feature announcement - if you are using a modern Javascript-friendly browser (IE, Firefox, Opera, Safari), the 'edit html' form in the content management system above replaces the html entry box with a 'what you see is what you get' editor. If you've ever used software like Microsoft Word (and who hasn't), the operation should be fairly obvious - and much easier than tedious coding of 'pointy br slash pointy' and the like.
But why stop there? What about all that nasty html code that you see on the homepage news editor? Or the community / collection edit pages? Well we've replaced the small html entry box on the news editor with a larger WYSIWYG editor - click here for an example. For the community / collection edit screens, we've left the small box of html code there by default, but added an 'Add/remove editor' link next to it, that will convert it into a cut down version of WYSIWYG editor - this time, click here for an example.
A couple of quick tips about the WYSIWYG editor - firstly, there is a button labelled 'HTML' on the second row of buttons. If you want to see or edit the html code, clicking that button will open a pop-up that displays the html code. Secondly, by default the editor creates new paragraphs when you hit the 'enter' key - and new paragraphs always have a blank line between them. If you just want to create a new line without the additional spacing, hold down the 'Shift' key whilst you press 'Enter'.
One final note, a big congratulations this week to Medecins Sans Frontieres who have gone public with their repository. It's great to see the work that's gone into it come to fruition.
Posted by Graham Triggs at 15:18 Comments (0)
Another day, another month, another feature
What better way to celebrate the last day of April than with another release of new features? (Yes, I can think of a few too, but you've got to be careful what you say on these blogs - you never know who might be reading).
Today brings to new enhancements to Open Repository. The first of which, you can see an example of in the screenshot below - tabbed browse panels for communities and collections.
Now, when you go to the homepage of a community or collection, the 'browse by' box will be presented with two tabs 'Community' (or 'Collection') and 'All'. The browse links when displaying the 'Community' or 'Collection' tab take you to browse list of content that is within the community or collection that you are viewing. Clicking on 'All' will update the box with links for browsing the content of the entire repository - including the 'communities and collections' link.
The really nice part about this enhancement is that these navigation options stay with you when you are browsing within a community or collection - so if you go to the 'Title' list for a specific community, the browse box will still be presented with the tabs, and you can still choose to go to the 'Author' or 'Date issued' browse for that community.

The second feature is only available to repository administrators. One of the most popular features of Open Repository has been the document conversion facility, that allows submitters to create PDFs of their Word files (amongst other options). However, this was restricted to being part of the submission process - once an item was accepted into the repository, although you could add and remove bitstreams, you no longer had access to the document conversion facilities.
Today's update changes this - as an administrator, if you edit an item, an additional button appears for the 'Document Convertor' (it's near the top of the page, just under 'Delete (expunge)' and 'Move Item'. On entering the document convertor, you will be presented with the existing bitstreams for that item, and any conversions that are available. It works in the same way as the conversion facility in the submission process, and also allows you to remove any bitstreams (just in case you create any by accident!). For editing any other metadata associated with the bitstream (name, description), or to add any files, you can use the form on the 'edit item' page as normal.
Posted by Graham Triggs at 16:02 Comments (0)
More PubMed integration in Open Repository
One of Open Repository's most useful additions has been the ability to automatically retrieve metadata from PubMed given a PubMed ID. In the last couple of days, this integration has been taken a little further.
Now, when submitting an item, if you have used a PubMed identifier, on reaching the 'Upload a file' screen the system will contact PubMed and attempt to see if there is anywhere that hosts the full text for that article. If PubMed does have links for the full text, then these will be displayed on the page, allowing you to easily get to other sources of the full text so that you can upload it to your repository.
Note, that this only displays during the submission process, and it is up to the user and/or admin to ensure that the repository has the rights to host any files obtained.
Click here for an example of this functionality.
Also, on the preview sites is an additional feature that will display related articles in PubMed for any item that has been assigned a PubMed identifier - click here for an example. As ever, please give us any feedback you may have.
Posted by Graham Triggs at 17:20 Comments (0)
No, I know I'm clearly not referring to the number of blog posts recently (unless the answer is very few, very slowly). So, what's been happening in Open Repository land?
Well, after a very busy couple of days with the DSpace User Group sessions (the presentations from which are now online), we've been hard at work tinkering under the hood of Open Repository.
All of our repositories are now registered within Google's Webmaster tools service, which means we can keep a track on the indexing of the sites, and if any problems are being encountered. Thanks to this, I've been able to track down a couple of small bugs hiding in the RSS feed code, and the XHTML headers that caused problems when viewing a tiny minority of items. All problems that this has shown up have now been rectified.
There is also a new tool available that allows repository admins to match parts of the metadata when it is filled from a PubMed ID or DOI, and insert additional metadata into the item automatically.
But, back to the original question, which refers to some changes to the file-type analysis tool. As you know, this was enabled for all our clients just before the Open Repositories conference. However, it was a bit slow. Actually, quite a lot slow. For a repository with 2000 items in it, the initial page took over 3 minutes to display. In fact, with 2000 items, every page in the analysis tool would take at least 3 minutes, and in the worst cases would even take over 10 minutes to display.
This has now been improved slightly. For the same repository, the initial page will typically display in under 5 seconds. And every page of the analysis tool will start to display in under 5 seconds - even for the worst case scenario of listing a breakdown of 2000 items (which should complete downloading in about 20 seconds).
I say generally, because sometimes the pages do take a bit longer to load - for example, if it hasn't been used in a while, then the database may need to do a bit more work to load the necessary data. But even in those cases, it is at least usable now.
So, if you haven't looked at the file-type analysis tool - or refrained from using it due to the performance - now is the time to give it a chance.
Posted by Graham Triggs at 17:11 Comments (0)
What happens when you try to get 400 people into 2 rooms?
Having survived Microsoft's hospitality yesterday (and they were extremely hospitable, hence the reason most of those that were there are struggling), I am back at the campus of Southampton University (this time with Peter and Dominic in tow) for the formal start of Open Repositories 2008.
Today's sessions started with an very interesting keynote presentation by Peter Murray-Rust of Cambridge University, exploring the issues of using repositories for scientific data (with a particular focus on chemistry). This touched on the difficulties of getting users to interact with repositories - how they only want to use the tools that they and processes that they are familiar with, and that repository ingest needs to fit in with this, either by direct integration with the tools (in the kind of way the Microsoft was demonstrating yesterday), or by having alternate ingesting procedures (discovering the content from other sources, harvesting and mining that data with little or no interaction from the user).
This presentation turned out to be as entertaining as it was informative, as the demonstrations that Peter had prepared caused havoc with the arrangements that Southampton University had made to relay the projections to a second room due to the numbers that turned up. So the answer to the question of what happens when you try to get 400 people into 2 rooms is that all your best laid plans fall through, and you end up squeezing 400 people into just one!
The following session consisted of three presentations relating to web 2.0 technologies (although as we certainly know in the DSpace community, you should never refer to something as 2.0!). The presentations focused on Connotea (and OpenID integration), scholarly practice and the impact of social networking, and a very impressive (or at least pretty) demonstration of cross repository browsing using RichTags.
It was particularly good to see what Connotea are up to, and the possibilities to build on and integrate the services of Connotea into a repository (such as the recommendation system). Usefully, this also provided me with an introduction to Ian Mulvany (Nature Publishing Group), who I've had a very productive meeting with to talk about how we can enhance the links that we provide to submit articles from Open Repository services into Connotea libraries - the good news is that they've already made some enhancements recently that allow Connotea to read the embedded meta links that we are placing in the html, in many cases signifcantly improving the quality and quantity of data that is transferred. There are still some kinks to work out, and I look forward to having further discussions with Ian once we both get back to our respective offices next week.
Posted by Graham Triggs at 14:44 Comments (0)
Open Repository @ Open Repositories 2008
Posted by Graham Triggs at 16:23 Comments (0)
Yesterday, another new feature was added to the Open Repository service. The file-type analyzer is linked to from the admin menu, and allows administrators to see a breakdown of the content types within their repository of the numbers of items and the file types they contain.
The main page lists the various file types within the repository, the number of items containing each file type, and how many of each file type in total (e.g. 10 items with Adobe PDF and 12 PDF files in total)
The details can be drilled down in to. Therefore, you can view each of the 10 items, and see exactly how many of each file type an item contains.
The display also shows the reverse: i.e. which items don't contain a certain file type but which files these items do contain.
The total number of metadata only items is also displayed, which again can be expanded to view each of these items.
The total number of items with files + total number of metadata only files = total number of items in the repository.
This tool is available for all production and pilot repositories.
We've also added RefWorks as one of the citation management download options, alongside EndNote and ReferenceManager.
Posted by Mark Merifield at 11:13 Comments (0)
Another useful workshop from the RSP, this time focusing on some of the additional services available for Repository Managers:
Repository Services Day
Wednesday 23rd April, 10.00am - 4.00pm
University of Nottingham
If you are a:
• Repository manager
• Information professional
• Researcher
• Lecturer
• or are involved in administering research grants and supporting researchers,
then this day is for you!
RSP is pleased to facilitate this event which will showcase key repository
and search services available to the UK repository and research community.
It is assumed that participants will have a good understanding of Open
Access. For those not already familiar with OA an overview is available from
the SHERPA website. Although presentations will focus on the advanced use of
the services a prior knowledge of these services is not necessary.
Research funder policies vary in their requirements as do the policies of
publishers with respect to what and when articles can be deposited. The
first part of the day will focus on the services available to explain and
simplify those policies and will highlight recent developments in these
services.
After coffee the focus will move to the current availability of repositories
in the UK and will include the advanced features of repository directories
and presentations on several unique UK repositories.
The final part of the day will look at a range of search services available
for finding OA materials held in repositories both in the UK and worldwide.
The day will end with a short workshop where participants will be able to
discuss current services, formulate suggestions for improvements and
identify future services. Recommendations resulting from this workshop will
be forwarded to JISC and hence participants have an opportunity to directly
influence future service development.
As with all RSP events the day will be offered at no-cost to attendees.
There are no limits on the number of delegates who can attend from any one
institution although total numbers of delegates is limited- so book early to
ensure a place.
Draft Programme
• 09.30 Registration and refreshments
• 10.00 Welcome & Overview
• 10.10 JULIET (Bill Hubbard)
• 10.35 RoMEO (Jane Smith)
• 11.00 Refreshment break
• 11.25 OpenDOAR and ROAR (Peter Millington)
• 11.50 The Depot (Theo Andrew)
• 12.10 Jorum (tbc)
• 12.30 Lunch
• 13.30 DART-Europe (Chris Pressler)
• 13.50 OAIster and BASE (Mary Robinson)
• 14.10 Intute Repository Search (Vic Lyte)
• 14.30 Workshop: Services of the Future?
• 15.30 Discussion
• 16.00 Close
Booking
If you are interested in booking a place, please do so via the online form
at http://www.rsp.ac.uk/events/FocusBooking-Services.php
Full details of the event can be found here -
http://www.rsp.ac.uk/events/FocusProgramme-Services.php
Please contact support@rsp.ac.uk for further details of any of our events.
Posted by Mark Merifield at 11:01 Comments (0)
With winter making one last ditch attempt to keep us all in our thermals, it's time to take a look back at what we've achieved this February. The unseasonally high temperatures have obviously helped as we seem to have made good progress in this slightly shorter of months.
We've moved to new, improved web servers, significantly improved the speed and efficiency of the browse and search features, added date restrictions to the advanced search feature, added links to various social bookmarking sites (Digg, Stumble upon, Facebook, Connotea, Del.icio.us, Citeulike) and deployed the ability to download to EndNote and Reference Manager. Phew!
Our final projects for the month are currently in various stages of testing and will hopefully be deployed at the end of this week. These will be: item embargos, EThOS standardized submission forms, and a tool to define content types within the repository.
Item embargos allow a user to enter an embargo date when uploading a file during submission. The bitstream in the item view will be labelled as under embargo and inaccessible until the date set. The embargos will need to be set up by us, but it's a simple change to a config file. Embargos set up across the repository, to all collections within a specific community, or to specified collections. All admins will be able to see embargoed content, and special groups can be set up to view embargoed content. Embargo dates can be changed within the item edit page, and we're working on allowing embargos to be added to bitstreams added after submission.
Further to this we're also creating a thesis-specific submission form that can be applied to a collection or collections of your choice, that also conforms to the UK's EThOS project requirements. This will be released in hand with the item embargo tool.
The content type tool will show you how many items in your repository contain which file types, and how many do not. For example, you can look at how many items contain PDFs, and then at all those that don't. For those that don't, you'll be able to see a breakdown of which file types they do contain, and link to the specific items in each case. There will also be a list of all items that are metadata only.
We've also been working on various small customisation requests for individual repositories, and we're looking forward to seeing HeRA ready for public launch at the end of the month. Our March task list has been set, and I'll add a further update on that later this week. In the meantime we've sent Graham off to Baltimore, not to catch up on season 4 of The Wire, but to attend the OAI-ORE launch. There will certainly be MUCH more of this to come, as ORE looks to become a central tool in repository usage moving forward.
Posted by Mark Merifield at 11:21 Comments (0)
So after another terrible pun, it's time to reveal the latest features to be added to the Open Repository service.
After having worked through various user test cases, the download to EndNote and Reference Manager feature has been released to live for all production and pilot sites. This feature is available on both the individual item page, and for a page of search results.
The feature will also work for the Firefox extension, Zotero.
Additionally we have also added links to each item page for the social bookmarking sites: Digg, Del.icio.us, Connotea, Citeulike, Stumbleupon and Facebook. However, we are unable to set these up for pilot sites.
Click here for an example of both features.
Posted by Mark Merifield at 16:31 Comments (0)
It's very common for companies to gather together the positive feedback they've received and send it round internally, and I felt that it was high time we did the same for us. Although, I've removed any mention of those involved, to avoid potential embarrassment at such gushing praise, here are a few of our favourites for your public perusal:
- We think you have been absolutely fantastic and we really do appreciate all of the hard work you have put in to deal with our queries and requests.
- I'm incredibly impressed by your ability to turn round these sorts of problems within hours (even minutes) rather than days or weeks!
- Like a good Santa you gave us almost everything we wanted, but held enough back for next year and because we weren't perfect this year!
- ...much better than socks and a poorly-knitted cardigan! That's fantastic thank you
- Just let you know that the launch was very successful and people showed great interests in the repository. Many thanks for your support
- We are very satisfied with your service so far! I am happy you are developing helpful tools of great importance!
- Big thanks for all the work
- You are really rocking, today!
- Are you trying to test my coronaries, or just good at timing?
- Thanks very much for this - it has indeed gone! I think I need to lie in a darkened room now...
- I am grateful to you for going beyond normal limits to help get this off the ground
- What would I do without your willingness to shed light on these mysteries for me!
- Thanks for this. You've been remarkably quick.
- We really appreciate the attention you've given us on all these picky details.
And finally...
- Thanks for this reassurance - I actually thought I had lost the plot... No - you're completely correct, I've clearly lost the plot. Now slightly concerned that I've lost a day though! Wonder what else I've missed...
In return, we'd like to say thank you too, to everyone we work with for making our jobs not only interesting, but also highly pleasurable.
And for our own and finally, we've realised that during their involvement with Open Repository, at least two administrators have got married, and two have fallen pregnant. Now, we're not saying that... well, make of it what you will!
Tune in next week for some more news you might Digg (groan!)
Posted by Mark Merifield at 17:09 Comments (0)
New item move tool released for Open Repository
It's been a little quiet on the blog this month, if not in the office. We're proud to announce that Medecins Sans Frontieres, Northumbria University and Helsebibliotekets are all moving towards their full production releases, and also to welcome the Museum of London as our newest customer.
As always we've been fixing and tweaking where we can, improving speed and efficiency as we go, especially in relation to the browse pages. Meanwhile DSpace 1.5, the majority of which we implemented last year, has gone into Beta testing for release in March.
We've also just released a new tool, a late addition to DSpace 1.5, that will help administrators with requests to move content around the repository. Whilst a tool to drag and drop communities and collections is still some way off, items can now be moved from one collection into another. Furthermore mapped items can also be moved to a new mapping location.
Therefore, if you want to move the contents of one collection into another, or into a couple of collections, this can now be done, albeit one item at a time. This means that with a little careful planning, the creation of new collections, moving of items and deletion of old collections, essentially what we have is a tool to enable a moveable hierarchy!
The item move is carried out through the Edit item page, and full instructions have been sent to senior administrators and will be added to the next version of the admin manual, due at the end of this month.
Posted by Mark Merifield at 17:02 Comments (0)
Hosted Repository Software: A seminar for repository managers
The Repositories Support Project are pleased to announce a
focussed event taking place specifically for repository managers using hosted or
commercial software solutions.
Thursday 27th March 2008, 10.00 -
1.00
SPLASH, University of Surrey, Guildford
Do you use one of the following products?
- Digital
Commons from BePress
- Digitool
from Ex Libris
- E-Prints
Services
- Fedora/VITAL
- IntraLibrary
from Intrallect
- Open
Repository from BioMed Central
Do you want to meet other repository managers who are using
the same software as you? If the answer is yes then this event is for
you!
The RSP is facilitating a half day event aimed specifically
at repository managers who work with hosted or commercial software solutions.
The programme will contain a case study from the University of Surrey exploring
some of the unique issues that face repository managers in this situation, along
with plenty of opportunity for networking and discussion as part of both the
wider group and in software based clusters.
As with all RSP events the day will be offered at no-cost
to attendees. There are no limits on the number of delegates who are welcome to
attend from any one institution although we recommend that to take the best
advantage of the event participants should be actively using, or about to take
on, one of the software solutions listed above.
To book a place please visit http://www.rsp.ac.uk/events/FocusProgramme-Hosted.php
Draft Programme
09.30 Registration and refreshments1
10.00 Welcome
& housekeeping
10.10 'Using a hosted solution to develop an
institutional repository: a case study report' Dr Christine Daoutis, Project
Officer, Surrey Scholarship Online
11.00 Refreshment
break
11.20 Networking and group discussion
11.50 Software
specific discussion groups
13.00 Lunch and close
For further information about this event, or details of any
other RSP events, please contact support@rsp.ac.uk.
Posted by Mark Merifield at 17:06 Comments (0)
With the arctic winds blowing the threat of snow down to London, it's time to catch up with what's happened to OR during January before we're all potentially stranded by a few flakes bringing the capital's transport system to a shivering halt.
It's been a busy month; we hit the ground running and haven't stopped, with improvements to both the front and back end of the service.
Hopefully, you'll have noticed a marked improvement in speed when using your repositories. We've focused on further improving many of the processes that drive the service, cutting down on memory usage and estimated that this work boosted performance by about 25%. Yesterday, we moved to new web servers with increased memory and processing power, which will not only improve efficiency even further, but also ensures that, in the long term, as your repositories grow and usage increases, there will be no loss in performance.
- At the front end, we've mainly focused on three areas: the in-submission document conversion, EndNote and RefMan imports, and search, with the following enhancements released:
- The submission form now recognises OpenOffice document formats (ODF) when added as bitstreams to a submission.
- The document conversion tool will now convert Microsoft Office applications (Word, Excel, PowerPoint) to their OpenOffice counterparts, as well as to PDF.
- The search results can now be ordered in the same manner as the browse results offering:
- ordering by relevance, title, issue date or submit date
- setting the number of results displayed per page
- ascending or descending order
- limiting the number of authors displayed with each result
- The advanced search now has a date restriction field, which will display only a range of dates that exist within the repository
- Full text items in non Latin alphabets are now searchable. This means that if you have a document written in Arabic, if you search in Arabic and the document matches that search, the item will be returned in the search results.
- The tool to download item metadata to either EndNote or RefMan remains available to admins only for the moment, as does the additional search enhancement to download a page of search results at a time to EN/RM. It'll be released as soon as we've completed further testing.
- We've also been working on the reverse, the importing of EndNote and RefMan library files, also still in testing.
Posted by Mark Merifield at 17:53 Comments (0)
A beginner's guide to metadata
For those who may not yet have seen the email, the Repositories Support Project is holding an event at the University of Wolverhampton, focused on the ever thorny issue of metadata within institutional repositories.
Wolverhampton's repository, WIRE, is an Open Repository service, so this will be an excellent opportunity to ask the Wolverhampton team all about how wonderful we are.
The text from Jackie Knowles' email is copied in below.
Metadata & Institutional Repositories : A beginners’ guide
Tuesday 11th March 2008, 10.00 - 4.00
Learning Centre Seminar Room, University of
Wolverhampton
Metadata is a challenging area for repository administrators
and this event will take beginners through some of the basics using a series of
presentations and practical exercises. This event forms part of the RSP series
of focussed events which take single themes identified as key training issues
and offer focussed training and specific support to the delegates in attendance.
The day will also include the usual opportunities to network and meet with
colleagues working in the repository field. As with all earlier RSP events the
day will be offered at no-cost to attendees.
Draft Programme
09.30 Registration and refreshments
10.00 Welcome & housekeeping
10.05 Talk 1 – An Introduction to metadata (Ann Chapman, UKOLN)
10.30 Talk 2 – Metadata and repositories (Jackie Knowles, RSP)
11.20 Refreshment break
11.40 Talk 3 – Metadata and harvesting (Stuart Lewis, RSP)
12.10 Speed networking
12.50 Panel
Q&A
13.00 Lunch
14.00 Practical exercise (including refreshments mid afternoon)
15.30 Report back and questions
16.00 Close
Who should attend?
Focussed events are aimed specifically at repository managers, technical staff, administrators, information workers and library staff engaged in the day-to-day operation and development of repositories. There are no limits on the number of delegates who are welcome to attend from any one institution. To book a place please visit http://www.rsp.ac.uk/events/FocusProgramme-Metadata.php
Please contact support@rsp.ac.uk for further details of any of our events.
Posted by Mark Merifield at 11:41 Comments (0)