Access Keys:
Skip to content (Access Key - 0)
My Area (Access Key - 2)


Toggle Sidebar
Your Rating: Results: PatheticBadOKGoodOutstanding! 14 rates

Labels

elympics elympics Delete
advanced advanced Delete
cms cms Delete
ezfind ezfind Delete
solr solr Delete
search search Delete
deep deep Delete
primo primo Delete
thirdnode thirdnode Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Primo CMS Solr Websearch Adaptor

Tags: , , , , , , , ,
Last Updated: Feb 04, 2010 13:16


  • Short description: This primo deep search adapter allows primo users to connect to any Solr Server and thereby directly integration webpage content to primo. Solr is an open source full text search engine that has connectors to most content management systems.
  • Author: Kai Jauslin
  • Institution: ETH-Bibliothek, ETH Zürich
  • Year: 2010
  • License: Apache License, Version 2.0 (Some bundled components may be released with other, Apache compatible open-source licenses. Please refer to the license of the individual components for more information).
  • Short description: Use, modification and distribution of the code are permitted provided the copyright notice, list of conditions and disclaimer appear in all related material.
  • Link to terms: Detailed license terms
  • Skill required for using this code:
    Basic - for running binary distribution, Advanced - for further development

Description

The problem

ETH library has an extensive website with many pages of textual information describing processes, projects, facilities, resources, organization and people – everything which helps the users of our library to better find their way around. This information is managed in a Content Management System (CMS). However, this information is not directly searchable in Primo and therefore not available to the users.

The solution

Content can be made available to Primo in two ways: Pipes and DeepSearch nodes (formerly Third Nodes). Since the nature of our content data is very differently to what we have in the Primo Index, we developed a very flexible and simple to use deep search adaptor, that can be used to search the CMS.

We developed the adaptor it in such a way, that it can easily configured for use with many CMS and content systems: eZPublish, Drupal, Typo3, Wordpress, Fedora Repository. In fact, it works for all systems that can be used with the Solr search service. Since we use ezPublish with the extension eZFind, this document is focused on this integration. But really, configuration for use with any Solr system is very simple.

Other uses

The adaptor can be used to search data in any Solr system – not limited to CMSs. In the following situations, this adapter can be of use:

  • Data has a simple structure (or XSL know-how is available)
  • No FRBRization or de-duplication of data is needed
  • Data is not accessible as an extract or the costs for implementation are too high
  • Data not related to any resources, which should not go into the Primo index (e.g. News messages)
  • Structure of the data is completely different than everything else in the Primo index
  • Support of search strategies not supported by Primo (e.g. geographical range searches)
  • Realtime search  (e.g. field-related newstickers)

In all other situations, it is advisable to use regular primo pipes.

Demonstration

We are working on providing a publicly available prototype and will post more information here as soon as it's available.http://www.library.ethz.ch

Screen captures

The screenshots show the ETH customized frontend with the websearch adaptor in a separate tab.

  • Picture 1: Websearch adapter configured in a seperate Primo tab, showing the results for the search "opening hours".
  • Picture 2: Primo full detail view including text extract
  • Picture 3: web page for result "Adresses, Opening hours"
  • Picture 4: facetted result set

Installation instructions

Before using this adapter, you need a working Solr instance. Please refer to the installation manuals of your CMS system (example: eZFind http://ez.no/doc/extensions/ez_find/2_1).

To install the adaptor within primo, please follow these steps:
1. Download the binary distribution (see below) to your Primo Frontend server
2. Logon to your Primo FE server with the primo user
3. Stop the primo frontend server (fe_stop)
4. Extract the distribution into a separate directory (e.g. /tmp), then execute

/tmp$ tar xvfz primo-websearch-dist-1.0.0.tar.gz
/tmp$ dist/install-websearch.sh

This script will copy the classes and java libraries to the correct places for your primo frontend. (Note: if you have changed the application context of your primo installation – most likely not - please modify the script and set the correct APP_DIR)
5. Enter “fe_conf” to switch to the frontend configuration directory
6. Merge the contents of the "thirdnode-config.websearch.xml" into the "thirdnode-config.xml".
7. Adjust the path to the adaptor configuration xml in the thirdnode-config.xml.
8. Follow the steps described in the Primo Manual to make the new third node known to primo (see also http://www.exlibrisgroup.org/display/PrimoOI/Deep+Search+Adapter Steps 6-9). See below for some example values.
9. Start the frontend server again (fe_start)
10. Configure the adaptor as necessary (by editing primo-websearch.xml).
11. Test.

You can do the detailed configuration of the adaptor at runtime, without the need to restart the Primo frontend. The configuration will reload itself when necessary (also all the XSL stylesheets).

Configuration

Please go to "fe_conf" and open the file "primo-websearch.xml”. Enter your solr base url at the "solr-base" tag as in the example. Also see this file for further examples – there are many options that can be adjusted (e.g. facets to be used).

If you are not using eZFind, you also need to copy and edit the XSL stylesheet. Please upload your stylesheet here when you configure the adaptor for a new system.

Development

Only go this path if you know exactly what you are doing. If not, you may destroy the configuration of your primo frontend server.

For customization of the adaptor to non-Solr XML sources, you need to write a new Java strategy. You can download the development source distribution below. Working in this way needs basic knowledge of the Java language and the mechanisms used in web applications. The Ant-Script will build your final distribution zip and optionally copy the files to your server (don't develop on your production server!). You can test the adaptor also on the command line with the given shell script.

Software requirements

Primo Version 2.1.x

Technology

This adaptor extends the deep search mechanism by using XSL-Transformation stylesheets, that are evaluated at runtime with the results from Solr. It is fully XSLT 2.0 compatible, which includes XSLT regular expression support. No reloads are necessary when changing the XSL stylesheets or the adaptor configuration - which make development fast and easy. The primo logging interface is also supported and all logs will be written into the standard primo_library.log.

Download

Using the following Ex Libris open interfaces

  • [Primo Deep Search Adaptor Interface]

Release notes

  • Version 1.0.2 (January 29th, 2010) - support for multi-facet restrictions
  • Version 1.0.1 (January 29th, 2010) - full result view now works also with facetted results (improved query facet mapping)
  • Version 1.0.0 (January 11th, 2010) Initial release with eZFind customizations. Basic support for Solr facetting.

Known issues

  • Date range facetting not supported
  • Blended result sets not tested

Comments

Page Attachments

File NameCommentSizeNumber of Downloads
primo-websearch-dist-1.0.0.tar.gzInitial release of the Primo Websearch Adaptor3.25 MB18
primo-websearch-src-1.0.0.zipInitial release of the Primo Websearch Adaptor (java sourcecode)13.39 MB16
solr-query-default.xslv1.0.2 multi-facet support2 kB15

Added by Kai Jauslin on Jan 11, 2010 21:14, last edited by Kai Jauslin on Feb 04, 2010 13:16

Adaptavist Theme Builder Powered by Atlassian Confluence