Basic URL Export for SFX 4
Tags: basic
, export
Last Updated: Aug 02, 2011 14:36
- Description
Basic URL Export for SFX 4 with additional proxy flag information (only active targets, services and portfolios).
- Main motivation behind this module/program is that the SFX 4 URL Export is broken out of the box.
- Export time on average is less than 2 minutes (compared to the 10 (4.1.1) or 20 (pre 4.1.1) minutes of the Ex Libris export) - the time varies depending on your hardware of course.
- Supports many exceptions, where the hostnames must be built from jkeys.
This is mostly based on the SFX_SCRIPTS of EZproxy Wondertool without the EZproxy part (since that should be part of a separate software to which you can feed URLs from many different sources).
- Author: Teemu Nuutinen
- Additional author(s):
- Institution: Helsinki University Library
- Year: 2011
- License: Perl License
- Short description: This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself. - Link to terms: Detailed license terms
- Skill required for using this code:
basic
State
Stable
Programming language
Perl (5.10, maybe older)
Software requirements
SFX 4
Download
ExportURLs.pm v0.1.1
Changes
v0.1.1 2011.08.01
- Added case insensitive flag to the $RE_match_url regex to catch If(...), if()
etc. parse params too. - Added a very simple check to prevent running as a non-SFX user. Some Ex Libris
module runs out of memory if certain environment variables are not set. Thanks
to A. Headley for the report. - Due to the previous bug a new method get_dbh() was added for overriding so
you need not rely on env variables and can use DBI instead. See example [perl-module] on how to use it.
Installation instructions
- Find out the owner of your SFX instance; e.g. view a local target and then click Additional details, see Owner field. It should look something like r.consortium-name.instance-name.
- Copy the ExportURLs.pm to a directory of your choice on the SFX 4 server.
- Run the command (substitute the owner)
perl ExportURLs.pm <owner> > export_urls.txt
Example output in file:
#TARGET-SERVICE URL PROXY ACM_DIGITAL_LIBRARY-getFullTxt http://portal.acm.org/browse_dl.cfm?linked=1&part=transaction&idx=J782&coll=portal&dl=ACM 1 ACM_DIGITAL_LIBRARY-getFullTxt http://portal.acm.org/toc.cfm?id=SERIES402 1 ACM_DIGITAL_LIBRARY-getFullTxt http://portal.acm.org/toc.cfm?id=SERIES12160 1 ACM_DIGITAL_LIBRARY-getFullTxt http://portal.acm.org/toc.cfm?id=SERIES492 1 ACM_DIGITAL_LIBRARY-getFullTxt http://portal.acm.org/browse_dl.cfm?linked=1&part=series&idx=SERIES955&coll=portal&dl=ACM 1 ACM_DIGITAL_LIBRARY-getFullTxt http://portal.acm.org/browse_dl.cfm?linked=1&part=series&idx=SERIES11155&coll=portal&dl=ACM 1 ACM_DIGITAL_LIBRARY-getFullTxt http://portal.acm.org/browse_dl.cfm?linked=1&part=series&idx=SERIES550&coll=portal 1 ACM_DIGITAL_LIBRARY-getFullTxt http://portal.acm.org/browse_dl.cfm?linked=1&part=series&idx=SERIES108&coll=portal&dl=ACM 1 ACM_DIGITAL_LIBRARY-getFullTxt http://portal.acm.org/browse_dl.cfm?linked=1&part=series&idx=SERIES632&coll=portal&dl=ACM 1 ...
Known issues
- No support for SFX institutes (Default is used).
- The exceptions will have to be updated manually if the linking syntax changes.
- Must be used on the same machine as SFX 4 and as a SFX-user with environment variables - though it's trivial (for a programmer) to use DBI instead of Manager::Connection for remote access.
-- There is now an example [perl-module] on how to do the export from a remote machine (tested using perl 5.14.1 with DBD::mysql driver v4.019).
- The owner must be found out manually.
Comments
The proxy flag can be set to 0 (false) and 1 (true) for the same hostnames (e.g. mixed free and licensed context) on different rows of the output. So be careful if you use this to generate EZproxy configuration; always pool the hostnames and only set proxying if the flag is true (1).
There is no command line switch to remove the proxy information, but to remove it you can simply pipe the output to cut, e.g.
perl ExportURLs.pm <owner> | cut -f1,2 > export_urls.txt
For more advanced usage, see perldoc ExportURLs.
You can freely use all parts of the code for your own projects.
About SQL-queries in this script and in EZproxy Wondertool
I received an email (which I somehow must have accidentally deleted - sorry!) regarding the compatibility of the queries. The SQL follows closely to the SQL of Wondertool's SFX 3 script (sfx-generate-ezproxy.pl), but there are key differences:
- the projection is slightly different
- Wondertool
displayName targetName targetService 'Active Target' URLString DefaultAvailability - ExportURLs
display_name target service parse_param proxyflag
- Wondertool
- uses SQL-92 syntax instead of the old SQL-89
- uses perl code for the exceptions URLs instead of SQL for maximum flexibility
Adapting ExportURLs' queries to Wondertool shouldn't be too hard, but I would recommend a different approach which is why I didn't use Wondertool in the first place: modify Wondertool or some other EZproxy script to read in URLs from files so you can use many different sources instead of just SFX.
Acknowledgements
This module is inspired by the SFX_SCRIPTS of EZproxy Wondertool developed by Rod McFarland and Paul Joseph, UBC; the original SFX script came from Ken Mitchell at Duke.

