Presentation on theme: "CGI-based “Proxy” (URL- rewriter) Used at TCU Developed in March 1996, initially for access to First Search, then other vendors Love child of Naivete and."— Presentation transcript:
CGI-based “Proxy” (URL- rewriter) Used at TCU Developed in March 1996, initially for access to First Search, then other vendors Love child of Naivete and Desperation (Getting people into commercial databases was easy with Telnet and C-Kermit -- isn’t the web supposed to make things easier?) Desperation: Needed solution now Naïvete: Figured could cobble together a solution to use for about 18 months, then IETF/vendors would have a better solution available
Goals... Support following vendor authorization scenarios with one program: –User enters id/password in a form to gain access (Cambridge Scientific Abstracts, Ovid on TexShare server, OCLC)Cambridge Scientific Abstracts –Vendor uses standard HTTP Username/Password validation (Hoovers Company Data, CenStats)Company Data –Vendor validates based on IP address (many vendors) Make completely transparent to user (no reconfiguration of Web browser) Use tools we already had (e.g, not maintain a Unix box just to run WebScript)
How it works: (IP validation scenario) User clicks on a link to a URL that looks like this: http://lib.tcu.edu/htbin/validate.pp?Britannica Validate.pp looks at WWW_REMOTE_ADDR field to determine if user is already coming from a TCU IP address: If F$Loc("138.237", WWW_REMOTE_ADDR).lt. F$Len(WWW_REMOTE_ADDR) then WS "Location: http://www.eb.com:180/" WS "" else
…. (IP validation scenario) If user is off-campus, then send them back a screen explaining they’re about to be asked for username and password, with a link that calls the CGI script for remote resource. URL for the link looks like this:a screen –http://lib.tcu.edu/htbin/Britannica.pp?http://www.eb.com: 180/ If their username and password is registered, Britannica.pp uses Lynx to fetch the page specified as the parameter and dumps it in a temp file.
…. (IP validation scenario) Pascal program opens the temp file and looks for all HREF=, SRC= and other tags that specify URLS. If URL is relative: Then it must first be converted to absolute form: Then, assuming URL points to something that also requires IP validation, the URL of the CGI script is prepended to it:
…. (IP validation scenario) So, when user clicks on the link (or when browser fetches data automatically, as with IMG tags), access continues to be routed through the CGI script and the vendor’s server continues to see our IP address. Lynx will gladly fetch.JPG,.GIF,.PDF and other non-HTML data and dump it in a temp file, so the CGI script just does a redirect when it sees that it’s fetched non-HTML data: $ If (F$Loc(".GIF", TSTR).lt. F$Len(URL)).OR. (F$Loc(".JPG", TSTR)... $ then $ WS "Location: http://library.tcu.edu/www$scratch/''HTML_InFile'" $ WS "” $ else
Surprisingly...it’s worked for three years Currently in use for 18 different sources. Most use IP validation, three use HTTP Username/Password, and five use logon forms with session ids. 250-1,400 files a day fetched through the scripts in February. Although technique is clunky (creates temporary files for everything fetched), the response time for the Internet is generally so slow that the extra time required by the scripts is negligible in comparison. Likewise, the load it adds to our Alpha server is small relative to other applications.
But… Lack of cookie support is becoming an issue for more and more databases. Creating scripts requires fair amount of proficiency in DCL. Limitations in DCL and Lynx create need for some kludgy work arounds in some of the DCL scripts; they can be complex to diagnose when something stops working. How much longer will be be running a VMS server?
So… working on a new version. Written as a Java Servlet so as to be portable to different web servers and operating systems. (The Servlet interface replaces CGI). Instead of scripts, uses a configuration file: should be easier to set up and maintain. Java has built in support for cookies and maintaining state in both directions (to the vendor, and to the user’s browser), so cookie support becomes doable.
Finally. If vendors start supporting a better solution, like the Referrer URL technique, there will be no need for it. Altogether now, let’s start holding our breath...