|
|
|||||
|
|
|||||
|
|
Google2RSS, a Second Time Around, or Introducing invsoap Currently missing from the Web Service scene is a set of generic, easy-to-use command-line utilities for collecting and processing data available through Web Services. The set of tools I'm describing would be analogous to the grep, awk, and sed milleu. The idea of these tools is somewhat different from the current (mostly theoretical) use-cases for Web Services. Such tools would be used as passive data processors, and hooked together using traditional shell-scripting techniques to create complex tools. These tool would make Web Service-based data available to a shell-scripting world. I've created a prototype of an important tool in that toolchest: invsoap. (Get it here. Requires ANT and J2SE 1.4 installed.) Invsoap is a command-line utility for requesting SOAP-HTTP Web Service-based data. ("Invsoap" is a contraction of "invoke SOAP".) The utility uses the well-known command-line data processing pattern; using stdin/stdout for input and output by default, command-line arguments to express configuration parameters, and process exit codes to indicate success or failure of the Web Service request (command-line instructions). Thus the utility will fit in well with the shell script world, and provide to that world Web Service-based data without requiring custom application design and coding. Invsoap is a prototype of a utility that's missing from the major software utility producers' offerings (Apache, GNU, SourceForge or any of the major commercial software producers such as Microsoft). The prototype implementation is not particularly efficient. Its raison d'etre is to define the minimal set of command-line parameters, and to demonstrate the usefulness of such a utility. My Google2RSS re-implementation demonstrates invsoap's usefulness. Google2RSS is a simple utility program, the public gift of Peter Drayton -- read his original story. His utility is a specific example of integrating data originating from Web Services in an automated system built using basic, well-known XML processing techniques. Peter Drayton's application is an an OO-based utility. It's central processing uses .NET's internal Schema-OO Class mapping and translation facilities, specifically bound to Google Search Web Service and RSS (Rich Site Summary) XML document types. Consequently his original Google2RSS implementation requires development of a custom utility for translating Google search results in to RSS data -- a task just as easily accomplished using any generic XSLT processor and an XSLT script. In my implementation I used a 3-line shell script I wrote in Notepad, which could probably be contracted down to 1 line if xsltproc supported standard input/output streams. Here's the whole script:
And this generates an RSS feed, which contains the top 10 Google search results for "Surfer Potato". The resulting feed is hosted at http://www.blumenfeld-maso.com/weblog/RSSSurferPotatoFeed.xml. In fact, Web Service data processing can often be developed more quickly using shell script-based (i.e., stream-based)utilities that are XML-aware (XSLT processors, invsoap, XQuery processors, etc.) than by custom OO-based applications that use XML-OO translation to get XML data into and out-of the application. (Let it be known that I am taking a rather cheap shot at Mr. Drayton's Google2RSS implementation. His utility is meant to demonstrate generation of an RSS data feed, and not a specific coding practice or design pattern. My appologies to him for inappropriately using his utility as a "negative example".) The invsoap command-line utility requires a reference to a WSDL resource describing a SOAP-HTTP Web Service. The utility reads one or more SOAP-HTTP requests from files named on the command-line, or from stdin if no files are specified. The requests are sent to the address(es) specified in the WSDL file for the specific SOAP operations invoked by the requests in an HTTP POST request (using the SOAPAction value and other important-but-should-be-hidden data available in the WSDL file). The response(s) in the form of SOAP <Envelope> elements are streamed to stdout serially, or are stored to an output file if one is named as a command-line parameter. Anyone who can write shell scripts, and has decent knowledge of populate XML processing techniques such as XSL processing, should be able to build powerful, custom SWeb Service-based data processing applications without using heavy-weight programming languages and frameworks such as .NET or Java. The current reality is that a set of generic command-line utilities for distributed processing using XML data has not entered the popular software engineering vernacular. So, like Peter Drayton, developers must resort to low-level customization.
|