2 Today Web Services – Introduction “Remote Procedure Call” in WS Binding, Marshalling…Using TCP as the transport for RPCsConnectivity Issues: NAT, Firewall
3 What are Web Services?Today, we normally use Web browsers to talk to Web sitesBrowser names document via URL (lots of fun and games can happen here)Request and reply encoded in HTML, using HTTP to issue request to the siteWeb Services generalize this model so that computers can talk to computers
4 What are Web Services? Web Service Client System SOAP Router Backend ProcessesWeb Service
5 What are Web Services? Web Service “Web Services are software components described via WSDL which are capable of being accessed via standard network protocols such as SOAP over HTTP.”SOAP RouterBackend ProcessesWeb Service
6 What are Web Services? Web Service “Web Services are software components described via WSDL which are capable of being accessed via standard network protocols such as SOAP over HTTP.”SOAP RouterBackend ProcessesToday, SOAP is the primary standard. SOAP provides rules for encoding the request and its arguments.Web Service
7 What are Web Services? Web Service “Web Services are software components described via WSDL which are capable of being accessed via standard network protocols such as SOAP over HTTP.”SOAP RouterBackend ProcessesSimilarly, the architecture doesn’t assume that all access will employ HTTP over TCP. In fact, .NET uses Web Services “internally” even on a single machine. But in that case, communication is over COMWeb Service
8 What are Web Services? Web Service + “Web Services are software components described via WSDL which are capable of being accessed via standard network protocols such as SOAP over HTTP.”SOAP RouterWSDL documents are used to drive object assembly, code generation, and other development tools.Backend ProcessesWSDL document+Web Service
9 Web Services are often Front Ends Web Server(e.g., IBM WebSphere, BEA WebLogic)DB2 serverSAPWSDL-described Web ServiceWeb App ServerWeb Service invokerCOM AppCORBA AppC# AppServer PlatformClient PlatformSOAP messaging
10 The Web Services “stack” BPEL4WS (IBM only, for now)Business ProcessesReliable MessagingSecurityTransactionsQuality of ServiceCoordinationWSDL, UDDI, InspectionDescriptionSOAPOther ProtocolsMessagingXML, EncodingTCP/IP or other network transport protocolsTransport
11 What are Web Services? Web Service serverlet Amazon would hand out “serverlets” for 3rd party developers to useThis connects their applications directly to Amazon’s systemSOAP RouterserverletBackend ProcessesWeb Service
12 Advantages of web services?* Web services provide interoperability between various software applications running on various platforms.“vendor, platform, and language agnostic”Web services leverage open standards and protocols. Protocols and data formats are text based where possibleEasy for developers to understand what is going on.By piggybacking on HTTP, web services can work through many common firewall security measures without requiring changes to their filtering rules.also be a disadvantage – security issues.*: From Wikipedia
13 How Web Services work First the client discovers the service. More in next lecture!Typically, client then binds to the server.By setting up TCP connection to the discovered address .But binding not always needed.OK to say client/server here?
14 How it works… Next build the SOAP request: (Marshaling) Fill in what service is needed, and the arguments. Send it to server side.XML is the standard for encoding the data (but is very verbose and results in HUGE overheads)SOAP router routes the request to the appropriate server(assuming more than one available server)Can do load balancing here.Are these all request-response? OR is it possible to say something like setVal(X, 100) also?
15 How it works…Server unpacks the request, (Demarshaling) handles it, computes result.Result sent back in the reverse direction: from the server to the SOAP router back to the client.
16 Marshalling IssuesData exchanged between client and server needs to be in a platform independent format.“Endian”ness differ between machines.Data alignment issue (16/32/64 bits)Multiple floating point representations.Pointers(Have to support legacy systems too)
17 Discovery This is the problem of finding the “right” service In our example, we saw one way to do it – with a URLWeb Services community favors what they call a URN: Uniform Resource NameBut the more general approach is to use an intermediary: a discovery service
19 Repository summary A database listing servers Each is described using the UDDI language, which is defined over XMLHence can be searched with XML queriesAn extensible standardDefines some required information about interfaces available and argument types, etcBut services can provide extra information too.
20 Roles?UDDI is used to write down the information that became a “row” in the repository (“I have a temperature service…”)WSDL documents the interfaces and data types used by the serviceBut this isn’t the whole story…
21 Discovery and naming The topic raises some tough questions Many settings, like the big data centers run by large corporations, have rather standard structure. Can we automate discovery?How to debug if applications might sometimes bind to the wrong service?Delegation and migration are very trickyShould a system automatically launch services on demand?
22 Example: Why discovery is tricky Client has opinions“I want current map data for Disneyland showing line-lengths for the rides right now”Service has opinionsAmazon.com would like requests from Ithaca to go to the NJ-3 datacenter, and if possible, to the same server instance within each clustered serviceDNS has opinionsMany systems play with name -> IP bindingsInternet has opinions (routing)
23 So, what’s tricky?Web Services doesn’t standardize these four steps, it just assumes that people will hack solutionsHence some are hard to implement, we lack standards, and in some cases, solutions are poor onesUDDI and WSDL are just a corner of the overall picture!
24 Network address translation… Another issue: Often, the internal address is not addressable from outside!A tiny bit of security.But if RPC server is behind a NAT, trouble!NAT needs the host behind it to start the connection process.Need to configure NAT to let specified traffic through.Generally: (WS traffic)HTTP is let through.Tough to have a connection in between two hosts behind NATs.There are some tricks to bypass this though.
25 FirewallsThese allow/disallow traffic, depending on source, destination, protocol used, etc.Often only allow connection from the inside to the outside!Stateful: remember active flows, and disallow unexpected packets (NAT)Again, need to configure to ensure server traffic gets through. (General RPC)Again, (WS)HTTP does not face as much of a restriction.Get traffic statistics.Spam/virus checking, etc.NAT and firewall typically in the same box.
26 Demilitarized Zone (DMZ) DMZ: used to host publicly accessible services like company webpages, ftp, dns.Good place to host the Web Service!DMZ situated outside the private network.No outgoing connections from DMZ.If DMZ attacked, damage limited to DMZ.
27 Client talks to eStuff.com Moving on… let’s oversimplify and just assume the client manages to find the data centerWe think of remote method invocation and Web Services as a simple chain:Client systemSOAP routerWeb ServiceWeb ServiceWeb ServicesSoap RPC
28 So… suppose we get inAssuming we can connect to the data center (to its Web Services router), then what?If you just use Visual Studio out of the box, you end up with a single-machine Web ServerBut massive datacenters are common!
29 A glimpse inside eStuff.com “front-end applications”Pub-sub combined with point-to-point communication technologies like TCPLBserviceLBserviceLBserviceLBserviceLBserviceLBservice
30 Clusters and load balancing Idea here is that some form of load balancer spreads work over a clusterAnd cluster replicates data for availability and load managementHow it does this is a topic we need to discuss in more detail (not today)
31 What about “legacy” applications? Some of these Web services are really just front-ends to older legacy applicationsSo to talk to an old IBM database, we mightRun the database on some sort of machine, or virtual machineBuild one of these translator front-endsAnd then register it with the Web Services routerThis may sound expensive (it is) but it works!Obviously, our fancy clustering and load-balancing won’t apply to a legacy application, so those fancy tricks are only for “new” code
32 Discovery in eStuff.com Data centers are increasingly commonAnd they raise hard questions!How can a data center in California control decisions a client is making in Ithaca?Services are clustered. How should client request be “routed” to the right memberOnce you start talking to a server it may cache data for you. How can you be sure to get the right one next time?
33 These are modern challenges Web Services can be seen as evolving from prior workMost often cited: CORBA, which also was used in many big data centersBut CORBA didn’t assume that clients came in over the public InternetMore often, CORBA was used between a hand-built client and the service it talks to
34 CORBA approach CORBA had what are called Ways to export specialized client stubsThe client stub could include server provided decision logic, like “which data center to connect with”Gives data center a form of remote controlFactory services: manufacture certain kinds of objects as neededEffect was that “discovery” can also be a “service creation” activity
35 CORBA is object oriented Seems obvious… and it is. CORBA is centered around the notion of an objectObjects can be passive (data)… active (programs)… persistent (data that gets saved)… volatile (state only while running)In CORBA the application that manages the object is inseparable from the objectAnd the stub on the client side is part of the applicationThe request per-se is an action by the object on itself and could even exploit various special protocolsWe can’t do this in Web Services
36 Web Services are document-centric That is, communication is by sending documents (like pages) from client to server and backAnd most guarantees or properties are associated with the document itself, not the serviceFor example, WS_RELIABILITY isn’t about making services reliable, it defines rules for writing reliability requests down and attaching them to documentsIn contrast, CORBA fault-tolerance standard tells how to make a CORBA service into a highly available clustered service
37 Will Web Services “help” with naming and discovery? Web Services tells us howOne client can…… find one server and… bind to that server and… send a request that will make sense… and make sense of the responseSo sure, WS will help
38 But Web Services won’t… Allow the data center to control decisions the client makesAssist us in implementing naming and discovery in scalable cluster-style servicesHow to load balance? How to replicate data? What precisely happens if a node crashes or one is launched while the service is up?Help with dynamics. For example, best server for a given client can be a function of load but also affinity, recent tasks, etc
39 How we do it now Client queries directory to find the service Server has several options:Web pages with dynamically created URLsServer can point to different places, by changing host namesContent hosting companies remap URLs on the fly. E.g. (reroutes requests for to Akamai)Server can control mapping from host to IP addr.Must use short-lived DNS records; overheads are very high!Can also intercept incoming requests and redirect on the fly
40 Why this isn’t good enough The mechanisms aren’t standard and are hard to implementAkamai, for example, does content hosting using all sorts of proprietary tricksAnd they are costlyThe DNS control mechanisms force DNS cache misses and hence many requests do RPC to the data centerWe lack a standard, well supported, solution!
41 Coming up?How content is managed in even larger systems, that have multiple data centersThe main example is Akamai…