Monday, November 15, 2004

Today I read an essay available from Sun Microsystems entitled "A Note on Distributed Computing" (SMLI TR-94-29). This is an older document, written by Jim Waldo, Geoff Wyant, Ann Wollrath, and Sam Kendall ten years ago, bearing a publication date of November 1994. In this essay, the authors note the difference between local and remote resources, and what bearing this has on developing distributed applications.

Specifically, there are three areas of difference: latency, memory access, and concurrency. The authors argue that one cannot treat local versus remote resources in a generic manner, and that the above areas of difference are the thorns in the side of unified resource models. Moreover, the issue isn't something that can be papered over through language and implementation, but lies far deeper within the architecture, and, as I would add, in the very nature of language itself. Before you think that I've fallen off my rocker, let me introduce, briefly, Ferdinand de Saussure, and, or particular note, how he is relevant to semiotics. After all, isn't the juncture of language and computer science the nature of the symbol?

In an extremely abbreviated synopsis/reduction of Saussure, let's just say that he viewed language as bifurcated into meaning and symbol. Meaning is the mutually understood concept represented by a word or symbol. In other words, in order to understand a stanza, or even mote, of a dialog, a common protocol must be implemented. Both parties must be able to recognize the objects passed, and be able to parse them in more-or-less similar fashion.

The authors of the essay in question see communications and language as split. Likewise, one could assert that language and communications among people are two separate subjects, but such an argument is reliant, especially, upon post-Chomsky views of language. However, Saussure's initial bifurcation of language is useful in propping up an analogy between natural language and distributed computing. Within the limits of natural language, then, we see that when the authors write, "Just making the communications paradigm the same as the language paradigm is insufficient to make programming distributed programs easier, because communicating between the parts of a distributed application is not the difficult part of that application" (5), that we are looking at something more intrinsic than transport - i.e., the communicative dialog - as being the trouble-spot.

One thing of interest when comparing computer science to natural language is the inversion of order. So instead of the OSI seven-layer model, where language would reside at the top, in human communication, language sits near the bottom. At least when linguists such as Chomsky and Bickerton are taken into account. The idea of a proto-language underlying our own communications, if viewed as a part of language itself (ala Bickerton) rather than simply part of the underlying infrastructure, would seem to either a) produce somewhat of an inversion to the OSI model, or b) mandate further divisions within the layers. Of course, there is a quite valid apples-and-oranges criticism that could be levied here.

Returning to the essay, a fundamental premise of the authors' point is that local and remote resources differ in regards to memory-address-space and latency. Thus, they write, "Ignoring the difference between the performance of local and remote invocations can lead to designs whose implementations are virtually assured of having performance problems because the design requires a large amount of communication between components that are in different adddress spaces and on different machines" (5). Yet, it seems that a simpler solution is possible through our linguistic model, to perceive the local and remote as the same through a language-based equivalency. Therefore, when they suggest, "Whether or not it will ever become possible to mask the efficiency difference between a local object invocation and a distributed object invocation is not answerable a priori" (6), it would seem that the very question is not particularly useful.

Oddly, the authors seem to realize the wrong-headedness of a model based upon unified treatment of local versus remote resources. Regarding local versus remote calls they write, "it would be unwise to construct a programming paradigm that treated the two calls as essentially similar" (6). There are really two sides to my criticism here: 1) the model of local versus remote as related to a model of thought versus communicative language, and 2) messages across entities as analogous to human communication itself. The former reinforces the Saussurian analogy in that langauge existing as an exchange of symbols has certain constraints that thought lacks: meaning between entities must be agreed upon.

Also along the lines of the first side to my criticism is the author’s statement regarding memory access. They write, "A more fundamental (but still obvious) difference between local and remote computing concerns the access to memory in the two cases - specifically in the use of pointers" (6). Pointers would, of course, exist beneath a linguistic model rooted in Sausurrian theory, and have more to do with access to information. In fact, pointers may be more relevant, paradoxically, in a higher-level framework such as Blending Theory (aka Conceptual Integration).

One of the main issues with pointers, though, is that, like meaning, the internal representation cannot be shared, only agreed-upon translations of the symbols. In similar fashion, the authors write, "In requiring that programmers learn such a language [one which does not use address-space-relative pointers], moreover, one gives up the complete transparency between local and distributed computing" (6), again a rather neo-Saussurian assertion.

Further, "Even if one were to provide a language that did not allow obtaining address-space-relative pointers to objects (or returned an object reference whenever such a pointer was requested), one would need to provide an equivalent way of making cross-address space reference to entities other than objects" (6-7). Here we can perhaps map objects to symbols and address-space-relative pointers to meaning. Therefore, the issue begs Saussure’s entire model! That cross-address space reference would be agreed-upon meaning, or some intermediate framework for parsing the language structures.

The authors still grapple with the bifurcation, however, adding, "The danger lies in promoting the myth that ‘remote access and local access are exactly the same’ and not enforcing the myth" (7). However, I would argue that this situation would violate Saussure's definition of language, and that, since meaning is not agreed upon, the two systems would be engaged in a dialog of gibberish. That is to say, I believe the author's are noting the symptom, not the problem.

Turning to the possibility of an intermediate framework for parsing, the authors note that "[t]he alternative [to making local and remote access the same] is to explain the difference between local and remote using an interface definition language" (7). This approach sidesteps the issue though, the authors continue, " By not masking the difference, the programmer is able to learn when to use one method of access and when to use the other," something that essentially boils down to speaking two tongues when a pidgin or creole would satisfy all involved parties.

However, a creole seems to be outside the realm of possibility for the authors. From their standpoint, "[w]hen we turn to problems introduced to distributed computing by partial failure and concurrency, however, it is not clear that such a unification [between local and remote memory access on one hand, and latency issues on the other] is even conceptually possible" (7). As a possible example to refute this supposition, I would propose Plan 9 by Bell Labs. There is an operating system that DOES treat local and remote resources identically, since it was built, from the ground up, to work in a distributed environment. Granted, this paper was published in 1994, the same year that Plan 9 was opened to the public (as I recall), so the authors may not have been privy to its possibilities.

Returning to the author’s assessment of the situation, it is clear that they do not consider the type of distributed model used by Plan9. "Not only is the failure of the distributed components independent, but there is no common agent that is able to determine what component has failed and inform the other components of that failure, no global status that can be examined that allows determination of exactly what error has occurred" (7). Perhaps the Bell Labs approach is decidedly outside a linguistic model such as Saussure’s. Yet, perhaps it is merely a question of where one draws the boundaries between self and group. The above statement by the authors does seem to allow the commonality between software and language being upheld. The difference is that software, in this model, wants/needs to know what broke down in communications, but in spoken discourse, such a requirement would be seen as obnoxious. If I cannot listen further to the conversation, or do not understand a stanza of the dialog, then I need to go no further than propriety demands - and the rules of propriety can be less rigid and formalized. Yet, the difference is not on the "client" side, but on the part of the "speaker." Again we see the boundaries between self and group/other as a major criterion for the model.

Concluding the essay, the authors note that, "One of two paths must be chosen if one is going to have a unified model."..."The first path is to treat all objects as if they were local and design all interfaces as if the objects calling them, and being called by them, were local."..."The other path is to design all interfaces as if they were remote" (8). Further, "Accepting the fundamental difference between local and remote objects does not mean that either sort of object will require its interface to be defined differently" (11). This seems to be a bit ambiguous, perhaps a function of a seeming contradiction. Yet, applying the linguistic model again, we can perhaps see the local/remote paradox as similar to our own unconscious shift from thought to language.

Overall, the essay suffers from a bit of redundancy and repetitiveness. Further, there are paradoxes not drawn out and dealt with adequately. Yet, the basic problems are well identified, and the argument that memory, latency, and concurrency are the major cruxes of the issue seems convincing. For an essay a decade old it is surprising just how relevant the issues are today. For my part, I think that the comparison to Saussurian linguistics provides an interesting and useful way to approach the issues identified by these authors.


Post a Comment

<< Home