Project Related Problems |
Distributed computing projects may generate data that is proprietary to
private industry, even though the process of generating that data involves the
resources of volunteers. This may result in controversy as private industry
profits from the data which is generated with the aid of volunteers. In
addition, some distributed computing projects, such as biology projects that aim
to develop thousands or millions of "candidate molecules" for solving various
medical problems, may create vast amounts of raw data. This raw data may be
useless by itself without refinement of the raw data or testing of candidate
results in real-world experiments. Such refinement and experimentation may be so
expensive and time-consuming that it may literally take decades to sift through
the data. Until the data is refined, no benefits can be acquired from the
computing work.
Other projects suffer from lack of planning on behalf of their well-meaning
originators. These poorly planned projects may not generate results that are
palpable, or may not generate data that ultimately result in finished,
innovative scientific papers. Sensing that a project may not be generating
useful data, the project managers may decide to abruptly terminate the project
without definitive results, resulting in wastage of the electricity and
computing resources used in the project. Volunteers may feel disappointed and
abused by such outcomes. There is an obvious opportunity cost of devoting time
and energy to a project that ultimately is useless, when that computing power
could have been devoted to a better planned distributed computing project
generating useful, concrete results.
Another problem with distributed computing projects is that they may devote
resources to problems that may not ultimately be soluble, or to problems that
are best pursued later in the future, when desktop computing power becomes fast
enough to make pursuit of such solutions practical. Some distributed computing
projects may also attempt to use computers to find solutions by number-crunching
mathematical or physical models. With such projects there is the risk that the
model may not be designed well enough to efficiently generate concrete
solutions. The effectiveness of a distributed computing project is therefore
determined largely by the sophistication of the project creators.-
Architecture
Various hardware and software architectures are used for distributed
computing. At a lower level, it is necessary to interconnect multiple CPUs with
some sort of network, regardless of whether that network is printed onto a
circuit board or made up of loosely-coupled devices and cables. At a higher
level, it is necessary to interconnect processes running on those CPUs with some
sort of communication system.
Distributed programming typically falls into one of several basic
architectures or categories: Client-server, 3-tier architecture, N-tier
architecture, Distributed objects, loose coupling, or tight coupling.
- Client-server � Smart client code contacts the server for data, then
formats and displays it to the user. Input at the client is committed back
to the server when it represents a permanent change.
- 3-tier architecture � Three tier systems move the client intelligence to
a middle tier so that stateless clients can be used. This simplifies
application deployment. Most web applications are 3-Tier.
- N-tier architecture � N-Tier refers typically to web applications which
further forward their requests to other enterprise services. This type of
application is the one most responsible for the success of application
servers.
- Tightly coupled (clustered) � refers typically to a set of highly
integrated machines that run the same process in parallel, subdividing the
task in parts that are made individually by each one, and then put back
together to make the final result.
- Peer-to-peer � an architecture where there is no special machine or
machines that provide a service or manage the network resources. Instead all
responsibilities are uniformly divided among all machines, known as peers.
Peers can serve both as clients and servers.
- Space based � refers to an infrastructure that creates the illusion
(virtualization) of one single address-space. Data are transparently
replicated according to application needs. Decoupling in time, space and
reference is achieved.
Another basic aspect of distributed computing architecture is the method of
communicating and coordinating work among concurrent processes. Through various
message passing protocols, processes may communicate directly with one another,
typically in a master/slave relationship. Alternatively, a "database-centric"
architecture can enable distributed computing to be done without any form of
direct inter-process
|