Skip to content

Commit

Permalink
docs: minor style and typo fixes.
Browse files Browse the repository at this point in the history
  • Loading branch information
Prashanth Mundkur committed Aug 31, 2010
1 parent 335114a commit e36c78c
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 11 deletions.
6 changes: 3 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@ Disco - Massive data, Minimal code
==================================

Disco is an implementation of the `Map-Reduce framework
<http://en.wikipedia.org/wiki/MapReduce>`_ for distributed computing. As
<http://en.wikipedia.org/wiki/MapReduce>`_ for distributed computing. Like
the original framework, which was publicized by Google, Disco supports
parallel computations over large data sets on unreliable cluster of
parallel computations over large data sets on an unreliable cluster of
computers. This makes it a perfect tool for analyzing and processing large
datasets without having to bother about difficult technical questions
related to distributed computing, such as communication protocols, load
balancing, locking, job scheduling or fault tolerance, which are taken
balancing, locking, job scheduling or fault tolerance, all of which are taken
care by Disco.

See `discoproject.org <http://discoproject.org>`_ for more information.
6 changes: 3 additions & 3 deletions doc/start/ddfs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ open-source projects such as `Hadoop Distributed Filesystem (HDFS)
DDFS is a low-level component in the Disco stack, taking care of data
*distribution*, *replication*, *persistence*, *addressing* and *access*.
It does not provide a sophisticated query facility in itself but it is
**tightly integrated** with Disco jobs and Discodex indexing component,
**tightly integrated** with Disco jobs and the Discodex indexing component,
which can be used to build application-specific query interfaces. Disco
can store results of Map/Reduce jobs to DDFS, providing persistence and
easy access for processed data.
Expand Down Expand Up @@ -158,7 +158,7 @@ atomicity of metadata operations.
Each storage node contains a number of disks or volumes (`vol0..volN`),
assigned to DDFS by mounting them under ``$DDFS_ROOT/vol0`` ...
``$DDFS_ROOT/volN``. On each volume, DDFS creates two directories,
``tag`` and ``blob``, for storing tags anb blobs, respectively. DDFS
``tag`` and ``blob``, for storing tags and blobs, respectively. DDFS
monitors available disk space on each volume on regular intervals for
load balancing. New blobs are stored to the least loaded volumes.

Expand Down Expand Up @@ -316,7 +316,7 @@ comments in the source code. This discussion is mainly interesting to
developers and advanced users of DDFS and Disco.

As one might gather from the sections above, metadata (tag) operations
are the hard core of DDFS, mainly due to their transactional nature.
are the central core of DDFS, mainly due to their transactional nature.
Another non-trivial part of DDFS is re-replication and garbage
collection of tags and blobs. These issues are discussed in more detail
below.
Expand Down
6 changes: 3 additions & 3 deletions doc/start/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ story short, Disco works as follows:
* Disco users start Disco jobs in Python scripts.
* Jobs requests are sent over HTTP to the master.
* Master is an Erlang process that receives requests over HTTP.
* Master launches another Erlang process, worker supervisor, on each node over
* Master launches another Erlang process, the worker supervisor, on each node over
SSH.
* Worker supervisors run Disco jobs as Python processes.

Expand Down Expand Up @@ -121,7 +121,7 @@ On the master node, start the Disco master by executing ``disco start``.

You can easily integrate ``disco`` into your system's startup sequence.
For instance, you can see how ``debian/disco-master.init`` and
``debian/disco-node.init`` are implemented in the Disco's ``debian``
``debian/disco-node.init`` are implemented in Disco's ``debian``
branch.

If Disco has started up properly, you should see ``beam.smp`` running on your
Expand Down Expand Up @@ -240,7 +240,7 @@ itself.
If the machine where you run the script can access the master node but
not other nodes in the cluster, you need to set the environment variable
``DISCO_PROXY=http://master:8989``. The proxy address should be the
same as the master's above. This makes Disco to fetch results through
same as the master's above. This makes Disco fetch results through
the master node, instead of connecting to the nodes directly.

If the script produces some results, congratulations, you have a
Expand Down
4 changes: 2 additions & 2 deletions lib/disco/func.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
The task uses stderr to signal events to the master.
You can raise a :class:`disco.error.DataError`,
to abort the task on this node and try again on another node.
It is usually a best to let the task fail if any exceptions occur:
It is usually best to let the task fail if any exceptions occur:
do not catch any exceptions from which you can't recover.
When exceptions occur, the disco worker will catch them and
signal an appropriate event to the master.
Expand Down Expand Up @@ -136,7 +136,7 @@ def reduce(input_stream, output_stream, params):
:param input_stream: :class:`disco.func.InputStream` object that is used
to iterate through input entries.
:param output_stream: :class:`disco.func.InputStream` object that is used
:param output_stream: :class:`disco.func.OutputStream` object that is used
to output results.
:param params: the :class:`disco.core.Params` object specified
by the *params* parameter in :class:`disco.core.JobDict`.
Expand Down

0 comments on commit e36c78c

Please sign in to comment.