docs: minor style and typo fixes.

pranjal5215 · Aug 31, 2010 · e36c78c · e36c78c
1 parent 335114a
commit e36c78c
Show file tree

Hide file tree

Showing 4 changed files with 11 additions and 11 deletions.
diff --git a/README.rst b/README.rst
@@ -3,13 +3,13 @@ Disco - Massive data, Minimal code
 ==================================
 
 Disco is an implementation of the `Map-Reduce framework
-<http://en.wikipedia.org/wiki/MapReduce>`_ for distributed computing. As
+<http://en.wikipedia.org/wiki/MapReduce>`_ for distributed computing. Like
 the original framework, which was publicized by Google, Disco supports
-parallel computations over large data sets on unreliable cluster of
+parallel computations over large data sets on an unreliable cluster of
 computers. This makes it a perfect tool for analyzing and processing large
 datasets without having to bother about difficult technical questions
 related to distributed computing, such as communication protocols, load
-balancing, locking, job scheduling or fault tolerance, which are taken
+balancing, locking, job scheduling or fault tolerance, all of which are taken
 care by Disco.
 
 See `discoproject.org <http://discoproject.org>`_ for more information.
diff --git a/doc/start/ddfs.rst b/doc/start/ddfs.rst
@@ -31,7 +31,7 @@ open-source projects such as `Hadoop Distributed Filesystem (HDFS)
 DDFS is a low-level component in the Disco stack, taking care of data
 *distribution*, *replication*, *persistence*, *addressing* and *access*.
 It does not provide a sophisticated query facility in itself but it is
-**tightly integrated** with Disco jobs and Discodex indexing component,
+**tightly integrated** with Disco jobs and the Discodex indexing component,
 which can be used to build application-specific query interfaces. Disco
 can store results of Map/Reduce jobs to DDFS, providing persistence and
 easy access for processed data.
@@ -158,7 +158,7 @@ atomicity of metadata operations.
 Each storage node contains a number of disks or volumes (`vol0..volN`),
 assigned to DDFS by mounting them under ``$DDFS_ROOT/vol0`` ...
 ``$DDFS_ROOT/volN``. On each volume, DDFS creates two directories,
-``tag`` and ``blob``, for storing tags anb blobs, respectively. DDFS
+``tag`` and ``blob``, for storing tags and blobs, respectively. DDFS
 monitors available disk space on each volume on regular intervals for
 load balancing. New blobs are stored to the least loaded volumes.
 
@@ -316,7 +316,7 @@ comments in the source code. This discussion is mainly interesting to
 developers and advanced users of DDFS and Disco.
 
 As one might gather from the sections above, metadata (tag) operations
-are the hard core of DDFS, mainly due to their transactional nature.
+are the central core of DDFS, mainly due to their transactional nature.
 Another non-trivial part of DDFS is re-replication and garbage
 collection of tags and blobs. These issues are discussed in more detail
 below.

diff --git a/doc/start/install.rst b/doc/start/install.rst
@@ -27,7 +27,7 @@ story short, Disco works as follows:
  * Disco users start Disco jobs in Python scripts.
  * Jobs requests are sent over HTTP to the master.
  * Master is an Erlang process that receives requests over HTTP.
- * Master launches another Erlang process, worker supervisor, on each node over
+ * Master launches another Erlang process, the worker supervisor, on each node over
    SSH.
  * Worker supervisors run Disco jobs as Python processes.
 
@@ -121,7 +121,7 @@ On the master node, start the Disco master by executing ``disco start``.
 
 You can easily integrate ``disco`` into your system's startup sequence.
 For instance, you can see how ``debian/disco-master.init`` and
-``debian/disco-node.init`` are implemented in the Disco's ``debian``
+``debian/disco-node.init`` are implemented in Disco's ``debian``
 branch.
 
 If Disco has started up properly, you should see ``beam.smp`` running on your
@@ -240,7 +240,7 @@ itself.
 If the machine where you run the script can access the master node but
 not other nodes in the cluster, you need to set the environment variable
 ``DISCO_PROXY=http://master:8989``. The proxy address should be the
-same as the master's above. This makes Disco to fetch results through
+same as the master's above. This makes Disco fetch results through
 the master node, instead of connecting to the nodes directly.
 
 If the script produces some results, congratulations, you have a

diff --git a/lib/disco/func.py b/lib/disco/func.py
@@ -23,7 +23,7 @@
           The task uses stderr to signal events to the master.
           You can raise a :class:`disco.error.DataError`,
           to abort the task on this node and try again on another node.
-          It is usually a best to let the task fail if any exceptions occur:
+          It is usually best to let the task fail if any exceptions occur:
           do not catch any exceptions from which you can't recover.
           When exceptions occur, the disco worker will catch them and
           signal an appropriate event to the master.
@@ -136,7 +136,7 @@ def reduce(input_stream, output_stream, params):
 
     :param input_stream: :class:`disco.func.InputStream` object that is used
         to iterate through input entries.
-    :param output_stream: :class:`disco.func.InputStream` object that is used
+    :param output_stream: :class:`disco.func.OutputStream` object that is used
         to output results.
     :param params: the :class:`disco.core.Params` object specified
                    by the *params* parameter in :class:`disco.core.JobDict`.