From da0193414c30dc8df83c1340c89e77196914dc4e Mon Sep 17 00:00:00 2001 From: joelschar Date: Thu, 21 Jun 2018 16:51:48 +0200 Subject: [PATCH 1/2] tiny mistake --- lectures/01-Lecture1-JavaIOs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lectures/01-Lecture1-JavaIOs.md b/lectures/01-Lecture1-JavaIOs.md index 98d7a3d..5312168 100644 --- a/lectures/01-Lecture1-JavaIOs.md +++ b/lectures/01-Lecture1-JavaIOs.md @@ -150,7 +150,7 @@ Let us look at some elements of the code: * Once the streams have been opened, the logic is very simple. We use a loop to consume all bytes, one by one, from the input stream. Each time that we read a byte, we write it immediately to the output stream. We can see that the `read()` method returns an int. This value is -1 if the end of the stream has been reached. Otherwise, it has a value between 0 and 255 (we are reading a single byte). -* Note that **this code is not very efficient and that copying large files would be painfully slow**. We will see later that is is much better to read/write blocks of bytes in a single read operation, or to use buffered streams. +* Note that **this code is not very efficient and that copying large files would be painfully slow**. We will see later that it is much better to read/write blocks of bytes in a single read operation, or to use buffered streams. From 27ef27c10aa34357db8154f12f8afd3723a3d986 Mon Sep 17 00:00:00 2001 From: joelschar Date: Thu, 21 Jun 2018 22:14:17 +0200 Subject: [PATCH 2/2] Update 01-Lecture1-JavaIOs.md --- lectures/01-Lecture1-JavaIOs.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lectures/01-Lecture1-JavaIOs.md b/lectures/01-Lecture1-JavaIOs.md index 5312168..46108e9 100644 --- a/lectures/01-Lecture1-JavaIOs.md +++ b/lectures/01-Lecture1-JavaIOs.md @@ -223,7 +223,7 @@ Binary value Decimal value Character **ASCII** worked well for many years, but there are many **languages with alphabets much larger than the latin alphabet**. For these languages, having only 7 bits (128 values) to represent characters is simply not enough. This is why several other character encoding systems have been developed over time. This has introduced quite a bit of complexity, especially when conversion from one encoding system to another is required. -In order to deal with internationalization, Java decided to use the **Unicode** standard to handle characters. When a Java program manipulates a character in memory, it uses **two bytes**. These two bytes are used to store what Unicode calls a **code point** (1'114'112 code points are defined in the range 0 to 10FFFF). A code point is a numeric value, which is often represented as `U+xxxxxx`, where `xxxxxx` is an hexadecimal value. What is useful (but also a bit confusing), is that the code points used to identify ASCII characters are the values defined in the ASCII encoding system. Huh? Take the character 'B' for instance. In ASCII, it is encoded with the decimal value 66. In Unicode, it has been decided that the code point `U+0042` (yes, 42 is the hexademical value of 66) would be used to identify the character 'B'. +In order to deal with internationalization, Java decided to use the **Unicode** standard to handle characters. When a Java program manipulates a character in memory, it uses **two bytes**. These two bytes are used to store what Unicode calls a **code point** (1'114'112 code points are defined in the range 0 to 10FFFF). A code point is a numeric value, which is often represented as `U+xxxxxx`, where `xxxxxx` is an hexadecimal value. What is useful (but also a bit confusing), is that the code points used to identify ASCII characters are the values defined in the ASCII encoding system. Huh? Take the character 'B' for instance. In ASCII, it is encoded with the decimal value 66. In Unicode, it has been decided that the code point `U+0042` (yes, 42 is the hexadecimal value of 66) would be used to identify the character 'B'. Unicode is actually not a character encoding system. When you have a code point, you still need to decide how you are going to encode it as a series of bits. Sure, you could use 16 bits (4 bytes for each of the 4 hexadecimal values making up the code point) for each encoded character. But in general, that would be a waste. Think of a text written in english, with only latin characters. Since the code points of all characters are below 255, @@ -276,7 +276,7 @@ That is pretty much what the `BufferedXXX` classes are doing. They manage an int ``` public void processInputStream(InputStream is) { - // I don't know to which source "is" is connected. It is also possible that is is already + // I don't know to which source "is" is connected. It is also possible that it is already // a chain of several filters wrapping each other. I don't really care, what I want is to // make sure that I read bytes in an efficient way.