-
Notifications
You must be signed in to change notification settings - Fork 0
/
atom.xml
494 lines (301 loc) · 213 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Old Young Boys Club</title>
<subtitle>Soccer, Joke and Java</subtitle>
<link href="/atom.xml" rel="self"/>
<link href="http://www.oldyoungboys.club/"/>
<updated>2019-09-03T20:29:56.243Z</updated>
<id>http://www.oldyoungboys.club/</id>
<author>
<name>Nathanael Yang</name>
</author>
<generator uri="http://hexo.io/">Hexo</generator>
<entry>
<title>“放码过来!作个Java实战派”Udemy良心课程免费放送!</title>
<link href="http://www.oldyoungboys.club/Udemy-Free-Course-Show-Me-the-Code-Java-Warrior-Part-I-Released/"/>
<id>http://www.oldyoungboys.club/Udemy-Free-Course-Show-Me-the-Code-Java-Warrior-Part-I-Released/</id>
<published>2019-06-10T16:19:36.000Z</published>
<updated>2019-09-03T20:29:56.243Z</updated>
<content type="html"><![CDATA[<p>昨天,我把<a href="https://www.udemy.com/java-warrior-part1/" target="_blank" rel="noopener">“放码过来!作个Java实战派”</a>课程在Udemy上面设为永久免费,也将对应课程放到了YouTube对应频道<a href="https://www.youtube.com/channel/UClFNjZxKXLa8LpBj2m4PCDg?view_as=subscriber" target="_blank" rel="noopener">Java Never Sleep</a>,欢迎访问和订阅!这是一款深度广度胜过N多收费课程的良心课程,您要觉得我在自吹自擂,请听我细细道来。</p><p><a href="https://www.udemy.com/java-warrior-part1/" target="_blank" rel="noopener"><img src="https://www.dropbox.com/s/qlfnts6w47dz19e/java-warrior-part1.jpg?dl=1" alt=""></a><a id="more"></a></p><p>这门课程的特点在于并不追求所有知识点的完整罗列,而是基于28原则,集中发力于最常用的工作技能,并对部分内容进行深入展开。辅以高强度的编程练习,覆盖其他需要的知识点。</p><p>比如当我阐述面向对象概念时,我并不直接引入复杂概念,而是先从一个小游戏开始:</p><div class="video-container"><iframe src="//www.youtube.com/embed/0UH27zLzSJA" frameborder="0" allowfullscreen></iframe></div><p>比如我在讲解常用数据结构时,从一个待解决的问题“双城记词频统计问题”入手,逐个引入不同集合,并说明其使用场景。最后,我对集合类库进行完整的梳理,通过实时生成的类图对其结构一览无余。</p><p>以序章为例,恐怕我是极少数直接使用JShell进行Hello World讲解的特例,实际上,从JDK9引入JShell之后,很多人对这一工具所带来的便利还有基本认识。所以,在“首次Hello World”中,其安排是这样的:</p><blockquote><p>终于,我们开始安装JDK(12.0.1)、编写Hello World。在JDK12中,我们优先通过JShell运行Hello World,当然,我们也介绍古典Hello World的编写、编译、运行三部曲。</p></blockquote><p>然后为了说明以JVM为核心的Java生态系统思想,我对Hello World也进行了再次跟进,甚至包括一个基于Kotlin的Hello World。</p><blockquote><p>我们继续Hello World:我们介绍手工编译运行时需要注意的细节、自JDK 11之后引入的直接运行Single Source、在其他操作系统上运行Hello World、用JVM系编程语言之一的Kotlin来Hello World、用Eclipse编译器来Hello World。一沙一世界,我们后面还要三番四次Hello World,因为它绝不如大家看到、想象的那般简单。</p></blockquote><p>好了,Talk is Cheap,您前去观看,自有分晓。如果您对这款课程有任何疑问或反馈意见,都可以通过本站各种联系方式骚扰我。如果您觉得课程不错,记得给我来个好评😅</p><p>此外,我在Udemy上面也发布了一款项目开发驱动的课程<a href="https://www.udemy.com/java-tank-war/?couponCode=JAVANEVERSLEEP" target="_blank" rel="noopener">“放码过来!新版Java坦克大战”</a>,您可以通过折扣码<strong>JAVANEVERSLEEP</strong>享受限时2折优惠,别错过时间了哈😁</p><p><a href="https://www.udemy.com/java-tank-war/?couponCode=JAVANEVERSLEEP" target="_blank" rel="noopener"><img src="https://www.dropbox.com/s/c40u990645zw8lx/Show_Me_The_Code.jpg?dl=1" alt=""></a></p>]]></content>
<summary type="html">
<p>昨天,我把<a href="https://www.udemy.com/java-warrior-part1/" target="_blank" rel="noopener">“放码过来!作个Java实战派”</a>课程在Udemy上面设为永久免费,也将对应课程放到了YouTube对应频道<a href="https://www.youtube.com/channel/UClFNjZxKXLa8LpBj2m4PCDg?view_as=subscriber" target="_blank" rel="noopener">Java Never Sleep</a>,欢迎访问和订阅!这是一款深度广度胜过N多收费课程的良心课程,您要觉得我在自吹自擂,请听我细细道来。</p>
<p><a href="https://www.udemy.com/java-warrior-part1/" target="_blank" rel="noopener"><img src="https://www.dropbox.com/s/qlfnts6w47dz19e/java-warrior-part1.jpg?dl=1" alt=""></a>
</summary>
<category term="Udemy教程" scheme="http://www.oldyoungboys.club/categories/Udemy%E6%95%99%E7%A8%8B/"/>
<category term="Udemy" scheme="http://www.oldyoungboys.club/tags/Udemy/"/>
</entry>
<entry>
<title>Java Never Sleep Launched!</title>
<link href="http://www.oldyoungboys.club/Java-Never-Sleep-Launched/"/>
<id>http://www.oldyoungboys.club/Java-Never-Sleep-Launched/</id>
<published>2019-02-03T20:52:37.000Z</published>
<updated>2019-09-03T20:29:56.237Z</updated>
<content type="html"><![CDATA[<p>It has been 2 months since I launched my personal blogger “Old Young Boys Club”, I wrote about Soccer, Joke, and Java here. I also published them to Facebook, Twitter, LinkedIn, and Reddit. Some of them attracted readers more than I expected, some of them, though I wrote with much effort and dedication, just keep silent:) Now I made a decision to focus on Java stuff mainly and will update regularly at another blogger: <a href="https://www.javaneversleep.com" target="_blank" rel="noopener">Java Never Sleep!</a> It’s about Java and only about Java. Welcome, my friend!</p><p><img src="https://www.dropbox.com/s/w0e5hnmg1j7aqyl/java-coffee.jpg?dl=1" alt=""><a id="more"></a></p><p>In the beginning of this year 2019, I setup a goal to build the best Java tutorial in the web for beginners. Sounds crazy but this goal drove me a lot and I got to know many new stuff I didn’t know previously. Since I’m teaching 12 students at Olivet Insititute of Technology, I hope to summarize what I’ve learned in the past and finalize all materials well.</p><p>And very immediately 30 days passed, looking back on the past 30 days, I can honestly say that I didn’t waste too much time, though sometimes I really felt powerless to continue. But stick to the goal till the end is something I firmly believe, so I will try my best to achieve my goal.</p><p><a href="https://www.javaneversleep.com" target="_blank" rel="noopener">Java Never Sleep</a>. Me not, me need sleep and actually sleep. However, my goal won’t sleep, my passion won’t and my prayer won’t.</p><p><img src="https://www.dropbox.com/s/3oqbw4zokidbfm1/goalkeeper.jpg?dl=1" alt=""></p>]]></content>
<summary type="html">
<p>It has been 2 months since I launched my personal blogger “Old Young Boys Club”, I wrote about Soccer, Joke, and Java here. I also published them to Facebook, Twitter, LinkedIn, and Reddit. Some of them attracted readers more than I expected, some of them, though I wrote with much effort and dedication, just keep silent:) Now I made a decision to focus on Java stuff mainly and will update regularly at another blogger: <a href="https://www.javaneversleep.com" target="_blank" rel="noopener">Java Never Sleep!</a> It’s about Java and only about Java. Welcome, my friend!</p>
<p><img src="https://www.dropbox.com/s/w0e5hnmg1j7aqyl/java-coffee.jpg?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Blogger" scheme="http://www.oldyoungboys.club/tags/Blogger/"/>
</entry>
<entry>
<title>Yet Another Top 10 Books For Java Learners - Part I</title>
<link href="http://www.oldyoungboys.club/Yet-Another-Top-10-Books-For-Java-Learners/"/>
<id>http://www.oldyoungboys.club/Yet-Another-Top-10-Books-For-Java-Learners/</id>
<published>2019-02-01T15:39:27.000Z</published>
<updated>2019-09-03T20:29:56.244Z</updated>
<content type="html"><![CDATA[<p>Honestly speaking, there is no need to post an article on the topic “Top 10 Books For Java Learners”, which is possibly older than your age. I do this for the 12 students I’m teaching Java programming language recently, I’d like to summarize a book list of 10 for their reference. I don’t want to recommend books said to be great but I never read, so some of the books listed here maybe not that famous or popular, but at least they are good to me and helped me a lot in the past.</p><p><img src="https://www.dropbox.com/s/6cir6bgtlzzl94j/books.jpg?dl=1" alt=""><a id="more"></a></p><h2 id="No-1-MOOC-By-University-of-Helsinki"><a href="#No-1-MOOC-By-University-of-Helsinki" class="headerlink" title="No.1 MOOC By University of Helsinki"></a>No.1 MOOC By University of Helsinki</h2><p>Yes, I’m joking and I’m not joking. From my own experience, I don’t think it’s the most efficient way to read a book systematically when you are pure beginners - especially if you never write code before. What you need to do is to understand the basic concepts quickly first, then <strong>JUST DO IT!</strong> Keep practicing, keep practicing and keep practicing.</p><p><a target="_blank" href="https://www.amazon.com/NIKE-Sportswear-Swoosh-Varsity-Medium/dp/B00TFAEPRA/ref=as_li_ss_il?ie=UTF8&qid=1549036398&sr=8-1&keywords=Just+do+it&linkCode=li3&tag=javaneversleep-20&linkId=a45c1168c1454889779b81a6c9304ada&language=en_US"><img border="0" src="//ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=B00TFAEPRA&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=javaneversleep-20&language=en_US"></a><img src="https://ir-na.amazon-adsystem.com/e/ir?t=javaneversleep-20&language=en_US&l=li3&o=1&a=B00TFAEPRA" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;"></p><p>Object-Oriented Programming with Java <a href="https://materiaalit.github.io/2013-oo-programming/part1/week-1/" target="_blank" rel="noopener">Part I</a> and <a href="https://materiaalit.github.io/2013-oo-programming/part2/week-7/" target="_blank" rel="noopener">Part II</a>. Register an account at <a href="https://tmc.mooc.fi/user/new" target="_blank" rel="noopener">Test My Code</a>. Finish all the exercises there. Post a screenshot to prove you really did it. That’s all.</p><p><img src="https://www.dropbox.com/s/fg5ceqyklz7x6wf/mooc-part1-109.jpg?dl=1" alt=""></p><p>Most of the students give positive feedback regarding the exercises. There is one Easter Egg in one of the test cases waiting for you to find there. Trust me, stick to the end and you won’t miss it.</p><p>Wait…but this is obviously not a book, right?</p><p>Yeah, right, but so what? For beginners, this would be the best resource for a quick start and get your hands dirty. Talk is cheap, show me the code!</p><p><a target="_blank" href="https://www.amazon.com/gp/product/B01FOTSHLA/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=B01FOTSHLA&linkCode=as2&tag=javaneversleep-20&linkId=314168ef8b7dd05181532e41f8c561dc"><img border="0" src="//ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&MarketPlace=US&ASIN=B01FOTSHLA&ServiceVersion=20070822&ID=AsinImage&WS=1&Format=_SL250_&tag=javaneversleep-20"></a><img src="//ir-na.amazon-adsystem.com/e/ir?t=javaneversleep-20&l=am2&o=1&a=B01FOTSHLA" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;"></p><h2 id="No-2-Refactoring-Improving-the-Design-of-Existing-Code"><a href="#No-2-Refactoring-Improving-the-Design-of-Existing-Code" class="headerlink" title="No.2 Refactoring: Improving the Design of Existing Code"></a>No.2 Refactoring: Improving the Design of Existing Code</h2><p>After you get your hands dirty for a long time and write enough code like shit, this book written by Martin Fowler is worthy to read thoroughly and practice the methods introduced one by one. <a href="https://amzn.to/2GeaV5Q" target="_blank" rel="noopener">Refactoring: Improving the Design of Existing Code</a> has two editions, though the <a href="https://amzn.to/2DOMhaE" target="_blank" rel="noopener">2nd Edition</a> is available with many updates, I would personally suggest you read the <a href="https://amzn.to/2MKnHKN" target="_blank" rel="noopener">1st Edition</a> since you are a Java learner.</p><p><a target="_blank" href="https://www.amazon.com/gp/product/0134757599/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0134757599&linkCode=as2&tag=javaneversleep-20&linkId=89250f6bdc5dbc4ec626099edabefb8d"><img border="0" src="//ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&MarketPlace=US&ASIN=0134757599&ServiceVersion=20070822&ID=AsinImage&WS=1&Format=_SL250_&tag=javaneversleep-20"></a><img src="//ir-na.amazon-adsystem.com/e/ir?t=javaneversleep-20&l=am2&o=1&a=0134757599" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;"></p><p>I always recommend this book to other programmers, as it had a great impact on my career. I was promoted to Team Leader in my first company because I finished refactoring against a module nobody wants to do - too painful and too brain-hurt. I found this book by chance in the bookshelf one afternoon, and I’m addicted to it and I just finished reading this book very quickly and used several methods there, and from that time on I began to write fully covered unit tests for my code, which made my life and our colleagues’ life much easier.</p><h2 id="No-3-Effective-Java"><a href="#No-3-Effective-Java" class="headerlink" title="No.3 Effective Java"></a>No.3 Effective Java</h2><p>Joshua Bloch, the guy who designed and implemented Java Collection Framework(of course not himself alone), released the <a href="https://amzn.to/2Ggl5D4" target="_blank" rel="noopener">3rd Edition</a> of this book. For me, I actually read the Chinese translation of the <a href="https://amzn.to/2S3qqoj" target="_blank" rel="noopener">2nd Edition</a>, which is very impressive. This is a book more like a master telling his experience, insights, best practices to you, rather than a textbook. This book, like its title, is really “Effective”.</p><p><a target="_blank" href="https://www.amazon.com/Effective-Java-Joshua-Bloch/dp/0134685997/ref=as_li_ss_il?ie=UTF8&qid=1549037913&sr=8-2&keywords=effective+java&linkCode=li3&tag=javaneversleep-20&linkId=fa9eefc833d196486daa8c9f8e92c294&language=en_US"><img border="0" src="//ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=0134685997&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=javaneversleep-20&language=en_US"></a><img src="https://ir-na.amazon-adsystem.com/e/ir?t=javaneversleep-20&language=en_US&l=li3&o=1&a=0134685997" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;"></p><h2 id="No-4-Head-First-Design-Patterns"><a href="#No-4-Head-First-Design-Patterns" class="headerlink" title="No.4 Head First Design Patterns"></a>No.4 Head First Design Patterns</h2><p><a href="https://amzn.to/2WA1wvm" target="_blank" rel="noopener">“Design Patterns: Elements of Reusable Object-Oriented Software”</a> by GoF is said to be “Classic, Awesome…” and blahblah. However, I didn’t read this legendary book, because I am too low to enjoy another funny book: <a href="https://amzn.to/2MILzON" target="_blank" rel="noopener">Head First Design Patterns</a>. All the source code of this book can be accessed at <a href="https://github.com/bethrobson/Head-First-Design-Patterns" target="_blank" rel="noopener">GitHub</a>.</p><p><a target="_blank" href="https://www.amazon.com/Head-First-Design-Patterns-Brain-Friendly/dp/0596007124/ref=as_li_ss_il?ie=UTF8&qid=1549038335&sr=8-1&keywords=head+first+design+patterns+2014&linkCode=li3&tag=javaneversleep-20&linkId=bce55d8f6d7171449418836de9a8d2d6&language=en_US"><img border="0" src="//ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=0596007124&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=javaneversleep-20&language=en_US"></a><img src="https://ir-na.amazon-adsystem.com/e/ir?t=javaneversleep-20&language=en_US&l=li3&o=1&a=0596007124" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;"></p><p>Life is short and I don’t want to waste too much time arguing which book is better or not. What I know is if you just want to read one book about Design Patterns in Java, then choose this one and you won’t regret.</p><h2 id="No-5-Clean-Code-A-Handbook-of-Agile-Software-Craftsmanship"><a href="#No-5-Clean-Code-A-Handbook-of-Agile-Software-Craftsmanship" class="headerlink" title="No.5 Clean Code: A Handbook of Agile Software Craftsmanship"></a>No.5 Clean Code: A Handbook of Agile Software Craftsmanship</h2><p><a href="https://amzn.to/2WCDZKn" target="_blank" rel="noopener">Clean Code</a> by Uncle Bob, IMHO, is not that practical and useful to me compared to the books listed above. But it’s a good summary and provides more information which can enhance you still, besides that, this book is very interesting and it’s a joyful read. When Uncle Bob said that he “turned on his old TV and tried to watch some scaring movies, but there is not even one ghost there. Sh.t!”, I just laughed to the ground. Anyway, for those who really care about quality, cleanness of your code, this book is a must-read.</p><p><a target="_blank" href="https://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882/ref=as_li_ss_il?ie=UTF8&qid=1549038518&sr=8-1&keywords=clean+code&linkCode=li3&tag=javaneversleep-20&linkId=586746e3093661bce05cf7b585646c25&language=en_US"><img border="0" src="//ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=0132350882&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=javaneversleep-20&language=en_US"></a><img src="https://ir-na.amazon-adsystem.com/e/ir?t=javaneversleep-20&language=en_US&l=li3&o=1&a=0132350882" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;"></p><p><strong>-To Be Continued-</strong></p><script type="text/javascript">amzn_assoc_placement = "adunit0";amzn_assoc_search_bar = "true";amzn_assoc_tracking_id = "javaneversleep-20";amzn_assoc_ad_mode = "manual";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_title = "";amzn_assoc_linkid = "b2d7aa2b3289e841ea7dae7b785b8767";amzn_assoc_asins = "0201485672,0134685997,0596007124,0132350882";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script><iframe src="//rcm-na.amazon-adsystem.com/e/cm?o=1&p=48&l=ur1&category=books&banner=0HX1M2P8DDZ20D689R82&f=ifr&linkID=61f529dfb8b35107c92efc29a2c3c8dc&t=javaneversleep-20&tracking_id=javaneversleep-20" width="728" height="90" scrolling="no" border="0" marginwidth="0" style="border:none;" frameborder="0"></iframe>]]></content>
<summary type="html">
<p>Honestly speaking, there is no need to post an article on the topic “Top 10 Books For Java Learners”, which is possibly older than your age. I do this for the 12 students I’m teaching Java programming language recently, I’d like to summarize a book list of 10 for their reference. I don’t want to recommend books said to be great but I never read, so some of the books listed here maybe not that famous or popular, but at least they are good to me and helped me a lot in the past.</p>
<p><img src="https://www.dropbox.com/s/6cir6bgtlzzl94j/books.jpg?dl=1" alt="">
</summary>
<category term="Great Books" scheme="http://www.oldyoungboys.club/categories/Great-Books/"/>
<category term="Awesome Books" scheme="http://www.oldyoungboys.club/tags/Awesome-Books/"/>
</entry>
<entry>
<title>A Simple Tank War Game Exercise for Java Beginners</title>
<link href="http://www.oldyoungboys.club/A-Simple-Tank-War-Game-Exercise-for-Java-Beginners/"/>
<id>http://www.oldyoungboys.club/A-Simple-Tank-War-Game-Exercise-for-Java-Beginners/</id>
<published>2019-01-25T21:38:38.000Z</published>
<updated>2019-09-03T20:29:56.226Z</updated>
<content type="html"><![CDATA[<p>I’ve just published a mid-term project for a small group of students. If you are a Java beginner who is seeking some challenge to conquer, I believe this <a href="https://github.com/nateyoung427/tankwar" target="_blank" rel="noopener">Tank War</a> game would be fit for you. Wanna a try?😄 Solution for your reference will be released after 3 weeks.</p><p><img src="https://www.dropbox.com/s/h7lb40jzzkcskun/tank-war.jpg?dl=1" alt=""><a id="more"></a></p><p>I wrote my first line of Java code, the awesome “Hello World” at 26. It was 2009 and life is really tough for me, all my hope is to finish the coding camp program as soon as possible and start working as a programmer to earn some livings. This little game was the final project of JavaSE course and I just cannot figure out it myself, only by following the instructor’s patient explanations I can catch up with what he is actually doing at that time. I don’t remember how many hours I spent on this little, stupid, and boring game, but I would never forget this painful experience.</p><p>And this time, when I told the students that they need to finish this in 3 weeks, half of them silenced, I know that feeling very well - they have no idea what to do, just like me 10 years ago. Maybe they don’t know, less than 450 lines of code will be sufficient for the implementations. Unreasonable fear would just make you ignoring many obvious and clear facts.</p><p>One would easily choose “Game Over”, and rarely press F2 to restart.</p><p><img src="https://www.dropbox.com/s/z1gars1msdvmsug/game-over.jpg?dl=1" alt=""></p><p>At that time, I was upset, depressed and anxious, but with patience and sweat, I finally finished the 3 months program and fortunately found a job to begin my career as a programmer. 10 years later, it has been very clear to me that I don’t have much talent in this area, but it doesn’t mean that I cannot leverage my skills to make a difference and help others - which I’ve actually done these years.</p><p>Yes, not everyone can be a great programmer, but everyone can be a programmer and make a difference - if they really want and devote themselves to it.</p><p>Set up a challenging goal, and achieve it by all means - I firmly believe this is the most efficient way to improve your programming skills. So do not fear, do not hesitate.</p><p>And, <strong>JUST DO IT!</strong></p><script type="text/javascript">amzn_assoc_placement = "adunit0";amzn_assoc_search_bar = "false";amzn_assoc_tracking_id = "oldyoungboy-20";amzn_assoc_ad_mode = "manual";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_title = "";amzn_assoc_linkid = "de429ade981a7c8fe5027e941b980ae1";amzn_assoc_asins = "B00TFAET2G,0134685997,0134757599,020161622X";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script><iframe src="//rcm-na.amazon-adsystem.com/e/cm?o=1&p=48&l=ur1&category=books&banner=0HX1M2P8DDZ20D689R82&f=ifr&linkID=61f529dfb8b35107c92efc29a2c3c8dc&t=javaneversleep-20&tracking_id=javaneversleep-20" width="728" height="90" scrolling="no" border="0" marginwidth="0" style="border:none;" frameborder="0"></iframe>]]></content>
<summary type="html">
<p>I’ve just published a mid-term project for a small group of students. If you are a Java beginner who is seeking some challenge to conquer, I believe this <a href="https://github.com/nateyoung427/tankwar" target="_blank" rel="noopener">Tank War</a> game would be fit for you. Wanna a try?😄 Solution for your reference will be released after 3 weeks.</p>
<p><img src="https://www.dropbox.com/s/h7lb40jzzkcskun/tank-war.jpg?dl=1" alt="">
</summary>
<category term="Projects" scheme="http://www.oldyoungboys.club/categories/Projects/"/>
<category term="Tank War" scheme="http://www.oldyoungboys.club/tags/Tank-War/"/>
</entry>
<entry>
<title>7 Tips of Searching GitHub Repositories You Should Know</title>
<link href="http://www.oldyoungboys.club/7-Tips-of-Searching-Github-Repositories-You-Should-Know/"/>
<id>http://www.oldyoungboys.club/7-Tips-of-Searching-Github-Repositories-You-Should-Know/</id>
<published>2019-01-18T16:33:04.000Z</published>
<updated>2019-09-03T20:29:56.225Z</updated>
<content type="html"><![CDATA[<p>You search on Google every day, and as a programmer, you would probably search on GitHub every day. Are you sure you know how to search GitHub repositories effectively? Let’s check out these 7 tips you probably didn’t know.</p><p><img src="https://www.dropbox.com/s/kfrd8o0veubqzmz/github-logo.png?dl=1" alt=""><a id="more"></a></p><h3 id="1-In-Name-In-Description-and-In-README"><a href="#1-In-Name-In-Description-and-In-README" class="headerlink" title="1. In Name, In Description, and In README"></a>1. In Name, In Description, and In README</h3><p>GitHub supports advance search in a certain field, like repository title, description, and README.</p><p>For example, you want to find some cool repository to learn Spring Cloud stuff, you can search like this:</p><p><code>in:name spring cloud</code></p><p><img src="https://www.dropbox.com/s/9va34ndlhe8lqf7/search-in-name.jpg?dl=1" alt=""></p><p>In the same way, you can also search in description or README only:</p><p><code>in:description spring cloud</code><br><code>in:readme spring cloud</code></p><h3 id="2-More-Stars-More-Forks"><a href="#2-More-Stars-More-Forks" class="headerlink" title="2. More Stars, More Forks"></a>2. More Stars, More Forks</h3><p>Stars of a repository would provide information on how popular it is, which is an important metrics in consideration, as a result, you can search like this:</p><p><code>stars:>=3000 spring cloud</code></p><p>You can also define a range like this:</p><p><code>forks:10..20 spring cloud</code></p><h3 id="3-Small-Repositories-Please"><a href="#3-Small-Repositories-Please" class="headerlink" title="3. Small Repositories Please"></a>3. Small Repositories Please</h3><p>Dinosaur repositories are not what you want, you love those simple, small and smart repositories only, you can add this search term:</p><p><code>size:<=5000 spring cloud</code></p><p>Unit of size here is KB, so 5000 means 5MB.</p><h3 id="4-Actively-Maintained-Repositories-Please"><a href="#4-Actively-Maintained-Repositories-Please" class="headerlink" title="4. Actively Maintained Repositories Please"></a>4. Actively Maintained Repositories Please</h3><p>Most of the time you don’t want to rely on a project that didn’t update for 7 years, actively maintained projects would give you more confidence, thus you need to introduce last push time in your search term. For example, you want to search those projects have updates in the last two weeks:</p><p><code>pushed:>2019-01-04 spring cloud</code></p><p>You may also search repositories created before or after a certain time using <code>created</code> rather than <code>push</code>.</p><h3 id="5-Apache-License-Please"><a href="#5-Apache-License-Please" class="headerlink" title="5. Apache License Please"></a>5. Apache License Please</h3><p>License of open source projects might bring you much trouble, you remember Facebook and React, right? If you want to search projects friendly licensed, for example, the well known Apache License 2, you would search like this:</p><p><code>license:apache-2.0 spring cloud</code></p><p>Of course, you can hunt other licenses also, just search in the <a href="https://help.github.com/articles/licensing-a-repository/" target="_blank" rel="noopener">complete list</a> provided by GitHub and choose the one you love.</p><h3 id="6-Java-Only"><a href="#6-Java-Only" class="headerlink" title="6. Java Only"></a>6. Java Only</h3><p><code>language:java</code> will filter repositories not written in Java, yeah! If you hate Java then replace it with the one you love, for example, the best programming language PHP.</p><h3 id="7-Rock-Star-Only"><a href="#7-Rock-Star-Only" class="headerlink" title="7. Rock Star Only"></a>7. Rock Star Only</h3><p>You may just want to search repositories by a Rock Star or a well-known organization because they are more likely awesome, just include <code>user</code> in your search term like this:</p><p><code>user:joshlong spring cloud</code><br><code>org:spring-cloud spring cloud</code></p><p>Obviously, you can combine the 7 tips together for more complex search, for example:</p><p><code>user:joshlong language:java pushed:>2018-03-04 stars:>200 in:description spring boot</code></p><p><img src="https://www.dropbox.com/s/3a2iny55gtz86xe/combination-search.jpg?dl=1" alt=""></p><h3 id="More-Options-to-Explore"><a href="#More-Options-to-Explore" class="headerlink" title="More Options to Explore"></a>More Options to Explore</h3><p>Want to explore all possible search terms? Just play around with <a href="https://github.com/search/advanced" target="_blank" rel="noopener">“Advanced Search”</a> or read the <a href="https://help.github.com/articles/about-searching-on-github/" target="_blank" rel="noopener">“search help”</a> by GitHub, the time you spent there will be definitely worthy.</p><p><img src="https://www.dropbox.com/s/zi76j669tzldjrg/advanced-search.jpg?dl=1" alt=""></p><p><strong>Happy Hunting!</strong></p><p><em>PS:</em> If you are a Chinese reader you would like to read the original version by Shucheng Hou at <a href="https://mp.weixin.qq.com/s/__MXKPICzAL4mLetycfc9A" target="_blank" rel="noopener">here</a>. My post is a simplified and slightly modified English version actually.</p><script type="text/javascript">amzn_assoc_placement = "adunit0";amzn_assoc_search_bar = "true";amzn_assoc_tracking_id = "oldyoungboy-20";amzn_assoc_search_bar_position = "bottom";amzn_assoc_ad_mode = "search";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_title = "";amzn_assoc_default_search_phrase = "GitHub";amzn_assoc_default_category = "All";amzn_assoc_linkid = "3e67b5f3d8c58758abe55ca413b6ada4";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script>]]></content>
<summary type="html">
<p>You search on Google every day, and as a programmer, you would probably search on GitHub every day. Are you sure you know how to search GitHub repositories effectively? Let’s check out these 7 tips you probably didn’t know.</p>
<p><img src="https://www.dropbox.com/s/kfrd8o0veubqzmz/github-logo.png?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="GitHub" scheme="http://www.oldyoungboys.club/tags/GitHub/"/>
</entry>
<entry>
<title>Farewell: Brothers and Sisters</title>
<link href="http://www.oldyoungboys.club/Farewell-Brothers-and-Sisters/"/>
<id>http://www.oldyoungboys.club/Farewell-Brothers-and-Sisters/</id>
<published>2019-01-14T16:48:47.000Z</published>
<updated>2019-09-03T20:29:56.233Z</updated>
<content type="html"><![CDATA[<p>11 brothers and sisters departed for China in the past two weeks, and today I just found my heart was emptied and filled with sadness. After living and working together for about 4 years, remembering the happy and hard days we experienced together, it’s just overwhelming and I had to pray a lot both for them and for myself.</p><p><img src="https://www.dropbox.com/s/ivj3utojsbe5sqd/goodbye.jpg?dl=1" alt=""><a id="more"></a></p><p>They went back to China to fulfill their commission and it would be not that sad as our Father God will be with them, but we are just mankind and always emotional, at this time, I would pray that their sacrifice and commitment be pleased by God, and may our shepherd guide their way in the future as always.</p><p>Months ago, I heard that an old Pastor said something like this, “don’t bring them to me again, as after they gone back, my heart was emptied and could not stand. They are like daughters to me and the time with them is too happy that I cannot bear the time when I have to say goodbye to them”.</p><p>And today I just have similar feelings, even though I am not a man focusing on relationship much.</p><p>It’s a community of love in God’s 3 persons, the Father, the Son, and the Spirit. If love is something connected so deeply, then how sad it would be, when the Father had to abandon the Son, when the Son had to be abandoned by the Father as an atonement? What kind of pain and bitterness would it be for Heavenly Father, when Lord Jesus was crucified?</p><p><img src="https://www.dropbox.com/s/kivwdriuq66sedl/prayer.jpg?dl=1" alt=""></p><p>Farewell, my brothers and sisters, “your labour is not in vain in the Lord”.</p>]]></content>
<summary type="html">
<p>11 brothers and sisters departed for China in the past two weeks, and today I just found my heart was emptied and filled with sadness. After living and working together for about 4 years, remembering the happy and hard days we experienced together, it’s just overwhelming and I had to pray a lot both for them and for myself.</p>
<p><img src="https://www.dropbox.com/s/ivj3utojsbe5sqd/goodbye.jpg?dl=1" alt="">
</summary>
<category term="Life" scheme="http://www.oldyoungboys.club/categories/Life/"/>
<category term="KBers" scheme="http://www.oldyoungboys.club/tags/KBers/"/>
</entry>
<entry>
<title>Fibonacci and BigInteger: Secret Under the Hood</title>
<link href="http://www.oldyoungboys.club/Fibonacci-and-BigInteger/"/>
<id>http://www.oldyoungboys.club/Fibonacci-and-BigInteger/</id>
<published>2019-01-11T06:53:17.000Z</published>
<updated>2019-09-03T20:29:56.234Z</updated>
<content type="html"><![CDATA[<p>No matter what programming language you are learning, Fibonacci numbers calculation is just too classic to skip. It seems very easy to implement, but also can be deep enough to catch up with, especially when the number is super big. Have you thought that ever before?</p><p><img src="https://www.dropbox.com/s/ye2355oyx0ixy3w/rabbit.jpg?dl=1" alt=""><a id="more"></a></p><p>It can be super easy to implement using recursion in just one line of code, even using Java.😅</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">fib</span><span class="params">(<span class="keyword">int</span> n)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> n <= <span class="number">2</span> ? <span class="number">1</span> : (fib(n - <span class="number">1</span>) + fib(n - <span class="number">2</span>));</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>And to make you understand the basic optimization idea, monetization mechanism will be introduced via an integer array or a HashMap, but the idea is same: calculate once and only once, so that those poor rabbits would feel life much easier.</p><p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/7/7a/FibonacciRabbit.svg/640px-FibonacciRabbit.svg.png" alt=""></p><p>Add just one more line of code and thanks to the Lambdas feature provided since JDK8, life is much easier nowadays.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">private</span> <span class="keyword">static</span> Map<Integer, Integer> CACHE = <span class="keyword">new</span> HashMap<>();</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">fib</span><span class="params">(<span class="keyword">int</span> n)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> CACHE.computeIfAbsent(n, k -> (k <= <span class="number">2</span> ? <span class="number">1</span> : (fib(k - <span class="number">1</span>) + fib(k - <span class="number">2</span>))));</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>There is always someone scared of the famous StackOverflowError, they would tell you not to use recursion but do an iterative way, like this:</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">fib</span><span class="params">(<span class="keyword">int</span> n)</span> </span>{</span><br><span class="line"> <span class="keyword">if</span> (n <= <span class="number">2</span>) <span class="keyword">return</span> <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line"> <span class="keyword">int</span>[] cache = <span class="keyword">new</span> <span class="keyword">int</span>[n];</span><br><span class="line"> cache[<span class="number">0</span>] = cache[<span class="number">1</span>] = <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">2</span>; i < n; i++) {</span><br><span class="line"> cache[i] = cache[i - <span class="number">1</span>] + cache[i - <span class="number">2</span>];</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> cache[n - <span class="number">1</span>];</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Obviously, the integer array can be avoided, you can just keep three values and update them till the end, but that’s not the problem we want to discuss here. You would have probably found that it’s very easy for the result to be much bigger than the upper limit of int, long, double. For example, if we input n as 2048, what will be the exact result then?</p><p>454153044374378942504557144629068920270090826129364442895118239027897145250928343568434971 803477173043320774207501029966396250064078380187973638077418159157949680694899576625922604 895968605634843621876639428348247300097930657521757592440815188064651826480022197557589955 655164820646173515138267042115173436029259905997102292769397103720814141099147144935820441 85153918055170241694035610145547104337536614028338983073680262684101</p><p>It’s 4.54 * 10^427. As for Java, for this kind of problem, you have no other choice but <a href="https://docs.oracle.com/javase/8/docs/api/java/math/BigInteger.html" target="_blank" rel="noopener">BigInteger</a>, and it’s very easy to<br>use.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">private</span> <span class="keyword">static</span> Map<Integer, BigInteger> CACHE = <span class="keyword">new</span> HashMap<>();</span><br><span class="line"></span><br><span class="line"><span class="function">BigInteger <span class="title">fib</span><span class="params">(<span class="keyword">int</span> n)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> CACHE.computeIfAbsent(n, k -> (k <= <span class="number">2</span> ? BigInteger.ONE : (fib(k - <span class="number">1</span>).add(fib(k - <span class="number">2</span>)))));</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Congratulations! You will get a BOMB exploding like this:<br><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">Exception in thread <span class="string">"main"</span> java.lang.StackOverflowError</span><br><span class="line"> at java.util.HashMap.hash(HashMap.java:<span class="number">339</span>)</span><br><span class="line"> at java.util.HashMap.computeIfAbsent(HashMap.java:<span class="number">1099</span>)</span><br></pre></td></tr></table></figure></p><p>You may use <code>-Xss4m</code> to shut it up, or resolve it in an iterative way a little bit optimized, wherein we won’t waste space to build an array. Yes, it’s right, recursion is easy to think of and implement, but it’s just too close to StackOverflowError.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">BigInteger <span class="title">fib</span><span class="params">(<span class="keyword">int</span> n)</span> </span>{</span><br><span class="line"> <span class="keyword">if</span> (n <= <span class="number">2</span>) <span class="keyword">return</span> BigInteger.ONE;</span><br><span class="line"></span><br><span class="line"> BigInteger prev = BigInteger.ONE, prevOfPrev = BigInteger.ONE;</span><br><span class="line"> BigInteger curr = <span class="keyword">null</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">2</span>; i < n; i++) {</span><br><span class="line"> curr = prev.add(prevOfPrev);</span><br><span class="line"> prevOfPrev = prev;</span><br><span class="line"> prev = curr;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> curr;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>BigInteger, what’s the secret behind it that it can represent a number beyond the upper limit of all primitive data types of Java?</p><p>The idea is still simple, but the implementation can be complex, especially when highly optimized is a must. <a href="https://github.com/tbuktu/bigint" target="_blank" rel="noopener">bigint</a> in Github would give you more information on how Timothy Buktu approaches and optimizes it.</p><p>There is a famous quote in The Lord of The Rings by Sam, “I can’t carry it for you, but I can carry you”, as for BigInteger, it seems should be “I can’t carry it for you, and I can’t carry you, but I can carry a segment of you.”😅</p><p>Yes, the whole number is too big, but if we split it into many parts, then we will find each part might be within the upper limit, if we store each of them in an array, and perform the calculation in binary level, also handle the carrying part well, then a container to store a big number will be possible.</p><p>Pretty cool and smart, right? But what exactly is under the hood? Let’s check it out in the next post.</p><p><em><strong>TO BE CONTINUED</strong></em></p><script type="text/javascript">amzn_assoc_placement = "adunit0";amzn_assoc_search_bar = "false";amzn_assoc_tracking_id = "oldyoungboy-20";amzn_assoc_ad_mode = "manual";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_title = "";amzn_assoc_linkid = "21ce171baf5d871f0872d552bc2cbace";amzn_assoc_asins = "0805063056,B0015DWM2K,0767908163,1590787528";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script>]]></content>
<summary type="html">
<p>No matter what programming language you are learning, Fibonacci numbers calculation is just too classic to skip. It seems very easy to implement, but also can be deep enough to catch up with, especially when the number is super big. Have you thought that ever before?</p>
<p><img src="https://www.dropbox.com/s/ye2355oyx0ixy3w/rabbit.jpg?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Java Never Sleep" scheme="http://www.oldyoungboys.club/tags/Java-Never-Sleep/"/>
</entry>
<entry>
<title>Sqlite, Monastery and St. Benedict: Code Of Ethics</title>
<link href="http://www.oldyoungboys.club/Sqlite-And-Monastery-Code-Of-Ethics/"/>
<id>http://www.oldyoungboys.club/Sqlite-And-Monastery-Code-Of-Ethics/</id>
<published>2019-01-02T18:58:24.000Z</published>
<updated>2019-09-03T20:29:56.242Z</updated>
<content type="html"><![CDATA[<p>We will begin from an interesting comment in the source code of SQLite, and end with Monastery, Saint Benedict, and Code of Ethics. I am not joking, seriously.</p><p><img src="https://upload.wikimedia.org/wikipedia/commons/e/ef/Totila_e_San_Benedetto.jpg" alt=""><a id="more"></a></p><p>I started using SQLite for my Java desktop application in 2014 and found it’s really good, its query performance, full-text search support made my development experience smooth and joyful. Import 800M entries from thousands of files into SQLite database in minutes, then query entries in milliseconds, which is really impressive. Besides my personal experience, almost every smartphone is using it every day.</p><p>Since it’s so widely used, as a programmer you will get into troubles soon, even though they have nothing to do with you. Read the comments and you would probably laugh with tears:😅</p><blockquote><p>2006-10-31: The default prefix used to be “sqlite_”. But then Mcafee started using SQLite in their anti-virus product and it started putting files with the “sqlite” name in the c:/temp folder. This annoyed many windows users. Those users would then do a Google search for “sqlite”, find the telephone numbers of the developers and call to wake them up at night and complain. For this reason, the default name prefix is changed to be “sqlite” spelled backwards. So the temp files are still identified, but anybody smart enough to figure out the code is also likely smart enough to know that calling the developer will not help get rid of the file.</p></blockquote><p>So the prefix now is <code>etilqs_</code>, and the programmer would sleep peacefully finally.</p><p>If you scroll up to the top of the file <a href="https://github.com/endlesssoftware/sqlite3/blob/master/os.h" target="_blank" rel="noopener"><code>os.h</code></a>, you will meet several lines of comments which will definitely surprise you - at least I was really surprised.</p><blockquote><p>2001 September 16</p><p>The author disclaims copyright to this source code. In place of a legal notice, here is a blessing:</p><pre><code>May you do good and not evil.May you find forgiveness for yourself and forgive others.May you share freely, never taking more than you give.</code></pre></blockquote><p>The date has a special meaning as it’s immediately after 911 accident, during a <a href="https://www.red-gate.com/simple-talk/opinion/geek-of-the-week/dr-richard-hipp-geek-of-the-week/" target="_blank" rel="noopener">interview</a> with <a href="https://en.wikipedia.org/wiki/D._Richard_Hipp" target="_blank" rel="noopener">Richard Hipp</a>, the original creator of SQLite, talked about the inspiration behind that:</p><blockquote><p>Interviewer: […] Who or what inspired you to write that?</p><p>Richard Hipp: People customarily put a copyright notice at the top of each source file. But SQLite version 2.0.0 had no copyright so I had to think of something else to go in that space.</p><p>The second sentence, “May you find forgiveness for yourself and forgive others”, is a loose interpretation of Matthew 6:12, part of what is commonly called ‘The Lord’s Prayer’ and more recognizable as ‘Forgive us our debts as we forgive our debtors’.</p><p>The third sentence tries to capture the concept of paying debts forward. The ‘never take more than you give’ part is a paraphrase of one of the lyrics from The Lion King. The first (hokey) sentence is there because it seemed like a good benediction needed three sentences.</p></blockquote><p>It sounds like a Pastor preaching to you, or a monk in monastery reading scriptures to you, right? Yes, you are right. There is actually a connection between SQLite and Monastery, and <a href="https://en.wikipedia.org/wiki/Rule_of_Saint_Benedict" target="_blank" rel="noopener">The Rule of St. Benedict</a>, read the “<a href="https://sqlite.org/codeofethics.html" target="_blank" rel="noopener">Code Of Ethics</a>“ and you will understand the spiritual inspiration then.</p><p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/c/c9/MS._Hatton_48_fol._6v-7r.jpg/640px-MS._Hatton_48_fol._6v-7r.jpg" alt=""></p><blockquote><p>The founder of SQLite, and all of the current developers at the time when this document was composed, have pledged to govern their interactions with each other, with their clients, and with the larger SQLite user community in accordance with the “instruments of good works” from chapter 4 of The Rule of St. Benedict (hereafter: “The Rule”). This code of ethics has proven its mettle in thousands of diverse communities for over 1,500 years and has served as a baseline for many civil law codes since the time of Charlemagne.</p></blockquote><p>In a post-modern world that rejects absolute truth so obviously and satirizes religion, especially Christianity, how would programmers with a different worldview respond to this? And what shall be the scope of this application? It also gives an eclectic statement so that code will be separated from religion, as it should be.</p><blockquote><p>No one is required to follow The Rule, to know The Rule, or even to think that The Rule is a good idea. The Founder of SQLite believes that anyone who follows The Rule will live a happier and more productive life, but individuals are free to dispute or ignore that advice if they wish.</p><p>The founder of SQLite and all current developers have pledged to follow spirit of The Rule to the best of their ability. They view The Rule as their promise to all SQLite users of how the developers are expected to behave in community. This is a one-way promise, or covenant. In other words, the developers are saying: “We will treat you this way regardless of how you treat us.”</p></blockquote><p>While code itself should be separated from religious belief, the man who wrote the code needs not and probably cannot.</p><p><img src="https://upload.wikimedia.org/wikipedia/commons/a/ab/Soli_deo_gloria.jpg" alt=""></p><p>Just like <code>Soli deo gloria</code> has been found at the end of manuscripts by masters like G. F. Handel, J.S. Bach and etc, for me, it’s surprising but also joyful to know that the invisible impact of Gospel in the IT world, it’s not only about code that would be cold, but also about code of life, the rules of love that is warm and truthful.</p><script type="text/javascript">amzn_assoc_placement = "adunit0";amzn_assoc_search_bar = "false";amzn_assoc_tracking_id = "oldyoungboy-20";amzn_assoc_ad_mode = "manual";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_title = "";amzn_assoc_linkid = "e0b9b0fe194fce20f484b857c041ae77";amzn_assoc_asins = "0596009763,1980293074,162164149X,149225178X";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script>]]></content>
<summary type="html">
<p>We will begin from an interesting comment in the source code of SQLite, and end with Monastery, Saint Benedict, and Code of Ethics. I am not joking, seriously.</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/e/ef/Totila_e_San_Benedetto.jpg" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="SQLite" scheme="http://www.oldyoungboys.club/tags/SQLite/"/>
</entry>
<entry>
<title>NetBeans 10 Released: The Best Swing GUI Builder You Should Try</title>
<link href="http://www.oldyoungboys.club/NetBeans-10-Released-The-Best-Swing-GUI-Builder-You-Should-Try/"/>
<id>http://www.oldyoungboys.club/NetBeans-10-Released-The-Best-Swing-GUI-Builder-You-Should-Try/</id>
<published>2018-12-29T17:14:57.000Z</published>
<updated>2019-09-03T20:29:56.240Z</updated>
<content type="html"><![CDATA[<p>As a Java guy now I use <a href="https://www.jetbrains.com/idea/" target="_blank" rel="noopener">IDEA</a> most of the time, I’ve been using <a href="https://www.eclipse.org/" target="_blank" rel="noopener">eclipse</a> for about 6 years in the past but now I rarely open it. However, I still use <a href="https://netbeans.org/" target="_blank" rel="noopener">NetBeans</a> when I need to do some Swing stuff, as it provides the best GUI builder user experience in my perspective. From Sun to Oracle, now moving to Apache, after a long time of silence, <a href="https://netbeans.apache.org/download/nb100/nb100.html" target="_blank" rel="noopener">Apache Netbeans 10.0</a> was released on Dec 27th, 2018. Wanna get a try, man?</p><div class="video-container"><iframe src="//www.youtube.com/embed/O8cwpEY1OAQ" frameborder="0" allowfullscreen></iframe></div><a id="more"></a><p>Besides Netbeans there is another distribution called <a href="http://coolbeans.xyz/" target="_blank" rel="noopener">CoolBeans</a>, I first got to know this from <a href="https://news.ycombinator.com/show" target="_blank" rel="noopener">Hacker News Show</a>, which is a good marketplace to know the latest new stuff from programmers who want to present and promote their work. I would highly recommend you visit there regularly.</p><p>Yesterday night I tried both NetBeans and CoolBeans, it’s similar to that of NetBeans 8.2, which I used for Swing GUI stuff. There is <a href="https://www.eclipse.org/windowbuilder/" target="_blank" rel="noopener">Window Builder</a> for eclipse and IDEA ships GUI Designer itself, I actually tried them both at 2014 and 2016, and still, I found NetBeans is the best in this area.</p><p><img src="https://www.eclipse.org/windowbuilder/images/wb_summary_shot.gif" alt=""></p><p>Nowadays the hot topic is cloud, apps, web and machine learning, desktop application is something you probably would never touch. As for me, in the first 3 years as a Java programmer I focused on JavaEE stuff, only when I started working for a start-up company at 2014, I have to build a desktop application to be deployed to thousands of clients. I never did that before but I managed to release the first usable version in 3 weeks, it would be impossible without Netbeans.</p><p><img src="https://www.dropbox.com/s/da3tt8mswn86zf5/gui-builder.jpg?dl=1" alt=""></p><p>From 2016 I started using JavaFX to build certain GUI applications, but in my personal experience I still prefer Swing - sounds strange, right? I will explain why in the future. Meanwhile, PyQT and WxPython in the Python community, <a href="https://electronjs.org/" target="_blank" rel="noopener">Electron</a> in Javascript community is also good stuff worthy to try. Once I played around with Electron and it would save you a lot of time as it will reuse your web development skills.</p><p><img src="https://www.dropbox.com/s/c41q0xpjhb0kz08/scene-builder-in-action.jpg?dl=1" alt=""></p><p>I plan to summarize and share some of my Swing work to you guys in Github recently. Hope you guys enjoy!</p><script type="text/javascript">amzn_assoc_placement = "adunit0";amzn_assoc_search_bar = "false";amzn_assoc_tracking_id = "oldyoungboy-20";amzn_assoc_ad_mode = "manual";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_title = "";amzn_assoc_asins = "B007Y6KIHI,1118385349,1617292842,0134393333";amzn_assoc_linkid = "5e73b1f765df95a34faf76f5c40b63d0";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script>]]></content>
<summary type="html">
<p>As a Java guy now I use <a href="https://www.jetbrains.com/idea/" target="_blank" rel="noopener">IDEA</a> most of the time, I’ve been using <a href="https://www.eclipse.org/" target="_blank" rel="noopener">eclipse</a> for about 6 years in the past but now I rarely open it. However, I still use <a href="https://netbeans.org/" target="_blank" rel="noopener">NetBeans</a> when I need to do some Swing stuff, as it provides the best GUI builder user experience in my perspective. From Sun to Oracle, now moving to Apache, after a long time of silence, <a href="https://netbeans.apache.org/download/nb100/nb100.html" target="_blank" rel="noopener">Apache Netbeans 10.0</a> was released on Dec 27th, 2018. Wanna get a try, man?</p>
<div class="video-container"><iframe src="//www.youtube.com/embed/O8cwpEY1OAQ" frameborder="0" allowfullscreen></iframe></div>
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="NetBeans" scheme="http://www.oldyoungboys.club/tags/NetBeans/"/>
</entry>
<entry>
<title>The Best Java Tutorial You Shall Not Miss at Easter 2019</title>
<link href="http://www.oldyoungboys.club/The-Best-Java-Tutorial-You-Shall-Not-Miss/"/>
<id>http://www.oldyoungboys.club/The-Best-Java-Tutorial-You-Shall-Not-Miss/</id>
<published>2018-12-28T23:53:11.000Z</published>
<updated>2019-09-03T20:29:56.243Z</updated>
<content type="html"><![CDATA[<p>The end of 2018 is close and it’s time to set up goals of 2019. As for me, it’s super clear - I want to make the best Java tutorial for beginners. Texts, videos, exercises will be prepared carefully, precisely covering the most important topics of Java programming based on market research. I will leverage my 10 years experience in software engineering to make your learning experience friendly, funny, and challenging. Remember, it will be released at Easter, 2019.</p><p><img src="https://www.dropbox.com/s/0x28upnkslfjcjo/goal.jpg?dl=1" alt=""><a id="more"></a></p><p>I dare to share this goal publicly, shamelessly due to these reasons:</p><ol><li><p>Shout out your goal as loud as you can, so that your friends will know this and ask you about progress, which will force you to push forward if you still know what shame is;</p></li><li><p>Teaching will always help you learn faster and better, there will never be an end in programming learning, even though you have worked for about 10 years. I plan to summarize Java programming completely, systematically this time and also learn new stuff that I didn’t have time to catch up with last year;</p></li><li><p>I’ve taught SE500 course at OIT twice but I didn’t build it into a product that can be reused by students in the future. It’s a pity that my experience cannot benefit them the most, and I want to improve it this time;</p></li><li><p>Compared to best selling Java courses at popular platforms such as Udemy, Lynda, and etc, my tutorial will be more user-friendly, funny while challenging. You might be surprised that a tutorial aiming at beginners will be described as “challenging”, but I firmly believe that reasonable challenges will make learners apply what they have learned in the shortest time;</p></li><li><p>Being a teacher means you have a lot of opportunities to touch many different students, which can also inspire and improve myself. Besides that, you can also make many new friends in the future, which is pretty exciting.</p></li></ol><p>So here I am. Will I keep my word and promise, or am I just joking? Let’s check it out at Easter 2019. It’s not shameful to fail, but it’s shameful to fear.</p><p><em>PS:</em> Got tired? Read these books and you will definitely be inspired and motivated. Not everyone can be great programmer, but everyone can be a programmer, and make a difference.</p><script type="text/javascript">amzn_assoc_placement = "adunit0";amzn_assoc_search_bar = "false";amzn_assoc_tracking_id = "oldyoungboy-20";amzn_assoc_ad_mode = "manual";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_title = "";amzn_assoc_linkid = "234ebf5a711b24c888ee32618d2535d3";amzn_assoc_asins = "1501124021,B00J5X5E9U,0812972155,B004W2UBYW";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script>]]></content>
<summary type="html">
<p>The end of 2018 is close and it’s time to set up goals of 2019. As for me, it’s super clear - I want to make the best Java tutorial for beginners. Texts, videos, exercises will be prepared carefully, precisely covering the most important topics of Java programming based on market research. I will leverage my 10 years experience in software engineering to make your learning experience friendly, funny, and challenging. Remember, it will be released at Easter, 2019.</p>
<p><img src="https://www.dropbox.com/s/0x28upnkslfjcjo/goal.jpg?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Tutorial" scheme="http://www.oldyoungboys.club/tags/Tutorial/"/>
</entry>
<entry>
<title>Merry Christmas: For to Us a Son Is Given</title>
<link href="http://www.oldyoungboys.club/Merry-Christmas/"/>
<id>http://www.oldyoungboys.club/Merry-Christmas/</id>
<published>2018-12-26T05:17:56.000Z</published>
<updated>2019-09-03T20:29:56.239Z</updated>
<content type="html"><![CDATA[<p>Just came home from <a href="http://gnit.org/" target="_blank" rel="noopener">GNIT</a> SF monthly gathering and I was deeply grateful for the community life. This Christmas has some special meaning to me and very likely I will always remember it.</p><p><img src="https://www.dropbox.com/s/ojaph66lp565014/christmas.jpg?dl=1" alt=""><a id="more"></a></p><p><a href="https://twitter.com/KAKA" target="_blank" rel="noopener">Kaka</a>, possibly the most godly soccer player I’ve ever known, published a tweet yesterday night and I saw it this morning.</p><p><blockquote class="twitter-tweet tw-align-center" data-lang="en"><p lang="pt" dir="ltr">"Porque um menino nos nasceu, um filho nos foi dado, e o governo está sobre os seus ombros. E ele será chamado Maravilhoso, Conselheiro, Deus Poderoso, Pai Eterno, Príncipe da Paz" Isaías 9:6 <a href="https://twitter.com/hashtag/FelizNatal?src=hash&ref_src=twsrc%5Etfw" target="_blank" rel="noopener">#FelizNatal</a> <a href="https://t.co/YzMHrIyZaO" target="_blank" rel="noopener">https://t.co/YzMHrIyZaO</a></p>— Kaka (@KAKA) <a href="https://twitter.com/KAKA/status/1077388219992428545?ref_src=twsrc%5Etfw" target="_blank" rel="noopener">December 25, 2018</a></blockquote></p><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script><p>After translating it from Portuguese to English, I determined the scriptures that I would use for the intecessory prayer of the Christmas Service in our church.</p><blockquote><p>“For to us a child is born, to us a son is given,<br>and the government will be on his shoulders.<br>And he will be called Wonderful Counselor, Mighty God,<br>Everlasting Father, Prince of Peace.<br>Of the greatness of his government and peace there will be no end.<br>He will reign on David’s throne and over his kingdom,<br>establishing and upholding it with justice and righteousness<br>from that time on and forever.<br>The zeal of the Lord Almighty will accomplish this.” </p><p><em>Book of Isaiah, 9:6-7</em></p></blockquote><p>God kept His word and promise to His people, He sent His only begotten son to this world at that very Christmas. Today, will you keep your word and promise to God, if there is any?</p>]]></content>
<summary type="html">
<p>Just came home from <a href="http://gnit.org/" target="_blank" rel="noopener">GNIT</a> SF monthly gathering and I was deeply grateful for the community life. This Christmas has some special meaning to me and very likely I will always remember it.</p>
<p><img src="https://www.dropbox.com/s/ojaph66lp565014/christmas.jpg?dl=1" alt="">
</summary>
<category term="Life" scheme="http://www.oldyoungboys.club/categories/Life/"/>
<category term="Christmas" scheme="http://www.oldyoungboys.club/tags/Christmas/"/>
</entry>
<entry>
<title>Percolation, Predestination, and Freewill</title>
<link href="http://www.oldyoungboys.club/Percolation-Predestination-Freewill/"/>
<id>http://www.oldyoungboys.club/Percolation-Predestination-Freewill/</id>
<published>2018-12-24T21:18:29.000Z</published>
<updated>2019-09-03T20:29:56.241Z</updated>
<content type="html"><![CDATA[<p>Yeah, I just finished week 1 assignment of <a href="https://www.coursera.org/learn/algorithms-part1" target="_blank" rel="noopener">Algorithms, Part I</a> about <a href="http://coursera.cs.princeton.edu/algs4/assignments/percolation.html" target="_blank" rel="noopener">percolation</a>, but why the hack this post trying to talk about Predestination and Freewill, a religious topic that would be super boring? OK. Calm down, calm down and watch this video first, my friend. It’s a silent film lasts only 20 seconds.</p><div class="video-container"><iframe src="//www.youtube.com/embed/52v9z0EfRK8" frameborder="0" allowfullscreen></iframe></div><a id="more"></a><p>In the Christmas retreat held on last Saturday, Pastor Peter Tzeng from <a href="http://www.sfgratia.org/" target="_blank" rel="noopener">SF Gratia Presbyterian Church</a> preached a sermon about “Predestination and Election”, which suddenly reminded me of the programming assignment I just finished. My heart was filled with peace, gratitude, and grace at that time. It’s really grateful to be part of a church, as you will never walk alone.</p><p>Let’s go back to the test case which I wrote a special test program <a href="https://github.com/ny83427/algs4-assignments/blob/master/src/test/java/ShuffledPercolationVisualizer.java" target="_blank" rel="noopener"><code>ShuffledPercolationVisualizer</code></a> to simulate the process. The video you have just watched was also recorded at that time.</p><p>This test case, with a matrix of 60 * 60, initialized 3600 sites blocked(marked black), then you can choose to select some of them and marked white, which was called “to open a site”. Meanwhile, if there is a connection between opened sites to any open site in the top row, all the sites involved in the connection will be marked blue.</p><p>There are 2408 opened sites, and when all of them are selected, the matrix will appear like Robert Sedgewick, Professor at Princeton University, instructor of the course.</p><p>You can shuffle the steps of opening these 2408 sites, but the result will always be the same. The connection will draw a picture for certain, it will never change its destination actually.</p><p>Then how many possible sequences there can be to open these 2408 sites? It’s very clear that the result will be factorial of 2408, but do you have any idea how big it would be?</p><p>I got into some trouble in calculating factorial of 2408 using Java, and I found using Python is much easier this time.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> math</span><br><span class="line">print(math.factorial(<span class="number">2408</span>))</span><br></pre></td></tr></table></figure><p>And the <a href="https://allthefactorials.tumblr.com/post/162652567073/2408" target="_blank" rel="noopener">result</a> is…ready? Go!</p><p>21758594999952225256227255509958770525534993761170297492657718767486088244539036311974548105366978730304890874687998931843982776471695309982496716091929477596457019761300893173741534377145055742888600264305972407789653360420715650908019804809003333937353797238853879239069468095253478585900901569877665080446005525224089508413768914166828202348453714981888582265659866672234817320549835393263269600678943173083208786150887241218830951534151023198217804926543144488519879222117164281533725136849972519542757913468975304278272202313390978737108647430080250688571420352473288158962330968717973026215547164574655829522597595920308076300262533612505452491729694802704235061137562520593939941611211676246573285129213297740475072197491178881479094324050477716711007186760890805885289600818395369120904089858881365253682904986765051411600279317114544670877598660608068387251089672699624432818691431753629834595266369774230206535021500947768350641436338008224871891096941989810564059835480734023608516986342940799282601509909755914345702002398979527341635335589932079921844550299296765788764614115047189880053781089359124607076307477102000781076272097686933038955068463564138165407556497891242723903418121248640843251301848661406267955775206671116633131848330041457815589303985731973513659761266361451987631846567722247462019715782565495726789718590430300132709702923444485339649818738367459553944758665000677121577239898195392489190372781469616510565567235662058347447041036082279555002359384777241046356630140481788406662330268331157432591691983712930387182787427331315236514697941067977155337915150525669700467459039497125260067081460975724610012796075453577512129524610598629405415947882287562646596847798500277989940349764441278748496344810587661925483816853247911767802264941093438557143107557343354223365275158586265975784232244388778490424915335014856632359110022021568854780677765623358588472815628236138731069723731918036279618308587391396893812365810934868790695982445677773511512096164778247854799696415801777047426769474512776572807515499163928478050635767617241814153161069023349807872926989276321585129832669703196308315638887095150725873327382978368414291853714206642479565340623081645635259899325152723077182087825883320502448759987706414708555797695156437711005255453773088388421692048712808757032199780471397499344526710742025027366241277709542406769259163050319711009907937797988005289487586289285239191709874783876026264546187631023984005517156464453378524257632954079904118202556259631871941494206554425675391040068137333289842961763952315169361943659557098887908148472454753534441180805876726890829406428410293912061447395930885462039983639454999156893078306721906157525678665587288311022535413165018866038774554519237458631172359295574521080147470001807646132611548173799809180872176554883132407975754633781395902726073263358391864654226838611241280382791810347791313261228186922648281944512665244213221470708304181265827395530294959267833181512516106016708886865802585082368266589308282277973310417979771362961573289306518544172875848050627600215462334565453317064959637152271137508553349749864076477375290799340126691892626805021243113341240094988118302395281987744694938839705675124899805205311840735482867308921928032903126837676596703677074546354070316422400659168832877910530102049188638921987560519794911408413376690139023750683170764269423245188478974782685833231647755282809023275600579562068022306125965603894714096709169014359173885725022410482307983582848544029430308504710050913607427947146960265782278182851895691019554244771829604138266880136613179627902032928187316074972413984172931815414459532911017932363121474024120207932042589633624980202492915808108472682957610862771268658194135467582024669148130274010639626573267227084500963051586412573311529688779546274807578980079589236574590032283708780576841735220257056089488889054503441347829340342987839998850315817941430464119539322030100036391585484429073544194437248620041004947094494306813124947233221809464184492977405338266088004753342006034454256802717860213487102304602111082069729179183912892310926940010895293383091638154273639132326948510614314636246553636478230586408593481841258855076019524321468305027698853616858544467250133242472615996699171731672015980715690856827203376654427469688828493437318017191647963907353096155949414777105832802809111797630502385412343654294767130489796867215540128434416046595314739194824591845330992960181683418445356776559743473171773332982169278213091022970109475662032901706168542366477219418250959082099419221794734241867163997537294903638475623203613762563916868986220108618349798999297270144629976714666404789938566426084294730264011993843464803887031369997182138090586858439263165728582852107148235168845179981951308720985000631523794448909717418597240983588271119017952446552908771616828729694498105134321362587752084498298540631630047275976674069416103878183339417398381640250832037298996904027435379281480080779997366294091624818615213365710341769880327641623742788293755856604362177161860165904684510408376136291933499279091071147396522966312847540590794851028270031075275070110877864018101022208278340073515317163453715671305253878011789709400417146562136811038036869947227020189400688889289145368187906102145887240147199649624741501825733634493584569288672808008071700916133597015265730531592515259177198882367872332958036736813927476039975698636525851141792384458124270507717681351913851341602999123763657321159787790779636932576524606488017978717474806161069078372492020161577278433064090872570321300847401308302034646131087623196009250892633693952221225135875156933075853403175847677904474782437577896496547224344704997964123613109921984911971617174297374719817432121315231348044318866252156117239158307344697451683424693705097269162351898260575905280042177114272875010079571280939316425425734562133587147771503016863766670267188101589212539344187429237337097886813790907620488059495062409895961794810564953442060219347321914248124224100979617970052618332117897515242897039379709127894075251637810159733054718222062365062421320028544993426422351376478518352472783084637480300654371523661962663091208492103830074944716764484446634939320765685200444210417771706531035127562476041407515658152099601907167854337328045882514475289024549979687331418902963389094528577036145080901894491929405642630745909979143286735986058234759547883452690878465663700666796458122673957366323079880085451052670365099161588137223768621426118555148885228241036087524278431003970843443200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000</p><p>And we would easily find that two facts exist without any conflict:</p><ul><li>You can select and open 2408 sites freely in nearly infinite possible ways</li><li>The destination pciture will never change when all the open sites connected</li></ul><p>Every day you make decisions that will have an impact on your life, sometimes you make them according to your own will, sometimes you just have to and have no choice. When Christians are taught that God has a plan for their life and predestined a good ending - we don’t discuss what kind of good it would be here, it seems very confusing and contrary to our daily life, and very difficult to believe that.</p><p>How can this be? How can a man act according to his free will if there is predestination from God already?</p><p>It would be very easy to approach and answer this question, or very difficult and impossible to answer this question.</p><p>The easy approach is to question the question itself, as it is based on a false assumption. Why can’t God’s predestination and men’s free will co-exist? Why do we think that it’s so obvious that you can only have one of them, either predestination or free will?</p><p>Based on this false assumption, and the preciousness of freewill, Christian faith about Predestination became chains of the souls who are struggling with how to make decisions - though it doesn’t need to be like this.</p><p>We simply overrated our ability while underrated God’s providence.</p><p>If those selected and opened sites are steps of my life, then I know that no matter how I walked through these steps, my destination won’t change, I would draw a picture of myself rebuilt by God’s grace. Many times I cannot understand why certain steps are connected in a confusing way, but immediately I can testify that in the right time God led me to the right place, to do the right things - usually I don’t know what the right things really are.😅</p><p>If those selected and opened sites are brothers and sisters in the church, called by God to build His Kingdom, then I know that no matter how and when they are elected, the destination won’t change, we will draw a picture of Kingdom of God that God will surely restore.</p><p>To appreciate and recognize God’s works in brothers and sisters is a joyful work that I’d like to practice always, you will never know how God will use a man and when God will use a man, but the result will always help you to know more about God, know more about you.</p><p>In the beginning, God created the heavens and the earth. And in the end, “a new heaven and a new earth”. As a result, the bottom row will for sure connect to the top row, <code>percolates</code> would always return <code>true</code>. If this is the test case you can skip the complex implementation and simply <code>return true</code>.</p><p><img src="../images/heaven-and-earth.jpg" alt=""></p><p>Percolation, predestination, and freewill. Complicated, simple, and mysterious.</p><p>Hard to believe, right? Maybe the life and works of Martin Luther, John Calvin will give you some insight on this never ending discussion.</p><script type="text/javascript">amzn_assoc_placement = "adunit0";amzn_assoc_search_bar = "false";amzn_assoc_tracking_id = "oldyoungboy-20";amzn_assoc_ad_mode = "manual";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_title = "";amzn_assoc_linkid = "1eea2a1e0fa470806f2efa9531b2818d";amzn_assoc_asins = "1426754434,030017084X,1296609529,0875521126";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script>]]></content>
<summary type="html">
<p>Yeah, I just finished week 1 assignment of <a href="https://www.coursera.org/learn/algorithms-part1" target="_blank" rel="noopener">Algorithms, Part I</a> about <a href="http://coursera.cs.princeton.edu/algs4/assignments/percolation.html" target="_blank" rel="noopener">percolation</a>, but why the hack this post trying to talk about Predestination and Freewill, a religious topic that would be super boring? OK. Calm down, calm down and watch this video first, my friend. It’s a silent film lasts only 20 seconds.</p>
<div class="video-container"><iframe src="//www.youtube.com/embed/52v9z0EfRK8" frameborder="0" allowfullscreen></iframe></div>
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Predestination" scheme="http://www.oldyoungboys.club/tags/Predestination/"/>
</entry>
<entry>
<title>5 Things I Won't Tell You About Algorithms, Part I</title>
<link href="http://www.oldyoungboys.club/5-Things-I-don-t-want-to-tell-you-about-Algorithms-Part-I/"/>
<id>http://www.oldyoungboys.club/5-Things-I-don-t-want-to-tell-you-about-Algorithms-Part-I/</id>
<published>2018-12-24T17:32:50.000Z</published>
<updated>2019-09-03T20:29:56.225Z</updated>
<content type="html"><![CDATA[<p><a href="https://www.coursera.org/learn/algorithms-part1" target="_blank" rel="noopener">Algorithms, Part I</a> at Coursera by Princeton University would be the best online Algorithms course I’ve ever enrolled in. I highly recommend you to take a look, though there are 5 things I don’t want to - or at least hesitate to tell you about it.</p><p><img src="https://www.dropbox.com/s/14rg1ow1rpnod2d/sedgewick60.png?dl=1" alt=""><a id="more"></a></p><h3 id="I-Gave-Up-Twice"><a href="#I-Gave-Up-Twice" class="headerlink" title="I Gave Up Twice"></a>I Gave Up Twice</h3><p>It’s a shame to tell you that I’ve enrolled in this course more than once. I’ve enrolled to this course at 2014 first, then probably at 2016 I picked up it again. But I never finished it and usually, I stayed at Week 1, Course 1, Video 1.</p><p>It’s Dec 2018 already and I found I am still struggling with certain coding contests like leetcode weekly, I think it’s time to review and strengthen again.</p><p>The instructors of this course suggested 6 weeks of study, 6-10 hours per week, which is apparently affordable in time and workload. Then why I gave up previously twice? I am lack of patience and endurance. This is a course you need to devote yourself and won’t see any visible impact in a short time, but in long-term the reward is high - at least I believe.</p><p>So this time I set punishment like this if I gave up the 3rd time.</p><p><img src="https://www.dropbox.com/s/z0pfde4bps6atuo/Wakisashi-sepukku.jpg?dl=1" alt=""></p><h3 id="Don’t-Copy-My-Code"><a href="#Don’t-Copy-My-Code" class="headerlink" title="Don’t Copy My Code"></a>Don’t Copy My Code</h3><p>To force myself to stick to the goal, I created a repository at GitHub to commit <a href="https://github.com/ny83427/algs4-assignments" target="_blank" rel="noopener">assignments</a> I finished and track progress each week. Fortunately, I’ve finished two weeks’ task beforehand and I am still alive.</p><p>I would say that the assignments are very good for practicing purpose. You will apply what you have learned in the course and it’s also challenging. Usually, it says that it would cost you 5 hours to finish, but I found that to get a good score you would spend more hours than this.</p><p>I literally spent 8 hours on the first assignment <code>Percolation</code> and tried 6 times to get a good score, as I also enhanced the test methodology myself, which you can review the issue <a href="https://github.com/kevin-wayne/algs4/issues/64" target="_blank" rel="noopener">#64</a> I reported.</p><p>Try to resolve the problems yourself but in a reasonable time(I am not a student anymore and I don’t have that much time), if you are stuck for a long time you can search for some tips and insights, or go to the forum to see what the mentor replied there. Don’t copy my code man, but you can review them to get some ideas if you really need and want.</p><p>The online grade system is very impressive and I am thinking of introducing it into OIT courses I would teach in the future.</p><h3 id="I-Get-Stuck-Frequently"><a href="#I-Get-Stuck-Frequently" class="headerlink" title="I Get Stuck Frequently"></a>I Get Stuck Frequently</h3><p>Yeah, don’t be upset. If you get stuck in the way, remember, there is a fool like me also get upset, depressed, angry, anxious, suspicious regularly. It’s very clear I don’t have much talent in this area, but hard works would still improve me somewhat.</p><table><thead><tr><th>Percolation Submission Date</th><th>Score</th><th>Passed?</th></tr></thead><tbody><tr><td>December 21, 09:31 AM PST</td><td>100/100</td><td>Yes</td></tr><tr><td>December 20, 10:48 PM PST</td><td>93/100</td><td>Yes</td></tr><tr><td>December 20, 10:41 PM PST</td><td>81/100</td><td>Yes</td></tr><tr><td>December 20, 09:17 PM PST</td><td>79/100</td><td>No</td></tr><tr><td>December 20, 03:31 PM PST</td><td>79/100</td><td>No</td></tr><tr><td>December 20, 03:06 PM PST</td><td>52/100</td><td>No</td></tr></tbody></table><p>The 5th submission passed and the score seems good enough, but actually, there is a bug which I cannot ignore, after resolving it I got a 100, but still the memory usage is not optimized well, which I plan to improve in the future.</p><h3 id="I-Got-The-Textbook"><a href="#I-Got-The-Textbook" class="headerlink" title="I Got The Textbook"></a>I Got The Textbook</h3><p>Slides of the lectures are good enough for reading purpose, but a thorough and systematical reading would still require a textbook. Algorithms, 4th Edition by Robert Sedgewick and Kevin Wayne provides</p><blockquote><p>“essential information that<br>every serious programmer<br>needs to know about<br>algorithms and data structures”</p></blockquote><p>So go and get it, <strong>DO NOT HESITATE!</strong></p><p><a href="https://amzn.to/2RivTXe" target="_blank" rel="noopener"><img src="https://algs4.cs.princeton.edu/cover.png" width="189" height="236" border="0" alt="Algorithms, 4th Edition by Robert Sedgewick and Kevin Wayne" style="display: block;margin-left: auto;margin-right: auto;"></a></p><h3 id="My-Unit-TestCase-Rocks"><a href="#My-Unit-TestCase-Rocks" class="headerlink" title="My Unit TestCase Rocks"></a>My Unit TestCase Rocks</h3><p>Honestly speaking, the unit test mentioned in the course are actually no unit test at all. Simply running <code>main</code> method and compare the result with expected result manually just sucks, I would shamelessly recommend you using JUnit or TestNG to do this. Feel free to use my test cases, they would save you a lot of time. For example, I’ve concluded all the tests provided by the instructor and integrated them into certain unit tests. You may like to check out <a href="https://github.com/ny83427/algs4-assignments/blob/master/src/test/java/PercolationTest.java" target="_blank" rel="noopener">PercolationTest.java</a> to see whether I am a good, old, and young boy.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@Test</span></span><br><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">all</span><span class="params">()</span> </span>{</span><br><span class="line"> In in = <span class="keyword">new</span> In(<span class="keyword">this</span>.getClass().getResource(<span class="string">"/percolation-test-cases.txt"</span>));</span><br><span class="line"> <span class="keyword">int</span> failed = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">while</span> (!in.isEmpty()) {</span><br><span class="line"> String line = in.readLine();</span><br><span class="line"> String[] array = line.split(<span class="string">","</span>);</span><br><span class="line"> Percolation perc = <span class="keyword">this</span>.from(array[<span class="number">0</span>]);</span><br><span class="line"></span><br><span class="line"> <span class="keyword">try</span> {</span><br><span class="line"> assertEquals(perc.numberOfOpenSites(), Integer.parseInt(array[<span class="number">1</span>]));</span><br><span class="line"> assertEquals(perc.percolates(), Boolean.parseBoolean(array[<span class="number">2</span>]));</span><br><span class="line"> } <span class="keyword">catch</span> (AssertionError e) {</span><br><span class="line"> <span class="comment">// I want to know how many cases fail and the exact corresponding files</span></span><br><span class="line"> <span class="comment">// You can simplify this if you just want to know the first failed case</span></span><br><span class="line"> System.err.println(line + <span class="string">" -> "</span> + perc.numberOfOpenSites() + <span class="string">","</span> + perc.percolates());</span><br><span class="line"> failed++;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span> (failed > <span class="number">0</span>) {</span><br><span class="line"> <span class="keyword">throw</span> <span class="keyword">new</span> AssertionError(failed + <span class="string">" test cases failed!"</span>);</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="meta">@Test</span></span><br><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">isFull</span><span class="params">()</span> </span>{</span><br><span class="line"> Percolation perc = <span class="keyword">new</span> Percolation(<span class="number">3</span>);</span><br><span class="line"> perc.open(<span class="number">1</span>, <span class="number">1</span>);</span><br><span class="line"> perc.open(<span class="number">3</span>, <span class="number">1</span>);</span><br><span class="line"> assertFalse(perc.isFull(<span class="number">3</span>, <span class="number">1</span>));</span><br><span class="line"></span><br><span class="line"> perc.open(<span class="number">1</span>, <span class="number">3</span>);</span><br><span class="line"> perc.open(<span class="number">3</span>, <span class="number">3</span>);</span><br><span class="line"> assertFalse(perc.isFull(<span class="number">3</span>, <span class="number">3</span>));</span><br><span class="line"></span><br><span class="line"> perc.open(<span class="number">2</span>, <span class="number">3</span>);</span><br><span class="line"> assertTrue(perc.isFull(<span class="number">3</span>, <span class="number">3</span>));</span><br><span class="line"> assertFalse(perc.isFull(<span class="number">3</span>, <span class="number">1</span>));</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Finally, if you have to purchase one and only one algorithm book and are still hesitating, then get <a href="https://amzn.to/2RivTXe" target="_blank" rel="noopener"><strong>Algorithms, 4th Edition</strong></a> first. Trust me, you will never regret.</p><script type="text/javascript">amzn_assoc_tracking_id = "oldyoungboy-20";amzn_assoc_ad_mode = "manual";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_design = "enhanced_links";amzn_assoc_asins = "032157351X";amzn_assoc_placement = "adunit";amzn_assoc_linkid = "a5b1635ea3b24455b5030a41c9f10ef9";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script>]]></content>
<summary type="html">
<p><a href="https://www.coursera.org/learn/algorithms-part1" target="_blank" rel="noopener">Algorithms, Part I</a> at Coursera by Princeton University would be the best online Algorithms course I’ve ever enrolled in. I highly recommend you to take a look, though there are 5 things I don’t want to - or at least hesitate to tell you about it.</p>
<p><img src="https://www.dropbox.com/s/14rg1ow1rpnod2d/sedgewick60.png?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Algorithms" scheme="http://www.oldyoungboys.club/tags/Algorithms/"/>
</entry>
<entry>
<title>Behind-the-Scenes Secrets of Jsoup V: Tips & Tricks of Optimization</title>
<link href="http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-Tips-And-Tricks/"/>
<id>http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-Tips-And-Tricks/</id>
<published>2018-12-17T17:02:51.000Z</published>
<updated>2019-09-03T20:29:56.230Z</updated>
<content type="html"><![CDATA[<p>We have done things right, now it’s time to do things faster. We would keep <a href="https://en.wikipedia.org/wiki/Donald_Knuth" target="_blank" rel="noopener">Donald Knuth</a>‘s warning in mind, “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil”.</p><p><img src="https://www.dropbox.com/s/kk13zhbeg3nsgmc/jsoup-performance.jpg?dl=1" alt=""><a id="more"></a></p><p>According to Jonathan Hedley, he uses <a href="https://www.yourkit.com/java/profiler/" target="_blank" rel="noopener">YourKit Java Profiler</a> to measure memory usage and find the performance hot point. Using statistic result of such kind of tools is crucial for the success of optimizations, it will prevent you spend time just wondering and making useless tunings, which doesn’t improve performance but also make your code unnecessarily complex and hard to maintain. Jonathan also talked about this in the <a href="https://jsoup.org/colophon" target="_blank" rel="noopener">“Colophon”</a>.</p><p>We will list some tips and tricks used in Jsoup, they are randomly ordered currently, would be re-organized in the future.</p><h3 id="1-Padding-for-Indent"><a href="#1-Padding-for-Indent" class="headerlink" title="1. Padding for Indent"></a>1. Padding for Indent</h3><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// memoised padding up to 21, from "", " ", " " to " "</span></span><br><span class="line"><span class="keyword">static</span> <span class="keyword">final</span> String[] padding = {......};</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">static</span> String <span class="title">padding</span><span class="params">(<span class="keyword">int</span> width)</span> </span>{</span><br><span class="line"> <span class="keyword">if</span> (width < <span class="number">0</span>)</span><br><span class="line"> <span class="keyword">throw</span> <span class="keyword">new</span> IllegalArgumentException(<span class="string">"width must be > 0"</span>);</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span> (width < padding.length)</span><br><span class="line"> <span class="keyword">return</span> padding[width];</span><br><span class="line"> <span class="keyword">char</span>[] out = <span class="keyword">new</span> <span class="keyword">char</span>[width];</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < width; i++)</span><br><span class="line"> out[i] = <span class="string">' '</span>;</span><br><span class="line"> <span class="keyword">return</span> String.valueOf(out);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">protected</span> <span class="keyword">void</span> <span class="title">indent</span><span class="params">(Appendable accum, <span class="keyword">int</span> depth, Document.OutputSettings out)</span> <span class="keyword">throws</span> IOException </span>{</span><br><span class="line"> accum.append(<span class="string">'\n'</span>).append(StringUtil.padding(depth * out.indentAmount()));</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Pretty smart, right? It maintains a cache of different lengths of paddings which would cover 80% of the cases - which I assume would be based on the author’s experience and statistic.</p><h3 id="2-Has-Class-Or-Not"><a href="#2-Has-Class-Or-Not" class="headerlink" title="2. Has Class Or Not?"></a>2. Has Class Or Not?</h3><p><a href="https://github.com/jhy/jsoup/blob/master/src/main/java/org/jsoup/nodes/Element.java#L1270" target="_blank" rel="noopener"><code>Element#hasClass</code></a> was marked as <strong>performance sensitive</strong>, for example, we want to check whether <code><div class="logged-in env-production intent-mouse"></code> has class <code>production</code>, split class by whitespace to an array then loop and search would work, but in a deep traverse this would be in-efficiency. Jsoup introduces <strong>Early Exit</strong> here first, by compare length with target class name to avoid unnecessary scan and search, which will also be beneficial. Then it uses a pointer detecting whitespace and performs regionMatches - honestly speaking this is the first time I got to know method <code>String#regionMatches</code>🙈😅</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">boolean</span> <span class="title">hasClass</span><span class="params">(String className)</span> </span>{</span><br><span class="line"> <span class="keyword">final</span> String classAttr = attributes().getIgnoreCase(<span class="string">"class"</span>);</span><br><span class="line"> <span class="keyword">final</span> <span class="keyword">int</span> len = classAttr.length();</span><br><span class="line"> <span class="keyword">final</span> <span class="keyword">int</span> wantLen = className.length();</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span> (len == <span class="number">0</span> || len < wantLen) {</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">false</span>;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// if both lengths are equal, only need compare the className with the attribute</span></span><br><span class="line"> <span class="keyword">if</span> (len == wantLen) {</span><br><span class="line"> <span class="keyword">return</span> className.equalsIgnoreCase(classAttr);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// otherwise, scan for whitespace and compare regions (with no string or arraylist allocations)</span></span><br><span class="line"> <span class="keyword">boolean</span> inClass = <span class="keyword">false</span>;</span><br><span class="line"> <span class="keyword">int</span> start = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < len; i++) {</span><br><span class="line"> <span class="keyword">if</span> (Character.isWhitespace(classAttr.charAt(i))) {</span><br><span class="line"> <span class="keyword">if</span> (inClass) {</span><br><span class="line"> <span class="comment">// white space ends a class name, compare it with the requested one, ignore case</span></span><br><span class="line"> <span class="keyword">if</span> (i - start == wantLen && classAttr.regionMatches(<span class="keyword">true</span>, start, className, <span class="number">0</span>, wantLen)) {</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">true</span>;</span><br><span class="line"> }</span><br><span class="line"> inClass = <span class="keyword">false</span>;</span><br><span class="line"> }</span><br><span class="line"> } <span class="keyword">else</span> {</span><br><span class="line"> <span class="keyword">if</span> (!inClass) {</span><br><span class="line"> <span class="comment">// we're in a class name : keep the start of the substring</span></span><br><span class="line"> inClass = <span class="keyword">true</span>;</span><br><span class="line"> start = i;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// check the last entry</span></span><br><span class="line"> <span class="keyword">if</span> (inClass && len - start == wantLen) {</span><br><span class="line"> <span class="keyword">return</span> classAttr.regionMatches(<span class="keyword">true</span>, start, className, <span class="number">0</span>, wantLen);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">false</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="3-Tag-Name-In-or-Not"><a href="#3-Tag-Name-In-or-Not" class="headerlink" title="3. Tag Name In or Not?"></a>3. Tag Name In or Not?</h3><p>As we analyzed in previous articles, <code>HtmlTreeBuilderState</code> will validate nest correctness by checking whether tag name in a certain collection or not. We can compare the implementation before and after <em>1.7.3</em> to have a check.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 1.7.2</span></span><br><span class="line">} <span class="keyword">else</span> <span class="keyword">if</span> (StringUtil.in(name, <span class="string">"base"</span>, <span class="string">"basefont"</span>, <span class="string">"bgsound"</span>, <span class="string">"command"</span>, <span class="string">"link"</span>, <span class="string">"meta"</span>, <span class="string">"noframes"</span>, <span class="string">"script"</span>, <span class="string">"style"</span>, <span class="string">"title"</span>)) {</span><br><span class="line"> <span class="keyword">return</span> tb.process(t, InHead);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 1.7.3</span></span><br><span class="line"><span class="keyword">static</span> <span class="keyword">final</span> String[] InBodyStartToHead = <span class="keyword">new</span> String[]{<span class="string">"base"</span>, <span class="string">"basefont"</span>, <span class="string">"bgsound"</span>, <span class="string">"command"</span>, <span class="string">"link"</span>, <span class="string">"meta"</span>, <span class="string">"noframes"</span>, <span class="string">"script"</span>, <span class="string">"style"</span>, <span class="string">"title"</span>};</span><br><span class="line">...</span><br><span class="line">} <span class="keyword">else</span> <span class="keyword">if</span> (StringUtil.inSorted(name, Constants.InBodyStartToHead)) {</span><br><span class="line"> <span class="keyword">return</span> tb.process(t, InHead);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>According to the comment written by the author, “A little harder to read here, but causes less GC than dynamic varargs. Was contributing around 10% of parse GC load. Must make sure these are sorted, as used in findSorted”. Simply using <code>static final</code> constant array, also make them sorted so that binary search will also improve from O(n) to O(log(n)), the cost–performance ratio is pretty good here.</p><p>However, “MUST update HtmlTreebuilderStateTest if more arrays added” is not a good way to synchronize IMHO, rather than Copy & Paste I would use reflection to retrieve those constants. You may find my proposal in Pull Request <a href="https://github.com/jhy/jsoup/pull/1157" target="_blank" rel="noopener">#1157: “Simplify state sorting status unit test - avoid duplicated code in HtmlTreeBuilderStateTest.java”</a>.</p><h3 id="4-Flyweight-Pattern"><a href="#4-Flyweight-Pattern" class="headerlink" title="4. Flyweight Pattern"></a>4. Flyweight Pattern</h3><p>Do you know the trick of <code>Integer.valueOf(i)</code>? It maintains a <code>IntegerCache</code> cache from -128 to 127 or higher if configured(<code>java.lang.Integer.IntegerCache.high</code>), as a result, <code>==</code> and <code>equals</code> result will be different when the value is located in a different range(a classic Java interview question?). This is an example of <a href="https://en.wikipedia.org/wiki/Flyweight_pattern" target="_blank" rel="noopener">Flyweight Pattern</a> actually. As for Jsoup, applying this pattern will also reduce object created times and gain benefit to performance.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Caches short strings, as a flywheel pattern, to reduce GC load. Just for this doc, to prevent leaks.</span></span><br><span class="line"><span class="comment"> * <p /></span></span><br><span class="line"><span class="comment"> * Simplistic, and on hash collisions just falls back to creating a new string, vs a full HashMap with Entry list.</span></span><br><span class="line"><span class="comment"> * That saves both having to create objects as hash keys, and running through the entry list, at the expense of</span></span><br><span class="line"><span class="comment"> * some more duplicates.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="keyword">private</span> <span class="keyword">static</span> String <span class="title">cacheString</span><span class="params">(<span class="keyword">final</span> <span class="keyword">char</span>[] charBuf, <span class="keyword">final</span> String[] stringCache, <span class="keyword">final</span> <span class="keyword">int</span> start, <span class="keyword">final</span> <span class="keyword">int</span> count)</span> </span>{</span><br><span class="line"> <span class="comment">// limit (no cache):</span></span><br><span class="line"> <span class="keyword">if</span> (count > maxStringCacheLen)</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">new</span> String(charBuf, start, count);</span><br><span class="line"> <span class="keyword">if</span> (count < <span class="number">1</span>)</span><br><span class="line"> <span class="keyword">return</span> <span class="string">""</span>;</span><br><span class="line"></span><br><span class="line"> <span class="comment">// calculate hash:</span></span><br><span class="line"> <span class="keyword">int</span> hash = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">int</span> offset = start;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < count; i++) {</span><br><span class="line"> hash = <span class="number">31</span> * hash + charBuf[offset++];</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// get from cache</span></span><br><span class="line"> <span class="keyword">final</span> <span class="keyword">int</span> index = hash & stringCache.length - <span class="number">1</span>;</span><br><span class="line"> String cached = stringCache[index];</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span> (cached == <span class="keyword">null</span>) { <span class="comment">// miss, add</span></span><br><span class="line"> cached = <span class="keyword">new</span> String(charBuf, start, count);</span><br><span class="line"> stringCache[index] = cached;</span><br><span class="line"> } <span class="keyword">else</span> { <span class="comment">// hashcode hit, check equality</span></span><br><span class="line"> <span class="keyword">if</span> (rangeEquals(charBuf, start, count, cached)) { <span class="comment">// hit</span></span><br><span class="line"> <span class="keyword">return</span> cached;</span><br><span class="line"> } <span class="keyword">else</span> { <span class="comment">// hashcode conflict</span></span><br><span class="line"> cached = <span class="keyword">new</span> String(charBuf, start, count);</span><br><span class="line"> stringCache[index] = cached; <span class="comment">// update the cache, as recently used strings are more likely to show up again</span></span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> cached;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>There is also another scenario to minimize new StringBuilder GCs using the same idea.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">private</span> <span class="keyword">static</span> <span class="keyword">final</span> Stack<StringBuilder> builders = <span class="keyword">new</span> Stack<>();</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Maintains cached StringBuilders in a flyweight pattern, to minimize new StringBuilder GCs. The StringBuilder is</span></span><br><span class="line"><span class="comment"> * prevented from growing too large.</span></span><br><span class="line"><span class="comment"> * <p></span></span><br><span class="line"><span class="comment"> * Care must be taken to release the builder once its work has been completed, with {<span class="doctag">@see</span> #releaseBuilder}</span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">static</span> StringBuilder <span class="title">borrowBuilder</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">synchronized</span> (builders) {</span><br><span class="line"> <span class="keyword">return</span> builders.empty() ?</span><br><span class="line"> <span class="keyword">new</span> StringBuilder(MaxCachedBuilderSize) :</span><br><span class="line"> builders.pop();</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Actually, <a href="https://github.com/jhy/jsoup/blob/master/src/main/java/org/jsoup/parser/CharacterReader.java" target="_blank" rel="noopener"><code>CharacterReader</code></a> and <a href="https://github.com/jhy/jsoup/blob/master/src/main/java/org/jsoup/internal/StringUtil.java" target="_blank" rel="noopener"><code>StringUtil</code></a> are worthy to digest more and more as there are many useful tips and tricks that will inspire you.</p><h3 id="5-Other-Improvement-Methods"><a href="#5-Other-Improvement-Methods" class="headerlink" title="5. Other Improvement Methods"></a>5. Other Improvement Methods</h3><ul><li>Use RandomAccessFile to read files that improved file read time by 2x. Check <a href="https://github.com/jhy/jsoup/issues/248" target="_blank" rel="noopener">#248</a> for more details</li><li>Node hierarchy refactoring. Check <a href="https://github.com/jhy/jsoup/issues/911" target="_blank" rel="noopener">#911</a> for more details</li><li>“Improvements largely from re-ordering the HtmlTreeBuilder methods based on analysis of various websites” - I list this one here because it’s very practical. Deeper understanding and observation of how the code will run will also give you some insights</li><li>Call <code>list.toArray(0)</code> rather than <code>list.toArray(list.size())</code> - this has been used in certain open source projects such as <a href="https://github.com/h2database/h2database/issues/311" target="_blank" rel="noopener">h2database</a>, so I also proposed this in another Pull Request <a href="https://github.com/jhy/jsoup/pull/1158" target="_blank" rel="noopener">#1158</a></li></ul><h3 id="6-The-Unknowns"><a href="#6-The-Unknowns" class="headerlink" title="6. The Unknowns"></a>6. The Unknowns</h3><p>Optimization never ends. There are still many tips and tricks I didn’t discover at this time. I would appreciate if you can share them to me if you find more inspiring ideas in Jsoup. You may find my contact information in the left sidebar of this website or simply email to <code>ny83427 at gmail.com</code>.</p><p><strong>-To Be Continued-</strong></p>]]></content>
<summary type="html">
<p>We have done things right, now it’s time to do things faster. We would keep <a href="https://en.wikipedia.org/wiki/Donald_Knuth" target="_blank" rel="noopener">Donald Knuth</a>‘s warning in mind, “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil”.</p>
<p><img src="https://www.dropbox.com/s/kk13zhbeg3nsgmc/jsoup-performance.jpg?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Jsoup" scheme="http://www.oldyoungboys.club/tags/Jsoup/"/>
<category term="Code Review" scheme="http://www.oldyoungboys.club/tags/Code-Review/"/>
</entry>
<entry>
<title>Behind-the-Scenes Secrets of Jsoup IV: CSS Selector</title>
<link href="http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-CSS-Selector/"/>
<id>http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-CSS-Selector/</id>
<published>2018-12-17T04:40:04.000Z</published>
<updated>2019-09-03T20:29:56.228Z</updated>
<content type="html"><![CDATA[<p>Most of the time, you are just consuming the <code>Document</code> Jsoup built for you, like <code>document.select(${selector})</code>. It would be helpful to go through <a href="https://www.w3.org/TR/CSS2/selector.html" target="_blank" rel="noopener">W3C CSS Selector Specification</a> to understand Jsoup’s roadmap. We have talked about Node Traverse before, what <code>Selector</code> will actually do is just filtering and collecting while traversing, and the key point here will be parsing and evaluating given queries.</p><p><img src="https://www.dropbox.com/s/usdc8a4am6gq6jp/make-coffee.jpg?dl=1" alt=""><a id="more"></a></p><h3 id="Overview-of-Package"><a href="#Overview-of-Package" class="headerlink" title="Overview of Package"></a>Overview of Package</h3><p>Package <code>org.jsoup.select</code> didn’t change much in latest version <em>1.12.1-SNAPSHOT</em>, I would reference UML diagram made by <a href="https://github.com/code4craft/" target="_blank" rel="noopener">Yihua Huang</a> directly here:</p><p><img src="https://www.dropbox.com/s/z2e274uwgupl6oa/select-uml-Yihua-Huang.jpeg?dl=1" alt=""></p><p>Besides <code>NodeVistor</code> the interface, there is another similar interface <code>NodeFilter</code>, their methods are nearly the same:</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">interface</span> <span class="title">NodeFilter</span> </span>{</span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Filter decision.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="keyword">enum</span> FilterResult {</span><br><span class="line"> <span class="comment">/** Continue processing the tree */</span></span><br><span class="line"> CONTINUE,</span><br><span class="line"> <span class="comment">/** Skip the child nodes, but do call {<span class="doctag">@link</span> NodeFilter#tail(Node, int)} next. */</span></span><br><span class="line"> SKIP_CHILDREN,</span><br><span class="line"> <span class="comment">/** Skip the subtree, and do not call {<span class="doctag">@link</span> NodeFilter#tail(Node, int)}. */</span></span><br><span class="line"> SKIP_ENTIRELY,</span><br><span class="line"> <span class="comment">/** Remove the node and its children */</span></span><br><span class="line"> REMOVE,</span><br><span class="line"> <span class="comment">/** Stop processing */</span></span><br><span class="line"> STOP</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Callback for when a node is first visited.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function">FilterResult <span class="title">head</span><span class="params">(Node node, <span class="keyword">int</span> depth)</span></span>;</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Callback for when a node is last visited, after all of its descendants have been visited.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function">FilterResult <span class="title">tail</span><span class="params">(Node node, <span class="keyword">int</span> depth)</span></span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>There is only one implementation in production code, which is <code>FirstFinder</code>, 5 other implementations are all located in test and probably reserved for the future. As for <code>FirstFinder</code>, it’s easy to guess that it’s created in the purpose of optimization. You don’t need to collect all the matching elements and return the first one, instead of invoking <code>select(${selector}).get(0)</code>, Jsoup introduces another method <code>selectFirst(${selector})</code> which will be useful when you just care about the first one - a frequent scenario actually.</p><h3 id="NodeFilter-Another-Traverse"><a href="#NodeFilter-Another-Traverse" class="headerlink" title="NodeFilter: Another Traverse"></a>NodeFilter: Another Traverse</h3><p>I don’t want to repeat Node Traverse here again, but it will be helpful to review it in another way - the implementation of filtering. The idea and workflow are totally the same, but a little bit longer. You may compare to <a href="https://github.com/jhy/jsoup/blob/master/src/main/java/org/jsoup/select/NodeTraversor.java#L40" target="_blank" rel="noopener"><code>NodeTraversor#traverse</code></a> for better understanding.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">static</span> FilterResult <span class="title">filter</span><span class="params">(NodeFilter filter, Node root)</span> </span>{</span><br><span class="line"> Node node = root;</span><br><span class="line"> <span class="keyword">int</span> depth = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line"> <span class="keyword">while</span> (node != <span class="keyword">null</span>) {</span><br><span class="line"> FilterResult result = filter.head(node, depth);</span><br><span class="line"> <span class="keyword">if</span> (result == FilterResult.STOP)</span><br><span class="line"> <span class="keyword">return</span> result;</span><br><span class="line"> <span class="comment">// Descend into child nodes:</span></span><br><span class="line"> <span class="keyword">if</span> (result == FilterResult.CONTINUE && node.childNodeSize() > <span class="number">0</span>) {</span><br><span class="line"> node = node.childNode(<span class="number">0</span>);</span><br><span class="line"> ++depth;</span><br><span class="line"> <span class="keyword">continue</span>;</span><br><span class="line"> }</span><br><span class="line"> <span class="comment">// No siblings, move upwards:</span></span><br><span class="line"> <span class="keyword">while</span> (node.nextSibling() == <span class="keyword">null</span> && depth > <span class="number">0</span>) {</span><br><span class="line"> <span class="comment">// 'tail' current node:</span></span><br><span class="line"> <span class="keyword">if</span> (result == FilterResult.CONTINUE || result == FilterResult.SKIP_CHILDREN) {</span><br><span class="line"> result = filter.tail(node, depth);</span><br><span class="line"> <span class="keyword">if</span> (result == FilterResult.STOP)</span><br><span class="line"> <span class="keyword">return</span> result;</span><br><span class="line"> }</span><br><span class="line"> Node prev = node; <span class="comment">// In case we need to remove it below.</span></span><br><span class="line"> node = node.parentNode();</span><br><span class="line"> depth--;</span><br><span class="line"> <span class="keyword">if</span> (result == FilterResult.REMOVE)</span><br><span class="line"> prev.remove(); <span class="comment">// Remove AFTER finding parent.</span></span><br><span class="line"> result = FilterResult.CONTINUE; <span class="comment">// Parent was not pruned.</span></span><br><span class="line"> }</span><br><span class="line"> <span class="comment">// 'tail' current node, then proceed with siblings:</span></span><br><span class="line"> <span class="keyword">if</span> (result == FilterResult.CONTINUE || result == FilterResult.SKIP_CHILDREN) {</span><br><span class="line"> result = filter.tail(node, depth);</span><br><span class="line"> <span class="keyword">if</span> (result == FilterResult.STOP)</span><br><span class="line"> <span class="keyword">return</span> result;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span> (node == root)</span><br><span class="line"> <span class="keyword">return</span> result;</span><br><span class="line"> Node prev = node; <span class="comment">// In case we need to remove it below.</span></span><br><span class="line"> node = node.nextSibling();</span><br><span class="line"> <span class="keyword">if</span> (result == FilterResult.REMOVE)</span><br><span class="line"> prev.remove(); <span class="comment">// Remove AFTER finding sibling.</span></span><br><span class="line"> }</span><br><span class="line"> <span class="comment">// root == null?</span></span><br><span class="line"> <span class="keyword">return</span> FilterResult.CONTINUE;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Using <code>FirstFinder</code> as an example, the idea is straightforward. It keeps the evaluator and the matching result, and return a <code>FilterResult</code> of STOP when the first matching element was found.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">private</span> <span class="keyword">static</span> <span class="class"><span class="keyword">class</span> <span class="title">FirstFinder</span> <span class="keyword">implements</span> <span class="title">NodeFilter</span> </span>{</span><br><span class="line"> <span class="keyword">private</span> <span class="keyword">final</span> Element root;</span><br><span class="line"> <span class="keyword">private</span> Element match = <span class="keyword">null</span>;</span><br><span class="line"> <span class="keyword">private</span> <span class="keyword">final</span> Evaluator eval;</span><br><span class="line"></span><br><span class="line"> FirstFinder(Element root, Evaluator eval) {</span><br><span class="line"> <span class="keyword">this</span>.root = root;</span><br><span class="line"> <span class="keyword">this</span>.eval = eval;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="meta">@Override</span></span><br><span class="line"> <span class="function"><span class="keyword">public</span> FilterResult <span class="title">head</span><span class="params">(Node node, <span class="keyword">int</span> depth)</span> </span>{</span><br><span class="line"> <span class="keyword">if</span> (node <span class="keyword">instanceof</span> Element) {</span><br><span class="line"> Element el = (Element) node;</span><br><span class="line"> <span class="keyword">if</span> (eval.matches(root, el)) {</span><br><span class="line"> match = el;</span><br><span class="line"> <span class="keyword">return</span> STOP;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> CONTINUE;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="meta">@Override</span></span><br><span class="line"> <span class="function"><span class="keyword">public</span> FilterResult <span class="title">tail</span><span class="params">(Node node, <span class="keyword">int</span> depth)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> CONTINUE;</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="The-Evaluators"><a href="#The-Evaluators" class="headerlink" title="The Evaluators"></a>The Evaluators</h3><p>OK. We have gone through the traverse again via review the filtering function. Now it’s time to understand how Jsoup parse a given query. We will look at <code>Evaluator</code>, <code>CombiningEvaluator</code>, and <code>StructuralEvaluator</code> first.</p><ul><li><p><code>Evaluator</code> is a abstract class with 39 inherited subclasses which cover querying by id, name, tagName(equals or endsWith), class, attribute name, attribute value, sibling index, texts, regex match and etc. Since we have set these properties well while parsing the DOM tree, it will be very easy to implement the abstract method:</p> <figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment">* Test if the element meets the evaluator's requirements.</span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">abstract</span> <span class="keyword">boolean</span> <span class="title">matches</span><span class="params">(Element root, Element element)</span></span>;</span><br></pre></td></tr></table></figure></li><li><p><code>CombiningEvaluator</code> combines multiple evaluators with logic expression <code>And</code> or <code>Or</code>, which can process combinators like <code>",", ">", "+", "~", " "</code></p></li><li><p><code>StructuralEvaluator</code> is used to simulate query like <code>div > p > span</code> or <code>div p span</code>. <code>ImmediateParent</code> will handle <code>div > p > span</code> while <code>Parent</code> will handle <code>div p span</code>. We will take a look at <code>Parent</code> implementation:</p> <figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">static</span> <span class="class"><span class="keyword">class</span> <span class="title">Parent</span> <span class="keyword">extends</span> <span class="title">StructuralEvaluator</span> </span>{</span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="title">Parent</span><span class="params">(Evaluator evaluator)</span> </span>{</span><br><span class="line"> <span class="keyword">this</span>.evaluator = evaluator;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">boolean</span> <span class="title">matches</span><span class="params">(Element root, Element element)</span> </span>{</span><br><span class="line"> <span class="keyword">if</span> (root == element)</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">false</span>;</span><br><span class="line"></span><br><span class="line"> Element parent = element.parent();</span><br><span class="line"> <span class="keyword">while</span> (<span class="keyword">true</span>) {</span><br><span class="line"> <span class="keyword">if</span> (evaluator.matches(root, parent))</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">true</span>;</span><br><span class="line"> <span class="keyword">if</span> (parent == root)</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> parent = parent.parent();</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">false</span>;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="meta">@Override</span></span><br><span class="line"> <span class="function"><span class="keyword">public</span> String <span class="title">toString</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> String.format(<span class="string">":parent%s"</span>, evaluator);</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p> And <a href="https://github.com/jhy/jsoup/blob/master/src/main/java/org/jsoup/select/QueryParser.java" target="_blank" rel="noopener"><code>QueryParser</code></a> that will parse a CSS selector into an Evaluator tree, handle above cases like this:</p> <figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// for most combinators: change the current eval into an AND of the current eval and the new eval</span></span><br><span class="line"><span class="keyword">if</span> (combinator == <span class="string">'>'</span>)</span><br><span class="line"> currentEval = <span class="keyword">new</span> CombiningEvaluator.And(newEval, <span class="keyword">new</span> StructuralEvaluator.ImmediateParent(currentEval));</span><br><span class="line"><span class="keyword">else</span> <span class="keyword">if</span> (combinator == <span class="string">' '</span>)</span><br><span class="line"> currentEval = <span class="keyword">new</span> CombiningEvaluator.And(newEval, <span class="keyword">new</span> StructuralEvaluator.Parent(currentEval));</span><br><span class="line">...</span><br></pre></td></tr></table></figure></li></ul><h3 id="QueryParser-Finally"><a href="#QueryParser-Finally" class="headerlink" title="QueryParser Finally"></a>QueryParser Finally</h3><p>After we explored different kinds of <code>Evaluator</code>, it will be easy to understand the parsing process in the below:</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">private</span> <span class="title">QueryParser</span><span class="params">(String query)</span> </span>{</span><br><span class="line"> <span class="keyword">this</span>.query = query;</span><br><span class="line"> <span class="keyword">this</span>.tq = <span class="keyword">new</span> TokenQueue(query);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function">Evaluator <span class="title">parse</span><span class="params">()</span> </span>{</span><br><span class="line"> tq.consumeWhitespace();</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span> (tq.matchesAny(combinators)) { <span class="comment">// if starts with a combinator, use root as elements</span></span><br><span class="line"> evals.add(<span class="keyword">new</span> StructuralEvaluator.Root());</span><br><span class="line"> combinator(tq.consume());</span><br><span class="line"> } <span class="keyword">else</span> {</span><br><span class="line"> findElements();</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">while</span> (!tq.isEmpty()) {</span><br><span class="line"> <span class="comment">// hierarchy and extras</span></span><br><span class="line"> <span class="keyword">boolean</span> seenWhite = tq.consumeWhitespace();</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span> (tq.matchesAny(combinators)) {</span><br><span class="line"> combinator(tq.consume());</span><br><span class="line"> } <span class="keyword">else</span> <span class="keyword">if</span> (seenWhite) {</span><br><span class="line"> combinator(<span class="string">' '</span>);</span><br><span class="line"> } <span class="keyword">else</span> { <span class="comment">// E.class, E#id, E[attr] etc. AND</span></span><br><span class="line"> findElements(); <span class="comment">// take next el, #. etc off queue</span></span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span> (evals.size() == <span class="number">1</span>)</span><br><span class="line"> <span class="keyword">return</span> evals.get(<span class="number">0</span>);</span><br><span class="line"></span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">new</span> CombiningEvaluator.And(evals);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Easier than expected, right? The implementation here is clean, elegant and easy to understand. Besides that, java doc of <code>Selector</code> is also impressive, you can play around with <a href="https://try.jsoup.org/" target="_blank" rel="noopener">Try Jsoup</a> with <a href="https://jsoup.org/apidocs/org/jsoup/select/Selector.html" target="_blank" rel="noopener">Selector syntax</a>.</p><p><strong>-To Be Continued-</strong></p>]]></content>
<summary type="html">
<p>Most of the time, you are just consuming the <code>Document</code> Jsoup built for you, like <code>document.select(${selector})</code>. It would be helpful to go through <a href="https://www.w3.org/TR/CSS2/selector.html" target="_blank" rel="noopener">W3C CSS Selector Specification</a> to understand Jsoup’s roadmap. We have talked about Node Traverse before, what <code>Selector</code> will actually do is just filtering and collecting while traversing, and the key point here will be parsing and evaluating given queries.</p>
<p><img src="https://www.dropbox.com/s/usdc8a4am6gq6jp/make-coffee.jpg?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Jsoup" scheme="http://www.oldyoungboys.club/tags/Jsoup/"/>
<category term="Code Review" scheme="http://www.oldyoungboys.club/tags/Code-Review/"/>
</entry>
<entry>
<title>Behind-the-Scenes Secrets of Jsoup III: The Tree and The State Machine</title>
<link href="http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-Build-The-Tree/"/>
<id>http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-Build-The-Tree/</id>
<published>2018-12-15T20:08:14.000Z</published>
<updated>2019-09-03T20:29:56.227Z</updated>
<content type="html"><![CDATA[<p>Ready? The challenging task finally came. We will build a tree this time so that you don’t have to traverse before it was built.😅<code>TreeBuilder</code> has two implementations, <code>XmlTreeBuilder</code> is relatively easy while <code>HtmlTreeBuilder</code> is much more complex. And we will go through them today.</p><p><img src="https://www.dropbox.com/s/kceinw1qzr7obp9/jsoup-build-tree.jpg?dl=1" alt=""><a id="more"></a></p><h3 id="Overview-of-Package"><a href="#Overview-of-Package" class="headerlink" title="Overview of Package"></a>Overview of Package</h3><p>I will skip general introduction about Compiler, Lex, Parser as there are too many articles talking about them. We will go to Jsoup source code directly. Let’s have a quick look at several classes under package <code>org.jsoup.parser</code>.</p><ul><li><p><code>Parser</code>: Facade of Jsoup parsing.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">private</span> TreeBuilder treeBuilder; <span class="comment">// this is the man who actually do the job</span></span><br><span class="line"><span class="keyword">private</span> ParseErrorList errors; <span class="comment">// by default Jsoup didn't collect errors of html but you can enable this</span></span><br><span class="line"><span class="keyword">private</span> ParseSettings settings; <span class="comment">// switches of preserveTagCase and preserveAttributeCase</span></span><br></pre></td></tr></table></figure></li><li><p><code>Token</code>: Parse tokens for the Tokeniser. It’s a abstract class with 6 subclasses: <code>Doctype</code>, <code>StartTag</code>, <code>EndTag</code>, <code>Comment</code>, <code>Character</code>, <code>EOF</code></p></li><li><p><code>Tokeniser</code>: Read input stream into tokens. It keeps current <code>state</code> and <code>emitPending</code> as the token we are about to emit on next read, it also keeps <code>tagPending</code>, <code>startPending</code>, <code>endPending</code>, <code>charPending</code>, <code>doctypePending</code>, and <code>commentPending</code> before tokens were filled up completely</p></li><li><p><code>CharacterReader</code>: consumes tokens off a string, it might be inspired from <code>ByteBuffer</code> of NIO as there are similar methods like <code>consume()</code>、<code>unconsume()</code>、<code>mark()</code>、<code>rewindToMark()</code> and <code>consumeTo()</code></p></li><li><p><code>TokeniserState</code> and <code>HtmlTreeBuilderState</code>: Lexing/Parsing State Machine. Jsoup applied State Pattern to implement it using enumerations, which is a classic example of design pattern usage also. Using a transition table is easy for the transition between different states but would be difficult to perform extra work during the transition</p></li><li><p><code>HtmlTreeBuilder</code> and <code>XmlTreeBuilder</code>: I list them in the end as they basically play a role like a manager, collaborating <code>Tokeniser</code> and Lexing, Parsing State Machine to finish the work</p></li></ul><p>Lexing and Parsing State Pattern implementation would the most complex part of Jsoup. You can get to know this even just from lines of code statistics. Top 7 source files here!</p><table><thead><tr><th>File</th><th>blank</th><th>comment</th><th>code</th></tr></thead><tbody><tr><td><strong>org.jsoup.parser.TokeniserState.java</strong></td><td>27</td><td>46</td><td>1671</td></tr><tr><td><strong>org.jsoup.parser.HtmlTreeBuilderState.java</strong></td><td>43</td><td>56</td><td>1429</td></tr><tr><td>org.jsoup.helper.HttpConnection.java</td><td>162</td><td>51</td><td>979</td></tr><tr><td>org.jsoup.nodes.Element.java</td><td>156</td><td>604</td><td>727</td></tr><tr><td>org.jsoup.parser.HtmlTreeBuilder.java</td><td>104</td><td>38</td><td>591</td></tr><tr><td>org.jsoup.select.Evaluator.java</td><td>139</td><td>107</td><td>532</td></tr><tr><td>org.jsoup.parser.CharacterReader.java</td><td>60</td><td>61</td><td>385</td></tr></tbody></table><h3 id="State-Machine-and-State-Pattern"><a href="#State-Machine-and-State-Pattern" class="headerlink" title="State Machine and State Pattern"></a>State Machine and State Pattern</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">⬇⬇ ⬇ ⬇ </span><br><span class="line"><div>test</div></span><br></pre></td></tr></table></figure><p><img src="https://www.dropbox.com/s/19h91tkb1lqb0u7/lexing-process.png?dl=1" alt=""></p><p>If we use a very short code snippet as an example, each arrow would point to a state like <code>TagOpen</code>, <code>TagName</code>, <code>Data</code>, <code>EndTagOpen</code>. I just list two of them in the below:</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * States and transition activations for the Tokeniser.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">enum</span> TokeniserState {</span><br><span class="line"> Data {</span><br><span class="line"> <span class="comment">// in data state, gather characters until a character reference or tag is found</span></span><br><span class="line"> <span class="function"><span class="keyword">void</span> <span class="title">read</span><span class="params">(Tokeniser t, CharacterReader r)</span> </span>{</span><br><span class="line"> <span class="keyword">switch</span> (r.current()) {</span><br><span class="line"> <span class="keyword">case</span> <span class="string">'&'</span>:</span><br><span class="line"> t.advanceTransition(CharacterReferenceInData);</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> <span class="string">'<'</span>:</span><br><span class="line"> t.advanceTransition(TagOpen);</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> nullChar:</span><br><span class="line"> t.error(<span class="keyword">this</span>); <span class="comment">// NOT replacement character (oddly?)</span></span><br><span class="line"> t.emit(r.consume());</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> eof:</span><br><span class="line"> t.emit(<span class="keyword">new</span> Token.EOF());</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">default</span>:</span><br><span class="line"> String data = r.consumeData();</span><br><span class="line"> t.emit(data);</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> },</span><br><span class="line"> TagOpen {</span><br><span class="line"> <span class="comment">// from < in data</span></span><br><span class="line"> <span class="function"><span class="keyword">void</span> <span class="title">read</span><span class="params">(Tokeniser t, CharacterReader r)</span> </span>{</span><br><span class="line"> <span class="keyword">switch</span> (r.current()) {</span><br><span class="line"> <span class="keyword">case</span> <span class="string">'!'</span>:</span><br><span class="line"> t.advanceTransition(MarkupDeclarationOpen);</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> <span class="string">'/'</span>:</span><br><span class="line"> t.advanceTransition(EndTagOpen);</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> <span class="string">'?'</span>:</span><br><span class="line"> t.advanceTransition(BogusComment);</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">default</span>:</span><br><span class="line"> <span class="keyword">if</span> (r.matchesLetter()) {</span><br><span class="line"> t.createTagPending(<span class="keyword">true</span>);</span><br><span class="line"> t.transition(TagName);</span><br><span class="line"> } <span class="keyword">else</span> {</span><br><span class="line"> t.error(<span class="keyword">this</span>);</span><br><span class="line"> t.emit(<span class="string">'<'</span>); <span class="comment">// char that got us here</span></span><br><span class="line"> t.transition(Data);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> },</span><br><span class="line"> ...</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Understand the idea of <code>Token</code> parse we can go to <code>TreeBuilder</code> process, we will go to the core section first, then go to <code>XmlTreeReader</code> which is easier. We will talk about <code>HtmlTreeReader</code> at last.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">protected</span> <span class="keyword">void</span> <span class="title">runParser</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">while</span> (<span class="keyword">true</span>) {</span><br><span class="line"> Token token = tokeniser.read();</span><br><span class="line"> <span class="comment">// protected abstract boolean process(Token token);</span></span><br><span class="line"> process(token);</span><br><span class="line"> token.reset();</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span> (token.type == Token.TokenType.EOF)</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>As for <code>Tokeniser#read()</code>:<br><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Token <span class="title">read</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">while</span> (!isEmitPending)</span><br><span class="line"> state.read(<span class="keyword">this</span>, reader);</span><br><span class="line"></span><br><span class="line"> <span class="comment">// if emit is pending, a non-character token was found: return any chars in buffer, and leave token for next read:</span></span><br><span class="line"> <span class="keyword">if</span> (charsBuilder.length() > <span class="number">0</span>) {</span><br><span class="line"> String str = charsBuilder.toString();</span><br><span class="line"> charsBuilder.delete(<span class="number">0</span>, charsBuilder.length());</span><br><span class="line"> charsString = <span class="keyword">null</span>;</span><br><span class="line"> <span class="keyword">return</span> charPending.data(str);</span><br><span class="line"> } <span class="keyword">else</span> <span class="keyword">if</span> (charsString != <span class="keyword">null</span>) {</span><br><span class="line"> Token token = charPending.data(charsString);</span><br><span class="line"> charsString = <span class="keyword">null</span>;</span><br><span class="line"> <span class="keyword">return</span> token;</span><br><span class="line"> } <span class="keyword">else</span> {</span><br><span class="line"> isEmitPending = <span class="keyword">false</span>;</span><br><span class="line"> <span class="keyword">return</span> emitPending;</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure></p><h3 id="XmlTreeBuilder-First"><a href="#XmlTreeBuilder-First" class="headerlink" title="XmlTreeBuilder First"></a>XmlTreeBuilder First</h3><p>After exploring the Lexing part we can go to the Parsing part now. We will go through the easier part <code>XmlTreeBuilder</code> first, focusing on its implementation of method <code>process(token)</code> in abstract class <code>TreeBuilder</code>.</p><p><code>XmlTreeBuilder</code> basically maintains a stack and insert node according to token type:</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@Override</span></span><br><span class="line"><span class="function"><span class="keyword">protected</span> <span class="keyword">boolean</span> <span class="title">process</span><span class="params">(Token token)</span> </span>{</span><br><span class="line"> <span class="comment">// start tag, end tag, doctype, comment, character, eof</span></span><br><span class="line"> <span class="keyword">switch</span> (token.type) {</span><br><span class="line"> <span class="keyword">case</span> StartTag:</span><br><span class="line"> insert(token.asStartTag());</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> EndTag:</span><br><span class="line"> popStackToClose(token.asEndTag());</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> Comment:</span><br><span class="line"> insert(token.asComment());</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> Character:</span><br><span class="line"> insert(token.asCharacter());</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> Doctype:</span><br><span class="line"> insert(token.asDoctype());</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">case</span> EOF: <span class="comment">// could put some normalisation here if desired</span></span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> <span class="keyword">default</span>:</span><br><span class="line"> Validate.fail(<span class="string">"Unexpected token type: "</span> + token.type);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">true</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>As for <code>insert</code>, we will just use <code>StartTag</code> as an example in the below. The other <code>insert</code> methods accept different types of <code>Token</code> but the idea is the same.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Element <span class="title">insert</span><span class="params">(Token.StartTag startTag)</span> </span>{</span><br><span class="line"> Tag tag = Tag.valueOf(startTag.name(), settings);</span><br><span class="line"> Element el = <span class="keyword">new</span> Element(tag, baseUri, settings.normalizeAttributes(startTag.attributes));</span><br><span class="line"> insertNode(el);</span><br><span class="line"> <span class="keyword">if</span> (startTag.isSelfClosing()) {</span><br><span class="line"> <span class="keyword">if</span> (!tag.isKnownTag()) <span class="comment">// unknown tag, remember this is self closing for output. see above.</span></span><br><span class="line"> tag.setSelfClosing();</span><br><span class="line"> } <span class="keyword">else</span> {</span><br><span class="line"> stack.add(el);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> el;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="HtmlTreeBuilder-Last"><a href="#HtmlTreeBuilder-Last" class="headerlink" title="HtmlTreeBuilder Last"></a>HtmlTreeBuilder Last</h3><p>Compared to <code>XmlTreeBuilder</code>, <code>HtmlTreeBuilder</code> is much more complex. It introduces <code>HtmlTreeBuilderState</code> to process transitions. As Html is relatively loose in syntax and error tolerance, Jsoup won’t simply quit when certain kinds of error detected, but choose to record them and continue. It will also try to close those tags not closed properly. Some nested tags have sequence restriction, such as a <code><td></code> must be under <code><th></code> or <code><tr></code>, while <code>tr</code> must be under <code><tbody></code> or <code><table</code>.</p><p>There are currently 23 <code>HtmlTreeBuilderState</code> and we will use a typical Html snippet as an example, which came from Yihua Huang’s work. Comments like <code><!-- State: --></code> will indicate what kind of <code>HtmlTreeBuilderState</code> we are at.</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"><!-- State: Initial --></span></span><br><span class="line"><span class="meta"><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"></span></span><br><span class="line"><span class="comment"><!-- State: BeforeHtml --></span></span><br><span class="line"><span class="tag"><<span class="name">html</span> <span class="attr">lang</span>=<span class="string">'zh-CN'</span> <span class="attr">xml:lang</span>=<span class="string">'zh-CN'</span> <span class="attr">xmlns</span>=<span class="string">'http://www.w3.org/1999/xhtml'</span>></span></span><br><span class="line"><span class="comment"><!-- State: BeforeHead --></span></span><br><span class="line"><span class="tag"><<span class="name">head</span>></span></span><br><span class="line"> <span class="comment"><!-- State: InHead --></span></span><br><span class="line"> <span class="tag"><<span class="name">script</span> <span class="attr">type</span>=<span class="string">"text/javascript"</span>></span><span class="undefined"></span></span><br><span class="line"><span class="xml"> //<span class="comment"><!-- State: Text --></span></span></span><br><span class="line"><span class="undefined"> function xx(){</span></span><br><span class="line"><span class="undefined"> }</span></span><br><span class="line"><span class="undefined"> </span><span class="tag"></<span class="name">script</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">noscript</span>></span></span><br><span class="line"> <span class="comment"><!-- State: InHeadNoscript --></span></span><br><span class="line"> Your browser does not support JavaScript!</span><br><span class="line"> <span class="tag"></<span class="name">noscript</span>></span></span><br><span class="line"><span class="tag"></<span class="name">head</span>></span></span><br><span class="line"><span class="comment"><!-- State: AfterHead --></span></span><br><span class="line"><span class="tag"><<span class="name">body</span>></span></span><br><span class="line"><span class="comment"><!-- State: InBody --></span></span><br><span class="line"><span class="tag"><<span class="name">textarea</span>></span></span><br><span class="line"> <span class="comment"><!-- State: Text --></span></span><br><span class="line"> xxx</span><br><span class="line"><span class="tag"></<span class="name">textarea</span>></span></span><br><span class="line"><span class="tag"><<span class="name">table</span>></span></span><br><span class="line"> <span class="comment"><!-- State: InTable --></span></span><br><span class="line"> <span class="comment"><!-- State: InTableText --></span></span><br><span class="line"> xxx</span><br><span class="line"> <span class="tag"><<span class="name">tbody</span>></span></span><br><span class="line"> <span class="comment"><!-- State: InTableBody --></span></span><br><span class="line"> <span class="tag"></<span class="name">tbody</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">tr</span>></span></span><br><span class="line"> <span class="comment"><!-- State: InRow --></span></span><br><span class="line"> <span class="tag"><<span class="name">td</span>></span></span><br><span class="line"> <span class="comment"><!-- State: InCell --></span></span><br><span class="line"> <span class="tag"></<span class="name">td</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">tr</span>></span> </span><br><span class="line"><span class="tag"></<span class="name">table</span>></span></span><br><span class="line"><span class="tag"></<span class="name">body</span>></span></span><br><span class="line"><span class="tag"></<span class="name">html</span>></span></span><br></pre></td></tr></table></figure><p>As we said before, nested tags such as <code>tr</code> cannot be under <code>body</code> directly, they need to be organized under <code>table</code> or <code>tbody</code>. And DOCTYPE declaration also should not be seen in <code>body</code>. In method <code>boolean process(Token t, HtmlTreeBuilder tb)</code> of <code>InBody</code>, there is a case to handle this.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">case</span> Doctype: {</span><br><span class="line"> tb.error(<span class="keyword">this</span>);</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">false</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>If <code>head</code> was not closed before <code>body</code>, Jsoup will auto-close it.<br><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">InHead {</span><br><span class="line"> <span class="function"><span class="keyword">boolean</span> <span class="title">process</span><span class="params">(Token t, HtmlTreeBuilder tb)</span> </span>{...}</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">private</span> <span class="keyword">boolean</span> <span class="title">anythingElse</span><span class="params">(Token t, TreeBuilder tb)</span> </span>{</span><br><span class="line"> tb.processEndTag(<span class="string">"head"</span>);</span><br><span class="line"> <span class="keyword">return</span> tb.process(t);</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure></p><p>As the State Pattern implementation needs to handle many different kinds of cases, one can be easily lost while reading the source code. Understanding the concept and general ideas first will be of help while debugging the code to know what Jsoup is doing, but still, it’s not easy and would hurt your brain😭</p><p><strong>-To Be Continued-</strong></p>]]></content>
<summary type="html">
<p>Ready? The challenging task finally came. We will build a tree this time so that you don’t have to traverse before it was built.😅<code>TreeBuilder</code> has two implementations, <code>XmlTreeBuilder</code> is relatively easy while <code>HtmlTreeBuilder</code> is much more complex. And we will go through them today.</p>
<p><img src="https://www.dropbox.com/s/kceinw1qzr7obp9/jsoup-build-tree.jpg?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Jsoup" scheme="http://www.oldyoungboys.club/tags/Jsoup/"/>
<category term="Code Review" scheme="http://www.oldyoungboys.club/tags/Code-Review/"/>
</entry>
<entry>
<title>Behind-the-Scenes Secrets of Jsoup II: Traverse A Tree Before It Was Built</title>
<link href="http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-Node-Traverse/"/>
<id>http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-Node-Traverse/</id>
<published>2018-12-14T16:39:26.000Z</published>
<updated>2019-09-03T20:29:56.229Z</updated>
<content type="html"><![CDATA[<p>Here we go! We’ve got a HTML or XML snippet, it would be clean or dirty, small or large, safe or dangerous, but our task won’t change. We need to extract and manipulate data supporting the best of DOM, CSS, and jquery-like methods. The basic idea would be crystal clear, you need to build a DOM tree and traverse it with or without filtering. That’s it. Sounds easy, right?😅Then Just Do It…and you would probably…cry.</p><p><img src="https://www.dropbox.com/s/e9mbalbcqyslp4b/tree.jpg?dl=1" alt=""><a id="more"></a></p><p>A tree consists of a root node and its children from top to bottom. So it’s straightforward to get the root node first, retrieve its attributes then traverse to its child nodes and maintain the relationship at the same time. The class diagram of <code>org.jsoup.nodes</code> generated by Intellij IDEA is like this:</p><p><img src="https://www.dropbox.com/s/k92qrrrtq5976wx/jsoup-nodes-uml.png?dl=1" alt=""></p><p>PS: LeafNode was introduced in <em>1.11.1</em> for memory usage optimization. “By refactoring the node hierarchy to not track childnodes or attributes by default for lead nodes. For the average document, that’s about a 30% memory reduction.” You can get more details on <a href="https://github.com/jhy/jsoup/issues/911" target="_blank" rel="noopener">#911</a>, which was a good example of performance improvement.</p><p>Besides other classes will be introduced in the future, <code>Element</code> and <code>Document</code> would be most frequently used, but <code>Node</code> is the root of the inheritance tree. Fields and methods worthy of notice are listed in the below:</p><ul><li><code>attributes</code>: key-value collection. Key like ‘type’, ‘id’, ‘name’, ‘value’ and etc. Jsoup especially maintains a list of boolean attribute keys such as ‘checked’(for example: <code><input type="checkbox" checked value="Registered" /></code>). Literally, two classes <code>Attributes</code> and <code>Attribute</code> are mapped correspondingly. A node can read and write attributes, which provides the ability to clean, correct given HTML</li><li><code>parentNode</code>, <code>childNodes</code> and <code>siblingNodes</code> will be used to access nodes related to current</li><li><code>baseUri()</code> that will be used to convert relative URL address to absolute</li></ul><p>I would talk more about Node traverse in “CSS Selector” part while touching a little bit here first. Slightly different from <a href="https://github.com/code4craft" target="_blank" rel="noopener">Yihua Huang</a>‘s interpretation, <a href="https://github.com/jhy" target="_blank" rel="noopener">Jonathan</a> made some refactorings here, but the concept is still the same. These lines of code in class <a href="https://github.com/jhy/jsoup/blob/master/src/main/java/org/jsoup/select/NodeTraversor.java#L40" target="_blank" rel="noopener"><code>NodeTraversor</code></a> play an important role in the whole picture:</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">traverse</span><span class="params">(NodeVisitor visitor, Node root)</span> </span>{</span><br><span class="line"> Node node = root;</span><br><span class="line"> <span class="keyword">int</span> depth = <span class="number">0</span>;</span><br><span class="line"> </span><br><span class="line"> <span class="keyword">while</span> (node != <span class="keyword">null</span>) {</span><br><span class="line"> visitor.head(node, depth);</span><br><span class="line"> <span class="keyword">if</span> (node.childNodeSize() > <span class="number">0</span>) {</span><br><span class="line"> node = node.childNode(<span class="number">0</span>);</span><br><span class="line"> depth++;</span><br><span class="line"> } <span class="keyword">else</span> {</span><br><span class="line"> <span class="keyword">while</span> (node.nextSibling() == <span class="keyword">null</span> && depth > <span class="number">0</span>) {</span><br><span class="line"> visitor.tail(node, depth);</span><br><span class="line"> node = node.parentNode();</span><br><span class="line"> depth--;</span><br><span class="line"> }</span><br><span class="line"> visitor.tail(node, depth);</span><br><span class="line"> <span class="keyword">if</span> (node == root)</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> node = node.nextSibling();</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Using recursion here is dangerous, though much easier. <code>StackOverflowError</code> might occur if the DOM tree is too deep. Iterate through is a must here. Don’t tell me you want to use <code>-Xss256M</code> to silence the world.😅</p><p>I want to emphasize that <code>NodeVisitor</code> is an interface. It defines what you would do when a node is first visited and last visited.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">interface</span> <span class="title">NodeVisitor</span> </span>{</span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Callback for when a node is first visited.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function"><span class="keyword">void</span> <span class="title">head</span><span class="params">(Node node, <span class="keyword">int</span> depth)</span></span>;</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Callback for when a node is last visited, after all of its descendants have been visited.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function"><span class="keyword">void</span> <span class="title">tail</span><span class="params">(Node node, <span class="keyword">int</span> depth)</span></span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Implementations such as <code>Accumulator</code> would be used to collect nodes that pass evaluations parsed from the query, that’s actually the key point of <code>document.select("${selector}")</code>. Go to class <code>Collector</code> and you will find it:</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Build a list of elements, by visiting root and every descendant of root, and testing it against the evaluator.</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> eval Evaluator to test elements against</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> root root of tree to descend</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> list of matches; empty if none</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">static</span> Elements <span class="title">collect</span> <span class="params">(Evaluator eval, Element root)</span> </span>{</span><br><span class="line"> Elements elements = <span class="keyword">new</span> Elements();</span><br><span class="line"> NodeTraversor.traverse(<span class="keyword">new</span> Accumulator(root, elements, eval), root);</span><br><span class="line"> <span class="keyword">return</span> elements;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p><code>Accumulator</code> kept a reference to the node where traverse begins (which means it doesn’t necessarily need to be root), and update <code>elements</code> which you may think of as an <code>ArrayList</code> for simplification.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">private</span> <span class="keyword">static</span> <span class="class"><span class="keyword">class</span> <span class="title">Accumulator</span> <span class="keyword">implements</span> <span class="title">NodeVisitor</span> </span>{</span><br><span class="line"> <span class="keyword">private</span> <span class="keyword">final</span> Element root;</span><br><span class="line"> <span class="keyword">private</span> <span class="keyword">final</span> Elements elements;</span><br><span class="line"> <span class="keyword">private</span> <span class="keyword">final</span> Evaluator eval;</span><br><span class="line"></span><br><span class="line"> Accumulator(Element root, Elements elements, Evaluator eval) {</span><br><span class="line"> <span class="keyword">this</span>.root = root;</span><br><span class="line"> <span class="keyword">this</span>.elements = elements;</span><br><span class="line"> <span class="keyword">this</span>.eval = eval;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">head</span><span class="params">(Node node, <span class="keyword">int</span> depth)</span> </span>{</span><br><span class="line"> <span class="keyword">if</span> (node <span class="keyword">instanceof</span> Element) {</span><br><span class="line"> Element el = (Element) node;</span><br><span class="line"> <span class="keyword">if</span> (eval.matches(root, el))</span><br><span class="line"> elements.add(el);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">tail</span><span class="params">(Node node, <span class="keyword">int</span> depth)</span> </span>{</span><br><span class="line"> <span class="comment">// void</span></span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Other <code>NodeVisitor</code> implementations such as <code>FormattingVisitor</code>, <code>CleaningVisitor</code>, <code>OuterHtmlVisitor</code>, <code>W3CBuilder</code> are also used in certain scenarios. <code>NodeTraversor</code> reuses the depth-first search process and provides the convenience of extra behaviors in the process, which makes the code clean and consistent.</p><p>However, I guess you would be very confused till now. Why the hack you begin from node traverse first? We even don’t have a tree built!</p><p>Yes, exactly. But remember, you have to begin with the end in mind. You always want to find DOM elements matching given search criteria, so the first thing you will think of is actually traversing, it will also affect how you build the tree - perhaps I should say “parse the tree” here.</p><p>And it will touch the topics of Compilers. Fortunately, we don’t need to dig into this mysterious area. Lex and Parse are enough to achieve the goal. And we will talk about State Machine and State Pattern in the next article.</p><p><strong>-To Be Continued-</strong></p>]]></content>
<summary type="html">
<p>Here we go! We’ve got a HTML or XML snippet, it would be clean or dirty, small or large, safe or dangerous, but our task won’t change. We need to extract and manipulate data supporting the best of DOM, CSS, and jquery-like methods. The basic idea would be crystal clear, you need to build a DOM tree and traverse it with or without filtering. That’s it. Sounds easy, right?😅Then Just Do It…and you would probably…cry.</p>
<p><img src="https://www.dropbox.com/s/e9mbalbcqyslp4b/tree.jpg?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Jsoup" scheme="http://www.oldyoungboys.club/tags/Jsoup/"/>
<category term="Code Review" scheme="http://www.oldyoungboys.club/tags/Code-Review/"/>
</entry>
<entry>
<title>Behind-the-Scenes Secrets of Jsoup I: Introduction</title>
<link href="http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-Introduction/"/>
<id>http://www.oldyoungboys.club/Behind-the-Scenes-Secrets-of-Jsoup-Introduction/</id>
<published>2018-12-14T16:36:22.000Z</published>
<updated>2019-09-03T20:29:56.228Z</updated>
<content type="html"><![CDATA[<p><a href="https://github.com/jhy/jsoup/" target="_blank" rel="noopener">Jsoup</a> would probably be the most popular “working with real-world HTML” library in the Java community. I’ve been using it for web crawler stuff since <em>1.7.3</em>(latest release is <em>1.11.3</em>), but a little bit surprised to see that there is little introduction or analysis regarding its source code and implementations.</p><p><img src="https://discoversdkcdn.azureedge.net/postscontent/Jsoup.jpg" alt=""><a id="more"></a></p><p>Since I will use Jsoup as an example for <a href="http://oit.olivetuniversity.edu/academics/macourses.htm" target="_blank" rel="noopener">OOP course(SE500)</a> I would teach at <a href="http://oit.olivetuniversity.edu/" target="_blank" rel="noopener">Olivet Institute of Technology</a> from Jan 2019, I tried to summarize the most important ideas behind-the-scene so that at least I know what I will talk about.😅These series were inspired from <a href="https://github.com/code4craft/jsoup-learning" target="_blank" rel="noopener">jsoup-learning</a> and reused some graphs from it. Many thanks for <a href="https://github.com/code4craft" target="_blank" rel="noopener">Yihua Huang</a>‘s digging into this beautiful library.</p><p>This is the first part of the series. I will give a brief introduction about features of Jsoup and its general code structure. After that, I will analyze the DOM parser and CSS selector implementation mechanism, plus with some interesting tips and tricks in following articles.</p><p>Jsoup is developed by <a href="https://jhy.io/" target="_blank" rel="noopener">Jonathan Hedley</a>, a Senior Manager of Software Development at Amazon. According to the <a href="https://github.com/jhy/jsoup/blob/master/CHANGES" target="_blank" rel="noopener">change logs</a>, the initial beta was released at Jan 31, 2010, so it has been about <strong>9 years</strong> till now! He is still maintaining the code base regularly, though not that actively as before. It might be due to “jsoup is in general, stable release” as he said.</p><p>I use the latest version <em>1.12.1-SNAPSHOT</em> for these series. It’s a <a href="https://maven.apache.org/" target="_blank" rel="noopener">Maven</a> project without any external dependencies, though it introduced junit, gson, and jetty for unit test and integration test usage. According to the statistic result of <a href="https://github.com/AlDanial/cloc" target="_blank" rel="noopener">cloc</a>, there are <strong>68</strong> Java source files under <code>src\main\java</code>, <strong>12015</strong> lines of code, <strong>4177</strong> lines of comment and <strong>1991</strong> blank lines. As a library with good test coverage, there are <strong>46</strong> Java source files under <code>src\test\java</code>, with <strong>7911</strong> lines of code, <strong>350</strong> lines of comment and <strong>1672</strong> blank lines.</p><p>The percentage of test code against production code is <strong>65.8%</strong>, which is pretty good. As for test coverage, Intellij IDEA code coverage runner gives the report in the below, which is also impressive.</p><table><thead><tr><th>Package</th><th>Class, %</th><th>Method, %</th><th>Line, %</th></tr></thead><tbody><tr><td>org.jsoup</td><td>98% (229/233)</td><td>87% (1269/1457)</td><td>83% (6135/7317)</td></tr><tr><td>org.jsoup.helper</td><td>100% (11/11)</td><td>79% (138/174)</td><td>83% (692/824)</td></tr><tr><td>org.jsoup.internal</td><td>100% (3/3)</td><td>100% (27/27)</td><td>95% (141/147)</td></tr><tr><td>org.jsoup.nodes</td><td>96% (31/32)</td><td>87% (356/407)</td><td>88% (1286/1455)</td></tr><tr><td>org.jsoup.parser</td><td>100% (114/114)</td><td>91% (490/535)</td><td>79% (2924/3699)</td></tr><tr><td>org.jsoup.safety</td><td>100% (9/9)</td><td>100% (44/44)</td><td>95% (286/300)</td></tr><tr><td>org.jsoup.select</td><td>100% (58/58)</td><td>81% (193/237)</td><td>92% (772/836)</td></tr></tbody></table><p>You may find that package <code>org.jsoup.examples</code> is missing here. Since it is used as a showcase, it’s reasonable that there is no need to write tests against them, thus I exclude it. It might be better to remove them out of production code and extract to another project with more examples covering most frequent scenarios - just in my opinion.</p><p>As a widely used library aiming at “working with real-world HTML”, the higher the test coverage is, the better. I’ve heard that Evan You, author of <a href="https://github.com/vuejs" target="_blank" rel="noopener">Vue.js</a>, even achieved 100% unit test coverage! 669 test cases make sure Jsoup stay at good status - but this still cannot prevent new issues happen. Anyway, real-world is always a crazy world, go to the test cases and you will believe what I said. For example, <code><p =a>One<a <p>Something</p>Else</code>, <code><div id=1<p id='2'</code>, what the hack is this? what should be the expected correct parsing result? Can you figure them out in 5 seconds?😭</p><p>Even though Jsoup is well covered by unit tests and maintains a high quality of implementation, you still need to be very careful when you prepare to upgrade to a new version - actually this is something to be self-evident - you just need more and more test cases to make your life easier. I remember very clearly in Dec 2016, some of my unit tests suddenly failed after I upgraded from <em>1.8.3</em> to <em>1.10.1</em> without doing anything else. Since my software was used by thousands of clients, I immediately report an <a href="https://github.com/jhy/jsoup/issues/803" target="_blank" rel="noopener">issue</a> in Github and rollback to 1.8.3 for a while, I just can’t imagine what will happen if I simply upgrade it, you know, some bugs just appear in certain circumstances and not easy to reproduce in normal functional tests.</p><p>From the test coverage result, you will also get a whole picture of code structure. While jsoup providing convenient methods to submit HTTP requests and get responses, the most important parts are still under package <code>org.jsoup.parse</code> and <code>org.jsoup.select</code>. I will introduce them in the 2nd and 3rd articles of these series. The 5 lines of code in the below, covering most frequently used scenarios, are fairly clean, easy and simple, which only means Jsoup did quite a good job for its API user experience.</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">Document doc = Jsoup.connect(<span class="string">"https://en.wikipedia.org"</span>).get();</span><br><span class="line">Elements newsHeadlines = doc.select(<span class="string">"#mp-itn b a"</span>);</span><br><span class="line"><span class="keyword">for</span> (Element headline : newsHeadlines) {</span><br><span class="line"> log(<span class="string">"%s\n\t%s"</span>, headline.attr(<span class="string">"title"</span>), headline.absUrl(<span class="string">"href"</span>));</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Jsoup also provides a website for you to play around with its <a href="https://jsoup.org/apidocs/org/jsoup/select/Selector.html" target="_blank" rel="noopener">selector</a>. <a href="https://try.jsoup.org/" target="_blank" rel="noopener">Try Jsoup</a> is the place where you can explore features of Jsoup without writing one line of code. It is also created for issue report. You can save the input, parameters, and output so that those who want to help you will go to the point much faster. For example, I want to retrieve all the post titles of Old Young Boys Club: <a href="http://try.jsoup.org/~iZFlhTQQAZnkXoPSCKrG4OisFcg" target="_blank" rel="noopener">http://try.jsoup.org/~iZFlhTQQAZnkXoPSCKrG4OisFcg</a>. The upgrade issue I mentioned above also had a corresponding session link <a href="http://try.jsoup.org/~NOfOU7vXHAaHWhDnHv5qBIPtE1M" target="_blank" rel="noopener">http://try.jsoup.org/~NOfOU7vXHAaHWhDnHv5qBIPtE1M</a> still available today.👍</p><p>After you get familiar with the features of Jsoup, it’s time to go to source code to understand the mechanism. I would suggest you run class <code>org.jsoup.examples.Wikipedia</code> and debug the 5 lines of code above to see what actually happened step by step. Beware, it’s a long journey. If you get stuck you may also go over the test cases to understand what kind of problems the code will resolve - and how you would resolve. It’s also helpful to fork the repository and try to submit some pull requests, you may either pick up an issue and try to fix or make some minor enhancements. Actually, I just submitted Pull Requests <a href="https://github.com/jhy/jsoup/pull/1157" target="_blank" rel="noopener">#1157</a> and <a href="https://github.com/jhy/jsoup/pull/1158" target="_blank" rel="noopener">#1158</a> yesterday plus two issues: <a href="https://github.com/jhy/jsoup/issues/1156" target="_blank" rel="noopener">#1156</a> and <a href="https://github.com/jhy/jsoup/issues/1159" target="_blank" rel="noopener">#1159</a>. I hope Jonathan would accept my PR and consider fixing these issues.😅</p><p>PS: Once there was a Japanese Samurai who submitted a Pull Request <a href="https://github.com/jhy/jsoup/pull/564" target="_blank" rel="noopener">#564</a> at Apr 27, 2015, got approved and merged at Nov 19, 2017. He recorded this unforgettable experience in twitter.</p><p><blockquote class="twitter-tweet tw-align-center" data-lang="en"><p lang="en" dir="ltr">Thanks – better late than never :)</p>— Jonathan Hedley (@jhy) <a href="https://twitter.com/jhy/status/932505536662183936?ref_src=twsrc%5Etfw" target="_blank" rel="noopener">November 20, 2017</a></blockquote></p><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script><p><strong>-To Be Continued-</strong></p>]]></content>
<summary type="html">
<p><a href="https://github.com/jhy/jsoup/" target="_blank" rel="noopener">Jsoup</a> would probably be the most popular “working with real-world HTML” library in the Java community. I’ve been using it for web crawler stuff since <em>1.7.3</em>(latest release is <em>1.11.3</em>), but a little bit surprised to see that there is little introduction or analysis regarding its source code and implementations.</p>
<p><img src="https://discoversdkcdn.azureedge.net/postscontent/Jsoup.jpg" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Jsoup" scheme="http://www.oldyoungboys.club/tags/Jsoup/"/>
<category term="Code Review" scheme="http://www.oldyoungboys.club/tags/Code-Review/"/>
</entry>
<entry>
<title>Chocolatey and Boxstarter: Setup Development Environment on Windows Like Never Before</title>
<link href="http://www.oldyoungboys.club/Chocolatey-and-Boxstarter/"/>
<id>http://www.oldyoungboys.club/Chocolatey-and-Boxstarter/</id>
<published>2018-12-13T05:58:41.000Z</published>
<updated>2019-09-03T20:29:56.231Z</updated>
<content type="html"><![CDATA[<p>How many times have you googled “jdk download windows”, then download, click, click, and click? You don’t have to be like this even if you are using Windows. <a href="https://chocolatey.org" target="_blank" rel="noopener">Chocolatey</a> and <a href="https://boxstarter.org/" target="_blank" rel="noopener">Boxstarter</a> are gospel for Windows users. Below is just an example: One Command To Install Them All!</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cinst jdk8 jdk11 git intellijidea-community vscode nodejs everything f.lux -y</span><br></pre></td></tr></table></figure><p><img src="https://www.dropbox.com/s/m7q76o2hbf7q66a/ronaldo.jpg?dl=1" alt=""><a id="more"></a></p><p>Install Chocolatey is super easy. Simply follow the <a href="https://chocolatey.org/install" target="_blank" rel="noopener">instructions</a> will do. Me usually choose the first option “install with cmd.exe(Run as Administrator)”, you just need to execute the command in the below, and you will get the Windows version <strong>apt-get</strong>:<br><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">@"%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command "iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin"</span><br></pre></td></tr></table></figure></p><p>Wanna upgrade? <code>choco upgrade chocolatey</code></p><p>You may explore <em><strong>6161</strong></em> community maintained <a href="https://chocolatey.org/packages" target="_blank" rel="noopener">packages</a> and build your own toolkit scripts. I’ve found it to be super useful when I need to setup development environment in bunches of different computers in a short time without any mistake. As sometimes I just cannot reproduce some tricky bugs in my own laptop and have to investigate in end user’s computer.</p><p>You can be even happier using <a href="https://boxstarter.org/" target="_blank" rel="noopener">Boxstarter</a>, it’s very straightforward and easy to use also.</p><div class="video-container"><iframe src="//www.youtube.com/embed/sgzVTG-zIPE" frameborder="0" allowfullscreen></iframe></div><p>Honestly speaking, I didn’t dive into these two tools deeply as I found the most frequently used commands would resolve my problem perfectly. If you use Windows as daily development environment, I highly recommend you to automate your setup process using them.</p><p>Yeah! Happy Coding! You may also access the <a href="https://gist.github.com/ny83427/4ca8801fb340bb0555e63155a7050ee9" target="_blank" rel="noopener">gist</a> created for this post.</p>]]></content>
<summary type="html">
<p>How many times have you googled “jdk download windows”, then download, click, click, and click? You don’t have to be like this even if you are using Windows. <a href="https://chocolatey.org" target="_blank" rel="noopener">Chocolatey</a> and <a href="https://boxstarter.org/" target="_blank" rel="noopener">Boxstarter</a> are gospel for Windows users. Below is just an example: One Command To Install Them All!</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cinst jdk8 jdk11 git intellijidea-community vscode nodejs everything f.lux -y</span><br></pre></td></tr></table></figure>
<p><img src="https://www.dropbox.com/s/m7q76o2hbf7q66a/ronaldo.jpg?dl=1" alt="">
</summary>
<category term="Code" scheme="http://www.oldyoungboys.club/categories/Code/"/>
<category term="Software Management" scheme="http://www.oldyoungboys.club/tags/Software-Management/"/>
<category term="Environment Installation" scheme="http://www.oldyoungboys.club/tags/Environment-Installation/"/>
<category term="Automation" scheme="http://www.oldyoungboys.club/tags/Automation/"/>
</entry>
<entry>
<title>Beware! Old Young Boys Club Might Be An Adult Content Website According To Bing Ads</title>
<link href="http://www.oldyoungboys.club/Bing-Ads-Adult-Content/"/>
<id>http://www.oldyoungboys.club/Bing-Ads-Adult-Content/</id>
<published>2018-12-11T00:19:22.000Z</published>
<updated>2019-09-03T20:29:56.230Z</updated>
<content type="html"><![CDATA[<p><a href="https://bingads.microsoft.com/" target="_blank" rel="noopener">Bing Ads</a> is just AWESOME! It will give you a coupon code for $100 in search advertising once you registered. So why not use it? I created my first campaign for this little website and got a surprise joy immediately. It was disapproved. The reason is…Bing Ads said my blogger violate its Adult Content Policy. Yes, <strong>ADULT CONTENT POLICY</strong>.</p><p><img src="https://www.dropbox.com/s/wv3aadp0lybze8c/shocked.jpg?dl=1" alt=""><a id="more"></a></p><p>As an Old Young Boy, I always appreciated those who made joy for us. I’m pretty grateful for the algorithm invented by Microsoft Bing Ads team. I submitted an exception case immediately. But that’s obviously not enough, so I opened an online chat with its customer service representative. Honestly speaking Bing Ads provided great service better than I expected, I just went away for 3 minutes and didn’t reply in time, then a phone call reached me directly. But it spent me at least five minutes to understand she is the representative I am talking to - and she calls me from Philippines! I thought it was a spam call at first!</p><p><img src="https://www.dropbox.com/s/0yqvflpram7vx73/bing-ads-disapproved.jpg?dl=1" alt=""></p><p>She helped me to expedite the review process and hopefully this campaign will be approved the next week. However, I am very interested in the issue of Bing Ads algorithm and would follow up continuously to know exactly what happened. I once doubt whether the funny tool <em><strong>“thef*ck”</strong></em> I mentioned in [5 Github Repositories Will Make Your Day]”<a href="http://www.oldyoungboys.club/Funny-Github-Repositories/"">http://www.oldyoungboys.club/Funny-Github-Repositories/"</a> confused Bing Ads. I would report an issue in the GitHub repository if it’s really the reason for my wonderful experience with Bing Ads.</p><p>Maybe it’s time to read the inspiring story of how Microsoft NT team fight with numerous bugs and shipped the historic product. Meanwhile, it would be also beneficial to know more about bugs. Happy Reading!</p><script type="text/javascript">amzn_assoc_placement = "adunit0";amzn_assoc_search_bar = "false";amzn_assoc_tracking_id = "oldyoungboy-20";amzn_assoc_ad_mode = "manual";amzn_assoc_ad_type = "smart";amzn_assoc_marketplace = "amazon";amzn_assoc_region = "US";amzn_assoc_title = "";amzn_assoc_linkid = "9653009959f57ceb819ce2d9117a7440";amzn_assoc_asins = "0759285780,0763667625,1426313764,1426317239";</script><script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script>]]></content>
<summary type="html">
<p><a href="https://bingads.microsoft.com/" target="_blank" rel="noopener">Bing Ads</a> is just AWESOME! It will give you a coupon code for $100 in search advertising once you registered. So why not use it? I created my first campaign for this little website and got a surprise joy immediately. It was disapproved. The reason is…Bing Ads said my blogger violate its Adult Content Policy. Yes, <strong>ADULT CONTENT POLICY</strong>.</p>
<p><img src="https://www.dropbox.com/s/wv3aadp0lybze8c/shocked.jpg?dl=1" alt="">
</summary>
<category term="Joke" scheme="http://www.oldyoungboys.club/categories/Joke/"/>
<category term="Bing Ads" scheme="http://www.oldyoungboys.club/tags/Bing-Ads/"/>
</entry>
</feed>