-
Notifications
You must be signed in to change notification settings - Fork 0
/
atom.xml
575 lines (329 loc) · 64.1 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title><![CDATA[Hamann Distributed]]></title>
<link href="http://distributed.hamann.se/atom.xml" rel="self"/>
<link href="http://distributed.hamann.se/"/>
<updated>2014-11-14T17:44:11+01:00</updated>
<id>http://distributed.hamann.se/</id>
<author>
<name><![CDATA[Dominik Hamann]]></name>
</author>
<generator uri="http://octopress.org/">Octopress</generator>
<entry>
<title type="html"><![CDATA[Dear Lucas]]></title>
<link href="http://distributed.hamann.se/blog/2014/11/14/dear-lucas/"/>
<updated>2014-11-14T16:31:00+01:00</updated>
<id>http://distributed.hamann.se/blog/2014/11/14/dear-lucas</id>
<content type="html"><![CDATA[<p>Yesterday, I found an E-Mail in my Inbox.</p>
<blockquote><p>Very impressive profile, Dominik! Any advice for a relatively young programmer who would like to be someone with similar skills in the next 10 years?</p><footer><strong>Lucas W.</strong></footer></blockquote>
<p>Now, I admit I was a little flattered. I really don’t think I’m that good, but looking back at the last six years and realizing again how I went from a broke, bipolar and triple failed university student to Lead Architect and eventually CTO and Co-Founder within this short timeframe, I can’t help to think that there must be <em>something</em> I did right and I could share with others. My impostor syndrome would beg to differ, but in fact there wasn’t so much magic involved – but lots of hard work, meeting the right people (among many more wrong ones) and taking opportunities that popped up along the way, even if they had some risk involved. I could totally do it and so can you.</p>
<p>My first intention was answering him personally back on Linked.in – but from my experience here’s one tip for starters already: If someone asks you a question that many more people might have, do yourself a favor and write it down for posterity. Put the answer into your code, wiki or blog for everyone to read for reference and you might make more than just one person’s day.</p>
<p>So here’s my advice to you, Lucas. I’m sorry, there is no TL;DR. There are no shortcuts. This is the real deal.</p>
<h2>Ambition and Culture</h2>
<p>Lucas, you asked me for a 10 year career path. I think you <a href="http://qr.ae/mFTW7">should be bold and shoot for what you can achieve in one year instead</a>. Before I go into any of the numerous technical aspects of your question, I think there is one important lesson for you to take so far already – which is: Don’t stand still. The fastest way to learn ANYTHING is to learn it from people who already know what you want to know – and are great at both teaching and learning. You need to surround yourself with those people if you really mean it. These are always people who are “better than you”, and you should try to get them to mentor you.</p>
<p>How to do it? Simple. Approach them even if you’re not working directly with them. Be nice, interested and try to solve some of their problems on your own. Ask them good questions and come back when you’ve done your homework to ask some more. Don’t ever ask them stupid questions, which you could answer yourself by Googling. Don’t ever ask a question twice. You know you’ve found a good mentor when you get continuously harder tasks and candid feedback from them. Within your company, try to get into the fastest and most respected team. Whenever you think your learning curve has peaked and is on its way down, look around for new opportunities.</p>
<p>In some great companies, it’s possible to evolve into ever more challenging jobs with increasing responsibility, but unfortunately often it’s not. When you feel like you’re hitting the glass ceiling, move on and look for something else to continuously broaden your horizon and stretch outside your comfort zone.</p>
<h2>Education</h2>
<p>High degrees and certificates will only take you so far in programming and often aren’t worth the hassle. I once did the Zend Certified PHP Engineer to prove I wasn’t a total failure in programming, but I skipped the expensive course and just sat down with a book and some test questions to prepare. More importantly, after years on the other side of the hiring table I’ve learned that people with the most certificates and the highest degrees are more often than not the least productive ones, because somehow, a lot of them seem to rest on their past successes and stop learning.</p>
<p>Remember, you’ll be hired for your papers, but you’ll be promoted for your output – so don’t spend too much time on the former.</p>
<h2>Read Hacker News</h2>
<p>Why <a href="https://news.ycombinator.com/">Hacker News</a>? Quite simply, it’s where some of the best people meet and share their knowledge about the whole technology ecosystem. There are other aggregators, maybe even in your native language. However, you should really start embracing English as your work language if you didn’t already and the best way to become proficient is reading and writing a lot.</p>
<p>Some people say Hacker News has become boring and repeating. These are usually the people who have become experts themselves by reading there for years – I still find lots of new, interesting and important information about technology and startups there.</p>
<p>Read about half an hour every day to stay on touch and learn something new, then get on to work on other things. <a href="http://bennesvig.com/youve-probably-read-enough/">Reading more won’t make you better</a> after a certain point, only practice will.</p>
<h2>How to learn</h2>
<p>I’ll give you a truckload of things to learn below, and learning is the most important thing you will ever do as a developer. But remember that learning works best (some say: only) <a href="http://health.tki.org.nz/Key-collections/Curriculum-in-action/Making-Meaning/Teaching-and-learning-approaches/Experiential-learning-cycle">as a spiral</a>. You’ll learn something, not quite understanding everything and move on to the next topic. Maybe you’re a little frustrated. This goes on and on until you come back to the first topic. You’ll look at the topic again and see that a lot of things explain themselves through the new connections, abstractions and transfers you made on your detours. This spiral will continue forever. If you’re not embarrassed by work you did years ago, you probably haven’t evolved or developed much.</p>
<p>Don’t try to learn any of the topics below “fully” before moving on. It’s simply not possible. Try to get one focus area every day and rotate.</p>
<p>Get used to the feeling of not understanding certain things. It’s normal. You will eventually.</p>
<p>Get used to the feeling of not knowing the exact paths to the goal you’re after. It’s normal. You’ll find out on the way.</p>
<h2>The Mix</h2>
<p>Then there’s “the learning mix”. It’s the optimium mix between reading, working, toying and communication.</p>
<p>Try to get the balance right, because combining all four belong to becoming a great developer.</p>
<p>You need to read, because there is a lot of information that you need in order to make good choices in every possible field.</p>
<p>You need to work because only applying your knowledge to a concrete codebase will complete the learning cycle and give you actual experience.</p>
<p>You need to toy and play. No, I’m not kidding. This is also work but it’s disguised in a much more undirected format, often with no clear goal at all. This is all about looking behind the curtains, trying unusual solutions, long shots which most probably wouldn’t ever work (and finding out why). Finding shortcuts or doing things with frameworks, languages or technologies that are not used at (or not even recommended for) “daily” work. Lots of people – especially women – are conditioned quite early by society not to needlessly toy around but to be “serious” – yet toying around is essential to becoming a great developer.</p>
<p>It’s really about expanding your horizon and it’s even appropriate to do some of these things at work: Small tools and one-off tasks are great use cases for more experimental technology. Some of the more time consuming things you should do better do in your free time.</p>
<p>Last but not least, you need to communicate. A lot. You need to ask, tell, understand and be understood. Also, get the mix right. You should hang out about one third within your core technology group (other devs, mentors, bosses), one third within extended technology (other teams, admins, BI guys, meetups) and one third outside of technology to understand other business functions and their needs better.</p>
<p><a href="http://keithferrazzi.com/products/never-eat-alone">Lunch is usually your friend.</a></p>
<h2>Learn 10 Programming languages over the years</h2>
<p>Why ten? Well, I just pulled a number out of my hat that feels about just right to start to get an intuitive grasp on code. You’ll learn that all programming languages share some features that are common but all have distinctive advantages and disadvantages in different situations. It also helps you get over stupid language wars and see the other sides of the table. It’s really about developing a technology “feeling” that will help you make good decisions later. Make sure you get some diversity again.</p>
<p>Choose something low level like C or assembler, something high level with a framework (Rails, Django), something object-oriented (Java, Objective-C), a functional language (Haskell, Clojure), something mixed-paradigm (Go, Scala), something arcane (COBOL, BASIC) and something brand-new or experimental (Rust, Nimrod). Choose a language targeted towards certain niches (Julia, Ada, Autohotkey <– don’t laugh, it was my venture into programming), and maybe something exotic like Brainfuck or LOLCODE. Be sure to learn JavaScript and why it’s both ingenious and awful at the same time. Learn at least the basics of SQL, HTML and CSS – I don’t count them as real languages, but you’ll come across them pretty much everywhere, even if you don’t see yourself as a “web developer”.</p>
<p>You don’t have to learn them all to full proficiency, but you should try to get something specific done in every one of them and move along. Stay around some more time with languages you like, learning them at a deep level – and leave others behind, mentally noting <em>why</em> you didn’t like them. This is really subjective. I for example can’t stand Lisp dialects because I find the parentheses ugly and the syntax unreadable. You will find other programmers totally excited about its expressiveness.</p>
<p>I won’t tell you PHP is a bad choice, <a href="http://whydoesitsuck.com/why-does-php-suck/">lots of others will happily do that for me</a>. Facebook, Wikipedia and Wordpress are examples showing you that the language works. Yet not everything which is widespread is great as well. Don’t believe others, find out yourself. PHP used to make me happy, fast and productive in my early days, but it doesn’t anymore.</p>
<p>Your preferences will change as you get more experienced. Often, beginners start with dynamic languages and hate types in the beginning (mostly because you don’t fully understand them and they’re a hassle to work around), yet start to appreciate them when you get more experienced and work in larger teams. YMMV.</p>
<h2>Learn about data structures and databases</h2>
<p>Data is everything you will ever create and move around in IT. If you are a self-made programmer like me, you will have most headache but also lots of leverage with understanding data structures. This is the boring stuff of every CS class which you cram into your head and then forget – but it’s only really interesting if you come across concrete problems where you’re running into the limits of your data structures. You should definitely know what sets, maps, arrays, trees, tries and lists are, how they look in computer memory and what their complexity cost in Big-O-Notation is. I found it much easier though to learn them on the job when I actually used them.</p>
<p>Databases are important as well. If you don’t know anything about databases yet, <a href="https://alpha.app.net/tonymillion/post/18431717">don’t be a hipster and just use Postgres</a>. You’ll be sure to learn something solid, performant and generic which will stick around. But if you’re already planning beyond Gigabytes of data, hundreds of concurrent connections, below a hundred milliseconds of latency or all of the above, you need to understand more scalable database technologies and their limitations. Read about MapReduce, DynamoDB, BigTable, sharding, elasticity, indexes, immutability, ACID and BASE, and these are only the very beginnings.</p>
<p>To really make good technology decisions, you need to talk to your business about the eventual requirements of the system, because you WILL have to make some tradeoffs eventually and you should be aware of them. Think of good questions before you try to find answers, i.e. “How often will data get updated?”, “How much downtime is acceptable in a day / year / minute?”, “How fresh does the data need to be?”. I can give you 20 of these questions drilling down into your real needs right now, and all of this will reflect in the architectural choice I will make.</p>
<p>I was once asked by a CEO why we couldn’t just “put everything into a really fast database and then get results from arbitrary queries in realtime” (note that he was talking about 500 GB a day). I told him that he could sell our company for several billions if we had that technology.</p>
<h2>Learn about your craft</h2>
<p>In the middle of <a href="https://www.codefellows.org/blog/this-is-why-learning-rails-is-hard">this page</a>, you will find a mind map about everything you should master as a Rails developer. Forget the “Rails” part for now, 80% of this map describes the neccessary skillset for every language and framework you can think of. You should spend at least a few hours for every single leaf of this tree to understand the concepts behind them. A developer who can program perfectly but can’t use VCS, Continuous Integration and Testing in an Agile Environment is pretty much useless in 2014.</p>
<p>Really, good craftsmanship cannot be underestimated. You should choose to (deeply) learn a powerful editor like Vim/Emacs/Sublime/IDEA and learn 10 finger typing. I bought <a href="http://www.getdigital.de/Blank-Keyboard.html">this</a> keyboard and learned it the hard way – within 2 weeks and at the venerable age of 26 years. Both measures roughly tripled my coding & writing performance and I didn’t ever look back.</p>
<h2>Learn about people</h2>
<p>Dealing with people is the single most undervalued skill in IT with the most leverage. This is both the shortest and the longest chapter of this text, because everything that needs to be said about this topic has been <a href="http://en.wikipedia.org/wiki/How_to_Win_Friends_and_Influence_People">written 78 years ago already</a> – and the only thing you should really do is get a copy and read it. Don’t be put off by the title, remember this book is as old as your grandparents, yet people didn’t change a single bit…</p>
<h2>Learn about your business</h2>
<p>Someone who can take any spec and write it down into working, readable and performant code is a great programmer.</p>
<p>Someone who can take any business problem and write it down into a well-modelled spec before coding it down is a great engineer.</p>
<p>You’re free to stop at being a good programmer, but good engineers are paid double the price. Really, the top three skills you can learn if you want to progress your career are</p>
<ul>
<li>Taking a business problem and turn it into a working software model</li>
</ul>
<p>There are lots of good ideas out there, yet execution of all of them depend on your technical transfer skills. Nothing is quite as important as learning to read the language of business people, form them into a domain model and explain and reevaluate the model together with the business people before even writing the first line of code. A great intro to this topic are the first chapters of Eric Evans’ <a href="http://www.amazon.de/Domain-Driven-Design-Tackling-Complexity-Software/dp/0321125215">“Domain Driven Design”</a>. This also means that you should get familiar with the business itself that you’re currently in. Learning which KPIs are important and have the most impact to your business will help you find much better technical solutions.</p>
<ul>
<li>Explaining technical dependencies and challenges to non-technical people</li>
</ul>
<p>It’s easy to underestimate <a href="http://xkcd.org/1425/">how hard this really is</a>. Yet, the ability to explain everything to a 5-year-old and negotiating features in order to keep down complexity are incredibly important for you to learn. A great example I had: Product Managment specified that they wanted a user that had seen a banner not to see it more than five times within the following 24 hours. I asked if it would be okay to only show the banner to them five times per 24-hour-interval. They said “but that’s almost the same!”.</p>
<p>Yes. From a product side this was almost equal, but of course technically, the latter was much less complex, saving lots of programming hours and computing complexity by simply rethinking the specs and providing an alternative. Learn how to teach the concepts of technical debt, technical investment and software rot and incorporate them into your estimates. Speaking of which…</p>
<ul>
<li>Making good estimations</li>
</ul>
<p>Estimating is usually one of the things even great programmers learn last. Yet it is crucially important to know for the business how long things will take. Learn to express yourself in terms of complexity and uncertainty. It’s helpful to continuously practice the estimate-observe loop with your own tasks and also learn about how large and complex projects are fundamentally different to a one-man-show. A great rule of thumb for larger projects from <a href="http://en.wikipedia.org/wiki/The_Mythical_Man-Month">“The mythical man-month”</a> which I often use is: 1/3 of the time is spent on design, ¼ on early testing and debugging, ¼ on late integration and acceptance testing and only 1/6 (!) on actual coding.</p>
<p>Take this into account and multiply the resulting time by 1.2 for every team member before dividing through the team to account for communication losses. If you have time, read the whole book for some great insights on planning software projects. Same thing here: You won’t believe how accurate most parts of a 40 year old book on software development still are today…</p>
<h2>Learn to glue</h2>
<p>30 years ago, if you wanted to write an app, you wrote everything yourself, from the device driver routines to the graphics buffer. Fortunately, thanks to the Github revolution, nowadays you can find an open source package for just about anything. And systems are so much more complex now that you’d be stupid not to. Not just that – most of the times, these libraries are written by people smarter or more experienced than you are.</p>
<p>On the other hand, with great power comes great responsibility. These packages often will not exactly do what you want or what you need and you’ll have to make important decisions. Do you commit back to the project and hope for a pull request? Do you integrate parts of it as an API? Sometimes it’s a good choice to go for a full stack framework. Other times it’s better to hand-pick and compose some smaller packages to achieve success.</p>
<p>On top of that, you will have to manage the dependencies as well. They will get out of date, glitches and security bugs will pop up and often they might have changed their interface inbetween so you have to change all your code in order to upgrade. Some code will rot and get abandoned. If it’s a small library, that might be okay, but if it’s your CMS or the fancy bleeding-edge framework you built your website on back then, you might be screwed.</p>
<p>If on the other hand you choose to build everything yourself and end up with a huge stack that’s <a href="http://en.wikipedia.org/wiki/Not_invented_here">reinventing the wheel</a> instead of using standard components, it’s getting continuously harder to onboard people, up to the point where I’ve seen this literally killing companies.</p>
<p>All these problems require a completely new skill, which is searching, composing and glueing software components to your own stack the right way. There’s even a <a href="http://gluecon.com/">conference solely dedicated to this topic</a>, that’s how hard this is in itself. If you have several repositories doing what you want, you need to be able to estimate their traction by evaluating the number of stars and forks, frequency of pull requests, number of open issues, commit frequency and date of last commit and if it is maintained by a company (yet not too much to cripple it), just to mention a few.</p>
<p>Get good at this stuff and if you really decide that you write a component yourself, you should already know the three best repositories closest to solving your problem and how they are not cutting it quite yet for your use case.</p>
<h2>Learn DevOps</h2>
<p>In the days of Infrastructure as a Service, it’s become so easy to start up your own server / cluster / database, that there is simply no excuse not to try it yourself. The entry barrier is so low through services like AWS or Heroku, that you need to be proficient in provisioning a prototyping environment yourself. Some people “admire” my Hadoop skills and complain they never “had the opportunity to work with it” before. I call BS. For just a few dollars, you can run your own map-reduce jobs on Amazon EMR and experiment with huge freely available open-data test sets if you were really interested in it. Get an account NOW and understand which problems the dozens of AWS services solve before you try to roll your own.</p>
<p>Learning how to properly operate, monitor, deploy, secure and maintenance real server systems – even superficially – will get you a much deeper understanding on how your code will actually run and perform when in the wild.</p>
<p>Also, you’ll get some humility for Admins who do this stuff full time.</p>
<p>Learn about technologies like Vagrant, Docker and Fig and their underlying concepts of <a href="http://blog.codeship.com/immutable-infrastructure/">immutable infrastructure</a> to get into the modern mindset of <a href="http://12factor.net/">12factor</a> apps.</p>
<p>Yes, all of this didn’t even exist in 2011.</p>
<p>Yes, it’s hard to continuously follow these important trends and keep up with the pace. You signed up for the ride.</p>
<h2>Go on from here</h2>
<p>Lots of what I wanted to say as well has been already said – for some more good stuff that I didn’t cover, go over here:</p>
<p><a href="http://peternixey.com/post/83510597580/how-to-be-a-great-software-developer">http://peternixey.com/post/83510597580/how-to-be-a-great-software-developer</a></p>
<p>I couldn’t have said these things any better myself, thanks Peter.</p>
<p>Thanks for reading and I’m looking forward to see what your career has to offer.</p>
<p>Cheers,</p>
<p>Dom</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[The day my cable provider killed me]]></title>
<link href="http://distributed.hamann.se/blog/2014/11/02/the-day-my-cable-provider-killed-me/"/>
<updated>2014-11-02T21:36:00+01:00</updated>
<id>http://distributed.hamann.se/blog/2014/11/02/the-day-my-cable-provider-killed-me</id>
<content type="html"><![CDATA[<h2>or: Eight lessons in avoiding customer service disasters</h2>
<p>Last Sunday, my internet suddenly stopped working. I didn’t think too much of it, but when I reset my router as usual and it still wasn’t working, I was getting curious. Turned out, my TV and my phone didn’t work as well. Now, I was definitely angry. I called up the tech support of Kabel Deutschland, my cable provider. After 10 minutes in the line, I gave up and went to bed.</p>
<h4>Lesson 1)</h4>
<h4>Try to stay accessible for your customers. Of course, this isn’t always possible or economical – but it might be helpful to separate urgent technical issues from contract support and prioritize the former. Especially if you are in a position to have significant impact on your customers’ daily life.</h4>
<p>Monday morning, I called tech support again. I asked why none of my devices were working anymore and the agent looked into my case. She mumbled something about a “horrible mistake” but she’s sorry and everything should be working again within 24 hours. I sighed and left for work, leaving my wife at home alone with our daughter – and no internet or phone whatsoever.</p>
<h4>Lesson 2)</h4>
<h4>If you let your customer service agents promise something to the client, make sure to keep it. If you can’t keep it, don’t let them promise anything specific.</h4>
<p>24 Hours later, there was still no internet. When I got to work, my wife sent me a message. Kabel Deutschland had sent a letter, offering their condolences to my death (!) and cancelling all my contracts. Now it was finally getting obvious what the “horrible mistake” was: I had sent a marriage certificate to them because I had changed my family name – and someone obviously mistook it for a death certificate.</p>
<h4>Lesson 3)</h4>
<h4>If your customer support is able to make high impact decisions, make sure there are basic protections against accidental or malicious misuse. It doesn’t even have to be a four-eye principle, sometimes a checkbox “are you sure” can do the trick. When you’re taking potentially harmful actions against your customers, be transparent and try to inform them early on, preferably on real-time channels.</h4>
<p>I was a little irritated and slightly bemused, but I wasn’t too that mad about the mistake yet. I started my career in customer support and know how little people are paid in this industry and how much they have to do. I was mad though about how they dealt with it.</p>
<h4>Lesson 4)</h4>
<h4>If you are dealing with customers and make mistakes, listen to the late Dale Carnegie: Admit mistakes on the spot if you do them and apologize – it will make you much more human and believable. Don’t try to put things under the rug.</h4>
<p>My amusement passed when my wife had to send important documents from home, rushing to neighbors with a sick kid just to plug her computer in. She called tech support from her phone, this time escalating to second level support. The agent couldn’t see any of the other calls made the days before, but was “shocked”, apologized and promised again that everything would be working soon, but it would take some time. By now we were writing down names just in case.</p>
<h4>Lesson 5)</h4>
<h4>Get a CRM. If you have one, use it. If you’re using it in first level support, share it with second level support and tech support. Get these basic things straight, before you start to get into the business of dealing with end customers.</h4>
<p>One more day passed. At that time, worried messages started to come in from all over our family because our phone line was still dead. My wife called tech support again, escalating again. They kept asking for more “patience”, because they would have to restart the contract and reprovision all our devices from anew. We kept waiting. Both our monthly mobile internet limits were exhausted.</p>
<h4>Lesson 6)</h4>
<h4>As an engineer running high-availability platforms, we always have a rollback plan when we deploy new software. So, when things go wrong (and they do go wrong), we can have the old state of the platform back within minutes. My cable provider was obviously completely unprepared. Switching everything off was done within minutes, reinstating the contract made them struggle.</h4>
<p>The next day was Friday. I had some work to finish on the weekend as well and was hellbent to get the company to get its act together. I called tech support, directly escalated to second level and told the lady: “Look, you guys pronounced me dead on Monday and cancelled my contract. My wife is at home with a sick kid, completely cut off from the outer world since Monday. If you don’t reprovision my modem within the very next 2 hours, I’m going to clear the situation with your head of PR.” – “Yeah, well, you can do that, but first, give me your contract number.”</p>
<p>In fact, I had entered the number before starting the call. (see lesson 5 for helpful remedies)</p>
<p>I gave her the number and she kept me in the line for 5 minutes. After that, she told me that “they are sorry and they are doing everything they can”. I should have some more patience, because she could not “accelerate anything”. I told her my patience was completely over and that she had exactly one hour and fifty five minutes to reinstate my contract – and if she did it in person. I hung up.</p>
<h4>Lesson 7)</h4>
<h4>Have a priority line in order to deal with urgent requests in your technical support queue. Some requests might be too important to wait in line together with all the others.</h4>
<p>In contrast to Kabel Deutschland, I keep my promises. Three and a half hours later, after confirming our wi-fi at home was still dead, I wrote a quick personal message to their Head of PR, telling him that I will go public with this story if he didn’t answer that very day.</p>
<p>He didn’t.</p>
<h4>Lesson 8)</h4>
<h4>Take your customers serious. Enough said.</h4>
<p>I still have no internet. Let’s see what Monday brings.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Will Darkmail be secure? Probably, but it doesn't matter.]]></title>
<link href="http://distributed.hamann.se/blog/2013/10/31/is-tox-secure-probably/"/>
<updated>2013-10-31T09:27:00+01:00</updated>
<id>http://distributed.hamann.se/blog/2013/10/31/is-tox-secure-probably</id>
<content type="html"><![CDATA[<p>So there’s currently all the rage about the <a href="http://silentcircle.wordpress.com/2013/10/30/announcing-the-dark-mail-alliance-founded-by-silent-circle-lavabit/">Dark Mail Alliance</a>, re-inventing electronic mail in an easy to use, fully decentralized and cryptographically secure fashion. I think there are some brilliant minds on the project and I personally wish them the very best.</p>
<p>A few months ago, there was the first true spiritual successor of the original Skype called <a href="http://tox.im/">Tox</a> in the making, and I can’t even pretend all of this this isn’t great news. Finally the FOSS community wraps up DHT’s, strong crypto, hole punching and even video – just like the original Skype – and puts it into a nice zero-configuration package. The project is getting well deserved publicity, crypto expertise and UX love, and is predestined to take off soon in times of PRISM-raised consumer awareness.</p>
<p>So why am I still not excited?</p>
<p>Because we’ve progressed to far. Laws worldwide have been hollowed out from steady lobbying since 9/11 – and it helps to look into the past to predict what will happen next.</p>
<p>Tox and Darkmail are already on the shortlist of every TLA (three letter agency) worldwide. What happens if you use it? Guess what – <a href="http://www.forbes.com/sites/andygreenberg/2013/06/20/leaked-nsa-doc-says-it-can-collect-and-keep-your-encrypted-data-as-long-as-it-takes-to-crack-it/">you’re officially making yourself a preferred target</a>.</p>
<p>A great read on this topic is PHK’s <a href="http://queue.acm.org/detail.cfm?id=2508864">“More encryption is not the solution”</a> that goes much more in depth than this article and I admit I couldn’t have written it any better.</p>
<p>But to phrase my concern in my very own words: You <em>can</em> still trust the math behind cryptography – but it doesn’t matter at all. What you <em>can’t</em> trust on in 2013 is your <a href="http://www.infoworld.com/d/mobile-technology/smartphone-makers-break-android-security-pre-installed-apps-180780">device manufacturer</a>, <a href="https://www.eff.org/nsa-spying">your carrier</a>, <a href="http://www.techdirt.com/articles/20130614/02110223467/microsoft-said-to-give-zero-day-exploits-to-us-government-before-it-patches-them.shtml">your operating system</a> and <a href="http://en.wikipedia.org/wiki/PRISM_%28surveillance_program%29">your e-mail provider</a>.</p>
<p>And especially if you’re a second-class citizen of the world (or you <em>are</em> from the US, <a href="http://www.washingtonpost.com/world/national-security/nsa-infiltrates-links-to-yahoo-google-data-centers-worldwide-snowden-documents-say/2013/10/30/e51d661e-4166-11e3-8b74-d89d714ca4dd_story.html?Post+generic=%3Ftid%3Dsm_twitter_washingtonpost">and your traffic somehow gets routed outside US jurisdiction</a>), the NSA won’t give a shit on breaking all international laws to get to the messages you are sending and receiving. Not only did they <a href="http://usgovinfo.about.com/library/weekly/aa050602b.htm">self-excempt themselves from international jurisdiction</a> like just a few other states like Cuba or North Corea – they are not even liable to any of their actions <a href="http://www.legitgov.org/Guantanamo-detainees-cant-mention-torture">to their own people</a>.</p>
<p>So who will use the new strong cryptography tools? Here’s my prediction: 20% curious minds and technologists, 40% who have something to hide for good reason, 40% who have something to hide for bad reason. The big masses won’t care. And as long as you’re not using a more exotic device, carrier <em>and</em> operating system – preferably not even from your own IP as to not have blown your cover by the mere fact that you are using encrypted traffic – you can bet your ass they will go after you to have you <a href="http://www.theguardian.com/technology/2013/oct/09/silk-road-founder-new-york-charges">lynched in the public</a>.</p>
<p>So what’s coming up next? <a href="http://arstechnica.com/gadgets/2013/07/motorolas-8-core-x8-chip-gives-us-a-lesson-in-marketing-speak/">Voice processors</a>. After Facebook helped indexing anything that you expose to the public and Dropbox went along with implicit “free photo sharing” to the NSA, their next target will be everything you didn’t even want to expose. Uploading hours and hours of voice communication is still too obvious by looking at your phone bills, but 24h of your verbal communication in an encrypted zipped text file after your device’s built-in speech recognition (of course together with GPS location info) will just disappear in the data noise of always-connected “factory” and carrier daemons and is easier to process for real-time analysis anyway.</p>
<p>Make no mistakes – the on-demand <a href="http://www.slate.com/blogs/future_tense/2013/06/07/nsa_surveillance_iphones_make_snooping_easy_for_spies_and_law_enforcement.html">“roving bug”</a> is already a reality since 2006 – this will only take it to the next level. Even switching your phone off <a href="http://www.stopcellphonetracking.com/nsa-can-still-track-your-cell-phone-even-if-its-turned-off/">will not protect you</a>, but rather make you a hard target again.</p>
<p>Privacy in the near future will be at places where you don’t have a smartphone – and nobody around you does either. It will be as hard to find as diamond and impossible to stop if we don’t start acting on a political level soon. And to quote my favourite political satirist, Volker Pispers: “Politicians in Germany always say: Why not collect all the fingerprints and save them in a database? The Spaniards do it as well! – Do you actually know why? It was done by a dictatorship (Franco) and fell into the hands into a democracy. No problem – except if it happens the other way around…”</p>
<p>If you’re not scared yet, do yourself a favor and watch the ingenious <a href="http://www.imdb.com/title/tt0405094/">“The lives of the others”</a> to get a grip about what a real dictatorship managed to do with 1980’s technology already. Put more than thirty years of technology evolution on top and you’ll be scared shitless. If you want to know why the Germans are always at the forefront of privacy protest, that’s why.</p>
<p>What’s going to stop all of this? Not Tox. Not Darkmail. Simply people standing up all over the world to stop the madness.</p>
<p>And honestly – I don’t see that coming.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[The horrifying state of eGovernment in Germany]]></title>
<link href="http://distributed.hamann.se/blog/2013/08/31/the-horrifying-state-of-egovernment-in-germany/"/>
<updated>2013-08-31T20:47:00+02:00</updated>
<id>http://distributed.hamann.se/blog/2013/08/31/the-horrifying-state-of-egovernment-in-germany</id>
<content type="html"><![CDATA[<p>Working and living in Berlin, one of the current major hubs for startups and innovation in Europe, I’m used to teams of a handful of people building globally scaling platforms in just a handful of weeks with a user experience so great that I’m almost taking it for granted.</p>
<p>A few weeks ago, I took a forced detour from the bubble.</p>
<p>Backstory: I used to freelance for smaller IT and media design gigs with my own company, but since starting to work full time in a startup, there wasn’t much more time to do so. Because I knew German bureaucracy is quite strict on freelancers not generating profit for years in a row, I just wanted to officially dismantle my company.</p>
<p>I already knew I didn’t want to show up in person anymore, as the respective administrative office is far away and the opening hours are plain hostile to any working person. So I googled my way around and saw that there’s a standard form you can download, fill out and send by snail mail, together with some copies of your documents. But hey – they were also offering the <a href="http://www.berlin.de/ordnungsamt/dienstleistungen/index.php/dienstleistung/122107/">“online treatment of the service”</a>!</p>
<p>Being a little too curious and naïve, I clicked on the link and followed down the rabbit hole…</p>
<h4>2013-05-26, 23:10 CET:</h4>
<p>After finding the well-hidden registration link after five paragraphs of legalese and stilted introduction, I’m able to register my first eGovernment account at EU-DLR. Yay. At least they’re confirming my mail address, so I’m continuing by clicking the link in the mail.</p>
<p>I’m greeted by a notification that tells me I’m “accessing the site out of business hours”.</p>
<p>Wait. What!?</p>
<p>They’re serious. Basically everything out of standard business hours is defined by them as a “(possible) maintenance window”. If the server crashes, it means basically waiting till someone fixes it only the next day. (How often it actually crashed, I found out over the next days and weeks)</p>
<p>I’m giving up for today.</p>
<h4>2013-05-27, 19:30 CET:</h4>
<p>Trying to log in again. The system immediately logs me out. Did I do a mistake? I’m trying again, with the same result. Switching from Firefox to Chrome. Same thing. Maybe I should really “try Internet Explorer” as suggested? If only it wasn’t for the fact that it’s 2013, I’m on OS X and Microsoft made the world a better place in 2005 already by <a href="http://en.wikipedia.org/wiki/Internet_Explorer_for_Mac">discontinuing IE for Mac</a>.</p>
<p>Before considering to use my girlfriend’s Windows computer, I’m trying Safari. It seems to work now.</p>
<p>I’m “opening my first case”.</p>
<p>Think of everything you know about usability and user flow – and then try to imagine a site that has been specifically crafted to ignore every single best practice in the field.</p>
<p>The workflow is a seemingly never ending series of small forms that each have a “next” button. Every click on next lets me wait almost one minute before continuing. Somewhere in the middle of the 5th form, after 2 minutes I’m getting back a raw, unfiltered, unstyled “500 Internal Server Error”. In English, that is. Fortunately I’m only half as confused as the average German business owner using this “service”.</p>
<p>I can’t reenter the site. Giving up again for today.</p>
<h4>2013-05-28</h4>
<p>I’m discovering that there is no way to access “my cases” by logging into the site normally. After the standard login, there’s no way for me to do anything else but read the FAQ. Only by clicking the link in the activation mail or typing in the correct URL, I’m able to enter “my case”.</p>
<p>Insert coin and try again.</p>
<p>This time, it’s only taking 10 seconds after each next. In one huge form, I need to enter all my data. Address, Phone number and so on. Finally, the last step in the workflow appears:</p>
<p>I’m able to download the very form I should have sent by snail mail from the beginning – and of course, it’s <em>NOT</em> even prefilled with all the information that I just entered.</p>
<p>Exhausted, I’m following the instruction, printing it out, signing it, scanning it into a PDF with the standard OS X Preview app and uploading it to the “my case” page.</p>
<p>It’s a pattern that’s constantly reoccuring: Instead of taking the opportunity and simplifying online access for citizens and business owners, the process has been designed to resemble <em>the Government’s</em> paper workflow as closely as possible, with all the bells and whistles on top that they think they need for “secure” interaction. Hell, they’re even still writing all about mailings, forms and documents.</p>
<h4>2013-06-01:</h4>
<p>I’m getting a ticket number manually assigned. Whew – something’s moving.</p>
<h4>2013-06-04:</h4>
<p>I’m getting a message by the system: “There’s a problem with your case”. After only 10 tries, I’m successfully logged back in again and see that apparently they can’t read my document. Weird. I’m reuploading the PDF.</p>
<h4>2013-06-08:</h4>
<p>Same thing: They can’t read my form. Again. Apparently ISO standardized and validated PDF files are way too unstandard to be read by government agencies. To top it off, when I’m clicking the link to my uploaded and signed form in the interface, I’m actually <em>seeing the correct PDF document in my browser</em>. How can’t they not see or access it? This is just frustrating. Dumb actually. I’m done here.</p>
<h4>2013-07-24:</h4>
<p>“Your case was closed by not answering our request”. Yeah, whatever. Total time clocked so far: way more than 3 business hours. I wish this was billable.</p>
<h4>2013-07-29:</h4>
<p>I’ve had enough. I’m doing what I should have been doing all along. Printing out the business dismantling form from the very first page and a copy of my ID, signing and stamping it and sending it by snail mail. Total time clocked: 10 minutes, 55 cent postage.</p>
<p>Sidenote: This could have been for free with signed e-mail as well. I even have the new German “electronic” ID with signature chip and a reader (introduced in 2010), uniquely identifing me as a German citizen with full cryptographic proof.</p>
<p>However, there’s nobody to send it to: The government agency still can’t receive signed mails “for technical reasons”.</p>
<p>Never would have thought I’d ever think of sending snail mail as easier than handling my business online.</p>
<h4>2013-08-27</h4>
<p>Having long forgotten about this major disappointment, that day I’m coming home and there’s registered mail in my postbox: An “Androhungsbescheid” (literally: “Threatening Notification”), telling me that if I don’t re-upload the PDF, I’m subject to a penalty payment of 500€.</p>
<p>Seriously?</p>
<p><em>You</em> can’t read the most standardized document format on the planet, you provide me an interface whose only purpose is to steal my time, I’m sending you everything by mail, you can’t even find it and put the files on my case – and now you’re even <em>threatening</em> me?</p>
<p>I’ve lost it. Okay, so what to do now? The two times that companies actually made me lose my composure by extreme amounts of insolence, I immediately dealt with their press department by issuing a credible threat of massive public shaming, leveraging my former media agency. Up to now, I’ve always won – and more.</p>
<p>This time it’s hopeless. Researching half a night, I’m realizing that there’s simply nobody to complain to. There’s a single method of complaint against people or actions of the German government, called “Dienstaufsichtsbeschwerde” (disciplinary complaint) that’s commonly taught to law practitioners as “formlos, fristlos, fruchtlos” (informally, without notice and pointless). Everyone can write one and exactly nothing will happen. There’s just a giant moloch that’s refusing to move.</p>
<p>This time I’ll just play along, reopening my case and reuploading the PDF in every image format that has ever been invented from BMP to DICOM in the hope of being accepted – and only leave this blog post as an exhausted note of protest to <em>someone</em>.</p>
<p>I’m an IT guy with some pride and attitude left. Whoever designed this system’s UX, technical base and business processes would immediately have been fired from any of the companies I’ve ever worked for due to total incompetence. Who does things like this to humanity? How much do they get paid? Where’s the accountability and common sense gone in projects like this?</p>
<p>I’m a loss of words. There’s only so much for me to say:</p>
<p>“Dear German government. The best way to grow and keep the economy is to not utterly frustrate and stand in the way of the people who actually grow and keep the economy.”</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[You probably have too much code.]]></title>
<link href="http://distributed.hamann.se/blog/2013/06/01/you-probably-have-too-much-code/"/>
<updated>2013-06-01T17:06:00+02:00</updated>
<id>http://distributed.hamann.se/blog/2013/06/01/you-probably-have-too-much-code</id>
<content type="html"><![CDATA[<p>There’s a recent trend in the software industry that I’m watching with a little concern. Countless new tools for static code analysis and “automated software review” are popping up on the market and being in a technology leadership position at a successful startup myself, I’m now getting pitched for this stuff on a regular basis. Even though I’m absolutely dedicated to code quality, I’m very skeptical at these tools – because, in my humble opinion, they are trying to solve the wrong problem. There’s an uncomfortable truth that I’d love to tell all companies seriously seeking to relieve their code quality pains by using these methodologies:</p>
<blockquote><p>You probably have too much code.</p></blockquote>
<p>The problem at the root of this issue is a decades-old industry focus and incentivization on <em>producing</em> code, not on taking it away. If you give a problem to a developer, there’s a usual and deeply embedded Pavlov reflex of solving it by <em>writing</em> or <em>adding</em> code. It’s what a software developer is paid for anyway, right?</p>
<h4>Less is more</h4>
<p>All these are signs of a dangerous culture that encourages software bloat, steadily leading to slowed software development up to a state of total unmaintainability. If you don’t believe me, go <a href="http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/80ba792f-5e2a-2f10-0594-89315609e177?QuickLink=index&overridelayout=true&55959128966099">take a look at the 3.5 gigabyte (!) binary of SAP BusinessObjects Explorer</a>. I’ve seen so many companies grinding to a halt by throwing software layers on top of each other, that I’ve coined a name for it: <em>Death by Abstraction</em>.</p>
<h3>Why you need less code</h3>
<h4>Less code –> less errors. Period.</h4>
<p>There are recent studies showing once and for all that the density of software errors is <a href="http://www.lsmod.de/~bernhard/cvs/text/dipl/papers/p42-basili.pdf">most strongly correlated by the number of lines in the module, <em>regardless of the language used</em></a>. Having less code makes projects stabler, leads to better test coverage and simplifies operations. It also helps to use a language that is as terse as possible (we love Coffeescript).</p>
<h4>Less I/O</h4>
<p>Code doesn’t just have to be written – it has to be read as well. Reading and understanding code takes time, lots of it. (Paradoxically, large, complex and fragmented code bases tend to have a lot of large, complex and fragmented documentation to read on top.)</p>
<p>Time however is usually one of the most valuable and rare ressources in the software industry.</p>
<p>Having less code helps you onboarding new developers on a project and getting productive way faster, and in the same way also helps with turnover. Most importantly, it prevents complexity by “code fear”: If people can’t understand other peoples’ code, they tend to abstract it away. This is the exact reason monstrosities like <a href="http://en.wikipedia.org/wiki/Common_Object_Request_Broker_Architecture">CORBA</a> were born in the first place, feeding the code-adding cycle once again. I’ve been in companies where projects had to be abandoned because of people being scared to touch a recently “disappeared” employee’s source code. Writing less codes helps preventing such disasters in the first place.</p>
<h3>How to get and write less code</h3>
<h4>Incentivise and encourage a “taking away” mentality</h4>
<p>I can’t overstate how important it is to continuously fight against the code-adding mentality itself that still resides in most developer’s heads. Put into your core values, acceptance criteria and processes that taking code away is just as important (if not even slightly more) than writing it. Every net code line and file added is a maintenance liability.</p>
<h4>Simplify requirements</h4>
<p>If you are designing a software component to solve all of your current and future problems at once, <a href="http://en.wikipedia.org/wiki/Second-system_effect">you’re setting yourself up for failure</a>. Solve the problem at hand and nothing more – if in doubt, <a href="http://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it">you ain’t gonna need it</a>.</p>
<p>Clearly separating your problem domain and constraints enables you powerful shortcuts. If a part of your internal tools does not need to be in the customer facing dashboard, you can save yourself half the development time by getting away without pixel-perfect design mockups, OAuth authentication, less QA, less testing… you get the idea.</p>
<p>Generalization is usually the enemy of lean software. One of the best strategies is to force yourself to generalize “on demand”. Start with all assumptions and shortcuts neccessary to make your life easier <em>right now</em> and follow the old adage <a href="http://c2.com/cgi/wiki?ThreeStrikesAndYouRefactor">“One, Two, Three, Refactor”</a>. Don’t worry too much about “later” – more often than you think, you’ll come up with an even more elegant solution anyway.</p>
<h4>Challenge frameworks, libraries and abstraction layers</h4>
<p>Separation of Concerns is incredibly important – but so is keeping a code base maintainable. Coming from a Java shop with nightmares of SOA done wrong, we had a grand vision of a one-page dashboard in Backbone and Chaplin.JS served by a fancy Collection+JSON API as Broadway.JS application on top of a Sequelize ORM.</p>
<p>In the end we saw we had so many dependencies and “magic” happening behind the scenes that displaying simple things from the database in the dashboard took ages in development – we realized we had made a 2013 version of the same mistakes in overengineering we were committed to not repeat. We threw out nearly everything and now use pure Backbone.JS on a lean, hand-built JSON API, tripling development efficiency.</p>
<p>Don’t believe in hypes. Believe in simplicity.</p>
<h4>Take a step back</h4>
<p>If you regularly look at the big picture and question the compartment and component borders inside of peoples’ heads, you’ll often find that modules do duplicated work, can be merged or could be thrown away altogether. Be bold and go for it when there’s an opportunity.</p>
<h4>Fight complexity all the way</h4>
<p>The sooner you realize that code is as much a part of the problem as a part of the solution, the sooner you can start coding less. If the solution to your problem can fit into a plain old class, don’t encapsulate it with the newest MVC framework or instantiate an <a href="http://static.springsource.org/spring/docs/2.5.x/api/org/springframework/aop/framework/AbstractSingletonProxyFactoryBean.html">AbstractSingletonProxyFactoryBean</a> around it “just because”. Seriously, keep it as simple as possible – but not simpler.</p>
<p>I may be a heretic here, but if you look objectively at the <a href="http://en.wikipedia.org/wiki/Design_Patterns">Gang of Four patterns</a>, you’ll realize that most of them were designed to lessen the pain of adding more code to a project involving many developers by <em>adding more code</em>. And that’s okay, remember they come from an era without Git, TDD and Continuous Integration. They can be powerful at times, but they often serve as fuel for the vicious and self-serving cycle of software bloat. Use them wisely, but not for their own sake just or to look professional.</p>
<h4>Continuous Rewrite</h4>
<p>I’m a big believer in Continuous Rewriting and it also <a href="http://www.citconf.com/wiki/index.php?title=Continuous_rewriting">seems to pick up adoption lately</a>. If instead of going all the way, you start out with a prototype that’s <em>meant to be thrown away</em>, you achieve a lot of important goals:</p>
<ul>
<li>You free your mind from having to deliver the final solution and iterate <em>fast</em>, casually adding tests as you go.</li>
<li>You protect yourself from abstracting too much and prevent adding “fixtures” for possible future use.</li>
<li>You will have solved (a part of) the problem with working code and have validated the technical feasability of all aspects.</li>
<li>You learn a <em>lot</em> about the stuff that works and scales and the stuff that doesn’t.</li>
</ul>
<p>Now if you have separated your problem domain properly and have tests for business rules, it’s easy to take all the knowledge you’ve gained to not just refactor and extend the component but to <em>completely rewrite it</em>, simplifying the internal design to the maximum. This way, we have rewritten our core matching component three times already – from 3000 lines of Java to 600 lines of Python to 200 lines of Coffeescript, while still adding functionality and scalability by a factor of 200.</p>
<h4>Pair programming and code reviews</h4>
<p>Pair programming and code reviews are the best tools to prevent people from reinventing the wheel, continuously update each other on best practices, simplify code and reduce clutter. Both techniques directly as well as indirectly lead to less code – they guarantee that someone else will understand and have read the code and reduce “code fear” by an order of magnitude. If you find you often don’t have the time to do pair programming and code reviews, you probably should do more of it.</p>
<h3>More tips?</h3>
<p>I’m curious, so be sure to let me know in the comments!</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Cache invalidation made easy]]></title>
<link href="http://distributed.hamann.se/blog/2013/05/31/cache-invalidation-made-easy/"/>
<updated>2013-05-31T08:13:00+02:00</updated>
<id>http://distributed.hamann.se/blog/2013/05/31/cache-invalidation-made-easy</id>
<content type="html"><![CDATA[<blockquote><p>“There are only two hard problems in Computer Science: cache invalidation and naming things.”</p><footer><strong>Phil Karlton</strong></footer></blockquote>
<p>Full disclaimer: No, I didn’t find the perfect solution either (guess it’s an NP hard problem). For a lot of use cases, one of the generically applicable patterns I like best is explained very well by <a href="http://37signals.com/svn/posts/3113-how-key-based-cache-expiration-works">DHH</a> – my problem with it was that it does not cover the case when entity content changes under the very same ID. I will show you another nice and generic pattern for just this purpose for you to have another trick up your sleeve.</p>
<p>For us, the key problem was about distributed tracking applications caching metadata about the campaigns or images we’re delivering. For example, if a campaign manager needs to change the URL, the tracking application needs to redirect to another target. As the tracking is high volume, caching was a no-brainer. For cache invalidation, we settled for a pull approach of one minute refreshes from the database first, which could unfortunately serve stale data and obviously wouldn’t scale with growing numbers of entries and servers.</p>
<p>Now instead of custom-building something ourselves, we thought of a more generic approach – and for us, it boiled down to a dead simple convention.</p>
<h4>How it works</h4>
<p>First step, set up a messaging service. If you’re on Amazon like us, <a href="http://aws.amazon.com/de/sns/">SNS</a> (+<a href="http://aws.amazon.com/de/sqs/">SQS</a> maybe) fits the bill perfectly and is set up in minutes, otherwise you might consider <a href="http://www.rabbitmq.com/">RabbitMQ</a> or any AMQP provider.</p>
<p>Second step, create one topic for every entity with the same name.</p>
<p>Third step, follow an easy convention:</p>
<ul>
<li>Everyone who is mutating this entity (with us, it’s just the API, making things even easier) publishes the entity ID to the topic after writing to the database.</li>
<li>Everyone who is caching this entity subscribes to the topic and re-pulls the instance when a “dirty” ID comes in.</li>
</ul>
<h5>This approach is so beautiful for a lot of reasons:</h5>
<ul>
<li>Loose coupling. There are no hard dependencies between any apps following this pattern, but they all work together magically with no stale caches</li>
<li>Very easy to convert from a pull-based system</li>
<li>No to low security hassle. Data never goes through the queue, only IDs</li>
<li>Added plus for SNS users: You can safely leave the dirty work of monitoring and reliably keeping up a messaging system to Amazon.</li>
</ul>
<p>Caveat lector: Obviously this approach will probably only be viable for medium- to low-volume core and master data. For caching and invalidation at several orders of magnitude higher, you’d probably be looking at specialized solutions and optimized modelling (Cassandra, S4 and the likes). Also, be careful as this eventually consistent solution still has some low but unknown amount of time where the cache will be stale. So if you’re <em>absolutely dependent</em> on consistency for a problem domain, there are few other ways than disabling caching altogether.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Careful with Cassandra upserts]]></title>
<link href="http://distributed.hamann.se/blog/2013/05/22/careful-with-cassandra-upserts/"/>
<updated>2013-05-22T08:13:00+02:00</updated>
<id>http://distributed.hamann.se/blog/2013/05/22/careful-with-cassandra-upserts</id>
<content type="html"><![CDATA[<p>A nice thing about Cassandra is the easily understandable data model: There are just upserts – an insert will automatically update / overwrite old rows. This does NOT hold true however in every case when using <em>dynamic</em> columns, as Cassandra does not have the same concept of a “row” as a traditional database.</p>
<p>Essentially, a Cassandra “row” is just a double hashmap. One layer goes to the key and says exactly <em>on which server</em> the row is, and the column key says <em>where on the server</em> the column is. This very flexible concept can lead to a problem later on though when some of the columns are different.</p>
<p>Here’s an entry in the “Employees” ColumnFamily:</p>
<pre><code>employee_id: 599 (KEY)
name: "Larry Page"
age: 46
</code></pre>
<p>Now for various reasons, we have to update employee 599 with another denormalized person:</p>
<pre><code>employee_id: 599 (KEY)
name: "Sylvie Stone"
devices: ["MacBook Pro"]
</code></pre>
<p>Sylvie didn’t tell us her age (she’s a lady after all!) and for new employees, we’re also tracking the devices we handed them. When we’re upserting employee 599, a lot of people with SQL or a document-oriented database background are expecting to have the second entry in the database. That’s not true at all unfortunately – what we will find now is this:</p>
<pre><code>employee_id: 599 (KEY)
name: "Sylvie Stone"
devices: ["MacBook Pro"]
age: 46
</code></pre>
<p>Welcome to the world of column-oriented databases – and before you think “WTF”, think about it for a moment. This is expected behaviour and part of Cassandra’s “independent columns” paradigm. Even if it looks like it in CQL, you never actually overwrite rows – you overwrite the columns behind it.</p>
<p>So how to avoid this? You just need to model your data properly or navigate around it. As Cassandra columns are way smarter than columns in other databases, there exists a way to correct for this effect in case it’s needed. How? Look under the hood. What Cassandra really stores is this:</p>
<pre><code>"599": [
{name:employee_id, value:599, timestamp: 1340385863990010, ttl: 0},
{name:name, value:"Sylvie Stone", timestamp: 1340385863990010, ttl: 0},
{name:devices, value:["MacBook Pro"], timestamp: 1340385863990010, ttl: 0},
{name:age, value:46, timestamp: 1340133763990010, ttl: 0}
]
</code></pre>
<p>As you’ll have imagined, Sylvie is relieved she’s not really 46… the entry is simply older than the rest of them, but was neither deleted nor overwritten!</p>
<p>Every decent driver for Cassandra can expose the timestamps and TTL’s as well – and there’s your solution to clean up the mess in the “eventually consistent” paradigm that the database follows: If it’s not the same timestamp as the key, simply discard the column (you’re free to delete it as well).</p>
<p>And don’t worry, this added hassle in handling something that would be considered a no-brainer with more conventional databases is more than worth the flexibility gained with Cassandra’s independent columns. More on advanced data modelling leveraging this power will follow!</p>
]]></content>
</entry>
</feed>