-
Notifications
You must be signed in to change notification settings - Fork 13
/
principles.html
507 lines (457 loc) · 38.7 KB
/
principles.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
---
title: The Institute for Ethical AI & Machine Learning
description: The Institute for Ethical AI & Machine Learning is a Europe-based research centre that brings togethers technologists, academics and policy-makers to develop industry frameworks that support the responsible development, design and operation of machine learning systems.
---
<html>
<head>
{% include header.html %}
</head>
<body>
<div id="page-wrapper">
{% include navbar.html %}
<section id="banner" style="font-size: 11pt">
<div class="content" style="text-align: center">
<h2 style="font-size: 4em; color: #01C3A7; font-weight: bold; text-align: center; line-height: 1.3em; margin-bottom: 20px">The Responsible Machine Learning Principles</h2>
<img class="logo-image" src="images/logos/eml-logo-white.png" alt="" style="max-width: 440px; width: 80%; margin-top: 30px" />
<header>
<h2 style="font-weight: bold; margin-top:0px">A practical framework to develop AI responsibly</h2>
<p style="font-weight: bold; max-width: 850px">
The 8 principles of responsible ML development provide a practical framework to support technologists when designing, developing or maintaining systems that learn from data.
<br>
<br>
If these principles resonate with you, you invite you to join the <a href="index.html#contact">Ethical ML Network (BETA)</a>, and be part of a global network of leaders driving forward positive change in this area.
</p>
</header>
<br>
</div>
<a href="#one" class="goto-next scrolly">Next</a>
</section>
<section id="one" class="spotlight style1 bottom">
<span class="image fit main"><img src="images/dots-vision.jpg" alt="" /></span>
<div class="content">
<header class="major">
<h2>
The Responsible Machine Learning Principles
</h2>
<h3>
The Responsible Machine Learning Principles are a practical framework put together by domain experts.
<br>Their purpose is to provide guidance for technologists to develop machine learning systems responsibly.
<br>
<br>
</h3>
</header>
</div>
</section>
<section id="four" class="wrapper style1 special fade-up">
<div class="container">
<div class="container">
<div class="row uniform">
<section class="3u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-cog"></span>
<h3><a href="#commitment-1">1. Human augmentation</a></h3>
<p>I commit to assess the impact of incorrect predictions and, when reasonable, design systems with human-in-the-loop review processes</p>
</section>
<section class="3u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-pie-chart"></span>
<h3><a href="#commitment-2">2. Bias evaluation</a></h3>
<p>I commit to continuously develop processes that allow me to understand, document and monitor bias in development and production.</p>
</section>
<section class="3u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-info-circle"></span>
<h3><a href="#commitment-3">3. Explainability by justification</a></h3>
<p>I commit to develop tools and processes to continuously improve transparency and explainability of machine learning systems where reasonable.</p>
</section>
<section class="3u$ 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-flask"></span>
<h3><a href="#commitment-4">4. Reproducible operations</a></h3>
<p>I commit to develop the infrastructure required to enable for a reasonable level of reproducibility across the operations of ML systems.</p>
</section>
<section class="3u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-group"></span>
<h3><a href="#commitment-5">5. Displacement strategy</a></h3>
<p>I commit to identify and document relevant information so that business change processes can be developed to mitigate the impact towards workers being automated.</p>
</section>
<section class="3u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-bar-chart"></span>
<h3><a href="#commitment-6">6. Practical accuracy</a></h3>
<p>I commit to develop processes to ensure my accuracy and cost metric functions are aligned to the domain-specific applications.</p>
</section>
<section class="3u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-commenting-o"></span>
<h3><a href="#commitment-7">7. Trust by privacy</a></h3>
<p>I commit to build and communicate processes that protect and handle data with stakeholders that may interact with the system directly and/or indirectly.</p>
</section>
<section class="3u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-exclamation-triangle"></span>
<h3><a href="#commitment-8">8. Data risk awareness</a></h3>
<p>I commit to develop and improve reasonable processes and infrastructure to ensure data and model security are being taken into consideration during the development of machine learning systems.</p>
</section>
</div>
</div>
<br>
<br>
<br>
<br>
<h2>Continue reading for more detail on each principle</h2>
</div>
</section>
<section id="two" class="spotlight style2 right stylelong">
<span class="image fit main"><img src="images/robothand.png" alt="" /></span>
<div class="content">
<header>
<h2 id="commitment-1">1. Human augmentation</h2>
<p>I commit to assess the impact of incorrect predictions and, when reasonable, design systems with human-in-the-loop review processes.</p>
</header>
<p>When introducing automation through machine learning systems, it's easy to forget the impact that wrong predictions can have in full end-to-end automation.</p>
<p>Technologists should understand the consequences of incorrect predictions, especially when automating critical processes that can have significant impact in human lives (e.g. justice, health, transport, etc).</p>
<p>However this isn't limited to obvious critical use-cases - enabling subject-domain-experts as human-in-the-loop reviewers at the end of ML systems can have significant benefits.</p>
<ul class="actions">
<li><a href="index.html#contact" class="button">Join the network</a></li>
</ul>
</div>
<a href="#three" class="goto-next scrolly">Next</a>
</section>
<section id="four" class="wrapper style1 special fade-up">
<div class="container">
<header class="major">
<h2>1. Human augmentation</h2>
<p>What are some examples where I should look towards adding human-in-the-loop review processes?</p>
</header>
<div class="box alt">
<div class="row uniform">
<section class="4u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-flask"></span>
<h3>Automatic prision sentence scrutiny</h3>
<p><a href="https://www.wired.com/2017/04/courts-using-ai-sentence-criminals-must-stop-now/">A fully end-to-end machine learning system that predicts prison sentences automatically</a> is a classic example of a system that should be deployed carefuly, ideally with a human-in-the-loop review. Especially given that in this example, the inner workings of the model cannot be explained which is addressed in <a href="#commitment-5">Commitment #5</a>.</p>
</section>
<section class="4u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-comment"></span>
<h3>Fraud detection evaluation</h3>
<p>Fraud detection prediction is a perfect example where a human-in-the-loop process design should be necessary. Instead of fully removing humans from the process completely, a domain expert can be requested to verify some of the results from the model to ensure the performance is aligned with the objectives.</p>
<p>Often a partial automation (i.e. having 3 people instead of 50 performing a specific process) may still have significant value, and provide an extra layer of safety.</p>
</section>
<section class="4u$ 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-area-chart"></span>
<h3>Temporary manual review process</h3>
<p>When rolling out automation systems, the ultimate objective may be to fully automate a process end-to-end. However, when reasonable, it may be required to perform the deployment of the system with a human-in-the-loop review in place. The system's precision and recall can then be evaluated during a production period, and full automation may be performed once is deemed acceptable.</p>
</section>
</div>
</div>
</div>
</section>
<section id="three" class="spotlight style3 left stylelong">
<span class="image fit main bottom"><img src="images/bias.jpg" alt="" /></span>
<div class="content">
<header>
<h2 id="commitment-2">2. Bias evaluation</h2>
<p>I commit to continuously develop processes that allow me to understand, document and monitor bias in development and production.</p>
</header>
<p>When building systems that have to make non-trivial decisions, we will always face the computational and societal bias that is inherent in data, which is impossible to avoid, but is possible to document and/or mitigate.</p>
<p>However we should take a step back from only trying to embed ethics directly into the algorithms themselves. Instead, technologists should focus on building processes & methods to identify & document the inherent bias in the data, features and inference results, and subsequently the implications of this bias.</p>
<p>Given that the implications of the bias identified are specific to the domain, and use-case of the technology, technologists should be able to create, identify and explain the bias in the data and features, so the right processes can be put in place to mitigate potential risks.<p>
<ul class="actions">
<li><a href="index.html#contact" class="button">Join the network</a></li>
</ul>
</div>
<a href="#four" class="goto-next scrolly">Next</a>
</section>
<section id="four" class="wrapper style1 special fade-up">
<div class="container">
<header class="major">
<h2>2. Bias evaluation</h2>
<p>What are some examples where I should look towards having effective bias evaluation?</p>
</header>
<div class="box alt">
<div class="row uniform">
<section class="4u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-flask"></span>
<h3>Pragmatic evaluation of bias</h3>
<p>As a technologist it is important to obtain an understanding of how potential biases might arise. Once the different sub-categories for bias are identified it's possible to evaluate the results on a breakdown based on precision, recall and accuracy for each of the potential inference groups.</p>
<p><a href="https://pair-code.github.io/what-if-tool/">Google's what if tool on income classification</a> provides an interactive way to visualise and assess for model and data bias - it's possible to see that "race" and "sex" are two of the strongest features.</p>
</section>
<section class="4u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-comment"></span>
<h3>Having "the right" datasets</h3>
<p>Whether it is from manual labelling, collecting from a data-source or generating it through simulations, it is important to appreciate that getting access to representative and balanced datasets is a non-trivial task.</p>
<p><a href="https://explosion.ai/blog/supervised-learning-data-collection">"Don't expect good data by boring the hell out of underpaid people"</a> - wise knowledge from the Core SpaCy Team, and basically a solid wake-up call for technologists so they are able to make explicit efforts when getting access or generating training or evaluation datasets.</p>
</section>
<section class="4u$ 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-area-chart"></span>
<h3>Equity, equality and beyond</h3>
<p>The deployment of a biased system, can have the effect of reinforcing that pre-existing societal bias (<a href="https://www.oii.ox.ac.uk/videos/does-ai-have-gender/">Professor Gina Neff provides an insight in her talk</a>), "Does AI Have Gender?". It is certainly possible for the system to be configured in such a way that it works towards reduing that bias. </p>
<p>However this is an extremely sensitive and complex issue. For example, do we want to configure the system for equality? or for equity? These decisions should not be taken lightly. For most (if not all) cases, the decision should be beyond the technologists themselves.</p>
<p>Because of reasons like this, this commitment encourages technologists to focus on identifying and documenting the biases present together with their potential impact. Ethical decisions should be considered together with the relevant industry stakeholders (ethics boards, regulatory bodies, etc).</p>
</section>
</div>
</div>
</div>
</section>
<section id="two" class="spotlight style2 right stylelong">
<span class="image fit main"><img src="images/iceberg.png" alt="" /></span>
<div class="content">
<header>
<h2 id="commitment-3">3. Explainability by justification</h2>
<p>I commit to develop tools and processes to continuously improve transparency and explainability of machine learning models where reasonable.</p>
</header>
<p>With the deep learning hype, technologists often throw large amounts of data into complex ML pipelines hoping something will work, without understanding how the pipelines work internally. However technologists should invest reasonable efforts where necessary to continuously improve tools and process that allow them to explain results based on features and models chosen.</p>
<p>It is possible to use different tools and approaches to make ML systems more explainable, such as by adding domain knowledge through features themselves instead of just allowing deep/complex models to infer them. </p>
<p>Even though on certain situations accuracy may decrease, the transparency and explainability gains may be significant.</p>
<ul class="actions">
<li><a href="index.html#contact" class="button">Join the network</a></li>
</ul>
</div>
<a href="#three" class="goto-next scrolly">Next</a>
</section>
<section id="four" class="wrapper style1 special fade-up">
<div class="container">
<header class="major">
<h2>3. Explainable by justification</h2>
<p>What are some examples where I could get a better understanding on compliance by design?</p>
</header>
<div class="box alt">
<div class="row uniform">
<section class="2u 12u(medium)">
</section>
<section class="4u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-area-chart"></span>
<h3>Explainability through feature importance</h3>
<p>Often the challenge of explainability can be simplified by reducing the scope of what needs to be explainable. In some occasions, it is possible to increase explainability of the model by analysing the features and inference results.</p>
<p>Getting a better understanding on the importance of each feature on each result would enable technologists to explain the model itself. There are several tools that can help for this, including <a href="https://ai.googleblog.com/2018/09/the-what-if-tool-code-free-probing-of.html">Tensorboard's What-if Screen</a>, as well as <a href="https://github.com/slundberg/shap">"SHAP (SHapley Additive exPlanations)"</a> which allow for understanding of the effect of features.</p>
</section>
<section class="4u$ 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-comment"></span>
<h3>Domain knowledge to increase explainability</h3>
<p><a href="https://www.youtube.com/watch?v=Um7grgYdBQQ">Bons.ai has a great insight on explainability</a> that shows how it is possible to introduce explainability even in very complex models by introducing domain knowledge.</p>
<p>Deep learning models are able to identify and abstract complex patterns that humans may not be able to see in data. However, there are many situations where introducing a-priori expert domain knowledge into the features, or abstracting key patterns identified in the deep learning models as actual features, it would be possible to break down the model into subsequent, more explainable pieces. </p>
</section>
</div>
</div>
</div>
</section>
<section id="three" class="spotlight style3 left stylelong">
<span class="image fit main bottom"><img src="images/blueprint.jpg" alt="" /></span>
<div class="content">
<header>
<h2 id="commitment-4">4. Reproducible operations</h2>
<p>I commit to develop the infrastructure required to enable for a reasonable level of reproducibility across the operations of ML systems.</p>
</header>
<p>Often production machine learning systems don't have the capabilities to diagnose or respond effectively when something bad happens with a model, let alone reproduce the same results.</p>
<p> In production systems, it is important to perform standard procedures, such as reverting a model to a previous version, or reproducing an input to debug a specific functionality, which introduces complexity in infrastructure.</p>
<p>There are tools and best practices for machine learning operations. These aid reproducibility of machine learning systems by proividing ways to abstract computational graphs and archive data at each step of transformation pipelines. These should be adopted to provide a reasonable level of reproducibility of operations.</p>
<ul class="actions">
<li><a href="index.html#contact" class="button">Join the network</a></li>
</ul>
</div>
<a href="#four" class="goto-next scrolly">Next</a>
</section>
<section id="four" class="wrapper style1 special fade-up">
<div class="container">
<header class="major">
<h2>4. Reproducible operations</h2>
<p>What are some examples to develop infrastructure that enables reproducibility?</p>
</header>
<div class="box alt">
<div class="row uniform">
<section class="2u 12u$(medium)">
</section>
<section class="4u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-flask"></span>
<h3>Abstracting each computational step</h3>
<p>In order to make a machine learning model reproducible, it is necessary to abstract its constituent components: namely 1) data, 2) configuration/environment, and 3) computational graph. If all these three points are abstracted, it is possible to have a basis for model reproducibility.</p>
<p><a href="https://www.youtube.com/watch?v=eOzl-LFqYFM">Pachyderm has an excellent breakdown of how to abstract each computational step</a> together with its components. Similarly, <a href="https://www.seldon.io/">Seldon Core</a> provides a flexible way to orchestrate the operations and serving of models in production.</p>
</section>
<section class="4u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-comment"></span>
<h3>Adopting Open Standards</h3>
<p>It is often important to decide what the level of abstraction will be, as it is possible to focus on building very complex layers to abstract multiple machine learning libraries with specific data input/output formats.</p>
<p>There are multiple formats for trained machine learning models - the most popular include: <a href="https://onnx.ai/"> Open Neural Network Exchange Format</a>, <a href="https://www.khronos.org/nnef">Neural Network Exchange Format</a>, and <a href="https://dmg.org/">Predictive Model Markup Language</a>.</p>
</section>
</div>
</div>
</div>
</section>
<section id="two" class="spotlight style2 right stylelong">
<span class="image fit main"><img src="images/robotarm.jpg" alt="" /></span>
<div class="content">
<header>
<h2 id="commitment-3">5. Displacement strategy</h2>
<p>I commit to identify and document relevant information so that business change processes can be developed to mitigate the impact towards workers being automated.</p>
</header>
<p>When rolling out systems that automate medium to large-scale processes, there is almost always an impact on an organisation- or industry-level, which would affect multiple individuals.</p>
<p>As technologists we should look beyond the technology itself, and have initiative to support the necessary stakeholders so they can develop a change-management strategy when rolling out the technology.</p>
<p>Although often technologists themselves may not be leading the operational transformation, it is still important to make sure the processes are in place when relevant, irrespective of the type of work being automated (i.e. skilled or otherwise).</p>
<ul class="actions">
<li><a href="index.html#contact" class="button">Join the network</a></li>
</ul>
</div>
<a href="#three" class="goto-next scrolly">Next</a>
</section>
<section id="four" class="wrapper style1 special fade-up">
<div class="container">
<header class="major">
<h2>5. Displacement strategy</h2>
<p>What are some examples where I should look towards developing displacement strategies?</p>
</header>
<div class="box alt">
<div class="row uniform">
<section class="4u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-area-chart"></span>
<h3>Processes to reduce impact</h3>
<p>There are currently a lot of articles covering the jobs being automated by AI (e.g. assembly line workers, field technicians, call center workers, etc), as well as technical articles providing insights on how to deploy machine learning models across production systems.</p>
<p>However it's often forgotten about the impact to individuals that are part of processes being automated. Fortunately business change has existed for a long time, and currently startups have been partnering with delivery partners such as <a href="https://en.wikipedia.org/wiki/Big_Three_%28management_consultancies%29">the Big Three Management Consultancy</a> firms. It is important for technologists to understand their potential impact, and subsequently the actions that can be taken to mitigate the impact.</p>
</section>
<section class="4u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-flask"></span>
<h3>Jevon's paradox</h3>
<p>A very interesting concept relevant to the current state of AI is Jevon's paradox. This paradox talks about how during the industrial revolution, innovations allowed for machines to perform the same output with less coal consumption.</p>
<p>Intuitively, it was thought that this would mean that the total coal required to power the industry would decrease. What happened instead is that given the cost to perform the same action decreased and got commoditised, more demand arose and the total coal consumption to power the industry actually increased. Analogous to this could be the rise of Excel, and in some areas, the rise of AI.</p>
</section>
<section class="4u$ 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-comment"></span>
<h3>AI business change strategies</h3>
<p>When planning the rollout of a new technology to automate a process, there are a number of people who's role or at least responsibilities will be automated. If this is not taken into consideration, these people will not have a transition plan and it won't be possible to fully benefit from the time and resources gained from the automation.</p>
<p>Technologists should make sure they are able to raise the relevant concerns when business change or operational transformation plans are being set up, as this would make a significant positive impact in the rollout of the technology.</p>
</section>
</div>
</div>
</div>
</section>
<section id="three" class="spotlight style3 left stylelong">
<span class="image fit main bottom"><img src="images/physicalchart.jpg" alt="" /></span>
<div class="content">
<header>
<h2 id="commitment-4">6. Practical accuracy</h2>
<p>I commit to develop processes to ensure my accuracy and cost metric functions are aligned to the domain-specific applications.</p>
</header>
<p>When building systems that learn from data, it is important to obtain a thorough understanding on the underlying means to assess accuracy.</p>
<p>Often it is not enough just using plain accuracy or default/basic cost metrics as what may be "correct" for a computer, may be "wrong" for a human (and vice-versa).</p>
<p>Ensuring the right challenge is being addressed in the right way can be achieved by breaking down the implications of f-1 score metrics from a domain-specific perspective, as well as exploring alternative cost functions based on domain-knowledge.</p>
<ul class="actions">
<li><a href="index.html#contact" class="button">Join the network</a></li>
</ul>
</div>
<a href="#four" class="goto-next scrolly">Next</a>
</section>
<section id="four" class="wrapper style1 special fade-up">
<div class="container">
<header class="major">
<h2>6. Practical accuracy</h2>
<p>What are some examples where I could understand practical accuracy use-cases?</p>
</header>
<div class="box alt">
<div class="row uniform">
<section class="2u 12u(medium)">
</section>
<section class="4u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-flask"></span>
<h3>Beyond accuracy</h3>
<p>It is not uncommon for teams to get stuck on default accuracy targets, doing everything possible to increase percentages naively. It is important to go beyond accuracy, and understand the performance of the model.</p>
<p>There is <a href="https://towardsdatascience.com/beyond-accuracy-precision-and-recall-3da06bea9f6c">a large toolbox of different approaches</a> that can be used to aid us in finding the most suitable accuracy metrics to use. This includes core fundamentals, such as precision, recall, F1-score, learning curves, error bars, confusion matrices and beyond. Technologists should make sure they understand and apply the fundamentals at all times.</p>
</section>
<section class="4u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-comment"></span>
<h3>Domain specific metrics</h3>
<p>When tackling an industry or application-specific problem, technologists should make sure they question what the implications of different types of errors have, as well as what the right way of evaluating these errors should be.</p>
<p>In system critical situations, there may be constraints where some types of errors are less critical than others. Similarly, there is often a lot of domain knowledge that can be abstracted in the cost functions to understand what answers may be intuitively correct to humans and how to represent these into mathematical functions.</p>
</section>
</div>
</div>
</div>
</section>
<section id="two" class="spotlight style2 right stylelong">
<span class="image fit main"><img src="images/trustrobot.png" alt="" /></span>
<div class="content">
<header>
<h2 id="commitment-7">7. Trust by privacy</h2>
<p>I commit to build and communicate processes that protect and handle data with stakeholders that may interact with the system directly and/or indirectly.</p>
</header>
<p>When developing large-scale systems that learn from data, there are often large number of stakeholders that may be affected directly and indirectly.</p>
<p>Building trust within relevant stakeholders is not only done through informing what data is being held, but also with the processes around the data, as well as the understanding of why protecting the data is important.</p>
<p>Technologists should enforce privacy by design across systems, as well as continuous processes to build trust not only with users, but also relevant stakeholders such as procurement frameworks, operational users, and beyond.</p>
<ul class="actions">
<li><a href="index.html#contact" class="button">Join the network</a></li>
</ul>
</div>
<a href="#three" class="goto-next scrolly">Next</a>
</section>
<section id="four" class="wrapper style1 special fade-up">
<div class="container">
<header class="major">
<h2>7. Trust by privacy</h2>
<p>What are some examples around building trust with stakeholders that interact with my models and systems?</p>
</header>
<div class="box alt">
<div class="row uniform">
<section class="2u 12u(medium)">
</section>
<section class="4u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-flask"></span>
<h3>Privacy at the right levels</h3>
<p>One key way to establish trust with users and relevant stakeholders is by showing the right process and technologies are in place to protect personal data.</p>
<p><a href="https://medium.com/uber-security-privacy/differential-privacy-open-source-7892c82c42b6">Uber's use of Differential Privacy</a> is a prime example, where they introduced a system that adds noise to query results, where the noise is relative to the level of granularity required by the query, to ensure that analysis still get access to the relevant datasets, whilst avoiding exposure of personal information.</p>
</section>
<section class="4u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-comment"></span>
<h3>Personal data via metadata</h3>
<p>Technologists should make explicit effort to understand the potential implications of metadata involved, and whether the metadata can expose unexpected personal information from relevant users or stakeholders.</p>
<p>The <a href="https://www.theguardian.com/uk-news/2018/mar/22/cambridge-analytica-scandal-the-biggest-revelations-so-far">cambridge analytica scandal</a> is the most relevant example, and a good generalisation for similar situations. Direct and in-direct users that interact with a system may give access to their data without realising the privacy breaches that could be extracted from metadata until it's too late.</p>
</section>
</div>
</div>
</div>
</section>
<section id="three" class="spotlight style3 left stylelong">
<span class="image fit main bottom"><img src="images/pairprogramming.jpg" alt="" /></span>
<div class="content">
<header>
<h2 id="commitment-8">8. Data risk awareness</h2>
<p>I commit to develop and improve reasonable processes and infrastructure to ensure data and model security are being taken into consideration during the development of machine learning systems.</p>
</header>
<p>Autonomous decision-making systems open the doors to new potential security breaches.</p>
<p>More importantly, it is critical to be aware that large percentage of security breaches occur due to human error as opposed to actual hacks (i.e. someone sending the dataset attached in an email by accident, or losing their laptop/phone).</p>
<p>Technologists should commit to prepare for both types of security risks through explicit efforts, such as educating relevant personnel, establishing processes around data, and assess implications of ML backdoors (such as adversarial attacks).</p>
<ul class="actions">
<li><a href="index.html#contact" class="button">Join the network</a></li>
</ul>
</div>
<a href="#four" class="goto-next scrolly">Next</a>
</section>
<section id="four" class="wrapper style1 special fade-up">
<div class="container">
<header class="major">
<h2>8. Security risks</h2>
<p>What are some examples where I should focus to become aware of potential risks in my data and models?</p>
</header>
<div class="box alt">
<div class="row uniform">
<section class="2u 12u(medium)">
</section>
<section class="4u 6u(medium) 12u$(xsmall)">
<span class="icon alt major fa-flask"></span>
<h3>Adversarial patch tricking models</h3>
<p>It is worth remembering that given machine learning systems are simple functions that given the right inputs, it's possible to obtain an expected output. Adversarial patches can be used to trick machine learning models to misclassify examples by only adding small noise to the input. The <a href="https://www.youtube.com/watch?v=c_5EH3CBtD0">AI Journal has a great video where they show how this could trick self-driving cars</a>.</p>
<p><a href="https://securityintelligence.com/how-can-companies-defend-against-adversarial-machine-learning-attacks-in-the-age-of-ai/">Security intelligence has a great write-up on this</a>, as well as some suggestions on how to protect ourselves. As always with cybersecurity it is impossible to fully protect from attackers, but it's certainly possible to introduce processes that mitigate basic loopholes.</p>
</section>
<section class="4u 6u$(medium) 12u$(xsmall)">
<span class="icon alt major fa-comment"></span>
<h3>Email sent to the wrong person</h3>
<p>A very large percentage of data breaches are caused due to simple human errors, such as sending the data to the wrong email address. <a href="https://www.mimecast.com/blog/2018/09/most-healthcare-data-breaches-now-caused-by-email/">Mimecast has an interesting article</a> which points out this is the case with very sensitive data in healthcare.</p>
<p>It is important that technologists take into consideration the whole lifecycle of the machine learning algorithm. The process and infrastructure to store the training data, accuracy, documentation, trained model, orchestration of the model, inference results and beyond.</p>
</section>
</div>
</div>
</div>
</section>
<section id="one" class="spotlight style1 bottom">
<span class="image fit main"><img src="images/dots-vision.jpg" alt="" /></span>
<div class="content">
<div class="container">
<header class="major">
<h2>If these principles resonate with you, we invite you to join the <a href="index.html#contact">Ethical ML Network (BETA)</a></h2>
</header>
</div>
</div>
</section>
{% include footer.html %}
</body>
</html>