-
Notifications
You must be signed in to change notification settings - Fork 10
/
IMPORTANT_NOTES
762 lines (683 loc) · 40 KB
/
IMPORTANT_NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
# version 0.213
(you must first upgrade to 0.203 if you haven't done so already, see notes
below)
See notes for 0.212 if you didn't upgrade to that version, though the known
issue has been resolved in this version.
New to this version is that redis-server processes won't be accidentally
spawned on farm nodes in some edge cases. Prior to running this version, now
might be a good time to clean up. Eg. for LSF, you could do something like:
$ bhosts > bhosts
$ cat bhosts | perl -ne '($host) = split; system(qq[ssh -o StrictHostKeyChecking=no $host "pkill -9 redis-server"]);'
(all users who have used your install should do this, or it should be run as
root)
# version 0.212
(you must first upgrade to 0.203 if you haven't done so already, see notes
below)
A known issue with this version is that the testing (and production?) server
no longer responds to the stop command, so you'll have to kill the associated
process(es) manually before starting the server again (before running another
test).
This version renames/deletes a number of pipelines (see HISTORY). Check that
the deleted ones are no longer installed as .pm files in your installation
location, and if they are, manually delete them. Then also remove them from
your production database (if they hadn't yet been used by a setup), eg. for
each pipeline name:
mysql> select ps.id from pipelinesetup ps join pipeline p on p.id = ps.pipeline
where p.name = "pipeline_name";
(if anything returned for the above, do not proceed with below)
mysql> delete sa, sb from stepadaptor sa join stepbehaviour sb on sa.pipeline =
sb.pipeline where sa.pipeline in (select id from pipeline where name =
"pipeline_name");
mysql> delete from pipeline where name = "pipeline_name";
# version 0.203
(you must first upgrade to 0.199 if you haven't done so already, see notes
below)
NB: if you have made a new irods datasource since v 0.199, the desired_qc_files
option may be stored incorrectly in the database, with extraneous double quotes
around the value. You will have to change this for it to work correctly.
# version 0.199
(you must first upgrade to 0.198 if you haven't done so already, see notes
below)
Neo4J is now required. See README.md for a brief installation guide. Re-run
'perl Build.PL' and say 'yes' to go through the installation questions and
supply answers to new questions, including your neo4j server access details.
This will also let you auto-install new required modules from CPAN:
Authen::Simple::PAM, Bytes::Random::Secure, Data::UUID, File::Share,
MIME::Base64, Mojo::UserAgent, Sort::Naturally and Twiggy::TLS.
'./Build install' is now required prior to running the production server. Do
not attempt to run it from the build directory. See README.md for non-root
installation.
The database schema has changed, so run vrpipe-db_upgrade. Make sure no VRPipe-
related processes are running first (stop the server, kill any jobs running in
your job scheduler). You may also take this opportunity to clear out your log
table in your MySQL production database:
mysql> truncate table pipelinesetuplog;
The new web-frontend is served with https; during your run of 'perl Build.PL'
above you get asked where your private TLS key and certificate files are stored.
If you don't have any these will be created for you, but that is considered
bad practice, since your users will have to accept an untrusted certificate.
Either way, afterwards ensure that these files are only readable by the user who
will run vrpipe-server. Because you will be accessing https over a non-standard
port, you may find that you can no longer access the website over a generic
ssh tunnelled connection. You'll have to explicitly forward the port.
To use steps that expect or create bam/cram/vcf/bcf files, you will need to set
the environment variable HTSLIB to point to the directory where you have built
htslib (http://www.htslib.org); this directory should have bin/htsfile inside
it.
Newer bam/cram/vcf/bcf-related steps and pipelines probably only work with
samtools v1.3+. Upgrade your samtools installation if you work with these kinds
of files.
The new vrtrack_qc sub-site on the web interface will make the first user to
log in the administrator of the qc sub-site. After upgrading, the desired admin
should log in with their unix username and password.
When starting vrpipe-server, you can probably now not use the --max_submissions
option, since you're now much less likely to run out of database connections.
Your MySQL database server should still allow around a third as many connections
as you have processors in your farm, to be safe.
# version 0.198
(you must first upgrade to 0.190 if you haven't done so already, see notes
below)
The database schema has changed, so run vrpipe-db_upgrade. After doing this,
also log in to your production database and do:
mysql> update stepstate set dispatched = 1;
# version 0.190
(you must first upgrade to 0.187 if you haven't done so already, see notes
below)
The following included pipelines have been changed by removing steps from them,
and VRPipe can't automatically cope with this:
sequenom_import_from_irods_and_covert_to_vcf
vcf_merge_and_compare_genotypes
vcf_merge_different_samples_to_indexed_bcf
If you've used these pipelines in any setups you will have to:
a) turn the server off
b) reset the setups (using vrpipe-setup --reset, which also deactivates them)
c) upgrade to 0.190 as normal
d) log in to your MySQL vrpipe production database and say for each pipeline:
mysql> select id from pipeline where name = "<pipeline name>";
mysql> delete from stepmember where pipeline = <pipeline id>;
mysql> delete from stepadaptor where pipeline = <pipeline id>;
mysql> delete from stepbehaviour where pipeline = <pipeline id>;
e) start the server
f) reactivate the setups
Additionally, any setup using a sequenom_import_from_irods_and_covert_to_vcf
setup as its source must change it's source from <setup_id>[2] to
<setup_id>[2:vcf_files].
# version 0.187
(you must first upgrade to 0.183 if you haven't done so already, see notes
below)
Number::Bytes::Human was added as a new CPAN dependency. Install it by rerunning
'perl Build.PL'.
# version 0.183
(you must first upgrade to 0.168 if you haven't done so already, see notes
below)
The database schema has changed, so run vrpipe-db_upgrade.
# version 0.168
(you must first upgrade to 0.166 or later, see notes below)
There is a change to how some information is stored in the database in this
version, so please follow these upgrade instructions carefully:
1) Stop the VRPipe server ('vrpipe-server stop'), ensure that it left no
processes behind (check with 'ps aux'), kill off any VRPipe-related jobs
and confirm nothing is accessing your production database (eg. log into
MySQL and do 'show processlist;').
2) Install this latest software in your usual way
3) Run the normal automated db_upgrade script:
$ vrpipe-db_upgrade
4) Run this Perl one-liner; it could take anywhere from seconds to hours to run
depending on this size of your database:
$ perl -Imodules -MVRPipe -Mstrict -we 'my $pager = VRPipe::FileList->search_paged({}); while (my $fls = $pager->next) { foreach my $fl (@$fls) { my $str = join(",", sort { $a <=> $b } VRPipe::FileListMember->get_column_values("file", { filelist => $fl->id })); my $dmd5 = Digest::MD5->new(); $dmd5->add($str); $fl->lookup($dmd5->hexdigest); $fl->update; } }'
5) Now you can start using VRPipe again as normal.
# version 0.166
(you must first upgrade to 0.165, see notes below)
Prior versions had a bug where vrpipe-handler could get stuck running for ever
even after completing their work, filling up the log with endless error
messages. Before upgrading, stop the server and kill all existing handler jobs
still running in your scheduler and delete your log file, which may have grown
to massive size.
# version 0.165
(you must first upgrade to 0.164, see notes below)
The database schema has changed, so run vrpipe-db_upgrade.
# version 0.164
(you must first upgrade to 0.163, see notes below)
The database schema has changed, so run vrpipe-db_upgrade.
Significant changes were made to the API used in the core of VRPipe, but these
are unlikely to affect you:
lock_row() in Persistent, use of transactions for locking and Living has been
removed in place of InMemory->lock related methods.
Job no longer has a heartbeat, so you can't confirm a job is currently running
just by looking at the time of the last heartbeat in the MySQL database. Instead
you can run a perl one-liner and see if $job->locked().
# version 0.163
(you must first upgrade to 0.159, see notes below)
The database schema has changed, so run vrpipe-db_upgrade.
Note that the vrpipe datasource filter option now uses the comma symbol to
separate multiple filters, which means that your filter keys and regexes must
not contain hashes or commas. If you have existing setups that had commas in
their filters, those setups will break. You may have to manually fix them in the
database before using 0.163.
If you use the sge_ec2 scheduler, VRPipe's automated SGE configuration has
changed so you will need to delete your .sge_confs directories (eg.
`rm -fr /shared/software/VRPipe/*/.sge_confs`) and then start up a new SGE
master instance. There are also new SiteConfig questions to answer, so rerun
`perl Build.PL` and answer 'y' to go through setup again.
# version 0.159
You must upgrade to 0.158 first, following the upgrade instructions below for
that version. Note that the instructions reference some scripts that have been
deleted in this version, so first 'git checkout 0.158', complete the 0.158
upgrade, then come back here with 'git checkout master'.
This version fixes some issues with row locking that you may have encountered if
using MySQL 5.5. If you were using MySQL 5.1 you may also have encountered a
different issue where certain inserts to the database would fail, causing
Submissions to fail. This remains an unresolved issue, so we recommend you
upgrade to 5.5. We also recommend the following settings in my.cnf:
max_connections = 1000
binlog_cache_size=32M
innodb_log_file_size=512M
innodb_log_files_in_group = 2
innodb_file_per_table = 1
innodb_fast_shutdown=0
innodb_buffer_pool_size = 1G
innodb_change_buffering=all
innodb_flush_log_at_trx_commit=0
innodb_log_buffer_size = 32M
innodb_thread_concurrency=16
innodb_concurrency_tickets=5000
innodb_commit_concurrency=50
innodb_autoinc_lock_mode=2
binlog_format=mixed
transaction_isolation = REPEATABLE-READ
innodb_locks_unsafe_for_binlog = 0
Basically the idea is to maximise transaction concurrency and performance, while
being aware that VRPipe relies on the isolation level being REPEATABLE-READ by
default (for gap locking, hence also ensuring innodb_locks_unsafe_for_binlog is
off).
The log file and file_per_table changes (or upgrading from MySQL 5.1 to 5.5)
necessitate a complete refresh of your database files:
1. log into your database and do 'truncate table pipelinesetuplog;' to make
subsequent steps faster and reduce the size of your database significantly
2. take a mysqldump of your database
3. drop the database
4. shut down mysqld cleanly
5. delete the corresponding ibdata* and ib_logfile* files
6. alter my.cnf as above
7. bring mysqld back up
8. create a new database with the same name as the old one
9. load the dump you took in step 2. into the new database
# version 0.158
This is a major, critical upgrade. You should upgrade as soon as possible,
before you are affected by a significant bug (in versions prior to this one,
if the version of your installed Storable module changes, all dataelements could
be withdrawn and new ones created, effectively resetting all your setups).
If you have previously used VRPipe in production, you must make major changes
to your production database. Depending on the size of your database, this could
take anywhere from minutes to over a week, during which vrpipe-server must be
stopped and no other changes to the database should be made. So plan for
extended down-time.
Follow the upgrade instructions carefully and in order. Most of the upgrade
steps must be carried out PRIOR to actually installing 0.158.
1) Stop the VRPipe server ('vrpipe-server stop'), ensure that it left no
processes behind (check with 'ps aux'), kill off any VRPipe-related jobs
and confirm nothing is accessing your production database (eg. log into
MySQL and do 'show processlist;').
2) Ensure that VRPipe in PERL5LIB is >= 0.153 && <= 0.157, schema version 30:
$ perldoc -l VRPipe::Persistent::Schema | xargs grep VERSION
3) Ensure that you're using the correct database details and schema:
$ perl -MVRPipe -MDBI -e 'my $dsn = VRPipe::Persistent::SchemaBase->get_dsn; print $dsn, "\n"; my $dbh = DBI->connect($dsn, VRPipe::Persistent::SchemaBase->get_user, VRPipe::Persistent::SchemaBase->get_password, {}); my ($version) = $dbh->selectrow_array(q[select version from dbix_class_deploymenthandler_versions order by id desc limit 1]); print "Schema version: $version\n";'
4) Run the following script. You may need up to 2GB free memory, and it could
take up to 24hrs to finish running. Check its output: it should be clear if
it completed successfully.
$ perl 0158_db_optimise.pl
5) Run the following script. It has low memory requirements and could take up
to 6hrs to finish running. Check its output: it should be clear if it
completed successfully.
$ perl 0158_pre_db_upgrade.pl
6) Install 0.158 using your normal method, and ensure that worked (the schema
version should come up as 31):
$ perldoc -l VRPipe::Persistent::Schema | xargs grep VERSION
7) Run the normal automated db_upgrade script:
$ vrpipe-db_upgrade
8) Unfortunately the automated upgrade here does not do things correctly, and it
will fail. Still, do step 7 and let it fail (so that the necessary files are
added to your schema directory for future use), then manually alter your
database; the following commands may take up to an hour to run:
mysql> BEGIN; SET foreign_key_checks=0; ALTER TABLE dataelement DROP COLUMN result, ADD COLUMN filelist integer(9) NOT NULL, ADD COLUMN keyvallist integer(9) NOT NULL, ADD INDEX dataelement_idx_filelist (filelist), ADD INDEX dataelement_idx_keyvallist (keyvallist), ADD CONSTRAINT dataelement_fk_filelist FOREIGN KEY (filelist) REFERENCES filelist (id), ADD CONSTRAINT dataelement_fk_keyvallist FOREIGN KEY (keyvallist) REFERENCES keyvallist (id); ALTER TABLE file DROP COLUMN metadata, ADD COLUMN keyvallist integer(9) NULL, ADD INDEX file_idx_keyvallist (keyvallist), ADD CONSTRAINT file_fk_keyvallist FOREIGN KEY (keyvallist) REFERENCES keyvallist (id); CREATE INDEX keyvallist_idx_lookup on keyvallist (lookup); CREATE INDEX filelist_idx_lookup on filelist (lookup); CREATE INDEX keyvallistmember_idx_keyval_key on keyvallistmember (keyval_key); CREATE INDEX keyvallistmember_idx_val on keyvallistmember (val (255)); DROP TABLE persistentarraymember; DROP TABLE persistentarray; SET foreign_key_checks=1; COMMIT;
mysql> insert into dbix_class_deploymenthandler_versions set version = 31, upgrade_sql = '';
9) Ensure the db upgrade worked:
$ perl -MVRPipe -MDBI -e 'my $dsn = VRPipe::Persistent::SchemaBase->get_dsn; print $dsn, "\n"; my $dbh = DBI->connect($dsn, VRPipe::Persistent::SchemaBase->get_user, VRPipe::Persistent::SchemaBase->get_password, {}); my ($version) = $dbh->selectrow_array(q[select version from dbix_class_deploymenthandler_versions order by id desc limit 1]); print "Schema version: $version\n";'
10) Run the following script. It has low memory requirements and could take up
to 2 days to run. Check its output: it should be clear if it completed
successfully.
$ perl 0158_post_db_upgrade.pl
11) The upgrade is complete and it should be safe to start using VRPipe again
normally.
Run 'perl Build.PL' to install a new CPAN dependency (JSON::XS). There is also
a new SiteConfig question if you use the ec2 scheduler (ec2_max_instances).
API changes:
PersistentArray and PersistentArrayMember classes (and tables) were removed,
replaced by KeyValList, KeyValListMember, FileList and FileListMember.
The ArrayRefOfPersistent type was renamed PersistentArrayRef, and there is no
more PersistentArray type.
DataElements no longer have a 'result', but instead a FileList and a KeyValList.
A result() method provides backwards-compatibility.
File no longer has a 'metadata', but instead a KeyValList. A metadata() method
provides permanent backwards-compatibility.
Scheduler classes can now have an initialize() method to cache things in class
variables for example.
# version 0.155:
It probably won't have any effect on you, but the bam_to_fastq step briefly
had a bam2fastq_logs output file that was removed in this version.
# version 0.154:
If you have written your own VRPipe::Schedulers::* class, note the following API
changes:
submit_command() and submit_args() were combined into submit_command(), which
takes the inputs of the later and returns their combined output as a single
string.
start_scheduler() and stop_scheduler() were removed.
# version 0.153:
There is a new site config question to answer, so rerun 'perl Build.PL'.
The schema version changed, so run vrpipe-db_upgrade. Unfortunately the
automated upgrade here might not delete the necessary tables in the correct
order, and it will fail. Run the script and let it fail (so that the necessary
files are added to your schema directory for future use), then manually alter
your database:
mysql> BEGIN; ALTER TABLE sidtosub CHANGE COLUMN sid sid varchar(20) NOT NULL;
DROP TABLE localschedulerjobstate; DROP TABLE localschedulerjob;
CREATE INDEX sidtosub_idx_sid on sidtosub (sid); COMMIT;
mysql> insert into dbix_class_deploymenthandler_versions set version = 30,
upgrade_sql = 'ALTER TABLE sidtosub CHANGE COLUMN sid sid varchar(20) NOT
NULL; DROP TABLE localschedulerjobstate; DROP TABLE localschedulerjob;';
If you have written your own VRPipe::Schedulers::* class, note that
SchedulerMethodsRole no longer has a required get_sid() method. Instead,
ensure_running() in VRPipe::Scheduler just does a system call. There have been
many other changes to the schedulers recently - check the POD for details, and
see the updated code in the local and LSF schedulers for examples of how to
implement schedulers now.
Some scripts and modules have been removed from the distribution; if you have
ever done a './Build install' you will have to manually remove files from your
installation location: vrpipe-local_scheduler, VRPipe/LocalScheduler.pm,
VRPipe/LocalSchedulerJob.pm, VRPipe/LocalSchedulerJobState.pm.
# version 0.152:
This is a critical upgrade. In future it will be assumed you upgraded to this
version. It is VERY strongly recommended you upgrade to 0.152 as soon as
possible. If you've been skipping the past few versions, read and follow their
notes first. Unlike 0.151, database connection exhaustion has been fixed, so you
do not need to use --max_submissions anymore (or you can set it to equal the
number of CPUs in your cluster).
There is a new required piece of 3rd party software you must install called
Redis. See the README for installation instructions. A number of new CPAN
modules must also now be installed: rerun 'perl Build.PL' and then follow the
instructions if any of the requirements are missing.
There are new configuration questions to answer, so when 'perl Build.PL' asks
you if "you wish to go through setup again", answer yes.
This version features a database schema change, so be sure to run
'vrpipe-db_upgrade' after installation.
The README now has new sections that briefly explain the web front-end and what
to do when things go wrong; you might like to read these.
If you've used a recent version in production, note that they produced excess
debugging information in the logs, so you should clear those out by deleting
the log file and truncating the pipelinesetuplog table in your database:
mysql> truncate table pipelinesetuplog;
# version 0.151:
You should update to this version as it includes many important fixes. Pipelines
should now run correctly and reliably. Problems remain, however, with efficiency
and running out of database connections. When starting the server, use
--max_submissions and set it to slightly less than the maximum number of
connections allowed to your database.
If you've been skipping previous versions, follow the notes below.
# version 0.146:
See notes for 0.145 and 0.143 if you skipped those versions. This version should
be safe to upgrade to, though there is an excess of debugging enabled. You may
wish to wait for the next version.
# version 0.145:
See notes for 0.143 if you skipped that version. You may also wish to skip this
version as another significant change was made, untested in production.
If you are using MySQL, your user must be able to change the transaction
isolation level. If you have database replication set up, you may find that you
have to reconfigure the server to binlog-mode=MIXED and restart/redo your
replication server.
# version 0.144:
See notes for 0.143; this just corrects the schema version number allowing
vrpipe-db_upgrade to work.
# version 0.143:
This is a substantial overhaul of how things work under-the-hood, and also
features an experimental fix for database inconsistencies that has not been
tested in a production environment. You may wish to hold off upgrading until the
next version.
This version features a database schema change: run vrpipe-db_upgrade after
installation.
Note that if you use a pipeline that uses the bam_to_fastq step, you must make
sure you have the bam2fastq exe now, and possibly reconfigure your setups.
# version 0.140:
A number of steps that previously assumed their bam input files were indexed now
explicitly require a 'bai_inputs' input. If you have written your own pipelines
that use these steps, you may have to add a bam_index step, or an adaptor if you
already had a bam_index step.
The archive_files step no longer advertises an output file, so you can no longer
use the output of an archive_files pipeline as the source of a vrpipe
datasource.
# version 0.130:
Alterations to the API of datasources were made in this version. If you have
created your own datasource module, note that now _update_changed_marker()
method is no longer used, and the existing _has_changed() method must now
always set the _changed_marker in addition to returning boolean.
# version 0.124:
This version features a database schema change: run vrpipe-db_upgrade after
installation.
# version 0.121:
This re-enables the feature discussed in the previous note: this version is safe
to upgrade to.
# version 0.120:
Normally when VRPipe detects missing input or output files it will automatically
restart the relevant step. This feature has been disabled in this version. You
may wish to skip this version as a result; the feature will likely return in a
future version.
# version 0.112:
This is a bug-fixed version of 0.111. It should be safe to upgrade to if you use
LSF. The local scheduler does function and should be fine on multi-processor
systems, but on some systems may be excruciatingly slow. See the notes below for
0.111 if upgrading from a version earlier than that..
# version 0.111:
If you are using a version earlier than 0.106, upgrade to that first,
following the guidance below. * the sensibly cautious may wish to avoid
upgrading to this version and wait for the next version instead *
This version features substantial changes to core class APIs and the database.
Run vrpipe-db_upgrade to upgrade your database. SiteConfig options have changed,
so also be sure to rerun Build.PL and answer the new questions. vrpipe-server
now takes over the role of vrpipe-trigger_pipelines and
vrpipe-dispatch_pipelines so be sure to keep it running in the new --farm mode.
If you have developed your own pipeline modules, these will need to be altered
to match the new API:
VRPipe::Pipeline no longer has _num_steps column, and PipelineRole gains a new
construction implementation involving new methods that can be implemented in
VRPipe::Pipelines::* modules: step_names(), adaptor_definitions() and
behaviour_definitions(). See some supplied Pipeline modules for how to define
these; you are defining the exact same things as before in pretty much the same
way, but now it just looks nicer and takes fewer characters.
VRPipe::Steps::* haven't changed in this version, but you should ensure that
you supply {output_files => [$ofile]} to dispatch calls, if more than 1 dispatch
occurs in the body_sub (and an output file is actually made). Also,
VRPipe::Requirements->time is now in seconds instead of hours, with backwards
compatibility for values under 60.
Other changes you might need to be aware of:
VRPipe::Pipeline no longer has a steps() method; use step_members() instead.
Pipelines are now constructed by calling create() (instead of steps()), and all
pipelines and steps can be created with
VRPipe::Interface::BackEnd->install_pipelines_and_steps().
Manager no longer handles triggering or dispatch of setups. PipelineSetup itself
gains a trigger() method instead.
New VRPipe::FarmServer class to track running servers. Manager lets
vrpipe-server register_farm_server(). PipelineSetup gains a desired_farm and
controlling_farm.
PipelineSetup gains currently_complete() method.
VRPipe::Scheduler: removed submit(), build_command_line(), run_on_node(),
wait_for_sid() and the scheduler_*_file() methods. It now has an
ensure_running() method instead of submit(). We no longer store or even look at
scheduler output. A VRPipe::Submission is no longer something you submit to the
farm, but something that vrpipe-handler (running on a farm node) will pick up to
run.
# version 0.106:
Further subtle issues have been discovered that may affect your VRPipe database
following the upgrade to 0.103 and/or 0.104. If you have ever used version
0.102 or earlier, it is STRONGLY recommended you do the following:
1) Install 0.106 as normal.
2) Make sure you have no VRPipe code running.
3) Log into your production database and execute the following command:
mysql> update datasource set _changed_marker = 'foo';
4) Run this Perl multi-liner (copy and paste it all in one go to your
terminal):
perl -MVRPipe::Persistent::Schema -Mstrict -we 'foreach my $setup (VRPipe::PipelineSetup->search({})) { \
print STDERR ". "; \
eval { $setup->datasource->incomplete_element_states($setup); }; \
} warn "\nAll done!\n";'
5) Run this Perl multi-liner (copy and paste it all in one go to your
terminal):
perl -MVRPipe::Persistent::Schema -Mstrict -we 'my $pager = VRPipe::DataElement->search_paged({withdrawn => 0}); \
while (my $des = $pager->next) { \
foreach my $de (@$des) { \
print STDERR ". "; \
my $r = $de->result; \
my $ps = $r->{paths} || next; \
next if ref($ps); \
my %correct_result; \
while (my ($key, $val) = each %$r) { \
$correct_result{$key} = $val; \
} \
$r->{paths} = 0; \
$r->{paths} = $ps; \
my ($orig) = VRPipe::DataElement->search({withdrawn => 1, datasource => $de->datasource->id, result => $r}); \
if ($orig && $orig->id < $de->id) { \
eval { VRPipe::StepState->search_rs({dataelement => $de->id})->delete; \
VRPipe::DataElementState->search_rs({dataelement => $de->id})->delete; \
$de->delete; }; \
if ($@) { \
$de->result($r); \
$de->withdrawn(1); \
$de->update; \
} \
$de = $orig; \
$de->withdrawn(0); \
$de->result(\%correct_result); \
} \
else { \
my ($result_str) = VRPipe::DataElement->get_column_values("result", { id => $de->id }, {disable_inflation => 1}); \
if ($result_str =~ /lane/) { \
if ($result_str =~ /paths.+lane/s) { \
$de->result($r); \
} \
} \
else { \
$de->result(\%correct_result); \
} \
} \
$de->update; \
} \
print STDERR "\n"; \
}'
6) Start running VRPipe again normally
# version 0.104:
This is identical to 0.103, released only to correct the upgrade instructions
below given for 0.103. Follow the advice given for 0.103, using this command
line when you get to step 3) (NB: it might take hours to run):
perl -Imodules -MVRPipe::Persistent::Schema -Mstrict -we 'my $pager = VRPipe::DataElement->search_paged({}); \
while (my $des = $pager->next) { \
foreach my $de (@$des) { \
my $r = $de->result; \
my $ps = $r->{paths} || next; \
next unless ref($ps); \
next unless @$ps > 0; \
eval { $de->_deflate_paths($r); }; next if $@; \
my %new_result; \
while (my ($key, $val) = each %$r) { \
$new_result{$key} = $val; \
} \
$de->result(\%new_result); \
$de->update; \
} \
}'
If you had already completed your install of 0.103 and started running VRPipe
already, you can correct errors that may be in your database by again following
steps 1-5, but this time using the following 2 commands during step 3):
perl -Imodules -MVRPipe::Persistent::Schema -Mstrict -we 'my $pager = VRPipe::DataElement->search_paged({withdrawn => 0}); \
while (my $des = $pager->next) { \
foreach my $de (@$des) { \
print STDERR ". "; \
my $r = $de->result; \
my $ps = $r->{paths} || next; \
next if ref($ps); \
my %correct_result; \
while (my ($key, $val) = each %$r) { \
$correct_result{$key} = $val; \
} \
$r->{paths} = 0; \
$r->{paths} = $ps; \
my ($orig) = VRPipe::DataElement->search({withdrawn => 1, datasource => $de->datasource->id, result => $r}); \
if ($orig && $orig->id < $de->id) { \
eval { VRPipe::StepState->search_rs({dataelement => $de->id})->delete; \
VRPipe::DataElementState->search_rs({dataelement => $de->id})->delete; \
$de->delete; }; \
if ($@) { \
$de->result($r); \
$de->withdrawn(1); \
$de->update; \
} \
$de = $orig; \
$de->withdrawn(0); \
$de->result(\%correct_result); \
} \
else { \
my ($result_str) = VRPipe::DataElement->get_column_values("result", { id => $de->id }, {disable_inflation => 1}); \
if ($result_str =~ /lane/) { \
if ($result_str =~ /paths.+lane/s) { \
$de->result($r); \
} \
} \
else { \
$de->result(\%correct_result); \
} \
} \
$de->update; \
} \
print STDERR "\n"; \
}'
perl -Imodules -MVRPipe::Persistent::Schema -Mstrict -we 'my $pager = VRPipe::DataElement->search_paged({withdrawn => 1}); \
while (my $des = $pager->next) { \
foreach my $de (@$des) { \
my $r = $de->result; \
my $ps = $r->{paths} || next; \
next if ref($ps); \
my ($corrected) = VRPipe::DataElement->search({withdrawn => 0, datasource => $de->datasource->id, result => $r}); \
unless ($corrected) { \
$de->result($r); \
$de->update; \
} \
} \
}'
# version 0.103:
New CPAN module dependencies were added, and a new SiteConfig option should be
answered, so be sure to rerun 'perl Build.PL' and answer 'y' to the first
question, also running './Build installdeps' if indicated.
This version makes a change to how some data is stored in the database, so if
upgrading and you have a production database in use, it is VERY IMPORTANT that
you do the following PRIOR to installation of 0.103:
1) make sure you have no VRPipe code running
2) cd to the root of your vr-pipe git clone directory, updated to the latest
code (ie. the directory containing this file)
3) run the following (copy and paste all the lines in one go to your
terminal):
[redacted - see notes for version 0.104]
4) install this latest version of VRPipe in your normal way
5) start running VRPipe again
# version 0.101:
Now that we have more than 100 versions, all previous tags in the git repository
have been renamed. If you have an existing clone, however, the old tags will
still be there. If it bothers you, you can delete them, eg:
perl -e 'for (1..99) { $old = "0.".sprintf("%02d", $_); system("git tag -d $old"); }'
If you have your own fork, also delete from your origin:
perl -e 'for (1..99) { $old = "0.".sprintf("%02d", $_); system("git push origin :refs/tags/$old"); }'
# version 0.100:
If you have used previous versions it is possible you have large Job stdout/err
files hanging around that are just wasting disc space. You can delete these by
doing something like:
perl -MVRPipe::Persistent::Schema -Mstrict -we 'foreach my $file (VRPipe::File->search({ s => { ">=" => 536870912 }, e => 1, path => { "LIKE" => q[%job_std%] } })) { $file->unlink }'
(which deletes all job_std* files over 512MB)
# version 0.99:
This version features a schema change, so be sure to run vrpipe-db_upgrade if
upgrading from an earlier version.
(MooseX::AbstractFactory is also no longer required)
# version 0.96:
This version features a schema change, so be sure to run vrpipe-db_upgrade if
upgrading from an earlier version.
# version 0.95:
This version removes, renames and alters a number of pipline and step modules.
Normally this is not something we will do, but we feel it is important in this
case. There are 2 consequences if you are upgrading:
1) Your production database will still contain all the removed pipelines and
steps, cluttering up the output of vrpipe-setup (when it lists available
pipelines) and causing confusion (a user may pick one of the defunct
pipelines).
2) If you are partway through running one of the altered pipelines, or if
you later need to rerun a setup that used an affected pipeline that
previously completed, you will be left with a broken mess with undefined
behaviour.
It is STRONGLY recommended that you resolve this by deleting affected pipelines
and steps from your production database:
0) Complete installation of 0.95 in the usual way
1) Log into your production database
2) Run the following query to find affected PipelineSetups you've created in
the past:
mysql> select ps.id, ps.name, ps.user, p.name from pipelinesetup as ps left join pipeline as p on p.id = ps.pipeline where p.name in ('gatk_genotype', 'gatk_variant_calling_and_filter_vcf', 'mpileup_with_leftaln', 'snp_calling_chunked_mpileup_bcf', 'snp_calling_chunked_mpileup_vcf', 'snp_calling_gatk_vcf', 'snp_calling_mpileup_vcf', 'snp_calling_mpileup_bcf', 'vcf_chunked_vep_annotate', 'vcf_filter_merge_and_vep_annotate');
You cannot delete affected pipelines or steps if there are any
PipelineSetups that use them. The easiest thing to do for each one is:
3) $ vrpipe-setup --setup [affected setup id] --delete
Note that this will remove all trace that you ever created or ran that
setup (including deletion of the output files), so do manual backups of
anything you want to keep first.
4) Run the following queries to delete the pipelines:
mysql> delete sa.* from stepadaptor as sa left join pipeline as p on p.id = sa.pipeline where p.name in ('gatk_genotype', 'gatk_variant_calling_and_filter_vcf', 'mpileup_with_leftaln', 'snp_calling_chunked_mpileup_bcf', 'snp_calling_chunked_mpileup_vcf', 'snp_calling_gatk_vcf', 'snp_calling_mpileup_vcf', 'snp_calling_mpileup_bcf', 'vcf_chunked_vep_annotate', 'vcf_filter_merge_and_vep_annotate');
mysql> delete sa.* from stepbehaviour as sa left join pipeline as p on p.id = sa.pipeline where p.name in ('gatk_genotype', 'gatk_variant_calling_and_filter_vcf', 'mpileup_with_leftaln', 'snp_calling_chunked_mpileup_bcf', 'snp_calling_chunked_mpileup_vcf', 'snp_calling_gatk_vcf', 'snp_calling_mpileup_vcf', 'snp_calling_mpileup_bcf', 'vcf_chunked_vep_annotate', 'vcf_filter_merge_and_vep_annotate');
mysql> delete from pipeline where name in ('gatk_genotype', 'gatk_variant_calling_and_filter_vcf', 'mpileup_with_leftaln', 'snp_calling_chunked_mpileup_bcf', 'snp_calling_chunked_mpileup_vcf', 'snp_calling_gatk_vcf', 'snp_calling_mpileup_vcf', 'snp_calling_mpileup_bcf', 'vcf_chunked_vep_annotate', 'vcf_filter_merge_and_vep_annotate');
5) Run the following query to delete steps no longer used by any pipeline:
mysql> delete s.* from step as s left join stepmember as sm on sm.step = s.id where sm.step is NULL;
6) Deactivate all PipelineSetups that used the vcf_vep_annotate pipeline,
since this pipeline gained a step and you probably don't want those
setups springing back to life and trying to run the new final step:
myslq> update pipelinesetup as ps left join pipeline as p on p.id = ps.pipeline set active = 0 where p.name = 'vcf_vep_annotate';
# version 0.93:
This version features a schema change, so be sure to run vrpipe-db_upgrade.
There are also new SiteConfig options, so be sure to go through and answer all
the questions of 'perl Build.PL'.
This version introduces vrpipe-server, which needs a port to bind to. It is safe
for multiple different people with their own VRPipe installs and databases to
run the server on the same machine, but you will encounter errors if you attempt
to use a port that someone else is using: pick a port number (during
'perl Build.PL') unique to your own install.
# version 0.81:
This version increments the schema version, so be sure to run vrpipe-db_upgrade
if you used a previous version of VRPipe.
There is also improved handling of duplicate database rows. Older versions of
VRPipe may have left you with many duplicate rows, most likely in the
dataelementstate and stepstate tables. You may like to manually remove these:
mysql> delete des from dataelementstate as des inner join (select min(id) minid, pipelinesetup, dataelement from dataelementstate group by pipelinesetup, dataelement having count(*) > 1) as dups on (dups.pipelinesetup = des.pipelinesetup and dups.dataelement = des.dataelement and dups.minid <> des.id);
mysql> delete t from stepstate as t inner join (select min(id) minid, stepmember, dataelement, pipelinesetup from stepstate group by stepmember, dataelement, pipelinesetup having count(*) > 1) as dups on (dups.stepmember = t.stepmember and dups.dataelement = t.dataelement and dups.pipelinesetup = t.pipelinesetup and dups.minid <> t.id);
Note that this may have strange effects on what the system thinks has completed,
but shouldn't cause any harm and is recommended.
# version 0.76:
Like 0.75, this version improves indexes. See the notes for 0.75 if upgrading.
# version 0.75:
This version increments the schema version, so be sure to run vrpipe-db_upgrade
if you used a previous version of VRPipe.
No actual changes to the schema itself were made, however the indexing of
columns has improved and vrpipe-db_upgrade will add new additional indexes to
necessary columns. It does not, however, remove the old defunct indexes; you are
encouraged to remove these yourself. The new indexes which should be kept are
all named [table_name]_idx_[column_name]. The old indexes which should be
dropped are named psuedo_idx and txt_idx. If you have used VRPipe for a very
long time there may be other indexes which you should delete (except for
PRIMARY).
# version 0.74:
Minor changes to the schema (size of some int columns); be sure to run
vrpipe-db_upgrade if you have used a previous version of VRPipe.
# version 0.73:
This version adds support for sqlite, though it currently locks up the database
whilst running pipelines; it is only really suited for parsing use.
# version 0.31:
This version introduces proper database independence, and also automatic
indexing of appropriate columns. The only converter written so far, however,
is for MySQL.
# version 0.27:
The schema has changed in this version. Be sure to run vrpipe-db_upgrade if you
have used a previous version of VRPipe.
# versions 0.01-0.30:
a) Only MySQL is fully supported so far, though it may work with other dbs.
b) There is currently an issue with indexing certain columns that are too large
to be specified as varchars. After running vrpipe-db_deploy you will have
to manually connect to your production database and issue the following SQL:
create index path_index on file path(255);
create index output_root_index on scheduler (output_root(255));
create index cmd_dir_index on job (cmd(255), dir(255));
create index requirements_index on requirements (custom(255));
create index result_index on dataelement (result(255));
create index source_options_index on datasource (source(255), options(255));
create index outputroot_options_index on pipelinesetup (output_root(255), options(255));
create index allowed_values_index on stepoption (allowed_values(255));
create index metadata_index on stepiodefinition (metadata(255));
create index summary_index on stepcmdsummary (summary(255));