Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

listDBKeyfile.sh - problem with Tassel v5 format #23

Open
afmcc opened this issue Oct 2, 2019 · 2 comments
Open

listDBKeyfile.sh - problem with Tassel v5 format #23

afmcc opened this issue Oct 2, 2019 · 2 comments

Comments

@afmcc
Copy link
Collaborator

afmcc commented Oct 2, 2019

There’s a slight bug in the listDBKeyfile.sh script when generating T5 key files. T5 is incredibly picky when it comes to the column headers and not having them right causes the program not to run. Could you please amend the script so the header has the following capitalised fields?

Flowcell
Lane
Barcode
FullSampleName

@afmcc
Copy link
Collaborator Author

afmcc commented Oct 6, 2019

Hi sorry about the delay – I’ve done this – see before and after below. If you could confirm this fixes
it that would be great – thanks

AMCC

before:

./listDBKeyfile.sh -s SQ2701 -v 5 > SQ2701_old.txt

patch:

iramohio-01$ git diff
diff --git a/list_keyfile.sh b/list_keyfile.sh
index f130691..02a3c22 100755
--- a/list_keyfile.sh
+++ b/list_keyfile.sh
@@ -167,10 +167,10 @@ function build_extract_script() {
if [ $CLIENT_VERSION == "5" ]; then
code="
select

  • Flowcell,
  • Lane,
  • Barcode,
  • $sample_phrase2 as sample,
  • Flowcell as "Flowcell",
  • Lane as "Lane",
  • Barcode as "Barcode",
  • $sample_phrase2 as "FullSampleName",
    PlateName,
    PlateRow as Row,
    PlateColumn as Column,
    iramohio-01$

after:

./listDBKeyfile.sh -s SQ2701 -v 5 > SQ2701_new.txt

diff of before and after :

iramohio-01$ diff SQ2701_old.txt SQ2701_new.txt
1c1
< flowcell lane barcode sample platename row column libraryprepid counter comment enzyme species numberofbarcodes bifo control fastq_link fullsamplename

Flowcell Lane Barcode FullSampleName platename row column libraryprepid counter comment enzyme species numberofbarcodes bifo control fastq_link fullsamplename

@afmcc
Copy link
Collaborator Author

afmcc commented Oct 7, 2019

Just realised that some columns in keyfile have been changed. Is this to comply with the Tassel requirements?

I have tried a few keyfiles. To reproduce what I reported here I use an example SQ2708 from the database:
julia> keyInfo[1:4]
4-element Array{AbstractString,1}:
"Flowcell"
"Lane"
"Barcode"
"FullSampleName"

julia> keyInfo[11:17]
7-element Array{AbstractString,1}:
"enzyme"
"species"
"numberofbarcodes"
"bifo"
"control"
"fastq_link"
"fullsamplename"

I am OK to adapt to the new changes. But in its current form, the column 4 and 17 are of the same name ("FullSampleName" vs "fullsamplename"). I would suggest use the standard header as suggested in the official documents (Please see the Link), with optional customized information as you currently include. “DNASample” should be the sample names as submitted to the sequencing lab. And the “FullSampleName” is required, but in many case, it may be upon to the uses (like me) do the due diligence, for the specific purpose.

Link: https://bytebucket.org/tasseladmin/tassel-5-source/wiki/Tassel5GBSv2Pipeline/Pipeline_Testing_key.txt?rev=7a156ed52cccd28e450d5abbee27bd4b8ca3bb51

I am happy to discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant