Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5' and 3' untranslated regions for eukaryotic parts #169

Open
fxbuson opened this issue Nov 3, 2023 · 5 comments
Open

5' and 3' untranslated regions for eukaryotic parts #169

fxbuson opened this issue Nov 3, 2023 · 5 comments
Assignees
Labels

Comments

@fxbuson
Copy link

fxbuson commented Nov 3, 2023

I'm transferring the discussion from the SBOL slack channel to here.

I'm collaborating with groups that work with plant synthetic biology, and they developed libraries of parts where 5' and 3' UTRs are key elements of their gene composition. There is currently no glyph assigned to these elements (SO:0000204 and SO:0000205), and I think diagrams for all eukaryotic systems would possibly benefit from the addition of a distinguished glyph for such regions.

iGEM's Mammalian Genetic Design page uses a trapezoid and the ribosome binding site glyph to represent the UTRs:

image

@jakebeal suggested two options to handle this change:

  1. Let trapezoid mean both 5'UTR and 3'UTR, or
  2. Add the 3'UTR meaning to ribosome entry site.

Since 5' and 3' UTRs are not interchangeable (they have different biological effects), I would avoid option 1, especially considering I'm trying to represent a modular parts library. Option 2 could be done, although I think it would decrease the specificity that the RBS glyph currently holds, and hurt the readability of past diagrams.

In case we do want a new glyph for the 5' UTR, @Gonza10V suggested we could split the trapezoid leaving the left half for 5' UTR and the right half for the 3' UTR (see below). I think this solution (or a similar one, in case we want different glyphs) is appropriate, since it keeps these elements as generic untranslated regions and implies some connection between both elements. My only worry is that the 3' half-trapezoid would then be too similar to the CDS glyph.

image

@bbartley
Copy link

bbartley commented Nov 3, 2023 via email

@Gonza10V Gonza10V self-assigned this Nov 6, 2023
@Gonza10V Gonza10V added the Draft label Nov 6, 2023
@Gonza10V Gonza10V added this to the SBOL Visual 3.1 milestone Nov 6, 2023
@jakebeal
Copy link
Contributor

jakebeal commented Nov 6, 2023

@bbartley In my own work with plant engineers, the 5'UTR and 3'UTR sequences tend to modulate gene expression levels up or down in a manner that is important for engineering but not yet well understood mechanistically.

One question that I have about the need to differentiate 5'UTR vs. 3'UTR is what happens when there is a polycistronic gene. Is a sequence in between the two genes a 5'UTR sequence, a 3'UTR sequence, or something else entirely?

@Gonza10V
Copy link
Contributor

Gonza10V commented Nov 7, 2023

Well, given the variety of UTRs maybe we should go one step up in the ontology and use the trapezoid to represent UTR (SO:0000203). The tradeoff is covering more uses or being specific, and for now I would target to cover more use cases. This will also remove the directional intuiton provided by the half trapezoid.

@fxbuson
Copy link
Author

fxbuson commented Nov 7, 2023

Jake is correct. The way I understand it, 5' and 3' UTR are generic terms for untranslated regions in mature mRNA, with diverse and not completely understood mechanics.

@jakebeal normally eukaryotic genes won't be polycistronic, and that's why UTRs are separated in the two categories, although there could be an engineered polycistronic gene by use of IRES sequences. For natural prokaryotic genes, there's not normally that much space beween CDSs, but again engineered genes could have.

Even though I'd prefer having two different glyphs for my particular use case, I get that having a single glyph would make more sense for the standard. In that case, wouldn't there be a conflict with Non-Coding RNA Gene (SO:0001263 and SO:0000834)? This might be another issue, but now I'm not sure about when that glyph should/shouldn't be used.

@Gonza10V
Copy link
Contributor

Gonza10V commented Nov 7, 2023

@fxbuson thats a good doubt because UTRs are basically non coding RNA (ncRNA), but the distinction might be in the gene part because that means that what is expressed is an RNA and not a protein therefore ncRNAs like rRNA, tRNA and miRNA goes in this category meanwhile the UTRs that are part of an mRNA not, because they are part of a gene that express a protein. Also a ncRNA gene is a gene product and a UTR is a regulatory element.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants