Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Representation of introns in non-CDS regions #168

Open
fxbuson opened this issue Nov 2, 2023 · 10 comments
Open

Representation of introns in non-CDS regions #168

fxbuson opened this issue Nov 2, 2023 · 10 comments
Assignees
Labels

Comments

@fxbuson
Copy link

fxbuson commented Nov 2, 2023

Currently the specification represents introns by showing a "torn out" CDS with whatever is in between those edges being the intron composition (1A). While this works for introns within CDSs, introns in untranslated regions can't have the torn out edges, and have no specific visual representation assigned to them. A standard-compliant solution would be to use the more generic non-coding RNA (1B) or engineered region glyphs. If we don't want to change/add any glyphs, I would suggest to change the specification to convey both cases and not directly associate Intron SO:0000188 to the "torn edges" glyph.

image

If we do want to have a new way of representing introns, I would like to start the discussion with the iGEM Mammalian Genetic Design page, where an intron is represented as a "spike" (2). This alternative would be sufficient to represent introns outside of CDSs, and we could keep the torn edges for when CDSs are interrupted (3):

image

image

Also, if an intron encompasses a region that has functional elements (promoters, RNA elements, etc), we have to define if that should be represented simply as a "composite" intron (4A) or if there is a single-diagram solution, such as having the intron make its own 'intron strand' (4B,C).

image

@Gonza10V Gonza10V self-assigned this Nov 4, 2023
@Gonza10V Gonza10V added the Draft label Nov 4, 2023
@Gonza10V Gonza10V added this to the SBOL Visual 3.1 milestone Nov 4, 2023
@Gonza10V
Copy link
Contributor

Gonza10V commented Nov 6, 2023

I see the need for new glyphs on intron regions to have a glyph outside the CDS. I liked the spike alternative and as you shoow in 4A. Now to show more detail I would use 4A with an inset of the composite before 4B or 4C.

@jakebeal
Copy link
Contributor

jakebeal commented Nov 6, 2023

I guess I am confused here. I thought that an intron had to be between exons. Can you give an example of an intron that isn't between exon sequences in a CDS?

@fxbuson
Copy link
Author

fxbuson commented Nov 7, 2023

@jakebeal an intron does need to be between two exons, but those don't need to be coding regions. Some of the parts in the collections I'm working with are 5'UTRs with introns in them, before any translation start site.

@Gonza10V
Copy link
Contributor

Gonza10V commented Nov 7, 2023

Hmm I looked at the definition at the Sequence Ontology and intron is defined as a sequence in between two exons and as @fxbuson mentioned these dont need to be the CDS.
I have no experience in synbio with eukaryotes, but I see in the example from iGEM tech mammalian the use of an intron outside of the CDS depicting RNA maturation, and one paper where introns are described in UTR regions.

@jakebeal
Copy link
Contributor

jakebeal commented Nov 7, 2023

My biology may be weak here... I thought that an exon was by definition part of the CDS?

@fxbuson
Copy link
Author

fxbuson commented Nov 8, 2023

Here is another example where an intron is not in the CDS. This is a part's plasmid in the OpenPlant toolkit. This part only has a promoter and a 5'UTR, so no coding region. Still, the UTR has an intron (highlighted).

image

Exons by definition are regions that get to be part of the mature mRNA, but are not necessarily coding regions.

@jakebeal
Copy link
Contributor

jakebeal commented Nov 8, 2023

Thank you for the example. That also caused me to look up in SequenceOntology and find that SO:exon is indeed the more general notion, while SO:coding_exon is what I was thinking about.

@Gonza10V
Copy link
Contributor

Gonza10V commented Nov 8, 2023

So, now the implementation only allows the use of coding exons while it has no way to represent the general exon.
I liked the solution from Asimov for this problem.
Then, to represent composites I would go with a picture-in-picture solution.
The representations of composites is a shared issue also mentioned for proteins #167 and maybe we should come with a stardized way to represent 2 hierarchical levels in a picture to solve both problems instead of creating a new representations for each composed part.

@fxbuson
Copy link
Author

fxbuson commented Nov 10, 2023

I'd agree to restrain the scope of this issue to the spike glyph solution. What is the process to turn this into an SEP?

@jakebeal
Copy link
Contributor

@fxbuson : the process to create an SEP is that you do four things:

  1. Create a branch/fork with the proposed specification change. If the SEP is approved, this will then become a pull request and merged into the spec. Example of this for SEP V022
  2. Add an SEP into the SEPs directory. I find it works most smoothly when this explicitly includes the language from the specification diff, which is why I recommend making the SEP based on the change rather than the other way around Example of this for SEP V022
  3. Create an SEP discussion issue and link to the SEP from there. Example of this for SEP V022
  4. Announce the SEP and discussion issue to the mailing list and the Slack. Example of this for SEP V022

From there, we see if the SEP can reach a state of consensus, and then proceed to a vote!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants