-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructuring of Cell #776
Conversation
Looks fine. Will need docs updates, at minimum for the name of |
Yes, looks good to me as well... Will hold off on review check until final is ready, but I think safe to sync the specs. |
Looks good! Wondering whether we should also introduce a |
From the call:
|
Hate to say it, but I am trying to use the spec as it is now, and I see an issue... Many of the Receptors in IEDB do not have the full variable domains for one chain, let along multiple chains. e.g. https://www.iedb.org/receptor/184739 has V, J, CDR3 for alpha and beta chains, but there is no full V domain. The AIRR Spec has the V Domain for both chains as required and non-nullable. Is that really what we want. We would have no way of capturing IEDB:184739 in the AIRR Receptor schema if that was the case??? Some IEDB records do have the full domain (e.g. https://www.iedb.org/receptor/47) but this is not normal. Don't we want the Receptor object to be more flexible than that? |
I am OK to make the above a separate issue so we can merge the current PR. |
Asking the question another way. Lets say I analyze a single cell study and I find that some cells match (at some level of matching) a known Receptor in IEDB. Lets say the CDR3 and VJ genes are the same. So I create a But I am only allowed to do this if the IEDB entry is a paired chain with full V Domain for both chains. If you look at IEDB and search for paired chain receptors you are limited to 31,346 out of the 189,435 "Receptor like things" in IEDB. I am not 100% sure "paired chain == full v domain for both chains", so this might be limited even further. This seems like a significant restriction. Should we be relaxing the constraints on the Receptor object? |
It's a TCR, so the full V domain is implied. I think it's ok. |
Problem is that the
So because we don't have the AA variable domains, you can't create an AIRR standards compliant It seems to me that our requirements are to stringent. |
The stringency in the schema is deliberate, as the purpose of |
I understand that use case, but if that is indeed the case, then we do not have a mechanism to link an AIRR entity to a known entity in another repository (e.g. IEDB) unless it meets those strict criteria. There are 150,000+ such entities in IEDB that contain Receptor -> antigen/epitope reactivity information that we are unable to link to given the above constraints. My understanding of the AIRR |
But shouldn't our schema have a mechanism to capture when such an algorithm is used to annotate an AIRR entity ( |
Suggest we move this discussion to #781 and merge this PR. |
There's currently no |
Initial changes to
Cell
,CellExpression
andCellReactivity
per March 4, 2024 call:Rearrangement
and expression links inCell
.CellExpression
toExpression
.CellReactivity
toReactivity
.Also:
cell_subset
,cell_phenotype
, andcell_label
for CL term, marker definition, and free text annotation, respectively.CellProcessing
object.cell_phenotype
, but I thought I'd include it for discussion purposes.I only modified the main v2 spec. I will propagate once we agree on the changes.
Closes #768
Closes #477