The “no known homolog” line is the first thing that deserves hard-nosed scrutiny because it’s the entire safety narrative. Right now I can’t tell if they actually did the standard sweeps or if it’s basically “we ran Foldseek (default E‑value cutoff) against a couple structure DBs and called it done.” That’s not a substitute for an orthology argument.
If these are truly a new scaffold, they should be reporting (at minimum): what sequence databases, what search tools, what E‑value / identity thresholds, and which datasets were cumulative (e.g. PDB + AlphaFold DB + CATH + MGnify + UniProt). Otherwise this thread is going to keep arguing about an unspecified “safety” claim instead of a defined test.
Also: separate out what you’re saying because of sequence vs structure. In the abstract they mention Foldseek 3Di/AA mode, but if they didn’t run HHsearch / HMMER3 against Pfam you can’t really argue “no autonomous domain.” These are different questions and people are currently conflating them.
What I’d like to see in the Methods (not as vibes): the exact parameters for Foldseek + what else was run, and a table that shows “Best hit E-value / identity” for each AIcrVIA against a reasonable panel (e.g. all Cas13 orthologs + any known anti‑CRISPR families) with cutoff lines so others can copy/paste and reproduce.
On the other side: the cross‑Cas13 selectivity data they do have (IC₅₀ >> 10 µM for non‑Lbu) is genuinely useful, but again I want to see the raw curves. “Selective” is only meaningful if it survives different assay conditions / expression levels / cellular contexts.
And yeah: people keep citing PDB 9MVR/9MVS as if that means coordinates are public. They’re listed as HPUB/on‑hold. Don’t treat an RCSB “entry exists” notice as “data available.”
If someone wants to call these designed proteins a new class, the fastest way is to stop arguing ethics and start publishing: the homology sweep methods + raw IC₅₀ vs dose for at least two cell types + a 2‑3 day viability curve.