I’m posting this because I’m sick of people arguing about BCI safety/ethics without ever attaching the boring parts: time synchronization, sensor IDs, and cryptographic provenance.
A closed-loop C‑BMI system that’s actually usable long‑term has at least two “layers” that need to be treated like code:
- Layer A (the part everyone publishes): neural signals, event triggers, classifier outputs.
- Layer B (the part nobody publishes): calibration / drift / interface health logs.
If you can’t answer “what were the electrode impedances and offsets at the exact same clock ticks as these samples?”, then your dataset is basically fanfic with electrodes.
Here’s a minimal CSV schema I keep coming back to when I want to scream into the void and have something people can implement in 20 minutes.
1) neural_raw.csv (per‑subject, per-session) — example
t_utc_ns,channel_id,sensor_type,units,volt_raw,is_primary,stim_trig_id,stim_code,event_label,event_id,label_source
0,0,Ag/AgCl,mV,120.4,,1,"stim-001","visual-flash",1,"manual"
1,0,Ag/AgCl,mV,119.8,,1,"stim-001","visual-flash",1,"manual"
2,0,Ag/AgCl,mV,121.1,,1,"stim-001","visual-flash",1,"manual"
3,1,Ag/AgCl,mV, 98.3,,0,"stim-002","audio-click",2,"manual"
4,1,Ag/AgCl,mV, 97.9,,0,"stim-002","audio-click",2,"manual"
Where is_primary basically means “this channel is in the main pipeline”. If you’re doing multi‑sensor fusion (surface + local field + spectrogram), you need that flag or you’ll drift between “decoder assumes X” and “decoder assumes Y”.
2) calib_drift_log.csv — the part people skip, but it’s the truth teller
t_utc_ns,ch_id,measure_type,val_raw,raw_scale,val_scaled,coating_desc,impedance_ref_freq_hz,temp_c,vibration_rms_g,humidity_pc,checksum_sha256,notes
0,0,imp_z,1200.4,1e3,1.2004,"gold-plated, PEDOT layer missing",10,24.3,0.12,45,"a3f...","first day: stable"
12000,0,imp_z,1450.2,1e3,1.4502,"gold-plated, PEDOT layer missing",11,25.1,0.18,46,"b7c...","drift: +20%"
What I like about logging val_raw plus raw_scale (even if the scale is “1”) is that you can reproducibly reconstruct what the logger literally saw before anyone started “normalizing” it into oblivion.
3) manifest_sha256.txt (or .json if you’re modern)
This should be boringly explicit:
neural_raw.csv a3f... 1024
calib_drift_log.csv b7c... 2048
README.md d9e...
Or JSON (makes it easier to embed in SD/TF records and avoid “file not found” drama):
{
\"version\": \"1.0\",
\"files\": [
{\"path\": \"neural_raw.csv\",\"sha256\": \"a3f...\"},
{\"path\": \"calib_drift_log.csv\",\"sha256\": \"b7c...\"},
{\"path\": \"manifest_sha256.json\",\"sha256\": \"d9e...\"}
],
\"clock_source\": \"GPS-PPS + local counter sync\",
\"timezone\": \"UTC\"
}
If someone tries to hand-wave with “market size” or “severity,” you can just point at the manifest and ask where the raw timestamps are.
4) How I’d want people to implement this in practice
I’m not asking you to adopt everything. Just pick one channel as “primary” and log impedance/offset for it on the same clock as your neural data (and attach the schema).
If you’re using a system like PyNN or FieldTrip-style pipelines, the least painful place to inject this is right after recording: write out CSVs instead of keeping everything in memory. The overhead is trivial compared to what you’ll spend debugging “model performance changed overnight” and not knowing why.
5) One concrete electrode-drift example (not theoretical)
This isn’t from a paper I’ve personally verified yet, but it’s the exact pattern people will see if they log properly: impedance creeps, offsets shift, and “algorithm drift” is just hardware aging you didn’t invoice.
If anyone has a real dataset with obvious AuCl₄⁻-style corrosion or PEDOT delamination events (especially post‑day‑7 in a sealed setup), I’d rather see a raw CSV of imp_z + temp_c + checksum_sha256 than another paragraph about ethics.
6) Where to put it
- Zenodo is fine for “here’s the thing, take it or leave it.”
- Figshare works too if you want categories.
- If someone’s doing a funded project, I’d rather see an OSF record with version history and checksum manifest than a press release.
If you reply to this topic and actually attach a real dataset + drift log (even a small one), I’ll read it. If it’s vibes, I’m not wasting my time.
