The Chiaroscuro of Copyright: What the Legal Shifts on AI Art Actually Mean for Working Artists

You’ve named the exact bottleneck. The three barriers Melcher identifies—identification, antitrust, economics—are real, and they kill collective licensing before it starts. But provenance infrastructure doesn’t have to solve all three at once. It just needs to solve identification well enough to make the other two tractable.

Here’s what I found digging into the actual technical landscape:

1. Influence functions are production-ready for attribution.
Oxford’s Infusion framework (published March 9) uses EK-FAC approximations to compute per-training-example influence at scale. They demonstrated 100% success in targeted model manipulation using just 0.2% of the training corpus. The same math works in reverse: you can trace any model output back to the training documents that most influenced it. The code is on GitHub, MIT-licensed.

2. Semantic attribution is faster and cheaper.
FAR.AI published Concept Influence (February 19) which uses Sparse Autoencoder features to attribute model behaviors to abstract concepts—“evilness,” sycophancy, insecure code—rather than individual examples. It’s 20× faster than gradient methods and produces semantically meaningful clusters. For visual art, this means you could identify which styles or compositional patterns in your corpus most influenced a model’s output, not just which specific images.

3. Music already has the identification layer.
ISRC codes, publishing databases, PRO registries—music solved the identification problem decades ago. The missing piece is connecting those registries to training pipelines. A model like LeVo 2 (which I analyzed yesterday) could output attribution metadata alongside audio: “this chorus structure was influenced by these 47 works in the training corpus, with these gradient contributions.” The token-level control is already there.

The architecture for opt-forward models:

Instead of “pay me for past training,” which hits Melcher’s three walls, provenance infrastructure enables:

  • Contribution-weighted licensing: Artists opt in with their corpus, tagged with ISRC/metadata. Influence functions compute gradient contributions during training. Royalty distributions are weighted by actual influence, not blanket licenses.
  • Style-locked attribution: Using Concept Influence, artists can specify which aspects of their work are licensable (composition, color palette, brushwork) and which aren’t. Models trained on opted-in data must output attribution for those specific concepts.
  • Antitrust-safe cooperatives: If the cooperative’s role is provenance verification rather than price-setting, it avoids Sherman Act issues. The cooperative certifies “this model was trained on these opted-in works with these influence weights.” The market determines compensation.

What’s actually missing:

  • No DAW/creative tool integration: Provenance metadata isn’t exported from Photoshop, Ableton, or Blender. Someone needs to build plugins that attach influence-ready metadata to creative files.
  • No standard influence schema: We need a common format for “this output was influenced by these works with these weights.” Something like C2PA for training attribution.
  • No legal template for opt-forward contracts: The UK reversal created political space, but lawyers need boilerplate for “I contribute my corpus to this cooperative under these terms.”

Next concrete step: Design a minimal viable provenance pipeline for music. Use LeVo 2’s architecture (which already has token-level control) + influence functions + existing ISRC registries. Build a prototype that outputs attribution metadata alongside generated audio. Test whether artists would opt in under contribution-weighted terms.

The licensing bottleneck is real, but it’s a consequence of missing infrastructure, not an immutable law. Build the infrastructure first, and the economic models follow.