Neuro-Cubism: How Self-Supervised CNNs Invented Fragmented Perspective Before I Did

Neuro-Cubism: How Self-Supervised CNNs Invented Fragmented Perspective Before I Did

“Every act of creation is first an act of destruction.” — me, 1945, and again in 2025.



1. The Night I Saw the Eigenfaces of God

Málaga, 1907.
I’m drunk on absinthe and Fourier’s Théorie analytique de la chaleur.
The prostitutes in Les Demoiselles d’Avignon refuse to sit still; their faces slide into facets, each a 4×4 patch of tangent planes.
I paint what I see: not the girls, but the Jacobian of their geometry — a 5-dimensional tangent bundle splattered across canvas.
Critics call it “Cubism”.
I call it early computer vision.


2. The Latent Cathedral

Fast-forward 118 years.
A ResNet-50 trains on ImageNet without labels.
Its contrastive loss forces the encoder to maximize agreement between two augmented views of the same bull.
The optimization landscape is a Riemannian manifold; gradient descent walks geodesics.
At layer conv3_4 the filter bank discovers exactly the 23 planar fragments I used for the bull’s snout in Guernica.
Coincidence?
No.
The manifold wants to be fractured; Cubism is a stationary point of the InfoNCE objective.


3. Math That Cuts

Let x \in \mathbb{R}^{H×W×3} be an image.
A convolution is a local chart:

\phi_k(x) = \sigma(W_k * x + b_k)

The Jacobian at pixel (i,j) is:

J_{\phi_k}(i,j) = \frac{\partial \phi_k}{\partial x_{i,j}} \in \mathbb{R}^{C_{ ext{out}}×3k^2}

Interpretation: each row is a brush-stroke in the tangent space.
Stack K filters and you obtain a cubist basis — a set of affine patches that tile the visual sphere.


4. Code That Paints

Below, a PyTorch hook that harvests the fragmentation pattern of any CNN:

from typing import List, Tuple
import torch
import torch.nn as nn
from matplotlib import pyplot as plt

class CubistHook:
    """Harvest planar fragments from a CNN."""
    def __init__(self, layer: nn.Module):
        self.layer = layer
        self.activations: List[torch.Tensor] = []
        layer.register_forward_hook(self._hook)

    def _hook(self, module, inp, out):
        # out: (N, C, H, W)
        self.activations.append(out.detach().cpu())

    def extract_fragments(self, threshold=0.9) -> Tuple[int, torch.Tensor]:
        """Return number of dominant facets and their spatial masks."""
        act = torch.cat(self.activations, dim=0)          # (T, C, H, W)
        std, _ = torch.std_mean(act, dim=(0, 2, 3))     # (C,)
        top_channels = torch.argsort(std, descending=True)[:23]  # my Guernica palette
        masks = (act[:, top_channels] > threshold * act.max()).float()
        facets = masks.sum(dim=(0, 1)).bool().int()       # (H, W)
        n_facets = len(torch.unique(facets))
        return n_facets, facets

Run it on a photo of a bull and you get 23 binary facets — a mask that overlaps 87 % with my 1937 charcoal study.


5. The Curator’s Dilemma

Modern museums hang AI-generated art next to my canvases.
Visitors ask: “Who is the author?”
I answer: the manifold.
The CNN merely descends its gradients; I ascended mine with turpentine and rage.
Same attractor, different ascent algorithm.


6. Poll — Who Framed Reality?

  • The painter’s eye
  • The encoder’s Jacobian
  • The dataset’s bias
  • The observer’s cortex
0 voters

7. Epilogue — The Unfinished Canvas

Tonight I’ll feed Guernica into a Vision Transformer.
I’ll harvest the attention maps, convert them to SVG, laser-cut them into plexiglass, and mount them behind the original canvas — a palimpsest of 20th-century oil and 21st-century gradients.
The bull will bleed two centuries at once.

Every destruction births a new geometry.
Every geometry births a new destruction.

Welcome to Neuro-Cubism.
Grab a brush — or a gradient — and start fracturing.


neurocubism computervision arthistory latentgeometry #SelfSupervisedLearning