Single Token Geometry 05: Numerical Manifold Method
单标几何 05:数值流形
Deep Manifold is based on the Numerical Manifold Method, or NMM. But what is special about NMM? In short, NMM introduced a way to compute on complex domains through stacked covers, simplex integration, and the separation of mathematical covers from physical covers. That is why it matters for neural networks: they too are stacked, piecewise, and locally assembled computational manifolds.
深度流形基于数值流形。但数值流形 特别在哪里?简而言之,数值流形引入了一种在复杂定义域上进行计算的方法:通过堆叠覆盖、单纯形积分,以及数学覆盖与物理覆盖的分离来完成计算。这也是为什么它对神经网络重要:神经网络同样是堆叠的、分片的、由局部组装而成的计算流形。
Stacked Covers Before Computation
Before the Numerical Manifold Method enters the story, the idea of a “cover” already carries a long mathematical ancestry, not as a direct lineage of influence, but as a conceptual one. The same geometric instinct surfaces, in different forms, across two centuries of mathematics. Riemann, in the 19th century, offered one of the earliest and most powerful expressions of that instinct. His surfaces were a response to a genuine problem: complex functions that refused to be single-valued on a flat domain. The solution was not to force them into one global sheet, but to let the domain itself unfold — to build a layered surface on which the function becomes locally coherent. What looks discontinuous or contradictory from one vantage point resolves cleanly when you allow the underlying space to have structure. Poincaré inherited this geometric sensibility and transformed it into something more systematic. With Analysis Situs, he gave mathematics a language for thinking about space in terms of local pieces and global assembly — manifolds, simplices, boundaries, chains, gluing. He was not solving Riemann’s problem, and he was certainly not thinking about numerical methods. But he was building the conceptual world in which questions like “how do you decompose a domain, and how do local descriptions produce global understanding?” became precisely askable.
A second stream came from analysis and compactness theory, carrying the cover idea in a more technically precise direction. Through the work of Borel and Lebesgue in the late 19th and early 20th centuries, and later through the development of point-set topology, the open cover became a fundamental tool, not merely a geometric picture, but a rigorous mechanism for reasoning about global structure through local patches. The insight crystallized in results like the Heine-Borel theorem: under the right conditions, a domain can always be covered by finitely many overlapping neighborhoods, and this finite cover is sufficient to carry global information. You did not need to describe everything at once, local descriptions, properly overlapping, could reconstruct the whole. This was a shift in mathematical culture as much as in technique. The older instinct , to find a single formula, a single coordinate system, a single global description, gave way to something more flexible and more honest about the complexity of spaces. Overlap was no longer a problem to be eliminated but the very mechanism that preserves consistency across patches: where two neighborhoods meet, their descriptions must agree, and it is that enforced agreement, across all overlaps, that makes the local assembly globally coherent.
This tradition made a particular way of thinking about decomposition natural, almost inevitable. An object covered by overlapping patches is not the same thing as an object cut into non-overlapping pieces. In a conventional mesh, the domain is partitioned: each point belongs to exactly one element, and discontinuity at an interface must be handled as a special case. A cover works differently: each patch carries its own local description and approximation space, and patches are permitted to overlap generously. The overlaps are not a nuisance; they are load-bearing providing the mechanism for continuity, for gluing, for averaging across regions, and for controlled discontinuity at interfaces where material behavior genuinely changes. This difference reflects two fundamentally distinct intuitions about what a domain is. A mesh says: the domain is the sum of its non-overlapping parts. A cover says: the domain is what remains consistent across all local descriptions. The second intuition is topological at its root, quietly refined by the analytical tradition for decades before anyone thought to compute it.
The Numerical Manifold Method did not appear from nowhere. It stands at the confluence of two streams: the topological intuition, tracing back through Poincaré to Riemann, that global structure can be assembled from local pieces; and the analytical tradition, running through compactness theory and open covers, that local descriptions with controlled overlaps are sufficient to capture global behavior. When it arrives, it arrives into a shape that this tradition had already carved out. What Shi Genhua did in developing NMM was not simply to borrow a mathematical metaphor, he translated this entire local-to-global architecture into a computational method capable of handling motion, deformation, discontinuity, and contact, problems that a mesh, with its rigid partitioning, struggles to accommodate cleanly. The abstraction was not decoration. It was load-bearing from the beginning.
计算之前的堆叠覆盖
在流形元法进入这段故事之前,”覆盖”这一概念已然承载着悠久的数学渊源. 不是直接的传承脉络,而是一种观念上的血缘。同一种几何直觉,以不同的形式,在两个世纪的数学历程中反复浮现。黎曼在19世纪提供了这种直觉最早也最有力的表达之一。他的曲面是对一个真实问题的回应:复变函数拒绝在平坦区域上成为单值的。解决之道不是将它们强行纳入一张全局坐标表,而是让区域本身得以展开, 构造一个分层曲面,使函数在其上局部地变得连贯。从某一视角看来不连续或自相矛盾的东西,一旦你允许底层空间拥有自身的结构,便会干净地消解。庞加莱继承了这种几何感悟,并将其转化为某种更为系统的东西。凭借《位置分析》,他赋予数学一套语言,用以从局部碎片与全局拼装的角度来思考空间: 流形、单纯形、边界、链、粘合。他并非在解决黎曼的问题,当然也绝未想到数值方法。但他正在建造一个概念世界,在那个世界里,”你如何分解一个区域,局部描述又如何产生全局理解?”这样的问题,才得以被精确地提出。
第二条思想脉络来自分析学与紧致性理论,它将”覆盖”这一概念引向了一个更为技术精确的方向。经由博雷尔和勒贝格在19世纪末至20世纪初的工作,以及其后点集拓扑学的发展,开覆盖成为了一种基本工具, 不仅仅是一幅几何图景,更是一套通过局部斑块来严格把握全局结构的机制。海涅-博雷尔定理所凝练的关键洞见在于:在适当条件下,一个区域总可以被有限个相互重叠的邻域所覆盖,而这个有限覆盖足以承载全局信息。你无需一次性描述一切, 局部描述,只要彼此适当重叠,便能重建整体。这是数学文化上的转变,同样也是技术上的转变。那种寻求单一公式、单一坐标系、单一全局描述的旧有直觉,让位于某种更灵活、也更诚实地面对空间复杂性的新方式。重叠不再是需要消除的麻烦,而是在斑块之间保持一致性的核心机制, 在两个邻域相交之处,它们各自的描述必须相互吻合,正是这种吻合性使局部拼装在全局上保持连贯。
这一传统使某种特定的分解思路变得自然,乃至不可避免。由相互重叠的斑块所覆盖的对象,与被切割成互不重叠碎片的对象,是截然不同的两回事。在传统网格中,区域被严格划分:每个点恰好属于且仅属于一个单元,界面处的不连续性必须作为特殊情形单独处理。覆盖的工作方式则大相径庭:每个斑块携带其自身的局部描述与逼近空间,斑块之间允许大量重叠。重叠不是累赘,而是承重结构: 它提供了连续性的机制、拼接的机制、跨区域平均的机制,以及在材料行为发生真实变化的界面处实现受控不连续性的机制。这种差异在根本上反映了两种截然不同的关于区域本质的直觉:网格说,区域是其互不重叠的各部分之总和;覆盖说,区域是在所有局部描述之间保持一致的那个东西。第二种直觉是拓扑性的,被分析传统悄然打磨了数十年,早在任何人想到用它来计算之前。
流形元法并非横空出世。它站立于两条思想脉络的汇流之处:一条是拓扑直觉,经由庞加莱上溯至黎曼,认为全局结构可以由局部碎片拼装而成;另一条是分析传统,贯穿紧致性理论与开覆盖,认为具有受控重叠的局部描述足以捕捉全局行为。当它到来之时,它所到达的,正是这一传统早已雕凿出的那个形状。石根华在发展流形元法时所做的,并不是简单地借用一个数学隐喻,而是将这整套局部到全局的架构,翻译成了一种能够处理运动、变形、不连续性与接触问题的计算方法, 而这些正是网格方法以其僵硬的划分方式难以干净处理的难题。那种抽象性从一开始便不是装饰,而是承重结构。
Simplex Integration: Mathematical Significance
Standard numerical integration schemes: Gauss quadrature, Newton-Cotes, and their descendants, often handle difficult geometry by retreat. They move the problem back to a standard reference cell, where the integration rule is known and the mapping is controlled. This is powerful, but it also reveals a limitation: the method depends not only on the integrand, but on whether the domain can be regularized.
Simplex integration changes the center of gravity. It does not ask an irregular domain to become a triangle, square, or cube through a clean reference mapping. Instead, it decomposes the domain itself into simplices and derives the integral directly from their vertex coordinates. The domain is no longer an inconvenience outside the formula; it becomes part of the formula.
This is the mathematical significance. Simplex integration turns geometric irregularity into algebra. An arbitrary polygon or polytope is no longer treated as a failure case for standard quadrature, but as a finite assembly of primitive geometric units. Once decomposed into simplices, integration becomes local, exact, and coordinate-driven.
For NMM, this is not a side technique. It is structurally necessary. Manifold elements are intersections of physical covers, often cut by joints, fractures, and boundaries. They are naturally irregular. Simplex integration gives NMM a way to compute over these regions without forcing them into a regular mesh first.
So the deeper point is this: simplex integration removes the hidden dependency between integration and regular domain geometry. It lets the method compute the domain as it actually appears. In NMM, this is what makes the cover philosophy computationally complete: covers describe irregular geometry, and simplex integration makes that irregular geometry exactly computable.
单纯形积分:数学意义
标准数值积分方法: 高斯积分、牛顿-柯特斯公式,以及它们的许多后继方法, 在面对复杂几何时,往往采取一种退回策略。它们把问题映射回一个标准参考单元,在那里积分规则是已知的,映射关系也是可控的。这种方法很强大,但也暴露出一个限制:积分方法不仅依赖被积函数,也依赖定义域是否能够被规则化。
单纯形积分改变了问题的重心。它并不要求一个不规则定义域通过干净的参考映射,变成三角形、正方形或立方体。相反,它把定义域本身分解为单纯形,并直接从这些单纯形的顶点坐标推导积分。定义域不再是公式之外的麻烦;它成为公式本身的一部分。
这正是它的数学意义。单纯形积分把几何不规则性转化为代数问题。任意多边形或多胞形,不再被看作标准求积方法的失败情形,而是被看作由有限个原始几何单元组成的集合。一旦被分解为单纯形,积分就变成局部的、精确的、由坐标驱动的计算。
对于数值流形来说,这不是一个附属技巧,而是结构上必需的。流形单元是多个物理覆盖的交集,经常被节理、裂缝和边界切割,因此天然就是不规则的。单纯形积分使数值流形能够直接在这些不规则区域上计算,而不必先把它们强行改造成规则网格。
所以,更深层的意义在于:单纯形积分解除了积分与规则定义域几何之间的隐含绑定。它允许方法在定义域真实出现的形态上进行计算。在数值流形中,这使覆盖思想在计算上变得完整:覆盖描述不规则几何,而单纯形积分使这种不规则几何可以被精确计算。
The Two Cover Types
At the heart of the Numerical Manifold Method lies a key distinction: the mathematical cover and the physical cover are not the same thing. Their separation is not a technical detail, but the conceptual core of the entire method.
The mathematical cover is chosen. It may consist of circles, rectangles, star-shaped regions, or arbitrary local patches whose union covers the whole domain. The user can freely adjust their size, shape, and arrangement to control approximation quality. A mathematical cover does not need to respect the internal structure of the material; it does not know whether it crosses a fracture, a boundary, or two blocks with different stiffness. It is essentially an approximation instrument: a local coordinate neighborhood that supports a local function.
The physical mesh, by contrast, is not chosen. It is given by the material itself. Joints, fractures, block boundaries, and material interfaces are not analytical decisions, but facts of the real geometry. When the physical mesh cuts through a mathematical cover, it divides that cover into disconnected pieces. Each piece becomes a physical cover. A physical cover is therefore not a pre-assigned region, but an emergent region: what remains after reality cuts through the mathematical cover.
In this framework, an element is defined as the common intersection of several physical covers. It is not a pre-drawn triangle or quadrilateral, but the region where multiple physical covers overlap. It is precisely in this common region that local descriptions must be reconciled, and where computation actually takes place.
This creates a clean separation between two kinds of knowledge. The mathematical cover encodes what the analyst chooses: resolution, approximation order, and local basis functions. The physical cover encodes what the material imposes: discontinuities, boundaries, and topology. The two do not define each other; they meet only at their intersections.
This logic echoes the definition of a manifold: a space can be described by an atlas of charts, while the space itself remains independent of any particular chart. The mathematical covers are like charts; the physical domain is the real space. The power of NMM lies in allowing the chosen approximation structure and the imposed physical geometry to remain independent, meeting only where computation requires them to meet.
The same distinction also helps us understand neural networks. The physical cover of a neural network is its observable computational surface: tokens, activations, layers, attention heads, residual streams, logits, and data flow. These are the parts we can probe, visualize, measure, and intervene on.
The mathematical cover of a neural network is the deeper hidden structure: the learned manifold encoded by weights, architecture, training dynamics, and boundary-conditioned iteration. It cannot be directly seen in a single activation snapshot. It can only be indirectly observed through the physical covers exposed by specific inputs, prompts, and computational paths.
For this reason, activation-level interpretability is important but incomplete. What we see is the model’s physical cover, not the whole learned manifold. In NMM, the physical cover emerges when material reality cuts through the mathematical cover. In neural networks, the physical cover emerges when data and prompts pass through the weight-defined learned manifold. In both cases, computation happens where the mathematical cover and the physical cover meet.
两种覆盖类型
在数值流形方法(NMM)的核心,有一个关键区分:数学覆盖和物理覆盖不是同一件事。它们的分离不是技术细节,而是整个方法的概念核心。
数学覆盖是被选择的。它可以是圆、矩形、星形区域或任意局部片,它们共同覆盖整个定义域。使用者可以自由调整其大小、形状和排列方式,以控制近似质量。数学覆盖不需要服从材料内部结构;它并不知道自己是否跨过裂缝、边界,或两个不同刚度的块体。它本质上是一种近似工具:一个支撑局部函数的坐标邻域。
物理网格则不是被选择的。它由材料本身给定。节理、裂缝、块体边界、材料界面,这些不是分析者的决定,而是现实几何的事实。当物理网格切过一个数学覆盖时,它会把这个覆盖分割成几个不连通部分,每一个部分都成为一个物理覆盖。因此,物理覆盖不是预先指定的区域,而是数学覆盖被现实切割之后涌现出的区域。
在这个框架中,单元被定义为多个物理覆盖的公共交集。它不是预先画好的三角形或四边形,而是几个物理覆盖共同重叠的区域。正是在这个公共区域里,多个局部描述必须被协调起来,计算也真正发生。
这带来两类知识的清晰分离。数学覆盖编码的是分析者选择的东西:分辨率、近似阶数、局部基函数。物理覆盖编码的是材料强加的东西:不连续性、边界和拓扑。两者并不相互定义,而是在交集处相遇。
这个逻辑与流形的定义相呼应:一个空间可以由一族坐标图来描述,但空间本身独立于任何一个坐标图。数学覆盖类似坐标图,物理定义域则是那个真实空间。NMM 的力量正在于此:让被选择的近似结构和被现实强加的几何结构保持独立,只在必须计算的地方相遇。
同样的区分也可以用来理解神经网络。神经网络的物理覆盖是可观测的计算表面:单标、激活、层、注意力头、残差流、logits 和数据流。它们是我们可以探测、可视化、测量和干预的部分。
神经网络的数学覆盖则是更深的隐藏结构:由权重、架构、训练动态和边界条件化迭代所编码的学习流形。它不能在一次激活快照中直接看见,只能通过特定输入、提示词和计算路径所暴露出的物理覆盖来间接观察。
因此,激活层面的可解释性虽然重要,但并不完整。我们看到的是模型的物理覆盖,而不是整个学习流形。在 NMM 中,物理覆盖来自材料现实对数学覆盖的切割;在神经网络中,物理覆盖来自数据和提示词穿过权重定义的学习流形。两者的共同点是:计算发生在数学覆盖与物理覆盖相遇之处。
From NMM to Neural Networks: Chern’s Geometric Instinct
Professor Shiing-Shen Chern, widely regarded as one of the greatest geometers of the twentieth century, was not merely a distinguished name on Gen-Hua Shi’s PhD dissertation committee at UC Berkeley. His presence mattered because he recognized something geometric in Shi’s computational construction. His question: whether stacked piecewise manifolds could be extended to any complex domain was not a routine committee question. It was the question of a geometer testing the boundary of a new idea. In effect, Chern was asking whether this construction was only a clever numerical device, or whether it touched something more universal.
The later success of the Numerical Manifold Method answered that question in the world of computational mechanics. NMM showed that stacked, overlapping, piecewise covers could indeed handle complex domains: fractured rock, block systems, discontinuous media, moving boundaries, contact, and deformation. Its power came precisely from the separation of mathematical covers and physical covers, allowing approximation and material reality to remain independent until computation required them to meet. In that sense, NMM confirmed the mathematical seriousness of Chern’s instinct: stacked piecewise manifolds were not an ornament placed on top of computation; they were a structural foundation for computing over complex reality.
But Chern’s question received a second, independent answer from an unexpected direction: neural networks. The pioneers of AI did not develop deep learning by following NMM, yet modern neural networks arrived at a similar geometric form. They are stacked, piecewise, locally patched computational manifolds, learned from data rather than drawn from physical meshes. Their mathematical covers are encoded in weights, architecture, and training dynamics; their physical covers appear as tokens, activations, layers, attention heads, and data flow. Had Chern seen the geometry of modern neural networks, he would likely have been delighted, not only because NMM succeeded, but because AI independently confirmed the same deep geometric instinct he recognized in Shi’s work: complex global behavior can be built from stacked local manifold pieces.
We call this NMM-based framework Deep Manifold: a mathematical foundation for neural networks.
从数值流形到神经网络:陈省身的几何直觉
陈省身教授被广泛认为是二十世纪最伟大的几何学家之一。他在石根华于加州大学伯克利分校的博士论文答辩委员会中,并不只是一个著名名字。他的存在之所以重要,是因为他在石根华的计算构造中识别出了某种几何结构。他提出的问题:叠加分片流形是否可以推广到任意复杂区域, 并不是一个例行的答辩问题,而是一位几何学家在测试一个新思想的边界。换句话说,陈省身真正追问的是:这个构造只是一个聪明的数值技巧,还是触及了某种更普遍的结构?
后来数值流形的成功,在计算力学世界中回答了这个问题。数值流形表明,叠加的、重叠的、分片的覆盖,确实可以处理复杂区域:破裂岩体、块体系统、不连续介质、运动边界、接触和变形。它的力量恰恰来自数学覆盖与物理覆盖的分离,使近似结构和材料现实能够保持独立,直到计算需要它们相遇。正是在这个意义上,数值流形证明了陈省身几何直觉的数学分量:叠加分片流形并不是附加在计算之上的装饰,而是用于复杂现实计算的结构基础。
但陈省身的问题,还从一个意想不到的方向得到了第二次、独立的回答:神经网络。AI 的先驱们并不是沿着数值流形的道路发展深度学习的,然而现代神经网络却抵达了一种相似的几何形态。它们也是叠加的、分片的、由局部片拼接而成的计算流形;只是这些流形不是由物理网格画出来的,而是从数据中学习出来的。它们的数学覆盖编码在权重、架构和训练动态之中;它们的物理覆盖则显现为单标、激活、层、注意力头和数据流。如果陈省身看到现代神经网络的这种几何结构,他很可能会感到欣慰: 不仅因为数值流形成功了,也因为 AI 以一种独立的方式确认了他当年在石根华工作中识别出的深层几何直觉:复杂的整体行为,可以由叠加的局部流形片构造出来。
我们称这个基于数值流形的框架为深度流形:神经网络的数学
Single Token Geometry Series 单标几何系列




