When I was learning Calculus on my own, I decided to skip limits. Sounds stupid, I know, but at the time I was engrossed in an alternate system that explained the central ideas of derivatives and integrals seamlessly without the minefield of epsilons and deltas that scare away half of the first-time students opening a Calc textbook for the first time.
Plus, I could rest assured that my methods were backed by the fathers of Calculus themselves.
The Idea of Infinitesimals
What do I mean by this? Like all concepts we treat as mathematical law today, Calculus was as fluid with its rules as the continuous change it described. Rather than relying on the rigor of axioms and limit postulates to develop a fully consistent system from the ground up, Newton and Leibniz— the pioneers of modern Calculus’s main ideas— built their new language around a murky idea that came to be known as “the infinitesimal.” That may sound mysterious, but I’ll bet that the symbol below is familiar to even the most limit-rooted Calculus students.
\( dx \)
Rather than simply writing it for the sake of convention within a limit-based derivative or integral, though, this symbol came to represent a quantity itself. An infinitesimal describes a tiny nudge to the value of x far smaller than the degree of precision being used in conventional calculations… infinitesimally small, so to speak. The question Newton and Leibniz were asking throughout their development of “infinitesimal calculus” was how a tiny nudge of dx affects the value of various functions \( y=f(x), \) in other words, a tiny nudge dy!
Leibniz in particular blurred the distinction between infinitesimals and real numbers, treating them as one in the same. In fact, it’s his exact notation we generally use today for the derivative \( \frac{dy}{dx} \), literally meant to be an arithmetical ratio of “the resulting infinitesimal nudge of y” from the “infinitesimal nudge of x.”
From here, anything goes. Instead of stopping at using infinitesimals as just a symbol, the proofs of early Calculus would add and subtract them, plug them into trig functions, use them as lengths or angles in geometric proofs, and so on. I’ll give a quick example here with the derivative of \( x^2 \).
Since we’re multiplying two quantities together, a common trick with geometric proofs is to treat this as an area calculation, so we can use a square to represent the numerical length x and add a small dx to each side to showcase the resulting change in area dy. But what is that area? Well, breaking the shape added to the square we can see it’s composed of 2 rectangles and a small square in the corner. Adding together the areas of these composite shapes gives us the expression below (there are 2 rectangles with sides x and dx, and a square with side length dx)
\( dy=2xdx+dx^2 \)
Here’s where things get interesting. We’ve already established that dx is infinitesimally small… so what would dx^2 be? The square of a number close to 0 is smaller than the original, so dx^2 must be unfathomably small. This added degree of “smallness” makes it comparatively inconsequential with respect to the other term, so Newton and Leibniz decided to drop it and leave their final expression as \( dy=2xdx \).
If this idea of dropping a “more infinitesimal” quantity feels wrong, you shouldn’t worry: I mean, the math checks out, and it sounds right. As Calculus’s power became widespread and mainstream in math and physics, its practitioners matched infinitely small numbers to infinitely small numbers over and over again and derived consistently useful results. The intuition for the derivative as a tangent to a curve comes from the idea of infinitesimal change that we use today down to the letters, one of the many remnants of infinitesimal thought that made it such a powerful system. If it ain’t broke, don’t fix it, right?
Things Break
The trouble was that math with infinitesimals was a lot like skimming the SparkNotes of a book for an essay (can speak from experience there) instead of just blocking out two hours and reading it cover-to-cover: you end up with a vague roadmap of the plot, characters, and themes, but when asked about the nitty-gritty of dialogue or character relationships you have to draw a blank. The fact that this symbol ∞ is so commonly used as shorthand for all kinds of nonsensical operations— such as division by 0— really speaks to how deeply etched the idea of juggling infinities and infinitesimals, all without a concrete set of rules, was in Calculus.
\( \frac{1}{dx}=\frac{1}{0}=\infty \)… the kind of expression that dominated logic with infinitesimals.
Why was that a problem? Well, remember that one of the strengths of infinitesimal calculus was in analyzing the behavior of functions over an interval and establishing a relationship in the form of the derivative. But certain functions, often to a grotesque degree, seemed to demand an explanation that infinitesimal logic simply couldn’t articulate.
The Problem of Continuity
Take the floor function, denoted as \( \left\lfloor x \right\rfloor \). The idea of this function is simple: given a real number, say 3.1415926535, chop off the decimal and round down to the closest integer, in this case, 3. Over a given interval, we expect it to be constant until it crosses over to a new integer, then jumps up to a new, higher floor. But discontinuities don’t really have a concrete explanation in infinitesimal language: tangent line intuition doesn’t seem to cut it when the function so abruptly changes its behavior.
Okay, that’s not so bad. We can define an intuitive framework for the idea of continuity pretty easily: if a function doesn’t abruptly change its value from point to point, then the idea of modeling change over an infinitesimal adds up; otherwise, we have to carve it up and analyze each segment one at a time. With this logic in place, the floor function wasn’t such a big deal. Other functions didn’t lend themselves to such intuition…
The Weierstrass Function
This spiky function is called the Weierstrass function, after its discoverer Karl Weierstrass. There are definitely less esoteric graphs out there that leave the infinitesimal definition of limits floundering (the absolute value function’s sharp point is the traditional example used), but this function has been historically pathological about the limits of infinitesimal theory and is too cool not to talk about anyway. Here are some facts about this strange function.
- It is continuous over its entire domain, meaning no jumps or holes.
- It is defined by a single function rather than a piecewise collection of fragments: \( f(x)=\sum _{n=0}^{\infty }a^{n}\cos(b^{n}\pi x), \) where \( 0< a< 1, b > 1, ab > 1+ \frac{3}{2}\pi \). The use of infinite series as functions is completely legitimate: Leibniz and especially Newton used them frequently in their respective developments of Calculus.
- It was one of the first rigorously studied fractal curves: at any given point you can zoom further in and the curve will maintain its structure rather than smoothing out into a line, just like the red bubble shows when zooming in on a point.
The Problem of Smoothness
Unlike the floor function, this function’s complications seem much less obvious. it’s behavior is unusual, sure, but surely we can accurately describe its local behavior, right? Not exactly. The idea of infinitesimals is to use local linearity: that, over a small enough interval, a function behaves like a straight line and can be modeled as such, but if a function just sprouts more and more jagged edges the further in you zoom then how can it be defined that way? And what does it even mean for a function to “act like a line?” And who’s to say it doesn’t behave that way at a certain level of infinitesimalness: after all, we just dropped certain infinities when it feels right anyway!
The problem isn’t necessarily that infinitesimals can’t seem to describe Weierstrass’s function, but rather that they can’t seem to describe why they can’t describe it. And that’s the telltale mark of an incomplete system.
Real Analysis
There are plenty of other places where infinitesimals don’t measure up as a rigorous system: in multivariable calculus, the use of partial derivatives (how a multivariable function like \( f(x,y) \) changes from a tiny nudge dx but no change in dy) causes the kind of infinitesimal arithmetic we take for granted to crumble altogether. But the real reason infinitesimals were phased out wasn’t because of any blatant contradictions but as part of a larger push for rigor in mathematical theory; as we saw with Weierstrass’s function, with a little bit of creativity we can create some truly indescribable functions.
But around the turn of the 1800s, mathematicians decided that indescribability wasn’t gonna fly if they wanted to have a perfect logic language. The field of analysis was developed by mathematicians such as Cauchy, Reimann, and Weierstrass— no, he wasn’t just creating crazy infinite series to mess with mathematicians— to formalize the key ideas of continuous change, and with its inception, the infinitesimal system was given the boot. But what would be the replacement for such a cornerstone of Calculus?
Infinitesimals Hit Their Limit
Introducing the limits we know and love! But to make sure we don’t end up hand-waving any infinities like last time, let’s unpack the formal definition from the ground up:
If, for every real number \( \epsilon >0 \) there exists a \( \delta >0 \) such that for all \( x \)
\( 0<|x-x_0|< \delta \Rightarrow |f(x)-L|<\epsilon \)
then we may say the limit of \( f(x) \) as \( x \) approaches \( x_0 \) is L:
\( \lim_{x\to x_0} f(x)=L \).
If you’ve taken Calculus then you’ve probably at least seen this definition even if you haven’t worked with the sheer pain that is proofs with epsilon and delta, but what I want to focus on here is the subtle difference between the logic of limits vs. infinitesimals. Looking at the 2 inequalities for \( x \) and \( f(x) \), we can see that both place an upper bound on the difference between the input and output of the function. That in itself isn’t saying much, but the real weight of the statement is that both bounds are present for any epsilon chosen: in other words, no matter how small the gap between two values of \( f(x) \), there will always be a finite range that its input values are restricted to.
For example, let’s say we’re given an output range of \( \epsilon 0.5 \). That means that all \( f(x) \) we’re allowed to use must be within 0.5 of L, and what’s more, the only x-values that yield this result are within some \( \delta \) of \( x_0 \).
As we make the interval arbitrarily small by choosing smaller and smaller epsilons, many functions with domains too limited to confine their range of outputs within another finite range of inputs are excluded from the definition at certain points. Taking the floor function as an example, at every integer we can choose an epsilon less than 1 and the resulting range of inputs will have no way to reduce the output distance any further: the minimum distance between one plateau and another is no less than one no matter how close together we make our input range. Try it yourself with some functions below!
Credit to https://www.desmos.com/calculator/iejhw8zhqd for original interactive
Notice that this definition only works with finite quantities: we're never working with an "infinitesimal" smaller than everything else in its vicinity, and this firmly rids us of "dx" and "dy" being treated as actual numbers (they're rigorously defined as differentials instead, which have several formal definitions that I may cover in a future post once I can wrap my head around them). Today, notation such as Leibniz's \( \frac{dy}{dx} \) no longer carries any formal arithmetical meaning: it's an artifact of an earlier period of math.
The Limit Derivative
To finish off, let's revisit the infinitesimal derivative e and see if we can formalize it with our new theory of limits. As expected, this semantic change makes a pretty negligible— some may even say "infinitesimal"— difference to our intuitive understanding. Starting from our initial idea of finding a tiny change in y given a change in x, by replacing our infinitesimals dx and dy with the well-defined differences in inputs and outputs respectively we can simplify our expression as follows:
\( \frac{dy}{dx}=\frac{f(x)-f(h)}{x-h} \).
Finally, by taking the limit of this expression using our established definition, we can finally write our formal definition of the derivative, no strings attached:
Let h be a real number such that, for every \( \epsilon >0 \) there exists a real number x where \( 0<|x−h|<\epsilon \).
If the \( \lim_{x \to h}\frac{f(x)−f(h)}{x−h} \) exists then f is differentiable at h and the limit is denoted as \( f'(h) \).
There we have it. This definition, known as the difference quotient gives us a testable way, using the rigorously defined algebra of limits, to determine whether a given function is differentiable at any point. All shortcuts, from the product rule to the power rule to even the quintessentially Leibniz-based chain rule, have formal proofs through this limit, which in turn has formal proofs through the epsilon-delta definition.
The Conditions of Continuity
At this point though, you may still be unconvinced. Sure, we've given our infinitesimals a fancy Greek-letter makeover, but how do these definitions succeed where our old system failed? Let's take another look at our problematic functions from before and see if we can come to any firmer conclusions about them.
The main problems with the floor function seemed to take place at the awkward jumps between points that made it confusing to define an "instantaneous rate of change." And as it turns out, we can actually do one better than just debunking \( \left\lfloor x \right\rfloor \): we can show that a function has to be continuous in order for it to be differentiable, effectively resolving the floor function and any other mapping with such jumps.
Defining continuity with limits is pretty simple: we don't want any breaks in our function. This can always be chalked to two cases: a function not existing at a point or behaving unlike the other points in its neighborhood. The former is caused by the function itself being undefined and the latter occurs when the limit doesn't exist or doesn't match the actual behavior of the function at the point. These cases can be neatly summarized with the equation below:
A function f is continuous at h if \( \lim_{x\to h}f(x)=f(h) \).
This statement firmly establishes 3 points: The limit exists at h, the function exists at h, and the two are equal.
The Proof
Now that we have a definition established, let's see how we can pull it out of the limit derivative! The proof basically chalks up to a clever bit of algebra, so I'll just write out the steps below:
\( \lim_{x\to h} f(x)-f(h)=(\lim_{x\to h}x-h)(\lim_{x\to h} \frac{f(x)-f(h)}{x-h}) \)
We expand our limit as a product of another limit and our familiar difference quotient. This factorization of a limit into 2 distinct factors that approach the same limit is allowed and called the product rule.
\( (\lim_{x\to h}x-h)(\lim_{x\to h} \frac{f(x)-f(h)}{x-h})=0*f'(h)=0 \).
Each limit is simplified, the first reducing to \( h-h=0 \) as x approached h and the 2nd to \( f'(h) \), which we assume to be finite and thus assume f to be differentiable. Make sure it's clear that the next steps require the assumption of differentiability.
\( \lim_{x\to h} f(x)-f(h)=0 \).
Since we know our limit product is equal to both our original limit and 0, the transitive property applies.
\( =\lim_{x\to h}f(x)-\lim_{x\to h}f(h)=0 \).
Here we break up our limit into individual terms: another rule that makes intuitive sense and is provable.
\( \lim_{x\to h}f(x)=\lim_{x\to h}f(h) \).
This step is just algebraic rearrangement, moving the limit of \( f(h) \) to the other side.
\( \lim_{x\to h}f(x)=f(h) \).
Since the right-hand-side limit was just asking how a constant changed with respect to an unrelated variable it's just equal to the constant, which spits out our definition of continuity! In other words, we arrive at the fundamental result that differentiability implies continuity.
What about Weierstrass?
Finally, let's take a look at the spiky mess that is the Weierstrass function. Broadly speaking, the reason sharp corners cause non-differentiability is that the function behaves differently on the left and right sides, and changes its behavior abruptly rather than through the gradual change that the difference quotient demands. Mathematically, this translates to the difference quotient approaching different values from the left and right sides.
So what does this say about Weierstrass's infinitely jagged construction? Because it's a fractal and thus never grows more linear as we zoom in to be approximated by a tangent line, this is an example of a function that is continuous everywhere but differentiable nowhere, probably the most brutal reminder that the converse of a proven result isn't necessarily true (differentiability implies continuity, but continuity doesn't imply differentiability). The proof is too long to go into here, but I highly recommend taking a look here if you're comfortable with the idea of limits and want a demonstration of their power as a rigorous system: it looks daunting at first, but actually turns out to be surprisingly simple and beautiful, and the end result demonstrates exactly what we discussed here about the limits behaving differently on each side of any given point. Unlike infinitesimals, the theory of limits is limitless.
Abandoning Infinitesimals?
By now, hopefully, you're convinced that limits were a necessary step toward creating a truly precise system of Calculus. But maybe you're also a little disappointed. Because let's be honest, infinitesimals are great. They're simple, they have infinite potential for visualizing the core concepts of Calculus, and the math you do with them just feels right.
Our familiarity with numbers makes the system of infinitesimals much more natural to us, but as it stood in the time of Newton and Leibniz, their logic didn't hold up to the scrutiny of formal analysis: that chain rule derivation yields a correct result, but because we can't accurately define infinitesimals, it works only because it happens to coincide with the longwinded proof with epsilons and deltas. Still, there are many cases where thinking of tiny nudges of infinitesimal magnitude is incredibly useful and will lead to correct conclusions, and it's for that reason that even today, more than a century after the formalization of Calculus through analysis, the use of infinitesimals still continues. But how can we know when to trust them and when to tread with caution?
A Guide to Sidestepping Limits
Well, for starters, take these generalizations by a fellow student still studying these classes with a grain of salt. But as someone who went all the way through Calculus 1 and 2 without learning about limits, their use is generally safe when functions are well-behaved.
Specifically, uniformly continuous, smooth (everywhere-differentiable) functions— including all the elementary functions used in single-variable calculus and general physics classes—will never cause any stick-ups when treated with infinitesimal logic. In fact, they're used all the time in physics, where functions like Weierstrass's are few and far between, including in solving differential equations through separation of variables and other methods. Their use can be rigorously confirmed to yield correct solutions, and they're a whole lot quicker and easier to work with. And finally, no matter how much Calculus books try and stress the importance of limits, when it comes down to geometry, whether it's describing a new coordinate system or representing a unit of integration through a graph, they're gonna go back to the tiny nudges that Calculus came from. The results they're showcasing have been formally verified, so they drop the formalities in favor of a clear explanation. In general, tread lightly and use them with discretion, and infinitesimals can still lend a hand throughout our journey with Calculus (I sure hope so, because I'm going to be using them a lot in future posts).
A New Infinitesimal Calculus?
Finally, if you're still unsure about the use of infinitesimals as a formal system, look no further than non-standard analysis. Developed by Abraham Robinson in the 1960s, the system codifies the meaning of infinitesimals as a concept and rederives many results of real analysis, including some in more efficient and intuitive forms. It's used most commonly in probability theory, economics, and— you guessed it— differential equations and mathematical physics to develop simpler models and make rigorous many leaps in logic. A detailed explanation of the needed groundwork would span a whole set of posts and dive deep into the weeds of axiomatic logic and set theory, but a couple of basic distinctions they make confirm the differences we stressed between infinitesimals and limits.
A number x is infinitesimal if and only if \( |x|<\frac{1}{n} \) for all integers n.
First off, this definition keeps our beloved intuition of infinitesimals as quantities that can be algebraically manipulated. However, through this definition and many other clauses, they are firmly separated from the real number system and cannot be used interchangeably with them.
Among other sticking points, all real numbers must satisfy the Archimedean property, which states (as a corollary) that, for any 2 real numbers x and y where \( x>0 \), there must be a natural number n such that \( nx>y \). If we look at the definition of infinitesimals though, we can see that \( nx<1 \) for all integers n, and thus infinitesimals are relegated to a new class of numbers known as the hyperreals, which are used to describe infinities and infinitesimals alike. As you can imagine, this makes the foundations of the theorem considerably more abstract, but once the convolutions of logic are established the theory yields the intuitive framework we know and love!
The point is, infinitesimals are still alive and well today, and whether you subscribe to the limit or infinitesimal mode of thinking about analysis, both are ultimately valuable for understanding the ways we use Calculus today. To demonstrate that and extend an olive branch to both camps, as we build the intricate web of Calculus in future posts, we'll make use of both models— not just to do math, but to understand it. See you then!