Functions

What is a function? It would be tempting to answer this question by writing a function. It probably involves a variable, a couple exponents, some additions and products; maybe, to make the heart content, top this up with a bevy of trigonometric functions, logarithms, and other exponentials. What do we get? Well, some kind of monster surely. But is that a function? Not quite, as there are a bunch of ingredients missing. But let us pretend for a second, because this chimera would surely and rightfully be called a function by most. Nonetheless, do we get an archetypical function, something that is a good representation of what functions really are?

Undoubtedly, no. For what is a function? It is, in general, a mechanism (or a machine, a black box, a magic wand) which transforms something into something. You input some stuff into your function, and it will output some other stuff (which could be the same). Usual functions are given by a formula, as above:

\displaystyle f(x) = \frac{x^2 \ln (\sin(3x))}{e^{x^2} - 5 \tan x}.

In a typical calculus class, there would often be a follow-up question: “what is the domain of this function?” In other words, what is the set of x for which this formula is well-defined? Here, you want what is inside the logarithm to be strictly positive, and the denominator to be non-zero. Good luck getting an explicit description for that! In any case, the set of points where this formula makes sense is the following well-defined subset of \mathbb{R}:

\displaystyle D = \{ x \in \mathbb{R}, \; \sin(3x) > 0, \; e^{x^2} \neq 5 \tan x \}.

Now, we indeed have a true function: it takes any element of D, applies the given formula, and returns some real number.

Definition, kinda

In general, to define a function, we need three ingredients.

  1. What objects it can take as an input.
  2. What type of objects it outputs.
  3. What happens to the input objects.

The point is that all of these can be pretty much anything. In calculus, functions often input some real numbers, and output real numbers. In multivariable calculus, functions input a vector of real numbers, and output a real number or a vector of real numbers. In linear algebra, functions take vectors to vectors. In more advanced analysis, functions input functions and output functions. Computer functions take as input some keyboard strokes, and can output a whole bunch of things: text, images, sound, etc.

A usual class of functions in computer science are “codes”: you enter a text, and the code turns into some gibberish. This would define a function from the set of all texts to the set of all texts, by some well-defined procedure. How exactly does this procedure work? Hopefully, this is extremely complicated! Otherwise, this means that the code is easy to crack.

We are now ready to make a general, abstract, almost precise definition.

A function f is the data of

  1. a set X, called the domain;
  2. a set Y, called the codomain;
  3. a rule which assigns, to each element x of X, one unique element of Y, called the image of x and denoted by f(x).

We then say that f is a function from X to Y, and write f : X \to Y.

The only thing which is not quite precise is the meaning of “rule”. A perfectly rigorous definition is given at the end of this post, but sadly, it cannot really be called enlightening. In any case, think of such a rule as a procedure that unambiguously tells us how to transform something in X to something in Y. “Unambiguously” does not mean “explicitly”: f(x) does not need to be given by some formula (and more often than not, it is not).

Consider the function f : \{1,2,\dots \} \to \{1,2,\dots\}, which inputs a natural number n \geq 1 and outputs the n-th prime number. It is perfectly well-defined, for instance f(1) = 2, f(2) = 3, f(3) = 5, \dots However, there is no (known) formula for f(n). If there were, pretty much all the online security would be doomed!

Consider the function f : [0,1] \to \mathbb{R}, defined for x \in [0,1] by

\displaystyle f(x) = x^5 + x + 3.

Check that for every y \in [3,5], the equation f(x) = y has a unique solution x \in [0,1]. This defines a function g : [3,5] \to [0,1], by g(y) = “the unique solution x \in [0,1] to the equation f(x) = y“. It turns out that there is no explicit formula for this solution.

There is a philosophical question hidden here: what does “explicit” even mean? The sentence “the only positive solution to x^2 = y” does not seem very explicit. But “y = \sqrt{x}” seems quite explicit. Is it “explicit” to write \ln 2, even though its decimal expansion is a mess? What about \pi?

Similarly, it is perfectly acceptable to describe a function using plain English, as long as it is clear and everyone understands without ambiguity. Using a formula often makes things much less clear, even for simple functions. For instance, a usual type of continuous functions is called “piecewise linear“: it is a function whose graph is a broken line, as below.

A continuous piecewise linear function.

Write a precise definition of a piecewise linear function defined on an interval [a,b].

Isn’t it more clear to just write a sentence in English and draw a figure?

Let us consider the Caesar cypher. It is a very simple code: take a word, and change each A in B, B in C, … Y in Z, Z in A. For instance, “MATH” becomes “NBUI”. How would we describe this as a function? We first need a domain. It makes sense to choose the set of all possible words. But what is a word? The set of actual words seems quite complicated indeed: should we include “IRREGARDLESS”? “BAE”? “ANATIDAEPHOBIA”? “FGDSARWCOBFQPLM”? Well, why not? The cypher is just as easy to define for these “words”. So we might as well consider all words, and get a domain which is much easier to describe. But then what do we call a word? It could be any finite sequence of letters, say to simplify a letter of the alphabet \mathcal{A} = \{A,B,...,Z\}. A sequence (or ordered couple) of two letters is then an element of \mathcal{A}^2, according to standard notation. Finally, we want to take the domain to be

\displaystyle X = \bigcup_{n \in \mathbb{N}} \mathcal{A}^n.

After coding, we get also another word, so the codomain should be Y = X. What about the transformation rule? Well, we explained it at the beginning! Probably everyone can understand how this work, so this is good enough! Whatever formula that we could write would just be terrible and incomprehensible.

The BMP format stores image in a computer as a table of pixels with r rows and c columns. The color of each pixel is given by three integer numbers from 1 to 256, giving the scale of red, blue, and green. If a function takes such an image as input, what would its domain be?

Many words of caution

The domain

The domain of a function is a data. It should be given along with all the other ingredients, i.e. the codomain and the “rule”. It is the set of values that we want to consider, not necessarily all those that we could consider. If our variable is time, we may want time to only move forward, and consider a domain [0,+ \infty). If it is an angle, we may want only angles in [0,\pi) or [0,2\pi). Its choice is extremely important, and can change drastically the properties of a function.

Define f : X \to \mathbb{R} with f(x) = \cos x for x \in X. For each n \in \mathbb{N} or n = + \infty, find a domain X for which the equation f(x) = 0 has exactly n solutions.

The other important property is that f(x) needs to be defined for all x \in X. In other words, the rule that assigns f(x) to x has to make sense for any x that we choose in X. And this is true however we define this “rule”: by a formula, implicitly, in plain English, etc.

Which of the following define a bona fide function? The codomain is implied.

  1. f(x) = \ln x for x \in X = [1,+ \infty).
  2. f(x) = \ln x for x \in X = [0,+ \infty).
  3. f(x) = e^x/(\ln x - 1) for x \in X = (0,+ \infty).
  4. f(x) = \ln (\sin x) for x \in X = [0.5,1].
  5. f(n) is the number of digits of n \in X = \mathbb{N}.
  6. f(p) is the temperature on December 31st, 1999, at some point p \in X, where X is the surface of the earth.
  7. f(n) is the n-th digit after the dot of \pi, for n \in \{1,2,\dots\}.
  8. f(x) is the birth date of x \in X, for X being all the people in the world.
  9. f(x) is the name of the first child of x \in X, for X being all the people in the world.

In calculus, it is customary to define the “domain” of a “function” as the set of points where a formula makes sense, as in our first example. It is more proper to call it the natural domain. In such a question, what is actually given in such a question is not a function, but an expression or a formula. Once we have determined the natural domain X, we get a true function with domain X, codomain (usually) \mathbb{R}, and where f(x) is indeed well-defined for all x \in X by the formula given. It is worth noting than more often than not, the natural domain is too large for all intents and purposes, and choosing a smaller domain is more suitable.

The codomain

The codomain is the type of objects f(x) that our x \in X become, after going through the machine f. It is also a data of the problem, but it is usually an obvious one. All the usual functions given by some formula output a real number, so the co-domain is often chosen to be \mathbb{R}. In Example I.1, we would naturally choose the codomain to be \mathbb{N}. But we could just as well choose it to be the set of prime numbers, or \mathbb{Z}, or \mathbb{R}. In any case, there is often a natural choice: the answer to the (often very simple) question “what type of object do we get after applying the function?”. This should not be confused with the much more difficult question “what are all the objects that we get after applying the function?”.

In Exercise II.2, natural choices of co-domain would be as follows.

  • \mathbb{R} for #1, 4.
  • \mathbb{N} for #5.
  • \mathbb{R} for #6 (if we measure in Celsius or Fahrenheit), [0,+\infty) if we measure in Kelvins. In Celsius, [-100,100] would also probably work.
  • \{0,1,\dots,9\} for #7, or \mathbb{N} could work too.
  • The set of all calendar dates for #8.

In any case, only one thing needs to be true: any f(x) for x \in X must belong to the codomain Y. Therefore, we can enlarge the codomain as much as we want, but we cannot shrink it too much.

A word of vocabulary: we say that a function is from its domain to its codomain. Sometimes to is replaced by into. However, it does not mean that the function takes all the values in the co-domain. Consider for instance f(x) = \sin x for x \in X = \mathbb{R}.

  • If we say f : X \to \mathbb{R}, we say that the sin outputs real values. We know this.
  • If we say f : X \to [-1,1], we say that the sin outputs real values between -1 and 1. This is more precise, but this is absolutely not a way to say that it takes all values between -1 and 1.

The rule

We already mentioned that the rule that assigns f(x) to x should be defined for all elements x \in X, and provide an element f(x) \in Y. However we define it, it should make sense for whatever x that we choose in X, and the result should be something that belongs to Y.

Now, what is important is that f(x) is defined unambiguously. In other words, there cannot be different choices. It is unique. If we let X = \mathbb{R}, and define f(x) to be “y \in \mathbb{R} such that y^2 = x“, then we do not have a function, for several reasons.

  • If x<0, then there is no such y. This is an issue with the definition of the domain.
  • If x>0, then there is such a y. But there is an ambiguity, as there are two different choices. This is an issue with the rule.

To solve the first issue, we instead define the domain to be X = [0,+\infty). To solve the second issue, we lift the ambiguity by demanding, arbitrarily but naturally, that the y in question should be positive. This being done, we have indeed, for all x \in X, a number y which is defined in a unique way. We have defined the function “square root“. It has domain [0,+\infty), and we can choose the range to be [0,+\infty) or \mathbb{R} or anything in between.

Which of the following define a bona fide function? If it is not, explain precisely the issue. The codomain is implied.

  1. f(x) = y \in \mathbb{R} such that \ln y = x, for x \in X = \mathbb{R}.
  2. f(x) = y \in \mathbb{R} such that e^y = x, for x \in X = \mathbb{R}.
  3. f(x) = y \in \mathbb{R} such that \cos y = x, for x \in X = [-1,1].
  4. f(d) is the temperature at day d \in X, where X is the set of all days from 1 AD to 2000 AD.
  5. f(x) = 1 if x \in \mathbb{Q}, f(x) = 0 if x \notin \mathbb{Q}, for x \in X = \mathbb{R}.
  6. f(n) is the number of digits of n \in X = \mathbb{N}.
  7. f(x) is the name of x‘s grandmother, where x \in X and X is the set of all the people in the world.

A matter of notation

Function and image

Take a function f from a domain X to a codomain Y. The following additional vocabulary is standard.

If x \in X and y = f(x) \in Y, then

  1. x is called the argument or input of the function;
  2. y is called the value or output of the function, or the image of x by f.

It is important to distinguish between the function itself and the image of an element. The first one is denoted by f: it has a domain, a codomain, and provides a procedure to turn some x \in X into some y \in Y. The second one is written f(x), and it is an element of Y. Therefore, one should say “the function f” 🙂 and not “the function f(x)” 😦 It is at best a poor choice of words, at worst an open door for mistakes.

Define X to be the set of all differentiable functions defined on [0,1], and Y the set of all functions on [0,1]. For a function f \in X, we can define \phi(f) = f' \in Y. This defines a function \phi from X to Y: it inputs a differentiable function, and outputs its derivative.

Take the function f defined by f(x) = 3x^2 + e^x for x \in [0,1]. Then \phi(f) = g, where g is the function defined by g(x) = 6x + e^x for all x \in [0,1]. It would be correct (but clumsy) to write that

\displaystyle \phi(f)(x) = 6x+e^x, \quad x \in [0,1].

Usual functions are the trigonometric functions \sin, \cos, \tan, the exponential \exp, and the logarithm \ln. To say that the derivative of the sine is the cosine, the most compact way is to write \sin' = \cos. Other tempting options are the following.

  • \sin' x = \cos x for all x \in \mathbb{R} 🙂
  • \sin' x = \cos x 😐
  • (\sin x)' = \cos x 😦

The first choice is correct but cumbersome. The second one is imprecise. The third one is an aberration, and doom awaits the blasphemers.

In calculus, when there is no issue with the domain, it is often acceptable to define a function by just giving a formula, such as “The function f(x) = x^5+x^3 is strictly increasing”. Nonetheless, afterwards, we should still say “the function f“.

Sequences

A sequence is merely a function with domain \mathbb{N}. It is customary to make the following notation.

  • Instead of f, we write (u_n) or (u_n)_{n \in \mathbb{N}} (or sometimes merely u).
  • Instead of f(x) or u(n), we write u_n.

Therefore, we should say “the sequence (u_n)“. The notation u_n denotes the n-th term of the sequence, where n \in \mathbb{N} should be given.

As above, it is often acceptable to define a sequence by just giving a formula, such as “The sequence u_n = n(n+1)/2“. Nonetheless, afterwards, we should still say “the sequence (u_n)” or “the sequence u“.

Arrow and arrow

When we write that a function f goes from a domain X to a codomain Y, we often write f : X \to Y. In \LaTeX, this arrow is given by “\to” or by “\rightarrow”.

To specify how an element of X is transformed into an element of Y via f, we often write “f(x) = ... (some formula) for all x \in X“. It is necessary to add “for all x \in X“: otherwise, we are talking about some x that is not introduced. It is similar to saying “John aced his calculus exam” to someone who does not know who John is.

Now, this is quite unwieldy. To circumvent this, we can instead write f : x \mapsto ... (some formula)”. Careful of the different arrow! In \LaTeX, this arrow is given by “\mapsto”. If you did some Python, it is the same as writing a lambda function. Therefore, we always have

\displaystyle f : x \mapsto f(x).

In general, the most economical way to fully define a function f from X to Y, where f(x) is given by some formula, is to write the following.

\begin{aligned} f : \; & X \to Y \\ & x \mapsto f(x).\end{aligned}

The function of Exercise 3 can be written

\begin{aligned} \phi : \; & X \to Y \\ & f \mapsto f'. \end{aligned}

Bonus: formal definition

In order to define a function f : X \to Y perfectly rigorously, we need to associate one and only one y \in Y to each x \in X. This can be done by giving the graph of the function, i.e. the set of points

\displaystyle G = \{ (x,f(x)), \; x \in X \}.

Formally, we can proceed as follows.

A function f is the data of

  1. a set X, called the domain;
  2. a set Y, called the codomain;
  3. a set G \subset X \times Y, such that, for any x \in X, there exists a unique (x,y) \in G. We write then y = f(x).

This encapsulates both important points discussed before.

  • each x \in X has an image y \in Y;
  • this image is unique, i.e. defined unambiguously.

Formally, a function is the domain + the codomain + the graph. In practice, naturally, we should think as we usually do, while bearing in mind the aforementioned points.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s