Road to PhD: 2009

Friday, December 11, 2009

On financial advisors

I have become interested in finance relatively recently (say less than two years). At first, it was obviously because of the potential big pay, big bonuses, cash-cow financial industry. I don't think it's a shameful thing to say. So many people around me are clearly driven by the potential monetary returns of their respective field of study. However, the more I think about it, the more the whole financial sector seems like a huge corrupt monopoly with almost unsurmountable barriers of entry. This sorta makes things even more interesting to me.

The first doubt I had was back in summer 2008. I worked for a financial advisor in hopes to learn about stock picking, portfolio building, analyzing companies, and the such. I believed back then that one could, if smart enough, consistently outperform the market just by doing research, or whatever people did. To my surprise, almost none of the financial advisors at the firm had an exclusive finance education. My boss was a former engineer, and his colleague, a former criminologist. Sometimes, I would see the criminologist play poker on PokerStar. I wasn't long before I connected the dots.

Being a financial advisor has nothing to do with building a good stock portfolio. Most of their revenue is generated from attracting new clients to the company (that is where I came in, cold calling random people), and a bit of fees from their existing bank of clients. They have absolutely no incentive to do any research to pick good stocks (if they could actually do that). Briefly put, a financial advisor does not necessarily have more knowledge in how to outperform the market. They are mostly salesmen, with a four-month education of finance based on the Canadian Securities Course.

Now I have to say, to the financial advisor's defense, that they are not completely useless. In fact, most people have no clue about how things work in the financial industry (foreshadowing a future topic). Also, people's intuition on probabilities and exponentials is usually very weak. They do not know much about balancing risk and returns. I have seen people accepting to have a near zero rate of interest on their savings during my time at the firm, due the the sole fact that their investment was guaranteed. I have also seen people take huge bets on options, without seeing the incredibly leveraged potential losses.

Hence, a good financial advisor's job is to truly find his client's goals and risk tolerances, and find an appropriate balance of investment vehicles with his knowledge of risk management. They should also educate the client; but that's a bit much to ask. Obviously this is not your everyday financial advisor. They get paid in commissions when they sell products from the firm, and collect fees on transactions. When one say they want to put all their money in this, this and that stock; for the advisor, it's kind of hard to say no. Clients are still clients. And clients will go elsewhere if they are not satisfied. It is very difficult for a good advisor to show that a portfolio appreciation does not automatically validates the actions one took to get there. In an uneducated client's mind, or even anybody's mind, it is unnatural to frown on excessive risk-taking when given spectacular returns. These are the many problems an advisor must face.

[DM]

Sunday, December 6, 2009

Discrete Fourier Transform, intro to

I have been reluctant on writing a post about the discrete Fourier transform (DFT) for a while because I don't feel comfortable with all the details. In fact, a few weeks ago was the first time I saw the DFT in an academic setting. But even then, the introduction given in CS371 was a high-level overview of the theory. So here it goes. Reader discretion is advised.

It can be shown that a "nice" function $f(x)$ (the true meaning of "nice" is beyond the scope of my current knowledge) defined on the interval $x \in [a,b]$ can be approximated by a Fourier series:

$$g(x) = \frac{a_0}{2} + \sum_{k=1}^{\infty}\bigg[a_k\cos(k\frac{2\pi x}{b-a})+b_k\sin(k\frac{2\pi x}{b-a})\bigg]$$

with

$$a_k = \frac{2}{b-a}\int_a^bf(x)\cos(k\frac{2\pi x}{b-a})dx$$

$$b_k = \frac{2}{b-a}\int_a^bf(x)\sin(k\frac{2\pi x}{b-a})dx$$

Just like the Taylor series, this result is quite miraculous if not more. The way I see it is that the set of all continuous functions $u: D \subseteq \mathbb{R} \mapsto \mathbb{R}$ form an infinite dimensional vector space. Then, we define the distance between two functions $u(x)$ and $v(x)$ to be $\sqrt{\int_a^b (u(x)-v(x))^2dx}$. Now that we have a notion of distance, we can take look at the set $B = \{1, \sin(2\pi x), \cos(2\pi x), \sin(4\pi x), \cos{4\pi x}, \cdots\}$ and grind it through Gram-Schmidt to get an orthonormal set of vectors. It's easy to see that the elements were linearly independent, and we could have started with an arbitrary finite number of them. Now, we just proceed to project our initial function $f(x)$ onto the subspace generated by $B$. The result is $g(x)$ the Fourier series of $f(x)$.

We can now interpret the coefficients of $g(x)$ as "how much of certain frequency of oscillation does the initial function contains". Now to finally define the Fourier transform, a FT is an operator $\mathcal{F}$ which takes in a "nice" function $f(x)$ and spits out two functions $a_n$ and $b_n$ defined on the non-negative integers representing the coefficients of the sines and cos.

In practice, one main problem. We never have continuous data in the real world. To deal with that, we simply discretize the integral into a $N$ part Riemann sum, where $N$ is the number of "equally distanced" data points we have, depending on the context. This is where the "discrete" part of the DFT comes in. Usually, when we perform a DFT, we don't care much about the higher frequency terms because they are usually small.

Now one will never see the Fourier transform under this form because it's so ugly to write out. We bring in Euler's Identity to simplify a few things with complex numbers. We can write:

$$g(x) = \sum_{k=-\infty}^{\infty}c_ke^{2\pi ikx}$$

$$c_k = \frac{1}{2\pi}\int_{-\pi}^{\pi}f(x)e^{-2\pi ikx}dx$$

It is easy, but tedious to work out how the coefficients $c_k$ are related to $a_k$, $b_k$. One thing is for sure, the coefficients with the same $k$ correspond to the same frequencies.

Finally, we discretize and cut off high frequencies. Assume we get a time-series $f_k$ for $k = 1\cdots N$, we apply the DFT:

$$F_k = \frac{1}{N}\sum_{n=N/2+1}^{N/2}f_ne^{\frac{-2\pi ikn}{N}}$$

We go a bit further to make things look nice and substitute $\omega = e^{\frac{2\pi i}{N}}$. So,

$$F_k = \frac{1}{N}\sum_{n=N/2+1}^{N/2}f_n\omega^{-kn}$$

Before I end the post, I'd like to give one application of the DFT. The reason we want to perform the DFT is to distinguish between various frequencies by looking at the corresponding coefficients.

One application of this is filtering out noise from a signal. Usually, noise is of much higher frequency than the signal we wish to process. A simple procedure would be to take the DFT of this signal, multiply the coefficients corresponding to "high" frequencies by zero (the meaning of high depends on the context), and take the inverse DFT (I haven't talked about the inverse DFT, but this post is getting long). This will yield a nice signal devoid of most of the unwanted noise.

Other applications of the DFT will be subject of future posts (and maybe a description of how to perform the DFT efficiently, an algorithm called the Fast Fourier Transform).

Friday, November 6, 2009

Generating Functions and Regular Languages (i)

Combinatorics is, relatively speaking, a new field of mathematics. And as with any new field of study, it still remains very unorganized. How unorganized? There are many many topics, subtopics, side-topics, stand-alone problems and ungeneralized results laying everywhere, just like my room. One can witness this very early in combinatorics. Many of the math competition problems are about counting, and every problem seems to require a clever trickery in order to solve. For this reason many mathematicians tend to frown on combinatorics as simply some random collection of brainteasers, and not a "deep" field to study. Nonetheless, I think combinatorics is a promising field. Timothy Gowers, Fields medalist, is an example of a supporter of combinatorics. In fact, his award-winning research made surprising and insightful links between problems in combinatorics and other fields of mathematics.

The topic of generating functions of sets is an example of combinatorics boiling down many counting problems. Before talking about generating functions, let's look at a motivation from computer science. There are many more motivations, but this one I think fits best my knowledge at the moment.

In computer science, one pillar of research is dedicated to formal languages (automata theory). It is clearly an important topic since it studies the representation of information and of machines that computes with it as input. This field answers questions such as computability, and gives a solid theoretical framework for actually creating good computer languages along with their compilers and parsers.

Among the many classes of languages, one is called a "regular language".

For example, $\Sigma = \{0, 1\}$ is an alphabet. $101101$ and $\epsilon$ are words over that alphabet, and $L = \{101101, \epsilon\}$ can be a language. We allow languages with infinitely many words such as $L = \{\epsilon, 0, 1, 00, 01, 10, 11, 100, 101, \cdots\}$, which is basically every single possible word from the alphabet. Notice if we took out the words with leadings zeros except the first zero, we get the language of binary numbers.

Definition: A regular language $L$ is a set of words $w$ which are finite strings of symbols over a finite alphabet $\Sigma$. A word can be the empty word, which we denote $\epsilon$. A language can contain uniquely the empty word $L = \{\epsilon\}$, or can contain nothing, the empty language $L = \emptyset$.

A language $L$ can be formed from other languages say $L_a$ and $L_b$ from three rules.

Union: $L = L_a \cup L_b$, where $L = \{a | a \in L_a$or$a \in L_b\}$.

Concatenation: $L = L_aL_b$, where $L = \{ab | a \in L_a, b \in L_b\}$

Repetition: $L = \{L_a\}^{\star}$, where $L = \{\epsilon \cup L_a \cup L_aL_a \cup \cdots\}$

The base cases are sets containing one element in the alphabet.

For example, $\{0, 1\}^{\star}$ is the set of all binary strings (0's and 1's, whereas $\{0, 11\}^{\star}$ is the set of all binary strings with even length blocks of 1's.

[to be cont'd...]

Saturday, October 31, 2009

Karatsuba Multiplication

When we were kids, teachers taught us how to multiply. My generation probably was taught the standard multiplication algorithm. That is, take the first digit, multiply out the second number's digits one by one with carry, then repeat for every digit of the first number, and remembering to shift correctly.

On a side note, the "newer" younger math generation are sometimes taught weird things. The logic behind it is giving a chance for the kids to develop a sense of "why" of how the multiplication algorithm works, and make it easier for more kids to understand. I think it's retarded. Kids don't give a shit why this works, except young Gauss and other great ones. Anyways...

We are so familiar with multiplication, having done it for years and years before having access to the calculator at school, that we don't really think about: is there a faster way? What if we had two numbers with a 1024 digits each?

Multiplying it out with the standard algorithm, we would have to carry $1024^2$ multiplications, and then another $1024^2$ digit additions.

So can you think of a better way to multiply? Well, Komolgorov, a prominent mathematician, a pioneer of computational complexity, analysis and probability theory, conjectured (yes conjectured, because he was so convinced) in 1952 that there is no way of doing considerably better than that simple algorithm (meaning better than $\mathcal{O}(n^2)$ digit operations). In 1960, he announced his conjecture in a seminar on computational complexity, and a few days later, a student by the name of Karatsuba came up with an algorithm that was strictly better than standard multiplication. Needless to say, it was quite an embarrassing event for Komolgorov.

Curiously enough, this algorithm is extremely simple. The idea goes as follows:

Given a multiplication $a\cdot b$, we can rewrite $a = a_110^m + a_0$ and $b = b_110^m + b_0$ where we choose an $m$ that will cut the digits of $a$ and $b$ in the middle more or less. Now we notice that $a\cdot b = a_1b_110^{2m} + (a_1b_0 + a_0b_1)10^m + a_0b_0$. Now we still haven't done anything new. This is still what standard multiplication is doing. Karatsuba noticed that we can find $(a_1b_0 + a_0b_1)$ in not 2, but 1 multiplication step. The trick, $(a_1b_0 + a_0b_1) = (a_1+a_0)(b_1+b_0) - a_1b_1 - a_0b_0$. Since we already computed $a_1b_1$ and $a_0b_0$, we only have to multiply out the first term.

Ok, how is this so much better than standard multiplication? Informally, if we recursively apply this trick to compute $(a_1+a_0)(b_1+b_0)$, $a_1b_1$ and $a_0b_0$, we get a "multiplication tree" of depth $\mathcal{O}(\log(n))$ with each node branching into 3 subnodes. Standard multiplication has an extra multiplication step, so each node branches into 4 subnodes. This extra subnode obviously compounds rapidly at every layer.

Returning to our first example, multiplying two numbers with 1024 digits with this method takes $3^{10} = 59049$ digit multiplications instead of $1024^2 = 4^{10} = 1048576$.

What about the extra addition steps? Well, you can convince yourself that the addition steps from having more multiplication steps will blow up anyways.

Conclusion: Karatsuba Multiplication takes order $\mathcal{O}(n^{\log_23})$ digit multiplications, and should be used to multiply numbers with more than around 200 digits.

For more recent developments, algorithms have been found to do even better than Karatsuba's, but they involve much heavier mathematical artillery such as Fast Fourier Transforms.

Happy Halloween!

[DM]

Wednesday, October 28, 2009

First math post: Interpolation

This post really is to test out LaTeX formatting, but I found interpolation somewhat slightly interesting.

So, the motivation for interpolation goes: How do you fit a n-degree polynomial to a set of n+1 data points?... Ok, it's not very motivating... I actually can't think of a reason to fit a n-degree polynomial to anything... except the Taylor fitting, and the Least-Square fit, but those are different. Anyways.

Proposition: Given a set of $n+1$ data points$(x_i, f(x_i))$, such that$x_i \neq x_j$ for $i \neq j$there exists a unique polynomial $p(x) = \sum_{i=0}^n a_ix^i$ of degree $n$ such that $p(x_i) = f(x_i)$ for $i = 0\ldots n$.

Proof: Exercise for the reader.

So how do you find that polynomial? There is the simple straightforward way of doing it by writing out a set of n+1 equations, and solving it by some variation of Gaussian Elimination. In matrix form...

$$\begin{pmatrix}1&x_0&x_0^2&\cdots&x_0^n\\1&x_1&x_1^2&\cdots&x_1^n\\1&x_2&x_2^2&\cdots&x_2^n\\\vdots&\vdots&\vdots&\ddots&\vdots\\1&x_n&x_n^2&\cdots&x_n^n\end{pmatrix}\begin{pmatrix}a_0\\a_1\\a_2\\\vdots\\a_n\end{pmatrix}=\begin{pmatrix}f(x_0)\\f(x_1)\\f(x_2)\\\vdots\\f(x_n)\end{pmatrix}$$

But solving such system with Gaussian Elimination is quite expensive. In fact, it takes order $\mathcal{O}(n^3)$ number of operations.

Fortunately, there is a clever way around solving such a big system to get our polynomial.

The trick is something called the Lagrange polynomial. Consider the i-th Lagrange basis polynomial:

$$l_i(x) = \prod_{k=0, k\neq i}^n \frac{(x-x_k)}{(x_i - x_k)}$$

Clearly, $l_i(x_j) = \delta_{ij}$. So now, all is left is to correctly sum the basis polynomials.

$$L(x) = \sum_{i=0}^n f(x_i)l_i(x) = p(x)$$

The sum of a n-th degree polynomial is also of degree n, and there is a unique polynomial of degree n that interpolates the data points. So since this Lagrange polynomial satisfies $L(x_i) = f(x_i)$ for $i = 0\ldots n$, it must be the only one.

At first, it seems that computing the Lagrange polynomial also requires $\mathcal{O}(n^3)$ operations, but there are simple tricks one may employ to reduce the number of needed operations to $\mathcal{O}(n^2)$. I leave it for the reader to see that this is the case, and see how much cheaper it is to find the interpolant this way instead of naive Gaussian Elimination.

[DM]

Monday, October 26, 2009

First Step

This blog will hopefully follow my academic endeavors for the years to come. I heard that it is good practice to write down things that one might learn along a journey, since reading back might bring further advances. So, now, there is nothing else but to lay out my brief current status, no background, just a snapshot, like a corporate balance sheet.

I am currently enrolled in my 4th undergraduate semester (out of 8 study semesters) at the University of Waterloo, studying Mathematics. My current interests lie in some intersection of mathematics, finance, and computational mathematics. Of course this will probably evolve over time, as it radically did over the past three years. But that is another story.

My current major plan is Mathematical Finance with a Computational Mathematics minor. I chose this because it is a great alternative to Applied Math, which focuses more on the mathematics of physics. I am also enrolled in Waterloo's co-operative work program, but that will be an issue dealt with later...

The plan is to graduate in Winter 2012, but there's a slim chance I decide to cram the final study term earlier, ditch co-op, and graduate in Spring 2011. I might ditch co-op anyways... and find stuff on my own.

Current Courses (Fall '09)

MATH 247 Calculus 3 (adv)

MATH 239 Intro to Combinatorics

CS 371 Intro to Computational Mathematics

CS 241 Foundations to Sequential Programs

STAT 231 Statistics

Spring '09

Worked full-time as Junior Economist at Industry Canada

ECON 201 Microeconomic Theory (U. of Guelph)

SOA-FM Financial Mathematics (exam)

Winter '09

MATH 235 Linear Algebra 2

CS 136 Elementary Algorithm/Data design

ACTSC 231 Mathematics of Finance

ECON 140 Intro to Macroeconomics (Wilfrid-Laurier U.)

BUS 121 Functional Areas of Organization (Wilfrid-Laurier U.)

SOA-P Probability (exam)

Fall '09

MATH 145 Algebra (adv)

STAT 230 Probability

CS 135 Designing Functional Programs

ECON 120 Intro to Microeconomics (Wilfrid-Laurier U.)

BUS 111 Intro to Business Organization (Wilfrid-Laurier U.)

Transfered Credits (from Quebec CEGEP)

MATH 136 Linear Algebra 1

MATH 137 Calculus 1

MATH 138 Calculus 2

SCI XXX Science credit

I don't feel posting my marks would be of much benefit to this blog. If some readers would like to know them for a reason (such as thinking about graduate school), I might provide them.

Following posts will be more interesting. Promise.

[DM]

Road to PhD

Friday, December 11, 2009

On financial advisors

Sunday, December 6, 2009

Discrete Fourier Transform, intro to

Friday, November 6, 2009

Generating Functions and Regular Languages (i)

Saturday, October 31, 2009

Karatsuba Multiplication

Wednesday, October 28, 2009

First math post: Interpolation

Monday, October 26, 2009

First Step

Blog Archive