oneminusp.com Computational Finance, Markets, Programming & co

11Mar/100

Empirical probabilities Entropy function

Just added this simple but often useful entropy function which can be run on any data set with multiple occurrences of symbols. Say your data is the vector

c(1,1,1,2,2,2,3,3,3)

then

entropy.count

calculates the Shannon entropy using empirical/maximum likelihood probabilities for all unique symbols 1,2,3.

entropy.count <- function(entry) {
	counts <- lapply(split(entry, as.factor(entry)), length)
	counts <- unlist(counts)
	ps <- counts / sum(counts)
	entropy(ps)
}

The code is now also available through the "Papers / Code" tab above.

Tagged as: , No Comments
18Feb/100

Open Source Information theory frameworks

I found two frameworks (for languages I'm interested in) which provide a range of entropy estimators and other information theoretical measures:

For python there is pyentropy

For R there is the entropy package by Hausser and Strimmer.

If you know of others, please let me know.

Also, I will soon add an additional page to the blog where you can additionally download all the source code presented in the code.

13Feb/100

Entropy estimators and predictability

In previous posts I discussed the local uncertainty h_n^{(1)} and the block entropy H_n. We also saw the rapid decrease in H_n uncertainty -- this is due to sampling errors. With larger n our empirical probability estimate n_i / n gets worse because it would require more samples to "fill up the histogram", i.e. there's missing ngrams and the seen ngrams have a bad probability estimate.

There's a vast number of papers and techniques on reducing the bias and variance on entropy estimates and I decided to write a few posts about this, with the aim to find the best entropy estimators for our (local) uncertainty measure. With a suitable entropy estimator we will be able to analyse local predictabilities conditioned on larger number of previous symbols with higher significance.

The estimator we used so far is called "plug-in" or maximum likelihood estimator and is defined as

\hat{H}(X) = - \sum_X \hat{P}(x) log \hat{P}(x)

where \hat{P}(x) = n_x / n, so the number of occurrences of the word x in the whole space. It is well known that the MLE estimator is negatively biased. What does that mean?

29Jan/100

Local order and predictabilitiy: Significance testing

The two previous posts described an implementation of a paper about finding local order (return patterns with higher than average predictability of the next symbol) in financial time series.

One important unanswered question so far is about the significance of the local uncertainties h_n(A_1 \dots A_n). Does a deviation from almost no order ( > 0.99) really mean something or is it due to imprecisions/undersampling of the empirical probabilities? As the original paper notices, the larger values we choose for n, i.e. the more previous trading days we consider to predict the next one, the more ngrams are possible and therefore the more samples we need to approximate the probabilities p^{(n)} more or less accurately.

There's two ways to go:

  • As in the original paper, use empirical probabilities and the basic plugin entropy estimator and restrict n to maximally 5, as their significance level K dictates (more to that below)
  • Experiment with larger n including more sophisticated probability and enstropy estimators

We will do both. But for now I'll concentrate on the significance level K as introduced in the paper. A so called surrogate sequence of length n is generated out of the partitioned time series. These surrogates have the same mean and standard deviation as the original sequence, you could see it as a random shuffling of the sequence with some further rules. The local uncertainties from the surrogates are called h_n^S(A_1 \dots A_n). The significance level K is then calculates as:

K_n(A_1 \dots A_n) = \vert \frac{h_n(A_1 \dots A_n) - \langle h_n^S(A_1 \dots A_n) \rangle}{\sigma_{h_n^S}}\vert

26Jan/100

Local order and predictability – Implementation

Part 1 discussed a paper on local order and predictability of time series. I will now describe the implementation of the described functions in R.

First we assume that already have our real returns data partitioned into symbols A_t = \{0,1,2\} so \lambda is 3. Thus our time series is just a vector of values 0 1 2.

Next, all our functions will consider trajectories A_1 \dots A_n of that original vector. I will implement this as a sliding window of length n. So if our sequence is 012020120 the function slide will create the array 012, 120, 202, 020, 201, 012, 120 out of it.

slide <- function(seq,windowsize) {
	steps <- length(seq)-windowsize
	start <- 1
	stop <- windowsize
	accu <- array(0,dim=c(steps,windowsize))
	for(i in 1:(steps)) {
		#print(seq[start:stop])
		accu[i,] <- seq[start:stop]
		start <- start+1
		stop <- start+windowsize-1
	}
	return(accu)
}
26Jan/100

Local order and predictability of financial time series

In this series of posts I will discuss an implementation and tests of the paper Local order, entropy and predictability of financial time series by L. Molgedey and W. Ebeling. (pdf)

The paper presents an excellent application of information theory to time series analysis. The idea is simple: is it possible to find sub-trajectories in financial time series (here the daily returns of some indices or stock) where a "local order" exists with higher than average predictability.

I won't explain the paper in full, so please have a look at the pdf above for notation and details. However I will describe the most important concepts below. We consider one-dimensional, discretely partitioned time series. The authors use Shannon entropy H as basic tool to measure uncertainty or predictability of the probability distribution described by the time series. For a certain trajectory of length n the uncertainty of predicting the next state is the difference in Shannon entropies for trajectories of length n+1 and n:

h_n = H_{n+1} - H_n

22Jan/100

Calculating Entropy the Functional Way

Previously, I wrote a short article on how to implement fold left in R. It was fairly obvious that there must be a builtin function for it in R. At the time, I just assumed it would be "reduce" or it would not exist, however the proper function name is called "Reduce" with a capital R -- as a side note, I do not really understand the naming scheme of functions in the R base library.

So here's the fairly obvious way on how to calculate Shannon's entropy in R using Reduce:

> fentropy <- function(x,y) { x + (-y * log2(y)) }
> Reduce(fentropy, c(0.5,0.5), 0)
[1] 1
> Reduce(fentropy, c(0.25,0.25,0.25,0.25), 0)
[1] 2

First for the binary case with answer 1, and then for four values uniformly distributed.

Last but not least, we could also write an entropy function the "R way" which uses its nice functions which work over vectors:

entropy <- function(ps) {
     H = -sum(ifelse(ps>0, ps * log2(ps), 0))
     return(H)
}
17Jan/100

Information Theory and Financial Markets

I would like discuss and implement ideas from papers applying information theoretical (IT) notions to trading in financial markets. I will provide links to all papers I'll read on this topic and describe certain concepts in more detail.

The current list is:

Untertainty analysis in financial markets: can entropy be a solution?, Andreia Dionísio, Rui Menezes and Diana A. Mendes (pdf)

Forecasting Foreign Exchange Market Movements via Entropy Coding, Arman Glodjo, Campbell R. Harvey (pdf)

Local order, entropy and predictability of financial time series, L. Molgedey and W. Ebeling (pdf)

These three papers all use Shannon Entropy in place of more traditional statistical measures. What is interesting however is that all of them apply entropy in different ways.
The first paper by Dionisio compares entropy as measure of uncertainty with variance/standard deviation in portfolio management.
The second paper by Glodjo applies techniques from coding theory (the original and most successful application of IT) to forecasting high frequency time series. Also it provides good arguments for using IT in finance.
The last paper by Molgedey is using conditional entropy directly on returns time series to quantify "local order" in highly stochastic time series. A local order would be a point in time where the next step is more predictable than average.

I will have a look at some of those techniques in more detail and might implement some of it to see if I can replicate the authors results.

10Jan/101

Understanding Biotech Companies

Biotechnology companies that develop new medicines are not quite like other companies. Understanding how their product development process works is crucial for investing profitably.

To make a medical product from zero to market takes a long time and is a risky undertaking. For a new discovery, both safety and effectiveness have to be demonstrated over time. Only then may the regulatory authorities grant a permission to sell the medicine on the market. This research process, with its multiple phases, is called a pipeline.

10Jan/100

fold left in R

Often used high order functions in functional programming are left and right folds.

A left fold [foldleft f accu l] applies the head of the list l to the function f together with the accumulator variable accu. The result is the new accumulator which is used in the next recursive call together with the tail of the list.

A left (or right) fold is easily implemented in R as follows:

foldleft <- function(f,accu,l) {
	if(length(l)==0) {
		accu
	} else {
		head <- l[1];
		tail <- l[-1];
		foldleft(f, (f(accu, head)) , tail)
	}
}

To see how it works, we could apply it to calculate the variance of a fair die. Remember the variance is just Var(X) = E[(X-\mu)^2] where \mu is the mean, which is implemented in the following function f:

mean<-sum(1:6)/6

f <- function(accu,i) {
	accu+(1/6 * (i-mean)^2)
}

foldleft(f,0,c(1,2,3,4,5,6))

where our last call to foldleft evaluates to 2.91667.