Optimum Design of Experiments

Author Archive

I have added several helper functions to work with ini-files.
In particular, I wanted to describe piecewise continous control functions by providing a time grid

[0,1,2,3,4,5]

and a list of values of the control function at these points

[4,3,5,3,1,5]

Doing this by hand is quite time consuming, since VPLAN
describes a control function by an offset and a slope.
So if one wants to have a continuous function, one has to calculate the slope such that the pieces fit together.

Being a programmer, I did that only once, then got bored and decided to write a helper function that does this work for me.

import vplan.ini
s = vplan.ini.generate_u_description("MyControlFunction",
                                      1, 3, -10, 10,
                                     [0,1,2,3,4,5],
                                     [4,3,5,3,1,5],
                                     [0]*6, [10]*6,
                                     [-2]*6, [2]*6)
print s

This code generates the code

[Steuerfunktionen]
u1=MyControlFunction 3 -10 10
u1tAnzahl=5
u1t0=t0
u1t1q=4 0 10 -1 -2 2
u1t1=1
u1t2q=3 0 10 2 -2 2
u1t2=2
u1t3q=5 0 10 -2 -2 2
u1t3=3
u1t4q=3 0 10 -2 -2 2
u1t4=4
u1t5q=1 0 10 4 -2 2
u1t5=tend

By typing

vplan.ini.generate_u_description?

in an ipython shell, one obtains a full description

    s = generate_u_description(name, number, mode, u_lo, u_up,                                                                                                                                                                                                                 
                               grid,q, q_lo, q_up, s_lo, s_up                                                                                                                                                                                                                  
                           )                                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                               
    Create ini-file description of a control function of the form                                                                                                                                                                                                              
                                                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                                                               
    Parameters                                                                                                                                                                                                                                                                 
    ----------                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                               
    name:   string                                                                                                                                                                                                                                                             
            name of the control function                                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                                               
    number: int                                                                                                                                                                                                                                                                
            number of the control function (counting from 1)                                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                               
    mode:   int
            type of control function
            0 piecewise constant
            1 piecewise constant (continuous)
            2 piecewise linear
            3 piecewise linear (continuous)

    grid:   at which points of time does the control function switch?
            the grid should contain t0 and tend

    q:      array_like
            control variables describing the values of the control function

            if mode is 0 or 1:
                len(q) == len(grid)-1

            elif mode is 2 or 3
                len(q) == 2*(len(grid)-1)
                In contrast to the implementation in the ini-file,
                the two q values are the _value_ of u, _not_ the value
                and slope

    u_lo:   float
            lower bound for u

    u_up:   float
            upper bound for u

    q_lo:   array_like or float
            lower bounds for q
            len(q_lo) == len(grid)-1

            if a float is provided, q_lo is the bound for all entries of q

    q_up:  array_like or float
            lower bounds for q
            len(q_lo) == len(grid)-1

            if a float is provided, q_up is the bound for all entries of q

    s_lo:   array_like
            lower bounds for the slopes

    s_up:   array_like
            upper bounds for the slopes

    Returns
    -------
    s:      string of the form
            u1=u 3 20 100
            u1tAnzahl=3
            u1t0=t0
            u1t1q=80 20 100 0 0 0
            u1t1=2
            u1t2q=70 20 100 0.0 -1e+10 1e+10
            u1t2=8
            u1t3q=70 20 100 0 0 0
            u1t3=tend



    Description
    -----------

    There three types of constraints for mode 2 and 3 function u on [t0,t1]

    1:  q_lo and q_up at t0
    2:  q_lo and q_up at t1
    3:  s_lo and s_up on [t0,t1]

    Howerver, VPLAN does not support this. It is only possible to specify

    1:  q_lo and q_up at t0
    3:  s_lo and s_up on [t0,t1]

    the constraint

    2:  q_lo and q_up at t1

    has to be modelled otherwise, e.g. by setting u_lo and u_up

So I wondered in which programming language I should write my next project.There
are two cool sites that compare programming languages:

The source code for different tasks: http://rosettacode.org/wiki/Generic_swap
The speed/memory consumption/compile time: http://shootout.alioth.debian.org/gp4/benchmark.php?test=allāŒ©=all

For small projects I’d always use Python and extend it with C++ using Boost:Python (which works exceptionally well once you get to know it, but takes ages to figure out). However, I think it is better to program the whole core functionality in a statically typed language such that it can easily wrapped and used for example in Python.

The D programming language looks really nice and it also shows up quite high on the benchmark (on linux on second place). So I wondered how it performs speedwise to C++. So I copied some example from the web, the bubble sort problem and implemented it four times:

  1. c++
  2. generic c++
  3. D
  4. generic D

The code is below.

Turns out that the C++ code is approximately twice as fast as the D code which
disqualifies D for me at the moment. Also, there is practically no speed gain
for the templated implementation in D.

So for the comparison of C++ and templated C++ code I obtained.
Its hard to read, but on the y-axis the relative speedgain of the templated
version is plotted. So if your array is longer than 16, the templated version
is even slower than the normal. I actually expected some kind of monotonic
decay. Couldnt figure out why there is a max around array length = 8.

/*****************************************
A comparison between different implementations
of the Bubble sort algorithms. The question is: How much faster is the algorithm with template programming.
Examples inspired by http://ubiety.uwaterloo.ca/~tveldhui/papers/Template-Metaprograms/meta-art.html written by Todd Veldhuizen
*****************************************/

/*****************************************
OUTPUT:
> ./bubblesort
0.027000,       0.030000,       0.128000,       0.016000,       0.016000
*****************************************


#include
#include
#include <SDL.h>
#include <sys/time.h>

using namespace std;

/* isn't actually used, we use the SDL timer instead,
it gives the same results */

int mtime(void)
{
    timeval tv;
    gettimeofday(&tv,NULL);
    return (int)(tv.tv_sec*1000 + (tv.tv_usec / 1000));
}

void print_vec(vector &v, string description)
{
    printf("%s\t",description.c_str());
    printf("[ ");
    for(int i = 0; i< v.size(); ++i)
    {
        printf("%d ",v[i]);
    }
        printf("]\n");
}

inline void swap(int &a, int &b)
{
    int tmp = a; a = b; b = tmp;
}


/* straight forward implementation of Bubblesort */
void bubbleSort1(int *data, int N)
{
    for(int i = N-1; i > 0; --i)
    {
        for(int j = 0; j<i; ++j)
        {
            if(data[j] > data[j+1])
            {
                swap(data[j], data[j+1]);
            }
        }
    }
}

/* recursive Bubble sort, one for loop replaced by recursion */
void bubbleSort2(int *data, int N)
{
    for(int j = 0; j < N-1; ++j)
    {
        if(data[j] > data[j+1])
            swap(data[j], data[j+1]);
    }
    if (N>2)
        bubbleSort2(data, N-1);
}

/* recursive recursive Bubble sort, both loops replaced by recursion */
void bubbleSortLoop3(int *data, int N)
{
    if (N>1)
        bubbleSortLoop3(data, N-1);
    if(data[N-1] > data[N])
        swap(data[N-1],data[N]);
}

void bubbleSort3(int *data, int N)
{
    bubbleSortLoop3(data, N-1);
    if (N>2)
        bubbleSort3(data, N-1);
}

/* template metaprogramming Bubble sort */
template class tswap
{
public:
    static inline void compareAndSwap(int* data)
    {
        if (data[I] > data[J])
            swap(data[I], data[J]);
    }
};

template class tbubbleSortLoop
{
public:
    static inline void loop(int *data)
    {
        tbubbleSortLoop::loop(data);
        tswap<N-1,N>::compareAndSwap(data);
    }
};

template<>
class tbubbleSortLoop
{
public:
    static inline void loop(int *data)
    {
        tswap<0,1>::compareAndSwap(data);
    }
};

template
class tbubbleSort
{
public:
    static inline void sort(int *data)
    {
        tbubbleSortLoop::loop(data);
        tbubbleSort::sort(data);
    }
};

template<>
class tbubbleSort
{
public:
    static inline void sort(int *data)
    {
    tbubbleSortLoop::loop(data);
    }
};

/* Todd Veldhuizens reference implementation */

template
class IntSwap
{
public:
    static inline void compareAndSwap(int* data)
    {
        if (data[I] > data[J])
            swap(data[I], data[J]);
    }
};

template
class IntBubbleSortLoop
{
private:
enum { go = (J <= I-2) };

public:
    static inline void loop(int* data)
    {
        IntSwap<J,J+1>::compareAndSwap(data);
        IntBubbleSortLoop::loop(data);
    }
};

template<>
class IntBubbleSortLoop<0,0>
{
public:
    static inline void loop(int*)
    { }
};

template
class IntBubbleSort
{
public:
    static inline void sort(int* data)
    {
        IntBubbleSortLoop<N-1,0>::loop(data);
        IntBubbleSort::sort(data);
    }
};

template<>
class IntBubbleSort
{
public:
    static inline void sort(int* data)
    { }
};

int main(int argc, char* argv[])
{
    /*generate array */
    const int N = 4; /* array length */
    const int NUMBER_OF_SORTS = 2000000;
    Uint32 start,end; /* for timing */

    vector original_data(N*NUMBER_OF_SORTS);
    for(int j = 0; j < NUMBER_OF_SORTS; ++j)
    {
        for(int i = 0; i < N ; ++i){
            original_data[i+N*j] = N-i;
        }
    }
    vector data(original_data);

    /* bubbleSort1 */
    data = original_data;
    start = SDL_GetTicks();
    for(int i = 0; i < NUMBER_OF_SORTS; ++i)
    {
        bubbleSort1(&data[i*N],N);
    }
    end = SDL_GetTicks();
    printf("%f,\t",(end-start)/1000.);

    /* bubbleSort2 */
    data = original_data;
    start = SDL_GetTicks();
    for(int i = 0; i < NUMBER_OF_SORTS; ++i)
    {
        bubbleSort2(&data[i*N],N);
    }
    end = SDL_GetTicks();
    printf("%f,\t",(end-start)/1000.);

    /* bubbleSort3 */
    data = original_data;
    start = SDL_GetTicks();
    for(int i = 0; i < NUMBER_OF_SORTS; ++i)
    {
        bubbleSort3(&data[i*N],N);
    }
    end = SDL_GetTicks();

    printf("%f,\t",(end-start)/1000.);

    /* templated bubbleSort */
    data = original_data;
    start = SDL_GetTicks();
    for(int i = 0; i < NUMBER_OF_SORTS; ++i)
    {
        tbubbleSort::sort(&data[i*N]);
    }
    end = SDL_GetTicks();
    printf("%f,\t",(end-start)/1000.);

    /* templated bubbleSort, Todd Veldhuizens reference implementation */
    data = original_data;
    start = SDL_GetTicks();
    for(int i = 0; i < NUMBER_OF_SORTS; ++i)
    {
        IntBubbleSort::sort(&data[i*N]);
    }
    end = SDL_GetTicks();
    printf("%f\n",(end-start)/1000.);

    return 0;
}

/**************************************
OUTPUT:
> ./D_bubblesort
straight forward implementation
0.058
recursive implementation
0.071
templated implementation
0.062
testing static array filling
[ 0 1 2 3 4 ]

**************************************/

//compile with dmd -O -inline -release D_bubblesort.d
version(Win32){
    /*
    * GetTickCount() wraps after 47 days or so, but this does
    * not matter, as we use unsigned for creating the difference.
    */
    private import std.c.windows.windows;
    static int
    diffmtime(inout DWORD old){

        int   Result;
        DWORD now = GetTickCount();
        Result = now - old;
        old = now;
        return Result;
    }
}

version(linux){
    /*
    * On linux, we could run with higher resulution,
    * but we don't
    */
    private import std.c.linux.linux;
    static int
    diffmtime(inout timeval old){

        int Result;
        timeval now;
        gettimeofday(&now,null);
        /* Convert to msec */
        Result = (now.tv_sec-old.tv_sec)*1000
        +(now.tv_usec-old.tv_usec)/1000 ;
        old = now;
        return Result;
    }
}

import std.stdio;

void print_vec(in int v[], string description="")
{
    writef("[ ");
    for(int i = 0; i< v.length; ++i)
    {
        writef("%d ",v[i]);
    }
    writef("]\n");
}


/*    NORMAL IMPLEMENTATION    */
void swap(int *a, int *b)
{
    int tmp = *a; *a = *b; *b = tmp;
}

/* straight forward implementation of Bubblesort */
void bubbleSort1(int *v, int N)
{
    for(int i = N; i > 0; --i)
    {
    for(int j = 0; j<i; ++j)
    {
    if(v[j] > v[j+1])
        swap(&v[j], &v[j+1]);
    }
    }
}

/* recursive implementation of Bubblesort */
void bubbleSortLoop2(int *data, int N)
{
    if (N>1) bubbleSortLoop2(data,N-1);
    if(data[N-1] > data[N])
        swap(&data[N-1], &data[N]);
}

void bubbleSort2(int *data, int N)
{
    bubbleSortLoop2(data, N-1);
    if (N>2)
        bubbleSort2(data,N-1);
}

/*    TEMPLATED IMPLEMENTATION OF ARRAY INITIALIZATION    */
template a(int N)
{
    static void fill(int *data){
        static if(N>=0)
        {
            a!(N-1).fill(data);
            data[N] = N;
        }
    }
}

/*     TEMPLATED IMPLEMENTATION OF BUBBLESORT    */

template tswap(int I, int J)
{
    static void perform(int *data){
        if(data[I] > data[J])
        {
            swap(&data[I],&data[J]);
        }
    }
}

template tbubbleSortLoop(int N)
{
    static void perform(int *data)
    {
        static if(N>1) tbubbleSortLoop!(N-1).perform(data);
        tswap!(N-1,N).perform(data);
    }
}

template tbubbleSort(int N)
{
    static void perform(int *data)
    {
        tbubbleSortLoop!(N-1).perform(data);
        static if(N>2) tbubbleSort!(N-1).perform(data);
    }
}

class Foo
{
    template TBar(T)
    {
        static T yy;                // Ok
        static int func(T t, int y)
        {
            return 0;  // Ok
        }    
    }
}

template Factorial(ulong n)
{
    static if( n <= 1 )
        const Factorial = 1;
    else
        const Factorial = n * Factorial!(n-1);
}

int main()
{
    const int N = 4;

    /* array length */
    const int NUMBER_OF_SORTS = 2000000;
    timeval t;

    /* fill some array */
    int[] original_data;
    original_data.length = N*NUMBER_OF_SORTS;
    for(int j = 0; j < NUMBER_OF_SORTS; ++j)
    {
        for(int i = 0; i < N ; ++i)
        {
            original_data[i+N*j] = N-i;
        }
    }

    writefln("straight forward implementation");
    int[] data = original_data.dup;
    gettimeofday(&t,null);
    for(int i = 0; i < NUMBER_OF_SORTS; ++i)
    {
        bubbleSort1(&data[i*N],N);
    }
    writefln(diffmtime(t)/1000.);

    writefln("recursive implementation");
    data = original_data.dup;
    gettimeofday(&t,null);
    for(int i = 0; i < NUMBER_OF_SORTS; ++i)
    {
        bubbleSort2(&data[i*N],N);
    }
    writefln(diffmtime(t)/1000.);

    writefln("templated implementation");
    data = original_data.dup;
    gettimeofday(&t,null);
    for(int i = 0; i < NUMBER_OF_SORTS; ++i)
    {
        tbubbleSort!(N).perform(&data[i*N]);
    }
    writefln(diffmtime(t)/1000.);

    writefln("testing static array filling");
    int[] lala;
    lala.length=5;
    a!(5).fill(&lala[0]);
    print_vec(lala);


    return 0;
}

What is Optimal Design of Experiments?

In a nutshell: Optimal Design of Experiments (ODOE) is a variance reduction technique to reduce the variance of estimators.
The variance of the standard error { \rm SE}_{ \bar x} of the mean \bar x of a random variable x with standard error {\rm SE}_x scales as
{\rm SE}_{\bar x} = \frac{ {\rm SE}_x}{\sqrt{N_m}}
where N_m is the number of repeated measurements.

History

It is often believed that Galileo Galilei’s merit was his insistance on the heliocentric world view. However, this is a persistent misbelief. Rather, Galilei was the first scientist who verified his mathematical description of reality by experiment. Not without reason Galilei is called the father of modern science.

According to a well-known legend he dropped objects of varying weight from the leaning tower of pisa to verify that the gravitational acceleration is independent of the weight. But why did Galilei supposedly climb the leaning tower of pisa and not his kitchen table? It is quite likely that he has tried that. However, it is not so easy to find out which object hits the floor first: the measurement has an error. If Galilei had known about error analysis he could have repeated this experiment many times. The theory predicts that to reduce the error of the estimated value by a factor 10 one has to do 100 times more measurements. One hundred, that seems reasonable, but to reduce the error by a factor 100 means 10000 measurements!!

Thus, Galilei had find a possibility to reduce the influence of his measurement error. His idea was to increase the falling time. It is verified historically that Galilei used a sloping plane. He measured the distance that cannon balls of varying weight pass in a certain time. This is basically the same idea as to climb the leaning tower. The important point is: Galilei optimized his experiment to be able to estimate the gravitational acceleration with as few experiments as possible even when the measurement devices at hand were very inaccurate (Galilei used his pulse to measure time).

Related Fields:

Active Learning (part of Machine Learning):
http://hunch.net/?p=49

In psychology:
http://www.phil.uni-sb.de/~jakobs/seminar/vpl/prolog.htm

In Chemistry:
http://staff.chemeng.lth.se/~BerntN



  • None
  • Isaac Gouy: Note the benchmarks game also shows full source code and build & run logs for all the programs, for example http://shootout.alioth.debian.org/g