## Life-Saving Unix Text-Processing Tools

This is my second incomplete post that I’ve started to write about two years ago, but posting incomplete version now .

If you are going to parse a log file or do some text-processing, there are too many ways to do those simple tasks. But I think following commands are should-be-known list:

## jot-seq:

jot is a tool used for sequential or random numbers or characters but can also help you do string formatting as well. seq can be used for similar purposes but I usually prefer jot which is much more powerful. Here are some example commands:

jot 3
This will output the numbers from 1 to 3 separated with newline:
1
2
3

jot 21 -1 1.00
This command prints 21 evenly spaced numbers increasing from -1 to 1:
-1.00
-0.90
-0.80
-0.70
-0.60
-0.50
-0.40
-0.30
-0.20
-0.10
-0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00

jot -s ” -b ‘-’ 50

will print 50 hyphens:

————————————————–

-s ‘word’ option denotes the seperator string by default it is newline ‘\n’.
-b ‘word’ option is used for printing the ‘word’ repetitively.

 jot -b yes 0

will print the word ‘yes’ infinitely.

jot -w tmp%d 10 1
prints the following string:
tmp1
tmp2
tmp3
tmp4
tmp5
tmp6
tmp7
tmp8
tmp9
tmp10

-w option is used to print word with the generated data appended to it. Octal, hexadecimal, exponential, ASCII, zero padded, and right-adjusted representations are possible by using the appropriate printf(3) conversion specification inside word, in which case the data are inserted rather than appended.

An interesting use case for jot is below:

### Resources

Wikipedia, Higher Order Functions: http://en.wikipedia.org/wiki/Higher-order_function
c2, Higher Order Functions http://c2.com/cgi/wiki?HigherOrderFunction

## Free and Bound Variables

According to SICP the definition of free and bound variables are,

A bound variable:

A variable, V, is “bound in an expression”, E, if the meaning of E is unchanged by the uniform replacement of a variable, W, not occurring in E, for every occurrence of V in E.

And a free variable is,

A variable, V, is “free in an expression”, E, if the meaning of E is changed by the uniform of replacement of a variable, W, not occurring in E, for every occurrence of V in E.

In first order logic, a variable is said to occur free in a formula X if and only if it is not within the “scope” of a quantifier (existential or universal).

Basically a free variable is a notation that specifies places in an expression where substitution may take place.
Each of the variable binding operators below, binds the variable x. Hence x becomes a bound variable:

Free and Bound variables, Wikipedia: http://en.wikipedia.org/wiki/Free_variables_and_bound_variables

Free and Bound Variables, Saarland: http://www.coli.uni-saarland.de/projects/milca/esslli/html/node8.html

First-Order Logic: bound variables, free variables http://cnx.org/content/m12081/latest/

### Related Posts:

• No Related Posts

## Philosophy and Computer Science

I’ve just found out an interesting MIT course on Philosophy and Theoretical Computer Science with code 6.893 and thought by Scott Aaronson. The most attractive part of this course for me was articles included in the reading list of the web page. There are great articles in that list:

Scott Aaronson, NP-complete Problems and Physical Reality
Scott Aaronson, Is P Versus NP Formally Independent?
Scott Aaronson and John Watrous, Closed Timelike Curves Make Quantum and Classical Computing Equivalent
Ethan Bernstein and Umesh Vazirani, Quantum Complexity Theory
Nick Bostrom, Are You Living In A Computer Simulation?
David Chalmers, Does A Rock Implement Every Finite-State Automaton?
David Deutsch, Quantum Mechanics Near Closed Timelike Lines
David Deutsch, Quantum theory, the Church-Turing Principle and the universal quantum computer
Jack Edmonds, Paths, Trees, and Flowers (Section 2)
Hector Levesque, Is It Enough To Get The Behaviour Right?
Christos Papadimitriou, Algorithms, Games, and the Internet
Ian Parberry, Knowledge, Understanding, and Computational Complexity
Juergen Schmidhuber, A Computer Scientist’s View of Life, the Universe, and Everything
Stuart Shieber, The Turing Test As Interactive Proof
Michael Sipser, The History and Status of the P versus NP Question
Robert Stalnaker, The Problem of Logical Omniscience I
Alan Turing, On Computable Numbers, With An Application to the Entscheidungsproblem
Alan Turing, Computing Machinery and Intelligence
Leslie Valiant, A Theory of the Learnable
Leslie Valiant, Evolvability
Avi Wigderson, P, NP, and Mathematics – A Computational Complexity Perspective
Avi Wigderson, Knowledge, Creativity, and P versus NP
Ronald de Wolf, Philosophical Applications of Computational Learning Theory

## Bitwise Operators and Binary Tricks for System Programming

Bitwise operators in most of the programming languages are common and inherited from the C (e.g.: &, |, ^, >>, …). These operations are critical in the low-level programming (for instance for writing drivers).

In the beginning of this post I’ll just briefly pass over them and present some tricks using them. As a programming language choice, I’ll give examples from C and python.

See the resources for more complete and detailed overview.

### Two most Fundamental Bitwise operators

There are two types of conditional operator in most of the programming languages, logical one the bitwise one. For example “bitwise AND” is shown as “&” and the “logical AND” is shown with “&&”. Similarly “bitwise OR” is | and “logical OR” is “||”.

#### AND operator

Bitwise AND (&) is very similar to the “logical AND” but instead it works with bits only. The bitwise AND compares each bit of the first operand to the corresponding bit of the second operand. It returns 1 if and only if both of its operands are equal to 1, otherwise it returns 0.

For example 00&11 is 0.

Let’s fire up python shell and write 5&7 the result is 5 but how?

5 in binary format is 0101 and 7 is 0111 and by comparing each bit we’ll get 0101 which is 5.

We can use this notion for detecting if a bit is 0 or 1.

For example, given a bit pattern: 0011 (decimal 3), we can determine whether the second bit is 1 may be done by using a bitwise AND with a bit pattern containing 1 only in that bit:
 0011 (decimal 3) & 0010 (decimal 2) -------------------- = 0010 (decimal 2) 
Because the result 0010 is non-zero, we know the second bit in the original pattern was 1. This process is usually called as bit masking.

#### OR operator

A bitwise OR (|) takes two bit patterns of equal length and performs the logical inclusive OR operation on each pair of corresponding bits. For each pair, if either bit is 1, the corresponding result bit is set to 1. Otherwise, the corresponding result bit is set to 0.

For example if calculate 5&7 in python you’ll get 7. Because 5 in binary format is 0101 and 7 is 0111 and by comparing each bit with inclusive OR we’ll get 0111 which is 7.

The bitwise OR may be used to set selected bits. One example is to set a specific bit (or flag) in a register where each bit represents an individual Boolean state. For example: 0010 (decimal 2) can be considered a set of four flags. The first, second, and fourth flags are clear (0), while the third flag is set (1). The first flag may be set by performing a bitwise OR between this value and a bit pattern in which the first flag is set:
 0010 (decimal 2 ) | 1000 (decimal 8 ) -------------------- = 1010 (decimal 10 ) 

This technique is an efficient way to deal with a number of Boolean values using the minimum of memory.

### Other important bitwise operators

#### NOT

The bitwise NOT (~), or complement, is a unary operation that performs logical negation on each bit, forming the ones’ complement of the given binary value. Digits which were 0 become 1, and vice versa. For example:

~7 is 8. Because 7 in binary format is 0111 and by converting each bit we’ll get 1000 which is 8.

If the number is signed we’ll use two’s complement for the bitwise not operator.

#### XOR

The term bitwise XOR (^) stands for “exclusive OR,” and means “one or the other, but not both.” In other words, XOR compares each bit, returns 1 if and only if exactly one of its operands is 1. If both operands are 0, or both are 1, then XOR returns 0. For example:
 0101 (decimal 5) ^ 0011 (decimal 3) -------------------- = 0110 (decimal 6) 
Assembly language programmers sometimes use the XOR operation as a short-cut to set the value of a register to zero. Performing XOR on a value against itself always yields zero, and on many architectures this operation requires fewer CPU clock cycles and/or fewer bytes (less precious cache-space) than loading a zero value and saving it to the register.

The bitwise XOR may be used to invert selected bits in a register (also called toggle or flip). Given the bit pattern: 0010 (decimal 2) the first and third bits may be toggled simultaneously by a bitwise XOR with a bit pattern containing 1 in the first and third positions:
 0010 (decimal 2 ) ^ 1010 (decimal 10 ) --------------------- = 1000 (decimal 8 ) 
This technique also may be used to manipulate bit patterns representing sets of Boolean states.

### Bitwise shifts

There are two bitwise shift operators and they are shift left and shift right. In C, these are represented by the << and >> operators, respectively. These operations are very simple, and do exactly what they say: shift bits to the left or to the right. The syntax for a shift operation is as follows:
 [integer] [operator] [number of places] 
A statement of this form shifts the bits in [integer] by the number of places indicated, in the direction specified by the operator.

#### Left Shift

Left shift basically shifts the binary n bits to the left. A left arithmetic shift by n is equivalent to multiplying by 2^n (provided the value does not overflow). For example:

Fire up your python shell and run the following in the shell:
 3 << 2 
Result of this operation is 12. Because,

Decimal 3 is 0011 in binary. When you shift 0011 2 bits to the left, you'll get 1100 and that's 12.

Another example is:
 x = 0000 1010 1000 0001; x = x << 4; 
After shifting 4 bits, the new x will be:
 1010 1000 0001 0000 
Every bit is shifted to the left by one place. When you do this, the MSB (most significant bit, remember?) of x is lost, because there isn't another place to shift it to. Similarly, after a shift left, the LSB of x will always be 0. There is no position to the right of the LSB, and so there's nothing to shift into the LSB, so it's assigned a value of 0.

#### Right Shift

Similar to the left shift, right shift basically shifts the binary number n bits to the left. That's actually taking two's complement of the number or dividing by 2^n. An example is:
 12>>2 
and result of this is 3. Because 12 in binary is 1100 and when you shift two bits you'll get 0011 which is 3.

Another example is,
 x = 0110 1111 1001 0001; x = x >> 4; 
and the x in the end will be 0000 0110 1111 1001. Here, the bits are being shifted right by four places.

### Some Hacks

#### Computing the modulus:

If the divisor is a power of 2, the modulo (%) operation can be done with:
modulus = numerator & (divisor - 1);
 x = 131 % 4; //equals: x = 131 & (4 - 1); 

#### Absolute Value

Forget Math.abs() for time critical code. Version 1 is 2500% faster than Math.abs(), and the funky bitwise version 2 is again 20% faster than version 1.
 //version 1 i = x < 0 ? -x : x;

 

//version 2 i = (x ^ (x >> 31)) - (x >> 31); 

#### Swap integers without a temporary variable using XOR

That's a pretty neat trick and this is about 20% faster.
 int a, b, t; a = b; b = t;

 

//equals: a ^= b; b ^= a; a ^= b; 

#### Multiplication

Here is a bitwise multiplication algorithm only using bitwise shifts and addition.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  #include   int multiply (int a, int b){ int c = 0; while (b!=0) { if ((b & 1) != 0) c = c + a; a = a<<1; b = b>>1; } return c; } int main () { int a = 3, b = 8, c, d; c = multiply(a, b); d = a * b; if (c == d) { printf("Result is %d\n", c); } else { printf("Incorrect : %d\n", c); } return 0; }

#### Detect if two integers have opposite signs

 int x, y; // input values to compare signs bool f = ((x ^ y) < 0); // true iff x and y have opposite signs 

#### Check if an integer is even/uneven using bitwise AND

 isEven = (x & 1) ^ ((-1 & 1) | ((x < 0) ? 0 : 1))); 

This example is taken from this SO post.

#### Compute the minimum (min) or maximum (max) of two integers without branching

This one is from Bit Twiddling Hacks page:

If you know that INT_MIN <= x - y <= INT_MAX, then you can use the following, which are faster because (x - y) only needs to be evaluated once.
 r = y + ((x - y) & ((x - y) >> (sizeof(int) * CHAR_BIT - 1))); // min(x, y) r = x - ((x - y) & ((x - y) >> (sizeof(int) * CHAR_BIT - 1))); // max(x, y) 
Note that the 1989 ANSI C specification doesn't specify the result of signed right-shift, so these aren't portable. If exceptions are thrown on overflows, then the values of x and y should be unsigned or cast to unsigned for the subtractions to avoid unnecessarily throwing an exception, however the right-shift needs a signed operand to produce all one bits when negative, so cast to signed there.

#### Sign Flipping

Sign flipping using NOT or XOR:
 i = -i; //equals i = ~i + 1; //or i = (i ^ -1) + 1; 

### Useful Resources

Bitwise Tricks

Bitwise Operations in C

The art of Programming Fasc 1

Bitwise gems – fast integer math

Bit Twiddling Hacks

Bitwise Operations

Reciprocal Multiplication, a tutorial

## PHP IPv6 address to Decimal Format Conversion

Storing IP addresses in database as decimal have several advantages on storing as a string (storing, efficiency, processing… etc). But my goal was to be able to easily query if an IP address is in a block or between a given IP range. The following two functions are often handy for converting a string IPV6 address to long decimal format. These functions require php gmp to be able to generate long decimals.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45  = 0) { $bin = sprintf("%08b",(ord($ip_n[$bits]))); if ($ipv6long){ $ipv6long =$bin.$ipv6long; } else {$ipv6long = $bin; }$bits--; } return gmp_strval(gmp_init($ipv6long,2),10); } function long2ip6($ipv6long) { $bin = gmp_strval(gmp_init($ipv6long,10),2); if (strlen($bin) < 128) {$pad = 128 - strlen($bin); for ($i = 1; $i <=$pad; $i++) {$bin = "0".$bin; } }$bits = 0; $ipv6 = 0; while ($bits <= 7) { $bin_part = substr($bin,($bits*16),16); if ($ipv6) { $ipv6 .= dechex(bindec($bin_part)).":"; } else { $ipv6 = dechex(bindec($bin_part)).":"; } $bits++; } return inet_ntop(inet_pton(substr($ipv6,0,-1))); }   $ipv6 = "fe80:0:0:0:202:b3ff:fe1e:8329"; print$ipv6long = ip2long6($ipv6)."\n"; print$ipv6 = long2ip6($ipv6long)."\n"; ?> ### Related Posts: ## Traverse Files In the Source Folder for C++ Lint Google has an awesome C++ coding style guideline and they have done a very good job on creating a very handy C++ lint based on that guideline. I have downloaded it renamed that script to cpplint, chmod to executable and move to /usr/bin to use. (Also changed the shebang to /us/bin/env python, because default shebang uses the python2.4) But this script solely itself can not traverse the files in a folder recursively and output the errors to a file, hence I had to write a basic bash script by myself:  1 2 3 4 5 6 7 8 9  #!/bin/bash if [ -n "$2" ]; then exec 2>$2 if [ -d "$1" ]; then find \$1 -type f -iname \*.cc -or -iname \*.h -exec sh -c "cpplint --filter=+build,+readability,+runtime,-whitespace --output=vs7 '{}'" \; fi else echo "Please enter a log file name to output the errors to and a path for the source folder to traverse for errors: ./check_cpplint [search folder] [log file]" fi

First parameter of this script is the path of the source code’s folder that your C++ codes are located in and the second one is the file that the errors will be written to.

BTW there is another great c++ lint called cppcheck.

## Learn you a haskell for Math! – Part 1

It has been for some time since I’ve used haskell for programming (in fact for my personal and work related projects I don’t use haskell. because I still feel so novice at it), so I thought that implementing some basic mathematical functions might be a good exercise. I’m planning to write 3 series, in the first part I’ll implement some of the basic calculus operators such as integration and derivative of a function. In the second part I’ll implement a few numerical analysis algorithms like newton-raphson, taylor series approximation and sieve of aristothenes. In the third and the last part I’ll present you the implementation of MCMC and gaussian processes.

If you are using haskell there are too many ways to accomplish a task. As a layman haskell-er, my approaches might be naive.

To find integration and derivation of a function you can use any of the following two different approaches:

1. You can approach problem by using an approximation technique.
2. You can solve by symbolically. Implementation of a symbolic solver is harder.

### Integral of a Function

We’ll try to find the definite integral of a function between upper and lower values, it is basically the area under the curve:

#### Riemann Sum Approach

First approach that I’ll use will be riemann sum is a method for approximating the total area underneath a curve on a graph, otherwise known as an integral of a specific function.

Formal definition (From Wolfram Math):

Let a closed interval be partitioned by points , where the lengths of the resulting intervals between the points are denoted , …, . Let  be an arbitrary point in the th subinterval. Then the quantity

is called a Riemann sum for a given function  and partition, and the value  is called the mesh size of the partition.

If the limit of the Riemann sums exists as , this limit is known as the Riemann integral of  over the interval . The shaded areas in the above plots show the lower and upper sums for a constant mesh size.

A general graphical view of convergence of riemann sum to a function is:

The haskell code of riemann sum is shown below. I have used tail recursion, to learn about tail recursion in haskell see this.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18  epsilon_dx = 0.000001 -- integrate function from a to b   integrate f a b = riemannSum f a b epsilon_dx 0   --using a tail recursion --integrate a function f from x to b   riemannSum f x b dx sum = if x > b then sum else riemannSum f (x + dx) b dx (sum + (f x) * dx)   f1 x = 1 f2 x = x**2 f3 x = x * exp(x**2)

type of the integrate function is:

 1 2  *Main> :t integrate integrate :: (Double -> Double) -> Double -> Double -> Double

A basic test via ghci:

 1 2  *Main> integrate f2 0 1 0.33333283333399866

function f2 is $$f(x) = x^2$$

and the integral we are trying to calculate is:
$$\int^b_a f_2(x) dx = \int^1_0 x^2 dx= 0.33333…$$

so the real and approximated answers are very close.

Trapezoidal rule (also known as the trapezoid rule or trapezium rule) is an approximate technique for calculating the definite integral

The trapezoidal rule works by approximating the region under the graph of the function f(x) as a trapezoid and calculating its area. It follows that

The graphical depiction of the definition above is:

#### The function f(x) (in blue) is approximated by a linear function (in red).Rule

The haskell code for the trapezoidal rule using tail recursion (again):

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17  --number of steps/resolution taken to find an approximate solution step_size = 100000   -- integrate function from a to b   integrate f a b = trapezoidalSum f a b 0 ((b-a)/step_size)   --use of a tail recursion --integrate a function f from x to b   trapezoidalSum f x b sum delta | x > b = sum | otherwise = trapezoidalSum f (x + delta) b (sum + delta * ((f x) + (f (x + delta))) / 2) delta   f1 x = 1 f2 x = x**2 f3 x = x * exp(x**2)

Again type of the integrate function is same as in the riemann sum version:

 1 2  *Main> :t integrate integrate :: (Double -> Double) -> Double -> Double -> Double

An example test:

 1 2  *Main> integrate f2 0 1 0.33334333344937134

### Derivative of a Function

First derivative of function at a certain point is actually the tangent of the function at that certain point. Therefore we can write:

This expression is Newton‘s difference quotient. The derivative is the value of the difference quotient as the secant lines approach the tangent line. Formally, the derivative of the function ƒ at a is the limit

and we can show that in the graph as follows:

By using currying and higher order functions implementing that derivation is straight-forward:

 1 2 3 4 5 6  diff :: (Fractional a) => a -> (a -> a) -> (a -> a) diff h f x = (f (x + h) - f x) / h   f1 x = 1 f2 x = x**2 f3 x = x * exp(x**2)

The test result at ghci is:

 1 2  *Main> diff 0.00001 f2 10 20.00000999942131

we have chosen h as 0.0001 and differentiate function $$f_2(x)$$ at point 10. Derivative of $$f_2(x)$$ is $$f’_2(x) = 2x$$ and the result at 10 is, $$f’_2(10) = 20$$ . Result that we have found with the above haskell code is very close to the exact value.

Note: I mistakenly post the wrong revision of this post which was incomplete and realized that a few days later.

## Large Scale Bayesian Inference for Network Tomography

### 1 Introduction

Major goal of network tomography is to infer the internal characteristics of network by only using data from the end nodes. Each node can either be a computer, router or a subnetwork. Broadly speaking large-scale network inference involves estimating network parameters (can be performance or other) based on traffic measurements at a limited subset of nodes [2]. Network tomography assumes that connections can be directional where data travels from origin to destination and the connections can be uni-directional or bidirectional which changes according to the level of abstraction of the model. During the collection of measurements, some of the data can be missing. Some of the reasons of missing data might be either lossy networks, devices’ load or connectionless UDP protocol.
According to data-collection mechanism, there can be 2 types of network tomography:
• Active Tomography: Active network tomography aims to infer the network’s internal characteristics by sending probe packets (or uses traceroute) from ends.
• Passive Tomography: Passive network tomography infers network information from passive observations of regular network traffic.

Some of the problems in literature about network-tomography (NT) that we are trying to solve are:

• Link-level measurement inference: This is the characterization of latent variables like delay from the observed measurements at the links of network.
• Network topology inference: The task of inferring the network’s topology by only using the end-to-end measurements.
• Sender-Receiver traffic intensity: This is also known as traffic matrices estimation. The task is to estimate the intensity of traffic between two sources.
• Network Anomography: This is the task of inferring anomalies in the network(might be caused by malfunctioning component, malwares or attacks like DDOS) using only the data from the ends.

The mathematical characterization of network tomography problem can be stated as[1]:

Given observations $$Y (t) := [Y_1 (t) . . . Y_l (t)]$$ , over times $$t = 1 . . . T$$ and a matrix $$A(l×κ)$$ such that $$Y (t) = AX(t), ∀t$$ we want to measure the latent variables $$X(t) := [X1 (t) . . . Xκ (t)]$$ over times $$t = 1 . . . T$$. It is always $$κ ≥ l$$ and in general $$κ = O(l^2 )$$.

Many network tomography problems can be approximated by the linear (not necessarily Gaussian) model $$Y (t) = AX(t) + e$$ where X is the $$κ$$ dimensional vector of network dynamic parameters(ex: delay, loss…etc), $$Y$$ is a $$l$$ dimensional measurement vector and hence $$A$$ is the $$l × κ$$ routing matrix. It is difficult to measure the traffic matrices directly. Therefore we try to estimate it by using statistical techniques like maximum entropy[12].

### 2 Related Works

The term network tomography was first used by Vardi (1996)[11] to capture the similarities between Origin-Destination(OD) traffic matrix estimation through link counts: in network inference, it is common that one does not observe quantities of interest but their aggregations instead and this goes beyond OD estimation. Vardi, also devised the linear tomography Poisson model for OD traffic estimation and the linear form [11] is shown later to approximate other network tomography problems [2]. Network Tomography tries to infer the latent variables and usually observed variables of the data is very limited. In that sense Network Tomography is an ill-posed inverse problem. Current approaches in network tomography uses a limited number of measurements to infer the network performance parameters, by using:

• Maximum Likelihood Estimator.
• Bayesian Inference.

and assuming a prior model.

Venkata et al(2003)[7], used Bayesian inference with Gibbs sampling for only using a number of lost and number of sent packets throughout the network. But although their model could give information about bottlenecks in the network, the model’s accuracy suffers. Nowak et al [2] [3] used MCMC to estimate MAP and EM to estimate MLE of the delays. In [6] Liang et al developed a pseudo EM to maximize the pseudo log-likelihood function. In order to illustrate the pseudo likelihood proposal, they inferred the internal link delay distribution and OD matrix estimation from link measurements. M. Rabbat et al in [8] inferred the network structure from the co-occurence data. Network elements that co-occur in many observations are assumed to be closely connected. Particularly, network elements that co-occur in many observations are probably closely connected and they modeled the co-occurrence observations as independent realizations of a random walk on the graph, subjected to a random permutation which accounts for the lack of order information. Treating the permutations as missing data, Rabbat et al employed MCEM algorithm with importance sampling for estimating the random walk parameters. As a result they were able get an accurate model of the network.

### 3 Possible Approaches to Attack the Problem

Queue management is an important aspect of routers. Routers have several interfaces (inif-ingrees, outif-egress) which each link is connected to and the data comes from these interfaces processed by a FPGA or specific multi-core CPU’s. Packets from different queues are processed in a FCFS and round-robin way. Hence it is a natural way to use queuing networks to model the routers’ behavior. Internet is built on the distributed networks scattered around the world and naturally measurements in the end-points can be modeled with queuing networks and graphical models which also allows us to apply Markov chain Monte Carlo for the missing data. C Sutton et al developed a model for Bayesian inference in queuing networks [10], [9] and obtained satisfactory results in their test for the web servers. In a similar fashion, using queuing networks for network tomography might be a more realistic model. Thus I expect that it will yield better accuracy for the estimation of throughput and RTT (round trip-time). Because the large amount of data-processing we’ll need to parallelize and adapt scalable approaches for the machine learning algorithms to be used.Sensor-networks are becoming more ubiquitous and Internet is moving from client-server architecture to the P2P architecture. These networks have specific importance in network tomography. Because there is no other efficient way to have information about the networks internal characteristics. There are some challenges that network tomogoraphy for these networks are bringing to like massive data load and lack of ends. Because every node can be an end. Thus there is no specific end. It might be more efficient to get measurements from supernodes. Network coding[4] is a new technique for modeling network and it has many applications in network tomography. This model is very suitable for bayesian causality and belief propagation. The possibilities should be investigated. Apart from the network’s internal characteristics, it is possible to approximate the router’s internal characteristics like cpu load and temperature with a sampler like MCMC.

With the recent revision of SNMP protocol it is possible to get different types of measurements from MIB’s like the number of in and out connections of a host (this data can be obtained from SNMPv2 by implementing a simple SNMP agent). It is possible to predict device’s connection behavior in an evolving network similar to the social network analysis and protein-protein interaction predictions. This problem is highly structured and device’s behavior is highly correlated with the number of incoming and outgoing connections. Therefore I believe it is possible to use CRF for network’s hosts’ behavior analysis [5]. Flow exporting from routers is an expensive procedure so routers use sampling (e.g.: trajectory sampling) and the missing data problem in that case can be solved with MCMC by sampling from the posterior distribution.

### References

[1] E.M. Airoldi and C.N. Faloutsos. Advances in network tomography. Citeseer, 2003.
[2] A. Coates, AO Hero Iii, R. Nowak, and B. Yu. Internet tomography. Signal Processing Magazine, IEEE, 19(3):47–65, 2002.
[3] M.J. Coates and R.D. Nowak. Network tomography for internal delay estimation. In Acoustics, Speech, and Signal Processing, 2001. Proceedings.(ICASSP’01). 2001 IEEE International Conference on, volume 6, pages 3409–3412. IEEE, 2002.
[4] C. Fragouli, J. Le Boudec, and J. Widmer. Network coding: an instant primer. Computer Communication Review, 36(1):63, 2006.
[5] T. Karagiannis, K. Papagiannaki, and M. Faloutsos. BLINC: multilevel traffic classification in the dark. In Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications,pages 229–240. ACM, 2005.
[6] G. Liang and B. Yu. Maximum pseudo likelihood estimation in network tomography. Signal Processing, IEEE Transactions on, 51(8):2043–2053, 2003.
[7] V.N. Padmanabhan, L. Qiu, and H.J. Wang. Passive network tomography using bayesian inference. In Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurement, pages 93–94. ACM, 2002.
[8] M.G. Rabbat, M.A.T. Figueiredo, and R.D. Nowak. Network inference from co-occurrences. Information Theory, IEEE Transactions on, 54(9):4053–4068, 2008.
[9] C. Sutton and M.I. Jordan. Inference and Learning in Networks of Queues. 3
[10] C. Sutton and M.I. Jordan. Bayesian Inference for Queueing Networks and Modeling of Internet Services. Arxiv preprint arXiv:1001.3355, 2010.
[11] Y. Vardi. Network Tomography: Estimating Source-Destination Traffic Intensities from Link Data. Journal of the American Statistical Association, 91(433),1996.
[12] Y. Zhang, M. Roughan, C. Lund, and D. Donoho. An information-theoretic approach to traffic matrix estimation. In Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, pages 301–312. ACM, 2003.