Thursday, April 19, 2012

Problem solving techniques (Example problems) (Cont.)

In this post i will show you when to use graphs to solve specific type of problems. To know about graphs and other problem solving techniques you may need to read this.

Problem 14 (Shortest path)

Given a set of cities' paths represented as a two dimensional array grid where grid[i][j] = 1 iff there exists a path between city i to city j. Given two cities src and dst, find the shortest path to go from src to dst. If there is no such path return -1.

Solution // In this problem i will show you how to use BFS to get shortest path

#define N 10 // For the example

int prev[N];
int vis[N];

int find_shortest_path(int paths[][N], int src, int dst) {
    memset(vis, 0, sizeof vis);
    memset(prev, -1, sizeof prev);
   
    queue<int> q;
    q.push(src); // First city
   
    while(!q.empty()) {
        int city = q.front();
        q.pop();

        if(city == dst) break; // We arrived
       
        for(int i = 0; i < N; i++) {
            if(paths[city][i] && !vis[i]) {
                prev[i] = city;
                q.push(i);
                vis[i] = 1;
            }
        }
    }
   
    if(prev[dst] == -1) return -1;
   
    int city = dst, res = 0;
    while(prev[city] != -1) { // Exercise: print the path
        city = prev[city];
        res ++;
    }
 
    return res;
}


Problem 15 (GrafixMask)

SRM 211 DIV-1 Medium

In one mode of the grafix software package, the user blocks off portions of a masking layer using opaque rectangles. The bitmap used as the masking layer is 400 pixels tall and 600 pixels wide. Once the rectangles have been blocked off, the user can perform painting actions through the remaining areas of the masking layer, known as holes. To be precise, each hole is a maximal collection of contiguous pixels that are not covered by any of the opaque rectangles. Two pixels are contiguous if they share an edge, and contiguity is transitive.

You are given a vector<string> named rectangles, the elements of which specify the rectangles that have been blocked off in the masking layer. Each String in rectangles consists of four integers separated by single spaces, with no additional spaces in the string. The first two integers are the window coordinates of the top left pixel in the given rectangle, and the last two integers are the window coordinates of its bottom right pixel. The window coordinates of a pixel are a pair of integers specifying the row number and column number of the pixel, in that order. Rows are numbered from top to bottom, starting with 0 and ending with 399. Columns are numbered from left to right, starting with 0 and ending with 599. Every pixel within and along the border of the rectangle defined by these opposing corners is blocked off.

Return a vector<int> containing the area, in pixels, of every hole in the resulting masking area, sorted from smallest area to greatest.

Solution

#define tall 400
#define width 600

bool Grid[tall][width]; // Represent the graph as a matrix
int nx[] = {0,0,1,-1};  // Sentinels
int ny[] = {-1,1,0,0}; // Sentinels

struct node {            // Graph node
    int x, y;
    node(int x,int y){
        this->x = x;
        this->y = y;
    }
};

int Fill(int x,int y)
{
    stack<node> s; // use DFS
    s.push(node(x,y));

    int result = 0;
    while(!s.empty())
    {
        node top = s.top();
        s.pop();

        if(Grid[top.x][top.y]) continue;

        Grid[top.x][top.y] = true;

        result++;

        for(int f= 0;f<4;f++)
        {
            int newX = top.x+nx[f];
            int newY = top.y+ny[f];
          
            if(!(newX<0 || newX >= tall || newY < 0 || newY >= width || Grid[newX][newY]))  
                     s.push(node(newX, newY));
        }
    }

    return result;
}

class grafixMask
{
public:
    vector <int> sortedAreas(vector <string> rect)
    {
        vector<int> answer;
        stringstream ss;
        int x,y,z,w;
           
        int i;
        for(i = 0;i<rect.size();i++)
        {
            ss << rect[i];
            ss >> x >> y >> z >> w;
           
            for(int j = x;j<=z;j++)
            for(int s= y; s<=w; s++)
                Grid[j][s] = true;
        }
       
        for(i = 0; i<tall; i++)
        for(int j = 0; j<width; j++)
            if(!Grid[i][j])
                answer.push_back(Fill(i,j));

        sort(answer.begin(),answer.end());

        return answer;
    }
};

Problem 16 (Shortest path)

Same as problem 14 but path[i][j] = c, that means path from city i to city j equals c KM. find the shortest path from city src, to city dst. In this problem we won't be able to use BFS. Why?

Dijkstra Solution

#define N 5

int graph[N][N];
bool vis[N];

struct node {
    int indx;
    int cst;
};

bool operator < (const node& n1, const node& n2) {
    return n1.cst > n2.cst;
}

int dijkstra(int src, int dst) {
    priority_queue<node> pq;
    pq.push(node(src, 0));
   
    while(!pq.empty()) {
        node n = pq.top();
        pq.pop();
       
        if(vis[n.indx]) continue;
        else if(n.indx == dst)          
                   return n.cst;

        vis[n.indx] = true;
       
        for(int i = 0; i < N; i++)
            if(graph[n.indx][i] != INT_MAX)
                pq.push(node(i, n.cst+graph[n.indx][i]));
     }
   
    return -1;
}

Floyd Warshall  Solution

int floyd_warshall()
{
    int i, j, k;
   
    for(k = 0; k < N; k++)
    for(i = 0; i < N; i++)
    for(j = 0; j < N; j++)
    if(graph[i][k] != INT_MAX && graph[k][j] != INT_MAX)
    graph[i][j] = min(graph[i][j], graph[i][k]+graph[k][j]);

    cout << graph[0][3] << endl;
}

In the next post i will show you examples to Maximum flow and Minimum Cut problems.

Wednesday, April 18, 2012

Problem solving techniques (Example problems) (Cont.)

Time for backtracking is now. In this post i will show you how to use backtracking solving a specific type of problems. To know about backtracking please revise previous posts.

Problem 10 (8 Queens problem)

 The eight queens puzzle is the problem of placing eight chess queens on an 8×8 chessboard so that no two queens attack each other. Thus, a solution requires that no two queens share the same row, column, or diagonal.

Solution
  
/**
 * rows: Each cell represents the col number that will contain a queen in this row
 * curRow: The current row to be filled
**/
int cnt = 0;
void place_8_queens(int rows[], int curRow) {
    if(curRow >= 8) {
        for(int i = 0; i < 8; i++) cout << rows[i] << " ";
        cout << endl;
        cnt++;
        return;
    }
  
    for(int i = 0; i < 8; i++) {
        bool valid = true;
        for(int j = 0; j < curRow; j++)
            if(rows[j] == i || (i == rows[j]+(curRow-j)) || (i == rows[j]-(curRow-j)))
                valid = false;
       
        if(valid) {
            rows[curRow] = i;
            place_8_queens(rows, curRow+1);
        }
    }
}

int main() {
    int rows[8];
  
    place_8_queens(rows, 0);
    cout << "There are " cnt << " valid placement." << endl;
}

Problem 11 (Tree Find)

In this problem given a tree whith the following definition:

struct node {
    int val;                    // Current node value
    vector<int> childs; // Contains the indices of the cilds nodes
};

vector<node> tree;   // Tree[0] is the tree root

and an element, find this element.

Solution

This solution is a DFS (Depth first search) where at each time visiting a node, check if it holds the element return true, else find it in its childs.

bool search_tree(node cur, int elem) {
    if(cur.val == elem) return true;
   
    for(int i = 0; i < cur.childs.size(); i++)  // Search all childs of current node
        if(search_tree(tree[cur.childs[i]]), elem) return true; // Found it, return true
       
    return false; // not found neither in this node nor in its child, then backtrack
}

int main() {
    search_tree(tree[0], -21);
}

Problem 12 (Sudoku)

Sudoku puzzles consists of a partially completed row-column grid of cells partitioned into N regions or zones each of size N cells, to be filled in using a prescribed set of N distinct symbols (typically the numbers {1, ..., N}), so that each row, column and region contains exactly one of each element of the set. The puzzle can be solved using a variety of algorithms. Here is the backtracking way to solve it:

#define N    3 // For learning purposes

bool check_row(int grid[][N], int row, int col, int a) {
    for(int i = 0; i < col; i++)
        if(grid[row][i] == a) return false;
      
    return true;
}

bool check_col(int grid[][N], int row, int col, int a) {
    for(int i = 0; i < row; i++)
        if(grid[i][col] == a) return false;
      
    return true;
}

void small_sudoku(int grid[][N], int i, int j) {
    vector<int> allowed_nums;
    int a;
   
    for(a = 1; a <= N; a++) {
        if(check_row(grid, i, j, a) && check_col(grid, i, j, a)) {
            grid[i][j] = a;
              
            if(j < N-1) small_sodoku(grid, i, j+1);
            else if(j == N-1 && i < N-1) small_sodoku(grid, i+1, 0);
            else {
                cout << "Found placement" << endl;
              
                for(int r = 0; r < N; r++) {
                    for(int c = 0; c < N; c++)
                        cout << grid[r][c] << " ";
                    cout << endl;
                }
            }
        }
    }
}

In the next post i will show you how to solve graph related problems.

Saturday, April 14, 2012

Problem solving techniques (Example problems) (Cont.)

In this post i will show you how to use the greedy approach, solving specific set of problems that may match a specific property. You may need to visit this post in order to understand the whole idea.

In the last post we discussed how to use DP technique to solve problems, Greedy and DP techniques are to solve optimization problems "Those which asks to find the optimal solution" (e.g., Finding the shortest path, min cost, or max profit).

Problem 8 (Activity Selection Problem)
In this problem we are given set of competing activities that require exclusive use of a common resource, with a goal of selecting a maximum size set of mutually compatible activities.

Suppose we have a set S = {a1, a2, ...., an} of n proposed activities that wish to use a resource, such as a lecture hall, which can be used by only one activity at a time. Given the start, and finish time of each task, Select the maximum size subset of manually compatible activities.

Solution
Sort the tasks in monotonically increasing finish time. By default include the first activity in our result subset, Then loop all other activities comparing it with the last included activity, if its not overlapped include it.

struct Activity {
    int stime, ftime;
};

bool operator < (const Activity& a1, const Activity& a2) {
    return a1.ftime < a2.ftime;
}

vector<Activity> greedy_activity_selector(vector<Activity> activities) {
   
    vector<Activity> res(1);
    int i, j = 0;
   
    sort(activities.begin(), activities.end());
    res[0] = activities[0];
   
    for(i = 1; i < activities.size(); i++) {
        if(activities[i].stime >= activities[j].ftime) {
            res.push_back(activities[i]);
            j = i;
        }
    }
   
    return res;
}

Problem 9 (Huffman Codes)

Huffman Codes are a widely used and very effective technique for compressing data: savings of 20% to 90% are typical, depending on the characteristic of the data being compressed. Data considered data to be sequence of characters. In order to understand the whole idea read this.

Huffman's greedy algorithm uses a table of the frequencies of occurrence of the characters to build up an optimal way of representing each character as a binary string.

Example


Frequency in thousands
a
b
c
d
e
f
Variable length code
45
13
12
16
9
5

Tracing Algorithm

Step 1:

Step 2:

Step 3:

Step 4:


Step 5:

 Step 6:

Solution

Hufman(C) {
    n = size(C)
    Q = C // Priority queue
   
    for i = 1 to n-1
    do allocate a new node z
        left[z] = x = extract_min(Q)
        right[z] = y = extract_min(Q)
        f[z] = f[x]+f[y]
           
        insert(Q, z)
    done 
    return extract_min(Q)
}

In the next post i will show you how to use backtracking solving specific set of problems.

Wednesday, April 11, 2012

Problem solving techniques (Example problems) (Cont.)

In this post i will show how to use Dynamic Programming technique to solve and optimize your algorithms, you may need to read my previous posts about problem solving techniques.

Problem 6 (Fibonacci Sequence)

To know about the fibonacci sequence you may need to read this. To generate this sequence by definition the first two numbers of it is 0, and 1. To get the nth element from fib sequence you just sum the last two elements F(N) = F(N-1)+F(N-2).

Recursive solution

int fib(int n) {
    if(n == 0) return 0;
    else if(n == 1) return 1;
   
    return fib(n-1)+fib(n-2);
}

If you draw the call graph for fib(6), it will look like the following:






As you may check each call makes 2 calls for fib(N-1) and fib(N-2). As per the drawing all black nodes are all repeated calls and no need to be called again.

Here we can use dynamic programming to save these repeated calls.

Memoization Solution using DP

int memo[10]; // in your main memo[0] = 0, memo[1] = 1

int fib(int n) {
    if(memo[n] != -1) return memo[n];
   
    return (memo[n]=fib(n-1)+fib(n-2));
}

Now the call graph for this version is:


As you can see the tree is much smaller than before, and all the dashed nodes are a previously visited state. This code is linear O(N).

Problem 7 (Changing Sounds)

Topcoder SRM 366 DIV 1 Level 1

You are a guitar player and you are going to play a concert. You don't like to play all the songs at the same volume, so you decide to change the volume level of your guitar before each new song. Before the concert begins, you make a list of the number of intervals you will be changing your volume level by before each song. For each volume change, you will decide whether to add that number of intervals to the volume, or substract it.

You are given a int[] changeIntervals, the i-th element of which is the number of intervals you will change your volume by before the i-th song. You are also given an int beginLevel, the initial volume of your guitar, and an int maxLevel, the highest volume setting of your guitar. You cannot change the volume of your guitar to a level above maxLevel or below 0 (but exactly 0 is no problem). Return the maximum volume you can use to play the last song. If there is no way to go through the list without exceeding maxLevel or going below 0, return -1.

At each song we can generate our current state from the last song satates, checking if each state is a valid state or not. At the last song we can check the max level we can reach returning it, or -1 in case we can reach a level between 0 and max.

 Dynamic programming solution

int maxFinal(vector <int> intervals, int begin, int max)
    {
        bool canHave[intervals.size()+1][max+1];
        memset(canHave, false, sizeof(canHave));
       
        canHave[0][begin] = true;
       
        int i;
        for(i = 0; i < intervals.size(); i++)
        {
            for(int j = 0; j <= max; j++)
            {
                if(canHave[i][j])
                {
                    if(j+intervals[i] <= max)
                        canHave[i+1][j+intervals[i]] = true;
                    if(j-intervals[i] >= 0)
                        canHave[i+1][j-intervals[i]] = true;
                }
            }
        }
       
        int res = -1;
        for(i = 0; i <= max; i++)
            if(canHave[intervals.size()][i] && i > res) res = i;
           
        return res;
    }


In the next post i will show sample problems solved using greedy technique.

Monday, April 9, 2012

Linux Security Modules Framework (LSM)

In computer security the term access control is mainly used to refer to the process of controlling the access of a subject or initiator (Process) or generally perform some sort of operation on an object or target (Data).

The Linux Security Modules (LSM) project has developed a lightweight, general purpose, access control framework for the mainstream Linux kernel that enables many different access control models to be implemented as loadable kernel modules. A number of existing enhanced access control implementations, including Linux capabilities, Security-Enhanced Linux (SELinux), and Domain and Type Enforcement (DTE), have already been adapted to use the LSM framework.

By checking the kernel source code i found that there are two systems that have already adopted to use the LSM framework, Tomoyo and Smack.

Agenda
  • What is LSM Framework?
  • LSM Design
  • LSM Implementation
  • LSM Hooks
  • References
What is LSM Framework?

Linux Security Modules (LSM) is a framework that allows the Linux kernel to support a variety of computer security models while avoiding favoritism toward any single security implementation. The framework is licensed under the terms of the GNU General Public License and is standard part of the Linux kernel since Linux 2.6. Apparmor, SELinux, Smack and TOMOYO Linux are the currently accepted modules in the official kernel.

LSM Framework Design 

The basic abstraction of the LSM interface is to mediate access to internal kernel objects. LSM seeks to allow modules to answer the question ``May a subject S perform a kernel operation OP on an internal kernel object OBJ?''

                                               Figure1: LSM Hook Architecture                                                   

LSM allows modules to mediate access to kernel objects by placing hooks in the kernel code just ahead of the access, as shown in Figure 1. Just before the kernel would have accessed an internal object, a hook makes a call to a function that the LSM module must provide. The module can either let the access occur, or deny access, forcing an error code return.

Many security models require binding security attributes to kernel objects. To facilitate this, LSM provides for opaque security fields that are attached to various internal kernel objects. However, the module is completely responsible for managing these fields, including allocation, de-allocation, and concurrency control.

LSM Implementation

The LSM kernel patch modifies the kernel in five primary ways. First, it adds opaque security fields to certain kernel data structures. Second, the patch inserts calls to security hook functions at various points within the kernel code. Third, the patch adds a generic security system call. Fourth, the patch provides functions to allow kernel modules to register and unregister themselves as security modules. Finally, the patch moves most of the capabilities logic into an optional security module.

Opaque Security Fields


The opaque security fields are void* pointers, which enable security modules to associate security information with kernel objects. Table 1 shows the kernel data structures that are modified by the LSM kernel patch and the corresponding abstract object. 

The setting of these security fields and the management of the associated security data is handled by the security modules. LSM merely provides the fields and a set of calls to security hooks that can be implemented by the module to manage the security fields as desired. For most kinds of objects, an alloc_security hook and a free_security hook are defined that permit the security module to allocate and free security data when the corresponding kernel data structure is allocated and freed. Other hooks are provided to permit the security module to update the security data as necessary, e.g. a post_lookup hook that can be used to set security data for an inode after a successful lookup operation. It is important to note that LSM does not provide any locking for the security fields; such locking must be performed by the security module. 

Since some objects will exist prior to the initialization of a security module, even if the module is built into the kernel, a security module must handle pre-existing objects. Several approaches are possible. The simplest approach is to ignore such objects, treating them as being outside of the control of the module. These objects would then only be controlled by the base Linux access control logic. A second approach is to traverse the kernel data structures during module initialization, setting the security fields for all pre-existing objects at this time. This approach would require great care to ensure that all objects are updated (e.g. an open file might be on a UNIX domain socket awaiting receipt by a process) and to ensure that appropriate locking is performed. A third approach is to test for pre-existing objects on each use and to then set the security field for pre-existing objects when needed. 

Calls to Security Hook Functions
Figure: The vfs_mkdir kernel function with one security hook call 
to mediate access  and one security hook call to manage the security field. 
The security hooks are marked by<->.
As discussed in the previous subsection, LSM provides a set of calls to security hooks to manage the security fields of kernel objects. It also provides a set of calls to security hooks to mediate access to these objects. Both sets of hook functions are called via function pointers in a global security_ops table. This structure consists of a collection of substructures that group related hooks based on kernel object or subsystem, as well as some top-level hooks for system operations. Each hook is defined in terms of kernel objects and parameters, and care has been taken to avoid userspace pointers. 

Figure 2 shows the vfs_mkdir kernel function after the LSM kernel patch has been applied. This kernel function is used to create new directories. Two calls to security hook functions have been inserted into this function. The first hook call, security_ops->inode_ops->mkdir, can be used to control the ability to create new directories. If the hook returns an error status, then the new directory will not be created and the error status will be propagated to the caller. The second hook call, security_ops->inode_ops->post_mkdir, can be used to set the security field for the new directory's inode structure. This hook can only update the security module's state; it cannot affect the return status.

Registering Security Modules

The LSM framework is initialized during the kernel's boot sequence with a set of dummy hook functions that enforce traditional UNIX superuser semantics. When a security module is loaded, it must register itself with the LSM framework by calling the register_security function. This function sets the global security_ops table to refer to the module's hook function pointers, causing the kernel to call into the security module for access control decisions. The register_security function will not overwrite a previously loaded module. Once a security module is loaded, it becomes a policy decision whether it will allow itself to be unloaded.

If a security module is unloaded, it must unregister with the framework using unregister_security. This simply replaces the hook functions with the defaults so the system will still have some basic means for security. The default hook functions do not use the opaque security fields, so the system's security should not be compromised if the module does a poor job of resetting the opaque fields.

LSM Hooks

 Task Hooks

LSM provides a set of task hooks that enable security modules to manage process security information and to control process operations. Modules can maintain process security information using the security field of the task_struct structure. Task hooks provide control over inter-process operations, such as kill, as well as control over privileged operations on the current process, such as setuid. The task hooks also provide fine-grained control over resource management operations such as setrlimit and nice.

Program Loading Hooks

Many security modules, including Linux capabilities, DTE, SELinux, and SubDomain require the ability to perform changes in privilege when a new program is executed. Consequently, LSM provides a set of program-loading hooks that are called at critical points during the processing of an execve operation. The security field of the linux_binprm structure permits modules to maintain security information during program loading. One hook is provided to permit security modules to initialize this security information and to perform access control prior to loading the program, and a second hook is provided to permit modules to update the task security information after the new program has been successfully loaded. These hooks can also be used to control inheritance of state across program executions, for example, revalidating open file descriptors.

IPC Hooks

Security modules can manage security information and perform access control for System V IPC using the LSM IPC hooks.

File System Hooks

For file operations, three sets of hooks were defined: filesystem hooks, inode hooks, and file hooks. LSM adds a security field to each of the associated kernel data structures: super_block, inode, and file. The filesystem hooks enable security modules to control operations such as mounting and statfs. LSM leverages the existing permission function by inserting an inode hook into it, but LSM also defines a number of other inode hooks to provide finer-grained control over individual inode operations. Some of the file hooks allow security modules to perform additional checking on file operations such as read and write, for example, to revalidate permissions on use to support privilege bracketing or dynamic policy changes. A hook is also provided to allow security modules to control receipt of open file descriptors via socket IPC. Other file hooks provide finer-grained control over operations such as fcntl and ioctl.

Network Hooks

Application layer access to networking is mediated using a set of socket hooks. These hooks, which include the interposition of all socket system calls, provide coarse mediation coverage of all socket-based protocols. Since active user sockets have an associated inode structure, a separate security field was not added to the socket structure or to the lower-level sock structure. As the socket hooks allow general mediation of network traffic in relation to processes, LSM significantly expands the kernel's network access control framework. For example, the sock_rcv_skb hook allows an inbound packet to be mediated in terms of its destination application, prior to being queued at the associated userspace socket.

Other Hooks

LSM provides two additional sets of hooks: module hooks and a set of top-level system hooks. Module hooks can be used to control the kernel operations that create, initialize, and delete kernel modules. System hooks can be used to control system operations, such as setting the system hostname, accessing I/O ports, and configuring process accounting. The existing Linux kernel provides some control over many of these operations using the capability checks, but those checks only provide coarse-grained distinctions among different operations and do not provide any argument information.

References