Saturday, January 27, 2007

Stack Size of Threads

Several months ago, my boss assigned me to do performance tuning on several applications. One of them was a server application, which was for distributing data to clients. This application used a lot of threads, two threads per client -.-" It can support up to about 80 clients. If the number was over 80, the application generated a core dump. The core dump was due to out of memory. From top, I found the virtual memory size increases 4x MB for each newly connected client. It confused me. How could one client eat 4xMB?
Finally, I found the answer -- stack size of a thread. In the Linux system, the default stack size of a thread (in case of pthread library) is set to 20MB!
To change the stack size of a thread,
  1. use pthread_attr_setstacksize (&attr, stacksize) to set the attribute during thread creation. For more details, you can refer to the following link: http://www.llnl.gov/computing/tutorials/pthreads/#Stack
  2. use ulimit -s nnnn command to change the default stack size of pthread, where nnnn is the size in KBytes. (http://kbase.redhat.com/faq/FAQ_43_8710.shtm)

Saturday, January 20, 2007

Find files by last modification time

"find" is a powerful Unix/Linux utility for searching files. With the -mtime (or -mmin) options, "find" can search files which modified within a specific days (or minutes). However, there is no option for "find" to search files modified after a point of time. To do this, you need to write a shell script:
#!/bin/sh
touch -t "$1" /tmp/$$ \
&& find . -newer /tmp/$$ \
&& rm /tmp/$$
This script creates a temp file with a specific last modification time in /tmp directory and uses the temp file's last modification time as a reference time point for "find" to search files newer than the specific time. The -newer option of "find" is for searching files which are newer than a specific file. touch -t "$1" /tmp/$$ creates a temp file with a specific last modified time using the -t option.

Sunday, January 14, 2007

Speed up DNS Lookup in Linux

When I was first time to use Linux for web surfing, I found it was slower than MS Windows. Recently, I found the reason form the Internet. The difference is due to the DNS lookup speed. In Windows, it caches DNS lookup results but Linux doesn't. To improve the lookup speed in Linux, we can install dnsmasq, which is a light-weight DNS server for Linux.

Monday, January 01, 2007

Bufferring Behaviour of stdout(Stardard Out)

Recently, I encounter a problem of capturing stdout of a program. The program is expected to run for a long time. When the output of the program is on the console, everything works fine. However, when the output of the program is redirected to a pipe or a file, the output is buffered for a long time. This means the output of the program is not appeared immediately. This behaviour is not desired. To illustrate the problem, I write the following simple program:
int main(int argc, char** argv)
{
while (1)
{
sleep(1);
printf("hello!\n");
}
}
The program prints "hello!" for every second. However, the output will be buffered if it is
redirected to a pipe, like:
a.out | tee tmp.log
After some investigation, I found the behaviour is documented in setbuf(3) man page:
The three types of buffering available are unbuffered, block buffered, and line buffered. When an output stream is unbuffered, information appears on the destination file or terminal as soon as written; when it is block buffered many characters are saved up and written as a block; when it is line buffered characters are saved up until a newline is output or input is read from any stream attached to a terminal device (typically stdin).... Normally all files are block buffered. When the first I/O operation occurs on a file, malloc(3) is called, and a buffer is obtained. If a stream refers to a terminal (as stdout normally does) it is line buffered. The standard error stream stderr is always unbuffered by default....

There are two ways to solve the problem. First, call fflush(stdout) after printf.
Second, call setlinebuf(stdout) at the startup of the program.