April 22, 2014 - Implementing "cat" without libc

April 22, 2014

Implementing "cat" without libc

Like usual, each new day of programming yeilds it's own triumphs and tribulations. My assembly language glue has evolved into something much nicer than the crude thing it was a week or so ago, and now it does less work in most cases. I commented the whole thing so it is easier to read and understand, and I also caught a bug which caused the programs to crash instead of exiting normally. I am much happier about it. In addition, it also correctly passes command line arguments to main() now. In the process of implementing that, I gained some valuable insight into the way things are supposed to work in Linux and assembly. At least I think it's valuable.
I also decided to implement about the simplest Unix command line tool using C without any libc functionality: cat. I decided that I would just implement the core functionality of cat, excluding things like printing with line numbers and whatnot, just what I consider to be the core functionality.
Should be an easy task, right? Well, turns out the answer is both yes, and no.

Basically, the target functionality boiled down to 4 use cases:
- cat (reading stdin)
- cat text.txt (reading a file and printing the contents)
- cat 1.txt 2.txt (reading multiple files and printing the contents)
- echo "test" | cat (printing a text stream piped into the program)
This should be trivial, and in C, it would be. However, without the C library, I am limited to only system calls. Syscalls for file reading are simple: SYS_open, SYS_read, SYS_close
Writing is just as easy: SYS_write
So I figured that I was pretty much done. I wrote a program that checks argc, then outs stdin (1) if there are no args, and opens argv[1] if there are args. Then it continues to loop through the remaining args.

I fought with it for a little while, but finally ironed out my errors. The resultant code looks nice and runs great. Then it came time to run through my test cases:
test 1: ./cat - SUCCESS
test 2: ./cat cat.c - SUCCESS
test 3: ./cat cat.c asm/amd64_syscalls.s - SUCCESS
test 4: echo "TESTING" | ./cat - FAILURE

Needless to say, I was stunned. The program is invoked with no arguments, so it should just read from stdin (1) and output to sdtout (0), and Linux should take care of redirecting the stdout from echo to the stdin of ./cat. However, this is NOT what happens, I was wrong. However, though I know one of my assumptions was wrong, I haven't yet been able to figure out exactly what is really happening. The behavior is very strange, when I redirect some text stream to the program, there is no output, even if I type something on the command line where stdin would normally come from. So somehow, the redirected text stream is partially overriding the text from the command line, but something is broken in there somewhere. Another mystery....

Anyways, here is my code at the moment:
#include <fcntl.h>
#include <sys/syscall.h>

void cat(int in, int out)
{
        int ret;
        char byte;

        do
        {
                ret = syscall3(SYS_read,in,&byte,1);
                syscall3(SYS_write,out,&byte,1);
        } while (ret!=0);

        return;
}

int main(int argc, char **argv)
{
        int i;

        if (argc==1)
        {
                cat(1,0);
        }
        else
        {
                for (i=1;i<argc;i++)
                {
                        int fd = syscall3(SYS_open,argv[i],O_RDONLY,0);
                        cat(fd,0);
                        syscall1(SYS_close,fd);
                }
        }

//        syscall0(SYS_exit);
        return 0;
}


Also, the ASM glue:
section .text
        global _start, __syscall, syscall0, syscall1, syscall2, syscall3, syscall4, syscall5, syscall_list
        extern main

        syscall_list:
                mov        rax,rdi                ;syscall number
                mov        rdi,[rsi]        ;arg1
                mov        rdx,[rsi+16]        ;arg3
                mov        r10,[rsi+24]        ;arg4
                mov        r8,[rsi+32]        ;arg5
                mov        r9,[rsi+40]        ;arg6
                mov        rsi,[rsi+8]        ;arg2
                syscall                        ;
                ret                        ;

        syscall0:
                mov        rax,rdi                ;syscall number
                syscall                        ;
                ret                        ;

        syscall1:
                mov        rax,rdi                ;syscall number
                mov        rdi,rsi                ;arg1
                mov        rsi,rdx                ;arg2
                syscall                        ;
                ret                        ;

        syscall2:
                mov        rax,rdi                ;syscall number
                mov        rdi,rsi                ;arg1
                mov        rsi,rdx                ;arg2
                syscall                        ;
                ret                        ;

        syscall3:
                mov        rax,rdi                ;syscall number
                mov        rdi,rsi                ;arg1
                mov        rsi,rdx                ;arg2
                mov        rdx,rcx                ;arg3
                syscall                        ;
                ret                        ;

        syscall4:
                mov        rax,rdi                ;syscall number
                mov        rdi,rsi                ;arg1
                mov        rsi,rdx                ;arg2
                mov        rdx,rcx                ;arg3
                mov        r10,r8                ;arg4
                syscall                        ;
                ret                        ;

        syscall5:
        __syscall:
                mov        rax,rdi                ;syscall number
                mov        rdi,rsi                ;arg1
                mov        rsi,rdx                ;arg2
                mov        rdx,rcx                ;arg3
                mov        r10,r8                ;arg4
                mov        r8,r9                ;arg5
                syscall                        ;
                ret                        ;

        _start:
                xor        rbp,rbp                ;AMD64 ABI Requirement
                pop        rdi                ;Get argument count
                mov        rsi,rsp                ;Get argument array
                call        main                ;Execute C code

                                        ;End Program
                mov         rax,60                ;SYS_exit
                syscall                        ;

                ret                        ;Never Reached


On a side note, I did some more work on drawing to the framebuffer. I figured out that I can use SYS_ioctl to actually change some setting in the framebuffer. I cleaned up the code a bit and hopefully have not hurt performance at all, and now I can draw colors to the screen. Eventually, I am going to make it do something useful, not sure what though.
Anyways, that's all for now.