April 22, 2014 - Implementing "cat" without libc
April 22, 2014
Implementing "cat" without libc
Like usual, each new day of programming yeilds it's own triumphs and
tribulations. My assembly language glue has evolved into something much nicer
than the crude thing it was a week or so ago, and now it does less work in most
cases. I commented the whole thing so it is easier to read and understand, and
I also caught a bug which caused the programs to crash instead of exiting
normally. I am much happier about it. In addition, it also correctly passes
command line arguments to main() now. In the process of implementing that, I
gained some valuable insight into the way things are supposed to work in Linux
and assembly. At least I think it's valuable.
I also decided to implement about the simplest Unix command line tool using C
without any libc functionality: cat. I decided that I would just implement the
core functionality of cat, excluding things like printing with line numbers and
whatnot, just what I consider to be the core functionality.
Should be an easy task, right? Well, turns out the answer is both yes, and no.
Basically, the target functionality boiled down to 4 use cases:
- cat (reading stdin)
- cat text.txt (reading a file and printing the contents)
- cat 1.txt 2.txt (reading multiple files and printing the contents)
- echo "test" | cat (printing a text stream piped into the program)
This should be trivial, and in C, it would be. However, without the C library,
I am limited to only system calls. Syscalls for file reading are simple:
SYS_open, SYS_read, SYS_close
Writing is just as easy: SYS_write
So I figured that I was pretty much done. I wrote a program that checks argc,
then outs stdin (1) if there are no args, and opens argv[1] if there are args.
Then it continues to loop through the remaining args.
I fought with it for a little while, but finally ironed out my errors. The
resultant code looks nice and runs great. Then it came time to run through my
test cases:
test 1: ./cat - SUCCESS
test 2: ./cat cat.c - SUCCESS
test 3: ./cat cat.c asm/amd64_syscalls.s - SUCCESS
test 4: echo "TESTING" | ./cat - FAILURE
Needless to say, I was stunned. The program is invoked with no arguments, so it
should just read from stdin (1) and output to sdtout (0), and Linux should take
care of redirecting the stdout from echo to the stdin of ./cat. However, this
is NOT what happens, I was wrong. However, though I know one of my assumptions
was wrong, I haven't yet been able to figure out exactly what is really
happening. The behavior is very strange, when I redirect some text stream to
the program, there is no output, even if I type something on the command line
where stdin would normally come from. So somehow, the redirected text stream is
partially overriding the text from the command line, but something is broken in
there somewhere. Another mystery....
Anyways, here is my code at the moment:
#include <fcntl.h>
#include <sys/syscall.h>
void cat(int in, int out)
{
int ret;
char byte;
do
{
ret = syscall3(SYS_read,in,&byte,1);
syscall3(SYS_write,out,&byte,1);
} while (ret!=0);
return;
}
int main(int argc, char **argv)
{
int i;
if (argc==1)
{
cat(1,0);
}
else
{
for (i=1;i<argc;i++)
{
int fd = syscall3(SYS_open,argv[i],O_RDONLY,0);
cat(fd,0);
syscall1(SYS_close,fd);
}
}
// syscall0(SYS_exit);
return 0;
}
Also, the ASM glue:
section .text
global _start, __syscall, syscall0, syscall1, syscall2, syscall3, syscall4, syscall5, syscall_list
extern main
syscall_list:
mov rax,rdi ;syscall number
mov rdi,[rsi] ;arg1
mov rdx,[rsi+16] ;arg3
mov r10,[rsi+24] ;arg4
mov r8,[rsi+32] ;arg5
mov r9,[rsi+40] ;arg6
mov rsi,[rsi+8] ;arg2
syscall ;
ret ;
syscall0:
mov rax,rdi ;syscall number
syscall ;
ret ;
syscall1:
mov rax,rdi ;syscall number
mov rdi,rsi ;arg1
mov rsi,rdx ;arg2
syscall ;
ret ;
syscall2:
mov rax,rdi ;syscall number
mov rdi,rsi ;arg1
mov rsi,rdx ;arg2
syscall ;
ret ;
syscall3:
mov rax,rdi ;syscall number
mov rdi,rsi ;arg1
mov rsi,rdx ;arg2
mov rdx,rcx ;arg3
syscall ;
ret ;
syscall4:
mov rax,rdi ;syscall number
mov rdi,rsi ;arg1
mov rsi,rdx ;arg2
mov rdx,rcx ;arg3
mov r10,r8 ;arg4
syscall ;
ret ;
syscall5:
__syscall:
mov rax,rdi ;syscall number
mov rdi,rsi ;arg1
mov rsi,rdx ;arg2
mov rdx,rcx ;arg3
mov r10,r8 ;arg4
mov r8,r9 ;arg5
syscall ;
ret ;
_start:
xor rbp,rbp ;AMD64 ABI Requirement
pop rdi ;Get argument count
mov rsi,rsp ;Get argument array
call main ;Execute C code
;End Program
mov rax,60 ;SYS_exit
syscall ;
ret ;Never Reached
On a side note, I did some more work on drawing to the framebuffer. I figured
out that I can use SYS_ioctl to actually change some setting in the framebuffer.
I cleaned up the code a bit and hopefully have not hurt performance at all, and
now I can draw colors to the screen. Eventually, I am going to make it do
something useful, not sure what though.
Anyways, that's all for now.