[ACCEPTED]-Relationship between a kernel and a user thread-kernel

Accepted answer
Score: 42

When they say map, they mean that each kernel 26 thread is assigned to a certain number of 25 user mode threads.

Kernel threads are used 24 to provide privileged services to applications 23 (such as system calls ). They are also used 22 by the kernel to keep track of what all 21 is running on the system, how much of which 20 resources are allocated to what process, and 19 to schedule them.

If your applications make 18 heavy use of system calls, more user threads 17 per kernel thread, and your applications 16 will run slower. This is because the kernel 15 thread will become a bottleneck, since all 14 system calls will pass through it.

On the 13 flip side though, if your programs rarely 12 use system calls (or other kernel services), you 11 can assign a large number of user threads 10 to a kernel thread without much performance 9 penalty, other than overhead.

You can increase 8 the number of kernel threads, but this adds 7 overhead to the kernel in general, so while 6 individual threads will be more responsive 5 with respect to system calls, the system 4 as a whole will become slower.

That is why 3 it is important to find a good balance between 2 the number of kernel threads and the number 1 of user threads per kernel thread.

Score: 19

User threads are managed in userspace - that 35 means scheduling, switching, etc. are not 34 from the kernel.

Since, ultimately, the 33 OS kernel is responsible for context switching 32 between "execution units" - your 31 user threads must be associated (ie., "map") to 30 a kernel schedulable object - a kernel thread†1.

So, given 29 N user threads - you could use N kernel 28 threads (a 1:1 map). That allows you to 27 take advantage of the kernel's hardware 26 multi-processing (running on multiple CPUs) and 25 be a pretty simplistic library - basically 24 just deferring most of the work to the kernel. It 23 does, however, make your app portable between 22 OS's as you're not directly calling the 21 kernel thread functions. I believe that 20 POSIX Threads (PThreads) is the preferred *nix implementation, and 19 that it follows the 1:1 map (making it virtually 18 equivalent to a kernel thread). That, however, is 17 not guaranteed as it'd be implementation 16 dependent (a main reason for using PThreads 15 would be portability between kernels).

Or, you 14 could use only 1 kernel thread. That'd allow 13 you to run on non multitasking OS's, or 12 be completely in charge of scheduling. Windows' User Mode Scheduling is 11 an example of this N:1 map.

Or, you could 10 map to an arbitrary number of kernel threads 9 - a N:M map. Windows has Fibers, which would allow 8 you to map N fibers to M kernel threads 7 and cooperatively schedule them. A threadpool 6 could also be an example of this - N workitems 5 for M threads.

†1: A process has 4 at least 1 kernel thread, which is the actual 3 execution unit. Also, a kernel thread must 2 be contained in a process. OS's must schedule 1 the thread to run - not the process.

Score: 19


Implementing Threads in User Space

There 179 are two main ways to implement a threads 178 package: in user space and in the kernel. The 177 choice is moderately controversial, and 176 a hybrid implementation is also possible. We 175 will now describe these methods, along with 174 their advantages and disadvantages.

The first 173 method is to put the threads package entirely 172 in user space. The kernel knows nothing 171 about them. As far as the kernel is concerned, it 170 is managing ordinary, single-threaded processes. The 169 first, and most obvious, advantage is that 168 a user-level threads package can be implemented 167 on an operating system that does not support 166 threads. All operating systems used to fall 165 into this category, and even now some still 164 do.

All of these implementations have the 163 same general structure, which is illustrated 162 in Fig. 2-8(a). The threads run on top of 161 a run-time system, which is a collection 160 of procedures that manage threads. We have 159 seen four of these already: thread_create, thread_exit, thread_wait, and 158 thread_yield, but usually there are more.

When 157 threads are managed in user space, each 156 process needs its own private thread table 155 to keep track of the threads in that process. This 154 table is analogous to the kernel's process 153 table, except that it keeps track only of 152 the per-thread properties such the each 151 thread's program counter, stack pointer, registers, state, etc. The 150 thread table is managed by the run-time 149 system. When a thread is moved to ready 148 state or blocked state, the information 147 needed to restart it is stored in the thread 146 table, exactly the same way as the kernel 145 stores information about processes in the 144 process table.

When a thread does something 143 that may cause it to become blocked locally, for 142 example, waiting for another thread in its 141 process to complete some work, it calls 140 a run-time system procedure. This procedure 139 checks to see if the thread must be put 138 into blocked state. If so, it stores the 137 thread's registers (i.e., its own) in the 136 thread table, looks in the table for a ready 135 thread to run, and reloads the machine registers 134 with the new thread's saved values. As soon 133 as the stack pointer and program counter 132 have been switched, the new thread comes 131 to life again automatically. If the machine 130 has an instruction to store all the registers 129 and another one to load them all, the entire 128 thread switch can be done in a handful of 127 instructions. Doing thread switching like 126 this is at least an order of magnitude faster 125 than trapping to the kernel and is a strong 124 argument in favor of user-level threads 123 packages.

However, there is one key difference 122 with processes. When a thread is finished 121 running for the moment, for example, when 120 it calls thread_yield, the code of thread_yield 119 can save the thread's information in the 118 thread table itself. Furthermore, it can 117 then call the thread scheduler to pick another 116 thread to run. The procedure that saves 115 the thread's state and the scheduler are 114 just local procedures, so invoking them 113 is much more efficient than making a kernel 112 call. Among other issues, no trap is needed, no 111 context switch is needed, the memory cache 110 need not be flushed, and so on. This makes 109 thread scheduling very fast.

User-level threads 108 also have other advantages. They allow each 107 process to have its own customized scheduling 106 algorithm. For some applications, for example, those 105 with a garbage collector thread, not having 104 to worry about a thread being stopped at 103 an inconvenient moment is a plus. They also 102 scale better, since kernel threads invariably 101 require some table space and stack space 100 in the kernel, which can be a problem if 99 there are a very large number of threads.

Despite 98 their better performance, user-level threads 97 packages have some major problems. First 96 among these is the problem of how blocking 95 system calls are implemented. Suppose that 94 a thread reads from the keyboard before 93 any keys have been hit. Letting the thread 92 actually make the system call is unacceptable, since 91 this will stop all the threads. One of the 90 main goals of having threads in the first 89 place was to allow each one to use blocking 88 calls, but to prevent one blocked thread 87 from affecting the others. With blocking 86 system calls, it is hard to see how this 85 goal can be achieved readily.

The system 84 calls could all be changed to be nonblocking 83 (e.g., a read on the keyboard would just 82 return 0 bytes if no characters were already 81 buffered), but requiring changes to the 80 operating system is unattractive. Besides, one 79 of the arguments for user-level threads 78 was precisely that they could run with existing 77 operating systems. In addition, changing 76 the semantics of read will require changes 75 to many user programs.

Another alternative 74 is possible in the event that it is possible 73 to tell in advance if a call will block. In 72 some versions of UNIX, a system call, select, exists, which 71 allows the caller to tell whether a prospective 70 read will block. When this call is present, the 69 library procedure read can be replaced with 68 a new one that first does a select call 67 and then only does the read call if it is 66 safe (i.e., will not block). If the read 65 call will block, the call is not made. Instead, another 64 thread is run. The next time the run-time 63 system gets control, it can check again 62 to see if the read is now safe. This approach 61 requires rewriting parts of the system call 60 library, is inefficient and inelegant, but 59 there is little choice. The code placed 58 around the system call to do the checking 57 is called a jacket or wrapper.

Somewhat analogous 56 to the problem of blocking system calls 55 is the problem of page faults. We will study 54 these in Chap. 4. For the moment, it is 53 sufficient to say that computers can be 52 set up in such a way that not all of the 51 program is in main memory at once. If the 50 program calls or jumps to an instruction 49 that is not in memory, a page fault occurs 48 and the operating system will go and get 47 the missing instruction (and its neighbors) from 46 disk. This is called a page fault. The process 45 is blocked while the necessary instruction 44 is being located and read in. If a thread 43 causes a page fault, the kernel, not even 42 knowing about the existence of threads, naturally 41 blocks the entire process until the disk 40 I/O is complete, even though other threads 39 might be runnable.

Another problem with user-level 38 thread packages is that if a thread starts 37 running, no other thread in that process 36 will ever run unless the first thread voluntarily 35 gives up the CPU. Within a single process, there 34 are no clock interrupts, making it impossible 33 to schedule processes round-robin fashion 32 (taking turns). Unless a thread enters the 31 run-time system of its own free will, the 30 scheduler will never get a chance.

One possible 29 solution to the problem of threads running 28 forever is to have the run-time system request 27 a clock signal (interrupt) once a second 26 to give it control, but this, too, is crude 25 and messy to program. Periodic clock interrupts 24 at a higher frequency are not always possible, and 23 even if they are, the total overhead may 22 be substantial. Furthermore, a thread might 21 also need a clock interrupt, interfering 20 with the run-time system's use of the clock.

Another, and 19 probably the most devastating argument against 18 user-level threads, is that programmers 17 generally want threads precisely in applications 16 where the threads block often, as, for example, in 15 a multithreaded Web server. These threads 14 are constantly making system calls. Once 13 a trap has occurred to the kernel to carry 12 out the system call, it is hardly any more 11 work for the kernel to switch threads if 10 the old one has blocked, and having the 9 kernel do this eliminates the need for constantly 8 making select system calls that check to 7 see if read system calls are safe. For applications 6 that are essentially entirely CPU bound 5 and rarely block, what is the point of having 4 threads at all? No one would seriously propose 3 computing the first n prime numbers or playing 2 chess using threads because there is nothing 1 to be gained by doing it that way.

Score: 0
  • This is a question about thread library implement.
  • In Linux, a thread (or task) could be in user space or in kernel space. The process enter kernel space when it ask kernel to do something by syscall(read, write or ioctl).
  • There is also a so-called kernel-thread that runs always in kernel space and does not represent any user process.


Score: 0

According to Wikipedia and Oracle, user-level threads are 23 actually in a layer mounted on the kernel threads; not that 22 kernel threads execute alongside user-level threads 21 but that, generally speaking, the only entities that are 20 actually executed by the processor/OS are 19 kernel threads.

For example, assume that 18 we have a program with 2 user-level threads, both 17 mapped to (i.e. assigned) the same kernel 16 thread. Sometimes, the kernel thread runs 15 the first user-level thread (and it is said 14 that currently this kernel thread is mapped to the 13 first user-level thread) and some other 12 times the kernel thread runs the second 11 user-level thread. So we say that we have 10 two user-level threads mapped to the same kernel 9 thread.

As a clarification:

The core of an OS is called 8 its kernel, so the threads at the kernel level 7 (i.e. the threads that the kernel knows 6 of and manages) are called kernel threads, the 5 calls to the OS core for services can be 4 called kernel calls, and ... . The only 3 definite relation between kernel things is that they 2 are strongly related to the OS core, nothing 1 more.

More Related questions