Operating System Essays
A collection of short essays on various OS-related concepts.
This is a collection of short essays on concepts associated with operating systems. They're primarily introductory in nature, presuming only a reasonable familiarity with some computer terminology. I wrote them at the conclusion of a course at NMSU in... 1998, if I remember correctly. They're dry, but potentially of value to someone who's in need of a basic discussion on some core OS topics, or to someone who just likes reading about semaphores.
Topics:
- Memory management. »
- Deadlock. »
- CPU scheduling. »
- Some hardware elements that aid the implementation of an OS. »
- Two areas where a cache might be found. »
- The Banker's Algorithm. »
- Concurrency. »
- Processes and state changes. »
- File Systems. »
- Process synchronization. »
- Threads and thread management. »
« Memory management
One of the major resources in a computer system is the memory, and like any resource, it must be managed carefully if the system is to operate efficiently. Management involves the allocation, logical division, utilization, etc. of physical memory. Physical memory consists of several different types, each of which exists at some level of the memory hierarchy. This hierarchy usually consists of (but is not limited to) cache, RAM, and nonvolatile fixed disk. Inherent in the treatment of this physical memory, however, is usually the concept of virtual memory. Memory, abstracted, is just a blank slate. It is empty space. Three things determine how the memory, or empty space, is managed: the system hardware, the operating system, and the applications that run on the system. Of these three, it is the role of the applications that has shrunk significantly over time - almost disappearing entirely. On modern systems, applications can (and are forced to be) overtly ignorant with respect to how they handle memory. The major impetus behind the waning involvement of applications is the idea of virtual memory.
A Physical address references a specific place in the memory that is actually present on the system. There's only so much of it. Each address usually references a byte of memory. It is the physical address that must ultimately be used to read/write from memory. However, most programs have no way of knowing exactly where in physical memory they will reside at runtime, so correctly referencing things becomes a problem. In a mono-programming system, this is easily solved using a base register. The evolution of fully multiprogramming systems has severely complicated the issue. The solution is virtual memory, which was initially devised for mono-programming systems when programs got too darn big. Virtual memory essentially relieves the process from most of the memory management. A virtual address is used by the process. Every process believes that it lives in memory beginning at 0x00000000 and has access to everything up to 0xFFFFFFFF (for a 32-bit address system). Of course, this is impossible, so it is up to the operating system to make it appear to be so. The MMU of the system (a logical hardware unit) handles the overhead for converting between virtual addresses and physical addresses, making it possible for processes to be completely unaware that they are contending for physical memory.
To facilitate the use of virtual memory, most systems have incorporated the idea of paging. The virtual address space is logically divided up into pages - contiguous chunks of equal size. The physical memory is logically divided up into page frames, each equal in size to a virtual page. A high order portion of any virtual address is used as a virtual page number, the rest for an offset. A page table is used to translate page numbers into frame numbers. Processes use some number of pages, any number of which may be in or out of physical memory at any given time during the life of the process. The page table must contain an entry for every virtual page, and so quite obviously the table can become extremely large (with pages being only kilobytes in size). To avoid using the majority of physical memory simply to hold the page table, many systems employ multiple level paging schemes. This involves simply adding indirection so that the first page table entry references another page table and so on until the page frame is finally referenced. There is, of course, a trade-off between performance and the number of levels of paging. Some hardware was also devised that alleviated the performance decrease from large page tables; this was the TLB, Translation Look-aside Buffer. Essentially, it was a small cache that held frequently used page table entries so that a page frame number could be obtained in a single memory access.
Another memory management scheme is called segmentation. The primary feature of segmentation is that it uses multiple virtual address spaces, each dynamically sized and defined as a logical entity. This idea of multiple, independent, resize-able address spaces has a number of benefits, not least of which is the fact that it reduces the amount of wasted memory. Because pages are fixed in size, and must be allocated as a unit, there is an average of 1/2 page wasted for every process. This adds up. Segmentation allows for memory to be allocated with greater granularity. Typically, each segment holds a single data structure, which allows for something else that paging didn't: different types of protection for different data structures. Also data structures (segments) can very easily be shared between users or processes, because each segment has its own address space. This makes things like shared libraries possible, greatly improving performance and memory utilization. However, in order for segmentation to work, the programmer must be aware of its presence, and make appropriate use of it in the program. Paging is transparent to the programmer. Also, after a system has been running for awhile, the memory can contain a large number of holes, many of which might be too small to use. Thus, a process of compaction should occasionally be used to push the segments together and consolidating the free memory. This compaction can be very slow, however. Typically, programs have a greater need for large segments (<1Meg) than they do for more segments, and so it becomes desirable to have the ability to keep only part of a segment in memory at a time. Paging is an obvious solution to this, and so some systems utilize a combined management scheme that uses both segmentation and paging. The advantages and disadvantages of the two schemes then overlap.
There is no perfect solution to the problem of optimally managing memory space. This is because no two programs behave precisely the same. Designs and schemes that benefit one type of program might hinder another, although usually compromises can be made to take advantage of the behavioral similarities that do occur. Also, a major consideration in memory management is the accommodation of multiprogramming, multi-user systems. Protection and relocation issues might perhaps prevent the use of more efficient methods. At any rate, as in most system concepts, there is a constantly evolving relationship that must exist between the hardware, the operating system, and the programs that must survive with both.
« Deadlock
Deadlock occurs when each process in a set is waiting for an event that can only be caused by some other process in that set. Basically, deadlock is an operating system issue that reared its head with the advent of multiprogramming environments. First and foremost, a deadlock cannot occur unless there are at least two processes trying to run concurrently. There are actually four distinct circumstances that must be met for deadlock to occur, which will be discussed shortly, but it is worth noting that it is fundamentally a multiprocessing problem. Mono-processing systems do not have to worry about deadlock. The reason is that deadlock involves resource allocation, and if there is only one process, it has uncontested access to all resources. Only certain types of resources are associated with deadlock, and they are of the exclusive-use, non-preemptible type. That is to say, only one process can use the resource at any given time, and once allocated the resource cannot be unallocated by the operating system, but rather the process has control over the resource until it completes its task. Excellent examples of such resources are printers, plotters, tape drives, etc.. Resources that do not fit the criteria are memory and the CPU. While it is convenient to discuss deadlock in terms of hardware resources, there are software resources that are equally good candidates for deadlock, such as records in a data base system, slots in a process table, or spooling space. Hardware or software, all that matters is that the resources are non-preemptible and serially reusable.
The four conditions that must exist for deadlock are as follows: The first two are described above - mutually exclusive use of resources by the processes and non-preemption (resources cannot be removed from the processes). The third condition is called the 'hold and wait' or 'wait for' condition. This means that processes can maintain possession of a resource and simultaneously request or wait for other resources. The last is the circular wait condition, which just states that there must be a chain of processes, each of which is waiting for a resource that is held by the next process in the chain. Since four separate, independent conditions must be met, one might be inclined to simply feel that deadlock is just to unlikely to worry about. Consider, however, that the 'wait for' condition is virtually always met, since almost no system forces a process to wait for every resource it will ever need before starting. And consider also, that every time the right type of resource is used, two more conditions are immediately met (mutual exclusion and non-preemption). Essentially then, the likelihood of deadlock is equal to the likelihood of a circular wait. Is this an acceptable risk? There are four practical strategies for deadlock (well, mostly practical).
The first strategy is to just ignore the possibility of deadlock. This might seem absurd, until it becomes apparent there can be a major performance or functionality decrease associated with a more rigorous solution. Thus, if it can be shown that the likelihood of a deadlock as well as its cost of recovery are not sufficient to outweigh the disadvantages of solving the problem, then it does, in fact, become an acceptable risk. Consider, for example, if a deadlock will only occur on average once every twenty years, and it will not cause system damage or permanent data loss; while the system may crash for many other reasons on average once very other month. Is it worthwhile to hinder the performance or convenience of the system to solve the deadlock problem? Probably not.
Another approach is to attempt to prevent deadlocks from occurring, which is to say that the system is pre-conditioned to remove any possibility of deadlock. The theory is that it is worthwhile to pay the price to ensure that deadlock will never occur. And if it can't occur, you don't have to worry about it. But how can it be guaranteed that deadlock will not occur? Simple, just prevent any of the four conditions from occurring. Preventing the mutual exclusion condition is generally not a good idea. To be sure, many resources could be effectively spooled, which would eliminated mutual exclusion, but spooling space itself then becomes another potential deadlock resource, not to mention the fact that certain resources simply can't be spooled. Prevention of the hold and wait condition might lead to greater success. This can be accomplished by forcing processes to request and be granted every needed resource before starting. Unfortunately, many processes don't know which resources will be needed until they need them. Also, quite obviously, having a process hold all necessary resources until it completes leads to very inefficient resource utilization. Alternatively, the OS could require that a process relinquish all resources in its possession before requesting another. If the resource is available, the process is granted it in addition to the ones it just gave up. Preventing the Non-preemption condition is also possible, although probably not recommended. Consider taking the printer away from a process that is only partially done using it. The last condition that can be avoided is that of circular wait. This can be avoided by numerically ordering the system resources and forcing processes to request resources in numerical order. This can work fairly well, except for two drawbacks. First, with many resources and many processes, it may be impossible to find an order that will allow all processes to finish. Also, if a new resource ever becomes available, the system may have restart to include it in the list.
A different approach to deadlocks is to allow them to happen, and then to detect and recover from them. This strategy avoids all the drawbacks of the preventative measures described above, but introduces others. One way to detect deadlocks is to have the OS maintain a resource graph that keeps track of requests and releases. If the OS detects a cycle, it simply kills one of the processes in the cycle. Or, even simpler, the OS can simply check every now and then to see if a process has waited an inordinate amount of time for a resource. If so, the process is killed. Despite the fact that this type of strategy is often used in large mainframes and batch systems, it has some major problems. Simply killing processes, while fun, is not usually desirable, especially if the process is yours. Also, to maintain system integrity, when a process is killed, all files modified by that process must be restored to their original state. In order to do this there is a tremendous amount of overhead and caching created.
The last strategy is not to physically prevent deadlocks, or allow them to happen, but to do one's best to avoid them through carefully monitoring resource requests. Monitoring means employing some type of tested algorithm that can predict deadlock based on the current state of things. Presumably, the algorithm then avoids deadlock by only granting resource requests when it is prudent to do so. One well known algorithm for deadlock avoidance is the Banker's algorithm. No algorithm for deadlock avoidance is perfect, however, so considerations about their usage still exist.
« CPU scheduling
The responsibility of scheduling processes is a major function of the operating system. It seems as though in early computers, the operating system's major function was to simply get the machine up and running, at which point it could essentially step aside and let a process take over. That was quite awhile ago, when operating systems were small and processes shouldered most of the responsibility for keeping themselves running. As computers became more and more complex and the idea of multitasking was realized, the idea of allowing processes to decide for themselves how to do things was quickly abandoned and it became necessary for the operating system to mature. Maturing meant taking more control, not least of which was the control of scheduling. CPU cycles have always been a precious commodity, and it fell upon the OS to determine who gets them, how many they get, and when they get them.
Despite the fact that early mono-programming systems only ran one process at a time, they still employed a scheduling method. The scheduling of the early batch systems is called 'run to completion' or non-preemptive scheduling. A process is initiated, and once it has control of the CPU, it retains control until it is finished, completely. It is not interrupted, or switched out, or blocked or postponed or any such thing. This scheduling method can be still be used on systems that have more than one process in existence. Basically, though, all but one of those processes do nothing but wait... and wait and wait until it is their turn. This is actually called FIFO scheduling - First In First Out. Processes are introduced to the system in a certain order, wait in that order, and eventually execute and finish in that same order, no matter what. The inadequacies of this method should be obvious. To consider how in adequate, though, it is necessary to review the goals of a good scheduler. It should be fair, ensuring that each process gets a fair share of CPU time. It should keep the CPU and other resources busy as much as possible. It should minimize response time and turnaround time, maximize throughput, be predictable and avoid indefinite postponement. With these things in mind, it can be seen that FIFO fails on several counts. It can be very inefficient, not just with the CPU, but with all resources. Consider a process that must wait frequently for I/O transactions. Under a FIFO scheme, that process will remain the only active process until it completes, and so the CPU goes unused while the process waits for I/O. And so while it might seem fair to allow processes to run based on when they arrive; it isn't. It is important to note, however, that FIFO is the method that will truly maximize throughput. A process will never run any faster than when it has uncontested access to all the resources, as it does under a FIFO scheduler. Thus, the combined time for some group of processes will be the smallest when they're executed serially. Doing so, as stated above, can greatly under utilize system resources, which introduces a trade-off.
The next logical evolution is preemptive scheduling. FIFO can quickly become preemptive by adopting a quantum. The quantum is a specified amount of time that each process is allowed to have the CPU. Once the quantum is up, the process is switched out for another. If the scheduler simply switches in the next process in line, while putting the current process at the end of the line, it is using round robin scheduling. The only interesting thing about round robin scheduling is the size of the quantum. If it is too small, thrashing will occur. If it is too big, the method begins to approach FIFO, with all its disadvantages.
Scheduling really becomes interesting when priorities are assigned to processes. When the quantum is used up (or a process blocks itself for I/O or something), the scheduler employs some type of decision making strategy based on the priorities of the processes waiting in the queue to pick the next active process. The first thing to consider in priority scheduling is how to assign priorities. There are all sorts of theories. Some factors include the I/O and CPU boundedness of a process, whether the process is interactive, frequency of the process's page faults, cumulative CPU time used by the process, etc.. All of these factors are used in conjunction with the scheduling algorithm in an attempt to optimize system performance and meet the goals of a good scheduler described previously.
A nice marriage between round robin and priority scheduling exists in multilevel feedback queues. Several levels of round robin queues exist, ranked in priority. A new process is placed in the highest priority queue. If it uses its entire quantum, it is preempted and placed at the back of the next level (lower priority) queue. If it doesn't use the quantum, but rather is blocked or gives up the CPU, it is placed at the end of the highest (entry) queue. Thus, if a process continues to use its quantum, it falls down the levels until it hits the bottom level, where it will stay until finally completing. A process in any given level cannot run unless all higher priority queues are empty. This method is nice because it rewards well behaved processes to an extent. A well behaved process is one that is I/O bound. If I/O bound processes are given higher priorities, the I/O utilization of the system will be very good. Well behaved processes are also those that adhere to their own estimates of runtime (if the scheduler is using that sort of information).
Since all processes are unique and unpredictable, there is no perfect scheduler. Any scheduler that favors some kind of processes will have to be biased against others. For this reason there are many scheduling algorithms. There's priority, shortest job first, shortest remaining time, lottery, etc..
« Some hardware elements that aid the implementation of an OS
For modern computers, one of the most important hardware features that aid in the efficient implementation of operating systems is the interrupt. Without interrupts, a multitasking operating system would have an extremely difficult time doing its job. The reason is that one of the OS's fundamental functions is to manage the switching of processes into and out of the active state. The operating system is responsible for the magic of making pseudo-concurrency feel like concurrency to all the processes. It does this by performing context switching on a regular basis, so that each process gets its fair share of the CPU. The problem with this, however, is that when a process is active, the operating system must literally relinquish control of the system (although not entirely). Once a process has control of the CPU, what is to prevent it from being autocratic and ill-mannered about the situation and never relinquishing control? The answer is interrupts. Interrupts allow the operating system to get back in the saddle and ensure that no one monopolizes the system (except maybe the OS). A hardware interrupt (often the clock interrupt) is designed to occur at regular intervals. The interrupt handler is a portion of the operating system which politely shoves the current process off the CPU and decides what to shove in its place. Thus, it doesn't matter how ill-behaved a process tries to be, the OS is guaranteed to get control again after a short period. Very convenient. Without interrupts, the operating system would have to rely on processes blocking themselves and calling some routine to transfer the CPU back to the OS. While theoretically this would work just as well as the interrupt method, it presumes that all processes, and hence all their programmers, are both smart enough and nice enough to do the right thing. History has proven that if people are predictable of anything, it sure isn't being nice (or even smart). Ergo, the software alternative to interrupts has remained, thankfully, a purely academic concept.
The Memory Management Unit is another extremely important hardware feature to modern operating systems. The reason is that people decided virtual memory was a pretty darned good idea. The problem with virtual memory is that it's virtual. Sometimes it isn't there. For that reason, the computer must function using physical memory, with all its horrible disadvantages. Of course, if everyone adopted Seymour Cray's attitude ("If you want a gig of memory, buy a gig of memory"), we wouldn't need an MMU. Well, that's not true, actually, we'd still need an MMU, because operating systems would still attempt to keep several processes resident in memory at the same time. Basically, and MMU does nothing more than translate virtual addresses into physical addresses. And it does it fast. This allows processes to be completely ignorant of where they exist physically and just skip along believing they're the center of the universe (not unlike Europe up until the seventeenth century). The existence of the MMU means that the operating system doesn't have to bother itself with this mundane task of address conversion. It only has to intervene in the case of a page fault. If the MMU didn't exist, and we still wanted things to function in relatively the same way, the operating system would have to intercept every single memory reference, spend a few instructions mucking around in the page table, and then send the newly created physical address onto the memory bus. Needless to say, this would be numbingly slow, probably employing the use of a trap interrupt or something.
Speaking of paging, the Translation Look-aside Buffer is a third very valuable hardware element. Most systems use some type of paging scheme. This invariably involves using some type of page table, as well. A problem arises, unfortunately, when the virtual memory space becomes large (which it usually does - who wants an 8MB virtual memory). Pages tend to be a few kilobytes in size (~4KB), any larger and there's just too much wasted space. Thus, a huge virtual memory space results in a huge number of pages. There has to be an entry in the page table for every page (unless its inverted), and so the page table becomes huge also. To correct this, the concept of multiple level paging was employed. Unfortunately, while this alleviates the need for huge page tables to be resident in memory, it introduces the problem of several memory accesses per memory access, if that makes any sense. For every level of indirection in the paging scheme, the system must fetch a page table entry in order to eventually arrive at the physical frame number for the virtual reference. The associated performance hit is unacceptable. To solve the problem, a small hardware cache was introduced which would hold a useful number of page/frame number pairs. This is the TLB. Basically, when an address translation is needed, the system goes ahead and starts it the hard way. Immediately, however, it also 'looks aside' and checks the contents of the TLB to see if the translation was performed some time in the near past and is resident in the buffer. If it is, then the paging search is aborted and the frame number is grabbed post-haste from the TLB. The TLB is a performance life saver.
Finally, another hardware helper for operating systems also deals with paging. If a system is designed so that the Least Frequently Used method will be used for page replacement, then certain hardware support can be included. This support can take the form of either a large (64 bit) hardware counter which is incremented after every instruction. Each PTE includes a copy of the counter, which is updated on any reference to the page. When a page fault occurs, the OS looks for the PTE with the lowest counter value, presumes it references the least recently used page, and kicks that page out of memory. This is a relatively cheap and effective way to support LRU in hardware. Alternatively, a system could use an n x n matrix (for n pages). Initially filled with zeros, a page reference would cause the hardware to set all the bits in the appropriate row and clear all the bits in the appropriate column. Thus, the row whose value is the lowest represents the LRU page at any given time. Without any support of these types, the OS would have to simulate LRU itself. This can be done using a Not Frequently Used algorithm that involves a software counter for every page (initially zero). At each clock interrupt, every page currently in memory is scanned and its reference bit is added to the leftmost bit of the page's counter (after the counter is right- shifted one bit). This is called aging. One a page fault, the page with the lowest counter is replaced. This scheme works reasonably well, but it is slow compared to the hardware alternatives. This is typically the case with hardware support. The reason it's there is because it's faster. If it wasn't any faster, the software solution would probably always be used because software is flexible. Once designed, hardware is hardware. For example, a system could probably provide hardware support for whatever page replacement algorithm was chosen (as with the case of LRU above). However, that would force the operating system to use that algorithm. That's the trade-off: hardware is fast, software is flexible.
« Two areas where a cache might be found
The problem with computers is that when it comes to data storage, large means slow. When it comes to computers, slow means bad. To get around this problem, designers created the cache. Basically, any cache's function is to keep important or frequently used information as easily (quickly) accessible as possible. Typically this is done via a small amount of very fast storage. Very fast is, of course, a relative term, as a cache need only be significantly faster than the origin if its data in order to be effective. Actually, the entire memory hierarchy can be viewed as simply linked caches, starting at the registers in the CPU and moving outward to the static storage devices. The idea is to quarantine data accesses within the lowest level of the hierarchy as possible (i.e. closest to the CPU). Close means fast. Caches are immensely effective, but really only because of certain aspects of predictability in the profile of data accessing. If random data access was the norm across levels of the hierarchy, a cache might still be used, but its effectiveness would be severely limited. Caches work because designers can use access patterns to be intelligent about how data is stored and removed in the cache.
Naturally, one of the more well known areas of caching is between the CPU and main memory. Most machines now employ more than one level of cache in this area, but for the discussion, we'll assume just one. The size of this cache is usually around 512KB, but as processors get faster and faster, the penalty for a main memory access gets higher and higher in terms of CPU cycles, and so the cache size continues to increase. This cache works simply by mirroring data that is stored in main memory. Dissertations and books have been extrapolated from the previous sentence alone, however, so it really isn't so simple. It is probably prudent to discuss the concept of locality before going further. As stated above, a cache is effective because of patterns in data access. The two fundamental patterns are temporal locality and spatial locality. Respectively, these mean that if data is accessed once, it will likely be accessed again (temporal) and that data located near to the accessed data will likely be accessed in the near future (spatial). Because of these types of localities, a well designed cache typically does two things - stores data in blocks, and tries to keep recently accessed data around as long as possible. Caches can be direct mapped, set associative, or fully associative. Essentially, the difference between the three types is whether a block of data can be placed in the cache at only one location, a few locations, or any location. The disadvantages and advantages as well as the implementation details of the three types of cache will not be discussed here. Let's get back to the traditional cache located between CPU and main memory. On each memory access, the cache controller checks to see if the data for the address in question is currently in the cache. If it is, and it's valid, then there's a hit, and the cache puts the data on the data bus. For the purpose of facilitating replacement, the data in the cache is marked as 'referenced' via a bit in the tag. If the data is not present, that's a miss, and the data must be fetched from the next level of the hierarchy (main memory in this case). Since the system has to go to main memory anyway, it takes the opportunity to initiated the cache replacement process, in the ongoing effort to keep data close. Typically, the replacement process involves deciding what to evict from the cache (keeping temporal locality in mind) and then bringing in the appropriate block of data as part of the main memory access. If the data access is a write, the cache behaves differently, depending on whether it is designed as write-back or write-through. A write-back cache will take the modified data from the CPU and keep it all to itself, marking the cache line as dirty. Then if that cache line needs to be replaced at some point, the modified data is copied back to main memory instead of just being overwritten. A write-through cache stores the modified data, but also sends it along its way to main memory as well, effectively passing it through the cache. The miss penalty for a write-back cache is higher than for write-through, but write-back can allow for higher performance if a high hit rate is maintained. The different write policies bring up the topic of synchronization. Obviously, it is very important to maintain integrity for any data that is cached. Caches are usually transparent to software, and so it is necessary for a process to not have to worry about precisely where the correct copy of its data resides. It is the responsibility of the cache controller (and the memory system) to make sure that data is always protected appropriately. This is done using elements like a valid bit, to tell whether a cache line has the correct data for a given address. There's also a dirty bit, to tell whether a cache line happens to be the only place where the correct data exists. In this way, the system can have data/address pairs stored in several places in the hierarchy without worrying about data integrity.
There are ways to improve cache performance. This involves one of two things; either decrease its average access time or increase its hit percentage. Usually, the easiest way is to make it bigger. If a cache can maintain its access times and increase capacity, it will decrease the miss percentage and perform better. Unfortunately, this also tends to carry an increased price tag for the cache. Another way to improve performance is to have a better replacement algorithm. The better job a cache does of anticipating accesses, the lower its miss count will be. Unfortunately, smarter caches usually require more real estate, and the miss penalty tends to increase because there's more logic involved in the replacement process.
The Translation Look-aside Buffer is another cache, and a very interesting one. Most systems use some type of paging scheme. This invariably involves using some type of page table, as well. A problem arises, unfortunately, when the virtual memory space becomes large (which it usually does - who wants an 8MB virtual memory). Also, pages tend to be a few kilobytes in size (~4KB), any larger and there's just too much wasted space. Thus, a huge virtual memory space results in a huge number of pages. There has to be an entry in the page table for every page (unless its inverted), and so the page table becomes huge also. To correct this, the concept of multiple level paging was employed. Unfortunately, while this alleviates the need for huge page tables to be resident in memory, it introduces the problem of several memory accesses per memory access, if that makes any sense. For every level of indirection in the paging scheme, the system must fetch a page table entry in order to eventually arrive at the physical frame number for the virtual reference. The associated performance hit is unacceptable. To solve the problem, a small hardware cache was introduced which would hold a useful number of page/frame number pairs. This is the TLB. Basically, when an address translation is needed, which is on every memory access, the system goes ahead and starts it the hard way. Immediately, however, it also 'looks aside' and checks the contents of the TLB to see if the translation was performed some time in the near past and is cached. If it is, then the paging search is aborted and the frame number is grabbed immediately from the TLB. The TLB is a performance life saver. Instead of caching regular data (as do Level 1 and Level 2 caches described above), the TLB caches address translations. As stated above, address translations are needed almost constantly, making them a very hot commodity. Since most TLB's maintain a hit percentage of 99-99.99% (which is extremely high), it means that the system almost never has to wait for the agonizingly slow process of multiple page table accesses. Without a TLB, most virtual memory systems could not operate at useful speeds. The TLB must be kept synchronized with the page tables. This can be done with either hardware or software. The only time that synchronization is an issue is on a full page fault, when a virtual page currently in memory must be swapped out for another, which means that the page to frame translation is no longer valid. Usually a page fault is handled by the OS, which takes the new page/frame pair and either puts it in the TLB itself or passes it off to a controller. If a miss in the TLB isn't part of a page fault, but is just a capacity miss, the address translation is performed the slow way (still tremendously faster than a page fault) and is then stored in the TLB for future hits. Some TLB's are fully associative, since a small size means a fully parallel search doesn't require much extra time or hardware. Other TLB's are large, and are either direct mapped or have a small set associativity. The performance of a TLB can be improved in similar ways to a regular cache. Its capacity could be increased. If it is too large, however, it is no longer practical for it to be fully associative, so either another scheme is adopted or it is not further increased in size. To improve performance, many designers opt for a random replacement scheme for the TLB, since implementing LRU in hardware is expensive, and TLB misses occur much too frequent to use the software algorithms that are used in page replacement.
« The Banker's Algorithm
The Banker's Algorithm is used for resource management and its solution to deadlock is to try to avoid it. It does this in a fairly simple manner by trying to predict the circular wait condition. The algorithm was proposed by Dijkstra and is named for the fact that it resembles how a banker manages multiple loan requests using insufficient capital. For explanatory purposes, we'll deal with a single type of resource, although the algorithm can be generalized to deal with multiple types. In order to function using the algorithm, the system must request a few things from potential processes. First, each process must state prior to execution the maximum number of the resource in question that it will need. For lack of originality, let's deal with tape drives. Thus, each process tells the system how many tape drives it will need simultaneously, at some time during its life, in order to complete its task. Second, each process promises that if its maximum need is met, it will use the resources and return them in a finite time. In return for this good behavior, the system promises that no process will have to wait indefinitely for resources. Another important note is that a process doesn't have to have possession of its maximum number in order to start - it can obtain and release tape drives on an individual basis. Obviously, though, the system cannot start a process whose maximum need is greater than the number of tape drives on the system.
Having stated the ground rules, let's review how it works. On any given request for a tape drive, the system analyzes the state of things to see if granting the request would put the current set of processes in a safe or unsafe state. It is important to understand the definition of safe and unsafe. The system is in a safe state if there exists a way for the system to grant available tape drives such that each process will eventually get its maximum and finish. For example, consider two processes and five tape drives. Each process has informed the system that their maximum need is four tape drives. So far so good. Initially, process A requests and is granted two tape drives. Next, process B requests and is granted one tape drive. Now three tape drives are allocated, two are available, and the system is still in a safe state. To determine that things are still safe, the system checks the process that is closest to its stated maximum (process A). There are sufficient available drives to satisfy the maximum need for that process, so the system essentially pretends that those drives are granted, process A finishes, and the four drives are released by process A. The system checks the next process closest to its maximum (now process B) and sees that the four drives that would be available are enough to satisfy process B's maximum need, so it too could finish. If there were more processes, the system would continue until either it encountered a situation where there would not be enough virtually available drives to satisfy the next process, or the last process could finish.
Consider, though, if when process A had two drives and process B had one, that process B made the next request for a tape drive. Should this request be granted? The system would analyze the situation and recognize that granting the request would put the set of processes in an unsafe state. It is unsafe because both processes have two drives and only one is available, so a tape drive must be released before either process can finish. There is no guarantee that a process will release a tape drive before requesting another one. Deadlock can result from an unsafe state. It is not a forgone conclusion, however, as it is certainly possible that a process will in fact release some of its resources before another request is made. This would put the system back into a safe state. The Banker's algorithm states that a request will always be granted if doing so would maintain a safe state. If granting the request will put the system in an unsafe state, the request may or may not be granted. The system prefers to postpone such requests in the hopes that in the near future, resources will become available and the postponed request no longer threatens an unsafe state. However, the system has promised a finite wait time for all process requests. As a result, it may be forced to grant the request and make the system unsafe. Again, it is important to note that deadlock may not necessarily occur. It is common to make transitions from both safe to unsafe states and vice versa.
There are some distinct advantages and disadvantages to the Banker's algorithm. It avoids the functionality restrictions associated with systems that are designed to prevent deadlock. Processes can request resources in any order, on an individual basis, and are never forced to relinquish them. Thus the algorithm is very lenient with regard to process execution. Also, as long as the system stays in a safe state, all processes are guaranteed to finish. Unfortunately, the disadvantages are quite severe. For one, the algorithm requires that processes know, and never attempt to exceed, their maximum resource needs. This is unrealistic. Another harsh drawback is that the algorithm needs to work with both a fixed number of resources and a fixed number of processes. If a resource, such as a tape drive, is removed from the system, the current set of processes may not run to completion. Even if a resource is added, most likely the system will have to be reinitialized in some way, which is likely to be very costly. If processes are added, which obviously happens on a regular basis on multi-user systems, they cannot be easily added to the set of currently running processes without greatly increasing the chance of deadlock. It is for these reasons that the Banker's algorithm is essentially unusable for actual operating systems. Most operating systems cannot afford to require all processes to know their maximum resource needs prior to execution. Most operating systems also cannot afford to have system integrity be so heavily dependent upon the number of physical resources present on the machine. The Banker's algorithm has survived as an academic tool for exploring and introducing deadlock avoidance, but it has proved to be too impractical for actual use.
« Concurrency
A concurrent environment is one in which the tasks being performed by the computer system are, or appear to be, happening simultaneously. To the user, the systems seems to be doing many things at once, i.e. concurrently. Establishing a safe, stable concurrent environment is no easy task, and efforts to do so have formed the backbone of operating system research. Fundamental to the approach is the sequential process model, which makes parallelism (and pseudo-parallelism) easier to deal with. True concurrency is only recently being realized, with the development of SMP systems and the operating systems that run them. True concurrency occurs when a system is actually performing multiple tasks at the same time, as opposed to just appearing to do so. Consider a multi-user system with a single CPU. Obviously, the CPU can only execute instructions sequentially (so far). Thus, the operating system on such a system switches control of the CPU between processes at a rate that must be fast enough to give the illusion of concurrency. This rapid switching of processes is called pseudo-parallelism, and it is not true concurrency. In fact, since the CPU can only work on a single process at any given time, the operating system itself must be included in the list of processes that are being switched. The concept of interrupts is critical to providing concurrency on single CPU systems, as they allow the operating system to get in and perform necessary tasks, not least of which is that of scheduling and switching processes.
The responsibility of providing concurrency has dramatically shaped the behavior and development of operating systems. As stated above, one of the first, and most fundamental, developments was of the sequential process model. The OS could be designed to treat everything (including itself) that needed to execute as a process. Processes are treated as state machines, with essentially three states: running, ready, and blocked. The OS scheduler can make decisions about how best to switch processes based on what state each of the processes is in. Also, the OS and to some extent the processes themselves can alter their state. A running process is one that currently has control of the CPU and is executing its code normally. There can only be one running process on a single CPU system, and this state is the coveted one. A blocked process is one that cannot continue execution or make use of the CPU (whether it has it or not) because it is waiting for some event to occur. Finally, a ready process is one that is not blocked and could proceed with its execution if it was granted control of the CPU. All ready processes are simply waiting to become the running process. Another immediate result of the process model was the idea of process context. Operating systems had to find a way to efficiently and safely switch processes, as the procedure could easily bottleneck the system or corrupt processes if not done carefully. A process context is basically all the information that is required to define and maintain the state of the process. This normally includes the program counter, the CPU registers, the stack pointer, and a lot more. The context consists of anything that must remain unaltered for the process to resume normal execution if it is switched out and eventually back into the running state. The process context is crucial to pseudo-parallelism, and it must be managed by the operating system. Occasionally, a system will include hardware support for the procedure of saving and switching contexts.
Furthermore, the operating system must maintain some type of process table, which contains an entry for every process that currently exists on the system. This table it the primary tool for the OS to manage concurrency, and it can take many forms. It usually contains either the context of each process, or a pointer to it, so that the OS can quickly affect context switches. There is also usually many other types of information contained in the table which helps the OS make state transitions for the process.
As soon as machines started trying to run processes concurrently, the problem of interprocess communication became paramount. There are basically three major IPC issues. The first is obvious - how does one process pass information to another. The second involves ensuring that processes do not interfere with each other when it comes to common resources, primarily shared data space. Lastly, there is the issue of synchronizing processes between whom there exist dependencies. The solutions to these three topics will not be described in detail, as they are too complicated for the purpose of this discussion, which is to briefly introduce the constructs that are available to operating systems for implementing concurrency.
Race conditions among processes are the result of the fact that processes can be, and frequently are, interrupted in mid-execution. Problems arise, then, when one process is attempting to access a shared resource and is interrupted. A succeeding process might then also access that resource and either undo the effects of the first process, or affect the resource in some way that causes incorrect execution in the first process when it resumes. Or the first process might destroy the changes made by the second process, being unaware that an access was made while it was blocked. This problem gave rise to the idea of critical sections, and to mutual exclusion in general. A critical section is that part or parts of any process that accesses shared resources. If an operating system can prevent any two processes from being in their critical sections concurrently, then the problem is solved. There are a number of ways for the system to do this, both in hardware and software. The hardware solutions are simple. The first is to allow a process to turn off interrupts when it is in the critical section. This is very undesirable, however, because an OS would have to trust the process to turn interrupts back on. Another hardware solution is the TSL instruction, which uses the idea of a lock on critical data. The lock cannot exist entirely in software, because access to the lock must also be controlled, which just relocates the mutual exclusion problem. Thus, an new instruction is added to the architecture which ensures that access to the lock is atomic. Software solutions include Dekker's algorithm, and Peterson's algorithm.
Synchronization is accomplished in software and can be done using semaphores or monitors. A semaphore is a unique type of variable invented by Dijkstra, and is used when two or more processes are dependent upon the actions of the other and they must alternate between waking and sleeping in order to function correctly. Semaphore access is atomic, and processes typically can perform one of two operations on the semaphore: DOWN or UP. A monitor is a language level construct that allows a programmer to declare procedures, data structures and variables as part of a special package. Compilers can then identify the packages and take the necessary steps to ensure mutual exclusion and synchronization during execution.
Finally, message passing is used to deliver information between processes. This is usually done using system calls such as SEND and RECEIVE. A process simply specifies the destination or source and a location for data and performs the appropriate system call.
As usual, each of these IPC primitives have their own advantages and disadvantages which must be evaluated during the development of an operating system. Suffice to say that any OS designed to provide concurrency will encounter a multitude of obstacles and approaches to handling them.
« Processes and state changes
In order to properly manage a multiprogramming environment, operating systems have adopted a sequential process model. A significant part of this model is the use of a process state machine. The state machine can vary slightly from OS to OS, but a general example is as follows. During its virtual lifespan, a process will exist in one of five states: running, ready, blocked, suspended-ready and suspended-blocked. The diagram for the states and their transitions is shown below. A state is just a particular set of circumstances under which the process exists. The operating system monitors different types of information related to each process in order to determine its current state. Most of this information is contained either directly or indirectly in the process table. It is important to note that a state is not a physical characteristic of the process. It might not even be so much a logical characteristic as simply a means of labeling the different stages of the process's niche in the system.
The first state is the running state. A process is running when it is the process that currently controls the CPU. Only one process may be in the running state at any given time (unless the system has multiple processors). The only way a process can enter the running state is when it is selected by the OS scheduler to be the active process. This is represented by the Dispatch signal on the diagram. A process can leave the running state for a number of reasons. First, it may block itself by requesting service from some I/O device or via a dependency upon another process or system event. This puts the process in the Blocked state. A process can be forced to leave the running state by the OS scheduler either because it used all of its quantum, or because an interrupt occurred which represented a higher priority function. The former is demonstrated by the timeout signal which puts the process in the ready state, and the latter is shown by the suspend signal which puts the process in the suspended-ready state. Those are the possible transitions to and from the running state. A process is blocked when, as described above, it is waiting for something. The process cannot continue its flow of execution until something occurs. If the OS suspends the process while it is blocked, it enters the suspended-blocked state. A suspended process is essentially removed from the process queue until the system reinstates it using the resume signal. When the event, I/O service, or system service for which the process was blocked finally occurs, the process enters the ready state if it wasn't suspended, or the suspended-ready state if it was. A process is ready if it could make use of the CPU for normal execution were it selected by the scheduler. All ready processes are simply waiting for their turn to become the running process. The only difference between the ready/blocked states and their suspended counterparts is that the process is waiting for the system to resume its status. This can be seen in the diagram from the transitions caused by suspend and resume. Thus any process will go from state to state depending on its own execution and on the execution of both the OS and the other processes. Assuming no errant behavior, a process will ultimately spend enough time in the running state to complete its job, at which point it terminates and is no longer part of the system.
As stated above, the above state diagram is only an example. Some systems take a simpler approach and remove the suspended states. The resulting model contains only three states: running, ready, and blocked. This model involves less overhead on behalf of the system, as the OS need only keep track of whether a process is or is not waiting for something, and then perform process scheduling. However, the addition of the suspended states might allow for more efficient resource utilization. machines, with essentially three states: running, ready, and blocked. The OS scheduler can make decisions about how best to switch processes based on what state each of the processes is in. Also, the OS and to some extent the processes themselves can alter their state. A running process is one that currently has control of the CPU and is executing its code normally. There can only be one running process on a single CPU system, and this state is the coveted one. A blocked process is one that cannot continue execution or make use of the CPU (whether it has it or not) because it is waiting for some event to occur. Finally, a ready process is one that is not blocked and could proceed with its execution if it was granted control of the CPU. All ready processes are simply waiting to become the running process
« File Systems
Next to the graphical interface, the file system is probably that part of any OS which has the greatest impression upon the user. Users are concerned, naturally, with the usability of the file system. This involves things such as how files are named and protected, and what types of operations are allowed on them, etc.. The designer of an operating system must busy themselves with the dirty implementation details of the file system, such as how free space is tracked, the definition of directory structures, etc..
First, the actual hardware (i.e. disks) used for secondary storage is fairly simple. The average hard disk consists of some number of circular platters, stacked vertically and separated by a small space. Between this space moves the read/write heads, which are attached to the access arms. Data can be stored on each side of the platters, and there is an access head for each side. All of the arms move in unison, in a linear fashion, toward or away from the center of the platters. The platter is divided into tracks, which are circular and increase in diameter proportionally to their distance from the center of the platter. Virtual lines extending radially out from the center divide the tracks into sectors. Sectors on identical tracks on all the platters form cylinders. Because of this concept of sectors tracks and cylinders, secondary storage is considered to be multi-dimensional. This is in contrast to primary memory, which is considered flat because it is a one dimensional address space. To read data from the disk, the access assembly moves the arms to position the heads over the appropriate track. The time to do so is called the seek time. Next, the platters are rotated to position the heads over the appropriate sector. The time to do so is called rotation time. Finally, the platters are rotated while the heads read or write the appropriate number of bits on the disk.
The most efficient use of the available hardware space is to store files simply as a string of sequential bytes. Unfortunately, this scheme is not practical because the OS must know the size of the file before creating it in order to reserve the correct amount of space. Also, such an organization prevents files from being dynamic in size once created. For this reason, file systems typically store files as a collection of equal-sized blocks, any of which can theoretically be stored anywhere on the disk. Such a file system is considered non-contiguous. A contiguous file system stores files as described above, in an uninterrupted chunk. The size of the blocks is the major issue for this scheme. Large blocks provide great access times, because each access to the disk retrieves more of a file to put into memory, while small blocks would require many more accesses to retrieve the same amount of information. However, the larger a block is, the more space is wasted in the last block of any file. Thus, small blocks are usually favored for their efficient use of space, at the cost of increased data transfer times. A common block size is one kilobytes. Once a block scheme is adopted, another consideration is how to link the blocks that constitute any given file. A contiguous file system would need only keep track of the location of the first byte of any file. As seen, though, for the average system, storing files contiguously has insufferable drawbacks. Thus, the problem remains, how to link blocks. Some systems, such as MS-DOS/WIN95 use a file allocation table or FAT. The FAT contains an entry for every physical block on the disk. The entry is nothing more than the next block of the file. Thus, the OS simply looks in the entry for the first block of any file and follows the links to find any given place in the file. Once the desired block is found, the physical block is accessed on disk and information read or written. In order to provide reasonable performance, the FAT must be kept resident in memory, preventing innumerable disk accesses just to find a given block. The problem with this, of course, is that as the size of disks increases, so does the FAT, to the point that it becomes impractical to keep it in memory.
Another way to link the physical blocks that constitute a file is that which is used by UNIX. UNIX uses a structure called an i-node. The i-node contains a number of bytes, each defined by the OS to represent some aspect of the file needed to facilitate management. Within the i-node, though, is also a few pointers to disk blocks. If the file is no bigger than a this number of blocks, the entire file can be reference using the single i-node. What's more, this file can be randomly accessed using only a single disk access. If the file is too large, one of the block references is instead used to point to a single indirect block. The values stored here are pointers to more blocks of the file. If this first level of indirection doesn't provide enough blocks for the file, then one of the entries in the i-node is used for a double indirect block, which functions as you'd expect. This method can be branched to the extent that files of virtually any size can be managed efficiently and quickly.
Another issue related to block file systems is the problem of tracking free blocks. There are two common methods. The first uses linked lists to catalog free blocks, where each block contains as many references to free blocks as possible, and a few bytes which point to the next block of the list. The other method is to use a bit map, where there exists one bit in the map for each block on the disk. The bit is set if the block is free and cleared if it is not (or vice versa). These two schemes both work fine, and can be optimal for a system under different circumstances.
Lastly, it is important to note that the OS might use some parts of the disk differently than others. This is because certain capabilities of the file system are not required for some elements of the system. For example, the OS might maintain its swap space (used for paging) in an entirely different way than that part of the disk reserved for user storage space. The swap space does not need to accommodate all the file operations that must be provided to the users, and so some of the storage overhead can be eliminated and better use made of the disk space.
« Process synchronization
Process synchronization is necessary to avoid busy waiting among processes which share access to some element. Mutual exclusion is essential to implement synchronization, and is assumed to exist in some form within synchronization methods. The concept of synchronization evolved as a consequence of various applications of mutual exclusion. Mutual exclusion in itself did not solve certain operating system problems, primarily those that developed when processes needed to access something not only exclusively, but in a particular order. As stated above, synchronization was used to help eliminate busy waiting. The idea was to block, or actually suspend, a process which needed access to shared data, and couldn't access it until some other process or processes had performed some operation on the data. Thus, the traditional description of synchronization primitives are SLEEP and WAKEUP. If a process finds that it must wait for another process to operate on the data, it sends the system a SLEEP signal, indicating that it should be suspended. Once the other process operates on the data, it will notice that the state of the data indicates that another process is waiting on it, and it will issue a WAKEUP signal to the system. The sleeping process will no longer be suspended, and can access the shared element as intended.
An implementation obstacle to this scheme is the potential loss of WAKEUP signals. This was solved by Dijkstra via the semaphore. The semaphore is simply a shared variable which basically acts as a counter for pending WAKEUP signals. The key to the success of a semaphore is that its use must be atomic. That is to say that mutual exclusion must be maintained for any access to the semaphore. The SLEEP and WAKEUP signals are typically transformed into DOWN and UP actions on the semaphore. The procedure is as follows. A process checks the state of the shared data, to determine whether or not it can perform the desired operation. If so, then the process simply does it, in a mutually exclusive fashion. If not, however, then the process prepares to sleep. Before sleeping, though, it checks the semaphore. If the semaphore is zero, the process goes to sleep, waiting to complete its DOWN operation. If the semaphore is nonzero, then another process has previously issued a wakeup and the process should not sleep but should go ahead and operate on the data. The seeming contradiction is a result of a race condition that resulted if the first process was interrupted between checking the data and checking the semaphore. During that interruption, the other process changed the state of the shared data, noticed that it was in a state which would put the first process to sleep, and issued a wakeup in the form of incrementing (UP) the semaphore. So, instead of sleeping, the first process decrements (DOWN) the semaphore, essentially using the buffered wakeup, and operates on the shared data. When a process issues and UP on a semaphore, and one or more processes are in fact sleeping on the semaphore, the system selects one of the sleeping processes at random and allows it to complete its DOWN. Again, it is essential that any operation on the semaphore itself must be atomic.
The classical use of semaphores is on the Producer/Consumer problem, which represents anything in the OS which uses a bounded buffer. The system can provide some hardware support for synchronization, usually support for mutual exclusion which is necessary for synchronization primitives to function. The perfect example is the TSL instruction, which provides a hardware guarantee for exclusive access to a shared variable.
« Threads and thread management
The concept of threads was conceived as a means to gain the advantages of both parallelism and blocking system calls. A non-threaded process is really just a process with one thread. The difference is that a multi-threaded process can achieve parallelism while the single threaded process cannot. Thus while most systems today benefit from the use of blocking system calls, they do not reap the benefits from parallelism. The reason is that when a process must block for some reason, it makes a call to the OS, which typically then switches the process out of the running state and replaces it with another process. While the process is blocked, it makes no progress whatsoever. One of the major benefits that threading can provide is reduced context switching. Switching processes into and out of the running state involves a full context switch, which is very time consuming. Context switching is pure overhead, which is to say that it is transparent to processes and is entirely the responsibility of the OS. The point is that context switching is costly and no one can do anything productive with the computer while they occur. Obviously, then reducing the rate at which these occur is very desirable, and can lead to tremendous performance boosts. Threading can help. The best way to understand a thread is to realize that threads are to processes what processes are to the operating system. A threaded process has several points of execution control, as opposed to just one. Probably the most important thing to remember about threads is that they share the same address space. All of the threads for a given process share the address space for that process. They can also share just about everything else unique to the process, such as data structures, open files, etc.. This can obviously lead to serious problems, and because the threads operate in a single address space, it is the responsibility of the programmer to implement security measures among the threads.
Thread management can be either the responsibility of the user (i.e. the process) or the operating system. There are dramatic differences and consequences of using one or the other. Consider first the idea of user thread control. Essentially the operating system is ignorant that threads even exist within the process. As a result, the OS treats the process as it would any other process - any single-threaded process. Thus, when an event occurs within any of the threads in the process that causes a system call, the OS intervenes and blocks the entire process. This is not acceptable, and basically defeats the entire purpose of threads. To get around this, a threaded process can utilize some sort of run-time system as an interface between the threads and the OS. The run-time system intercepts any system requests issued by the threads. If the threads needs to be suspended, the run-time system performs a thread context switch, activating a non-blocked thread. Thread switching is much faster than process switching. The reason is that a thread context includes much less than a process context, typically only register values, the PC, and a stack pointer. For this reason, threads are also referred to as lightweight processes, because they can perform like process, and follow the sequential process model, but without all of the overhead. Not only is a thread switch faster than a process switch, but thread switching managed by the user is much faster than thread switching managed by the OS, because the process does not have to trap out to the OS for every thread switch. At any rate, if a run-time system is used, the process can make very efficient use of its quantum by exhausting as many unblocked threads as possible. Unfortunately, the run-time system does not eliminate the problem of blocking calls which cause the OS to block the entire process. A possible solution to this is to have a testing call to the system which would determine if a block would occur. The run-time system uses the test when a thread issues a blocking call. If a block would occur, the thread is switched out, otherwise, the call is allowed. Other nice advantages of user level thread management include the use of customized schedulers. Each process knows how its threads behave and what their purpose is, so it can optimize a scheduler for them. The OS remains only concerned with process scheduling. Also, threads managed by the user scale much better than threads managed by the OS, because they place no demands on OS resources such as table space. Unfortunately, there are also severe obstacles, such as signal handling. If the OS is unaware of the threads, it cannot ensure that a signal is sent to the appropriate thread. Also, since the process does not have control over key resources such as timing interrupts, there exists the problem of runaway threads which can't be stopped unless they block for I/O.
Having the OS manage threads can eliminate many of the headaches of thread management in general, but also stifles the performance gain possible with threads. For the OS to manage threads, it would need a thread table in addition to the process table. This introduces significant demand on resources as well as requiring more time for OS bookkeeping. When a thread blocks, the OS thread scheduler would choose another thread, which may or may not be part of the current process. One advantage of OS thread management involves things like resource allocation, and signal handling. When the OS is aware of threads, it can recognize individual stack overflow, for example, or correctly route signals. A major problem, however, is the fact that many critical library functions were not designed with threads in mind, and so are not reentrant. The OS must somehow prevent reentrant calls to these functions (not likely) or be overhauled to accommodate threading.