· 6 years ago · Mar 21, 2019, 01:24 AM
1---------------------------------
2CS 143A
3---------------------------------
4
5What is an operating system?
6OS is the software that acts as an intermediary between the applications and computer hardware
7
8
9Operating System Roles:
10
11- Referee:
12Allocation of resources among applications, users
13Isolation of different applications, users from each other
14Communication between users, applications
15
16- Illusionist:
17Each application appears to have the entire machine to itself
18Infinite number of processes, (near) infinite amount of memory, reliable storage, reliable network transport
19
20- Glue:
21Libraries, user interface widgets
22Reduces cost of developing software
23
24
25OS Challenges:
26
27- Reliability
28Does the system do what it was designed to do?
29
30- Availability
31How much time does the system spend working and other metrics like
32Mean Time to Failure (MTTF), and Mean Time to Repair (MTTR)
33
34- Security
35Can the system be compromised by an attacker?
36
37- Privacy
38Data is only accessible to authorized users
39
40- Performance
41-- Latency/response time : How long does an operation take to complete?
42-- Throughput : How many operations can be completed per unit time?
43-- Overhead : How much extra work is done by the OS?
44-- Fairness : How equal is the performance received by different users?
45-- Predictability : How consistent is it in its performance over time?
46
47- Portability
48For programs : Application Programming Interface
49For the Kernel : Hardware abstraction layer
50
51
52---------------------------------------------------------- Consider reading up on the history of Operating Systems from the slides as well ----------------------------------------------------------------------------
53
54
55How is I/O sped up through DMA?
56DMA (Direct Memory Access) allows data to be moved directly between I/O devices and memory, freeing up the CPU to work on other processes
57
58
59I/O Completion in Batch Processes : How do we know when I/O is complete?
60
61- Polling
62Device sets a flag when it is busy
63Program tests the status of the flag in a loop waiting for completion of I/O
64
65- Interrupts
66Device interrupts the CPU on completion of I/O to jump to another instruction that handles the Interrupt sub-routine
67CPU jumps back to the code it was executing after the interrupt has been processed
68
69Multiprogramming:
70
71- Use interrupts to run multiple programs simultaneously : When a program performs I/O, instead of polling, let the CPU execute another program until an interrupt is received
72- Requires secure memory and I/O for each program
73- Requires intervention if a program runs into an infinite loop
74- Requires CPU scheduling to determine the next job to run
75
76CPU Execution Cycle:
77
78- Fetch instruction from memory
79- Decode the instruction
80- Execute the instruction
81- Write the result to memory
82- Increment the program counter and find the next instruction
83- Repeat
84
85I/O Devices and Buffer:
86
87- I/O Devices and the CPU execute concurrently
88- Every controller is responsible for a type of I/O devices and every controller has a local buffer. I/O is from the device to the local buffer of a controller
89- CPU is responsible for moving data from/to main memory to/from the local buffers
90
91
92Interrupts:
93
94- Interrupts transfers control to the interrupt service routine ; Interrupt Service Routine : Segments of code that dictate the actions to be taken for an interrupt
95
96- Determining the type of interrupt:
97Polling: The same interrupt handler is called for all interrupts, all devices are then polled to see where the interrupt came from
98Interrupt Vector table: different interrupt handlers are called for different interrupts
99
100- During interrupt handling, the OS preserves the state of the CPU : Stores registers and the program counter
101
102What happens to a new interrupt when the CPU is handling one interrupt?
103- Incoming interrupts can be disabled (masked) when one interrupt is currently being processed, and the new interrupt may either get lost of placed on a buffer or delivered later
104- Incoming interrupts can also be delivered on the spot, creating nested interrupts
105
106
107
108- In Direct Memory Access, device controllers transfer blocks of memory directly from local buffers to main memory without CPU intervention
109- The CPU is interrupted when the memory transfer is complete
110
111
112What is a process?
113A process is an instance of a program running with limited rights
114- Threads: A sequence of instructions within a process
115- Processes also have rights associated with it : 1) memory that the process can access 2) other permissions the process has like what system calls it can make, what files it can access etc.
116
117
118Dual Mode Operation:
119
120- Provide hardware support to differentiate between at least two modes of operation
1211. User Mode : Execution is done on behalf of the user
1222. Kernel Mode : Execution is done on behalf of operating system
123- "Privileged" instructions are only executable in kernel mode
124- Execution privileged instructions in user mode "traps" into kernel mode
125
126- Another bit called a Mode bit is added to computer hardware to indicate the current mode: User mode or kernel mode
127- When an interrupt occurs, the hardware switches to kernel mode
128
129
130The Timer:
131
132- Sometimes processes need to be controlled from execution indefinitely, in which case we make use of a timer that interrupts the process after a certain period of time and returns control to the processor
133- The timer is also used to help implement timesharing
134- The timer is also used to compute the current time
135- Programming the timer can only be done in the kernel since it requires privileged instructions
136
137
138Address translation and Memory Protection Cycle:
139
140- The processor takes a virtual address and attempts to translate it
141- If the translation is invalid, an error is raised
142- If the translation is valid, the translation is a physical address that is used to locate the address in memory
143- Data from the physical address is sent back to the processor
144
145Memory Protection:
146
147- When a process is running, only memory within that process space should be accessible
148- When executing in kernel mode, the kernel has unrestricted access to all memory
149
150
151Memory Protection : Base and Bounds:
152
153- To provide memory protection, add two registers to determine the range of legal memory addresses:
154Base Register: Hold the memory address of the smallest legal memory address
155Limit Register: Contains the size of the range
156- Memory outside the range is protected
157
158Implementation of Base and Bounds?
159
160- The virtual address in the processor is added to the base address.
161- If the sum of the address is greater than the bounds, an exception is raised
162- Else, the sum of the addresses form the physical address used to access physical memory
163
164The ability to load the base and limit registers are privileged instructions
165
166
167I/O Protection : All I/O instructions are privileged instructions
168Given that these instructions are privileged instructions, how do users perform I/O?
169- Via system calls - The method used by a process to request action by the operating system
170
171System Calls : User code can issue a syscall, which traps into kernel mode. Kernel handles the syscall
172
173User process Execution --> User process calls system calls --trap--> kernel mode executes system call --return--> return to user process from system call
174
175
176Storage Device Hierarchy:
177- Registers
178- cache
179- Main memory
180- solid-state disk
181- hard-disk
182
183Monolithic vs Microkernel OS:
184
185- Monolithic OSes have large kernels with lot of components
186- Microkernels move as much from the kernel into "user" space
187
188
189------------------------------------------------------------------------------------------------Worth trying to understand monolithic and microkernel OS ---------------------------------------------
190
191OS Task : Process management
192
193- OS is responsible for the following management activities:
194Process creation and deletion
195Process suspension and resumption
196Process synchronization and interprocess communication
197Process interactions - deadlock detection, avoidance and correction
198
199
200OS Task : Memory management
201
202- OS is responsible for:
203Allocation and deallocation of memory to processes
204Managing multiple process within memory - keep track of which parts of memory are used by which processes and also manages the sharing of memory between processes
205Determining which processes to load when memory becomes available
206
207
208Another OS task is : Secondary Storage and I/O Management
209
210OS Task : File Management
211
212- OS is responsible for:
213File creation and deletion
214Directory creation and deletion
215Supporting primitives for file/directory manipulation
216Mapping files to disks (secondary storage)
217
218Another OS Task : Protection and Security
219
220
221
222----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
223
224
225Our assumption will rest on processes having only one thread of execution, thereby the process execution takes place in a sequential fashion
226
227Process =? Program
228
229A process is one instance of a program in execution
230A program can invoke more than one process
231
232
233Process States:
234
235- New : The process is being created
236- Ready : The process is waiting to be assigned to a processor
237- Running : The instructions of the process are being executed
238- Waiting : The process is waiting for an event to occur
239- Terminated : The process is finished execution
240
241Process changes:
242
243New --> Ready : Process admitted
244Ready --> Running : Scheduler dispatch
245Running --> Ready : Interrupt
246Running --> Waiting : I/O or event wait
247Waiting --> Ready : I/O or event completion
248Running --> Terminated : exit
249
250
251Process Control Block:
252
253Contains the information associated with each process
254- Process State (eg) Running, waiting, etc.
255- Program Counter - location of next instruction to execute
256- CPU Registers
257- CPU Scheduling Information
258- Memory Management Information
259- Accounting information
260- List of all open files
261
262
263Enabling Concurrency and Protection in the OS
264
265Concurrency:
266Give out CPU time to different processes (Scheduling) - Only one process running at a time, Give more CPU time to important processes
267
268Protection:
269Give pieces of resources to different processes (Protection) - Controlled access to non-CPU resources
270
271What is a context switch?
272Operation that switches CPU from one process to another process
273
274How to enable concurrency in context switching:
275The CPU saves the state of the current process to the PCB and loads the state of the new process from the PCB
276
277The time it takes to do context switching is overhead:
278- System does not do any useful work when context switching
279- Overhead sets minimum practical switching time; can become a bottleneck
280
281
282
283Process Scheduler
284- Selects among available processes for next execution on the CPU
285- Maintains scheduling queues of processes
286-- Job queue: set of all processes in the system
287-- Ready queue: set of all processes residing in main memory, ready and waiting to execute
288-- Device queue: set of processes waiting for an I/O device
289- processes migrate among the various queues
290
291
292Schedulers
293
294- Short term Scheduler : Determines which process is to be executed next and allocates it to the CPU
295Sometimes the only scheduler in a system
296Called very frequently and works very fast (milliseconds)
297
298- Long term Scheduler : Determines which process to add to the ready queue
299Invoked infrequently and is relatively slow (seconds, minutes)
300Determines the degree of multiprogramming
301
302Processes can be described as either : 1) I/O Bound processes - which spends more time doing I/O than computations (many short CPU bursts) or 2) CPU-bound processes - which spend more time doing computations (few long CPU bursts)
303Long term scheduler strives for good process mix
304
305
306
307Process Creation:
308
309- Processes are created by other processes
310- Processes that create other processes are called parent processes, the created process is called a child process
311- The result is a tree of processes
312
313
314What does it take to create a new process?
315- Must construct new PCB : not very expensive
316- Must set up the address space ( e.g set up new page tables for address space) : More expensive
317- Copy data from parent process : Expensive
318- Copy I/O state - medium expense
319
320
321
322What is multithreading?
323A single program made up of a number of different concurrent activities
324
325Single and Multithread processes:
326
327- Code, Data and files are shared among threads
328- Registers and stacks are private to each stack
329
330- Threads encapsulate execution and concurrency
331- Processes encapsulate protection
332
333- In a multi-threaded process, while one thread is blocked and waiting, a second thread in the same task can run
334
335
336Thread State
337
338- State shared by all threads in the process
339- State private to each thread
340Kept in TCB (Thread Control Block)
341Threads also have states of execution ( new, ready, running, etc.)
342
343- Only one thread can run on a CPU at a time, but switching between threads does take up any memory management related work only a register set switch
344
345
346Types of Threads:
347
348- Kernel-supported threads
349- User level threads
350There exist some hybrid approaches that involve implementation of both user-level and kernel-supported threads
351
352
353Kernel Threads
354
355Supported by the kernel:
356- threads created and managed directly by the kernel
357- Threads can run or block independently
358- One process may have several threads waiting on different things
359
360Downside: Expensive for the thread to keep having to cross into kernel mode for scheduling
361
362
363User Threads:
364
365Supported above the kernel, via a set of library calls at the user level : may have several user level threads per kernel thread
366- One advantage of user threads is that they are cheap and fast because they do not need to cross over into the kernel for scheduling.
367- A disadvantage however, is that only one thread at a time per kernel thread, these threads will not run in parallel
368
369Multithreading models:
370
371- One to one : One user level thread mapped to a kernel thread
372- Many to one : Multiple user level threads mapped to one kernel thread
373- Many to many : Allows many user level threads to be mapped to one kernel thread
374
375Two Level Model:
376
377It is similar to many-to-many, except that it allows a user-level thread to be BOUND to a kernel thread
378
379
380What is a signal?
381Signals are used in UNIX systems to notify a process that a particular event has occurred
382
383
384A signal handler is used to process signals:
385- Signal is generated by a particular event
386- Signal is delivered to process
387- Signal is handled by either default handler or user-defined handler
388
389
390
391Ways of delivering signals to a multithreaded process?
392- Deliver the signal to every thread in the process
393- Deliver the signal only to the thread to which the signal applies
394- Deliver the signal to a specific thread designated to receive signals
395- Deliver the signal to certain threads in the process
396
397
398Thread Pools:
399
400Create a number of threads in a pool awaiting work
401Advantages:
402- Usually faster to service a request with an existing thread than to create a new one
403- Allows the number of threads in the application(s) to be bound to the number of threads in the pool
404
405
406Thread Local Storage:
407
408TLS (Thread Local Storage) allows each thread to have its own copy of data
409- Different from local variables in that TLS is visible across all functional invocations and is not local to one single function
410
411
412Multi- Definitions:
413
414Multiprocessing: Multiprocessors/CPUs
415Multiprogramming: Multiple jobs/processes
416Multithreading: Multiple threads per process
417
418
419Interprocess Communication:
420
421- Processors within a system may be independent or cooperating
422Processes may cooperate with each other for many reasons:
423- - Sharing information
424- - Speed up computation
425- - Convenience
426- - Modularity
427
428Cooperating processes thus need to communicate and share data. For this purpose, they use Interprocess communication (IPC)
429
430Two models of IPC:
431- Shared memory
432- Message passing
433
434
435Shared Memory:
436
437- An area of memory shared among the processes that wish to communicate
438- The communication is under the control of the processes not the operating system
439- Issues of synchronization of actions of user processes when they access shared memory
440
441
442
443Producer-Consumer problem:
444
445Paradigm for cooperating processes, producer processes produces information that is consumed by consumer processes
446
447
448Message Passing
449
450Mechanism for processes to communicate with each other without having to access shared data
451IPC facility provides two options:
452- Send(message)
453- Receive(message)
454
455The message size is usually fixed or variable
456
457
458If processes P and Q wish to communicate, they need to
459- Establish a communication link between them
460- Exchange messages via send/receive
461
462
463Direct Communication
464
465Processes must name each other explicitly:
466- Send(P, message)
467- Receive(message, Q)
468
469Properties :
470(1) Links are established automatically
471(2) Each link is associated with a pair of processes
472(3) Between each pair there exists exactly one link
473(4) Links may be either unidirectional or bi-directional
474
475
476Indirect Communication
477
478messages are directed and received from mailboxes (usually a port)
479- Processes must share a common mailbox for a link to be established
480- each mailbox will have a unique id
481
482Properties:
483(1) Links are established only if processes share a common mailbox
484(2) Each link may be associated with multiple processes
485(3) There may be more than one link between processes
486(4) Links may be either unidirectional or bi-directional
487
488
489Operations:
490- Create a mailbox
491- Send or Receive a message to/from a mailbox
492- Destroy a mailbox
493
494
495Synchronization:
496
497Message passing may be either blocking or non-blocking
498Blocking is considered synchronous
499- blocking send: the sender is blocked until the message is received
500- blocking receive: the receiver is blocked until a message is blocked
501Non blocking is considered asynchronous
502- non-blocking send: the sender can send a message and continue
503- non-blocking receive: the receiver receives: 1) a valid message or 2) a null message
504
505If both send and receive are blocking, we have a rendezvous
506
507
508Buffering:
509
510Queue of messages attached to the link is implemented in one of three ways:
511- Zero capacity: no messages are queued on the link, sender must wait for receiver
512- Bounded Capacity: Finite length of n messages. Sender must wait if link full
513- Infinite capacity: Infinite length, sender never waits
514
515
516Sockets
517
518- A socket is defined as an endpoint for communication
519- it is the concatenation of an IP address and a port ( a number included at the start of message packets to differentiate network services on a host )
520- Communication consists between a pair of sockets
521
522
523Remote Procedure Calls
524
525- RPC abstracts procedure calls between processes on networked systems
526Stubs: Client side proxies for the actual procedure on the server
527- The client-side stub locates the server and marshalls the parameters
528- The server side stub receives this message, unpacks the marshalled parameters and performs the procedure on the server
529
530Pipes
531
532- acts as a conduit for communication between two processes
533Ordinary pipe: cannot be accessed from outside the process that created it. Usually created by a parent process for communication with the child process that it created
534Named pipe: can be accessed without a parent-child relationship
535
536
537Ordinary Pipes:
538
539- Ordinary pipes allow communication in typical producer-consumer style
540- producer writes to one end of the pipe (the write end of the pipe)
541- consumer reads from other end of the pipe (the read end of the pipe)
542- Ordinary pipes are thus unidirectional
543- Require parent-child relationship between communicating processes
544
545
546Named pipes:
547
548- Named pipes are much more powerful
549- named pipes are bidirectional
550- No parent child relationship is necessary for the communicating processes
551- Several processes can use the named pipe for communication
552- is provided on both UNIX and windows systems
553
554
555
556
557Basic concepts of CPU Scheduling
558
559- Maximum CPU utilization obtained with multiprogramming
560- CPU - I/O Burst cycle : Process execution consist of a cycle of CPU execution and I/O wait
561
562Issues to consider in CPU Scheduling:
563
564- I/O Boundedness
565Short burst of CPU before blocking for I/O
566- CPU Boundedness
567Extensive use of CPU before blocking for I/O
568- Frequency of preemption
569- Urgency and priorities
570- Process execution time
571- Time sharing : amount of execution time process has already received
572
573Levels of Scheduling
574
575- High level scheduling or Job Scheduling
576Selects jobs allowed to compete for CPU and other system resources
577- Low level (CPU) Scheduling
578Selects the ready process that will be assigned for the CPU
579Ready queue contains PCBs of processes that are ready to be dispatched
580
581
582CPU Scheduler :
583
584Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them
585
586
587Non-preemptive scheduling:
588- Once CPU has been allocated to a process, the process keeps the CPU until:
589The process exits the CPU
590The process switches to the waiting state
591
592Preemptive Scheduling:
593- The process can be interrupted and must release the CPU
594
595
596CPU Scheduling Decisions:
597
598- CPU may make schedule decisions when a process:
599- - switches from ready to running state (non-preemptive)
600- - switches from running state to ready state (preemptive)
601- - switches from waiting state to ready state (preemptive)
602- - terminates (non-preemptive)
603
604
605Dispatcher:
606
607Dispatcher module transfers control of the CPU to the process that is selected by the short-term scheduler
608This involves:
609- switching contexts
610- switching into user mode
611- jumping to the proper location in the user program to restart that program
612Dispatcher latency:
613- time it takes for the dispatcher to stop one process and start another running
614- Dispatcher needs to be fast
615
616
617Scheduling Criteria:
618
619- CPU utilization
620keep the CPU as busy as possible
621
622- Thoroughput
623the number of processes that complete their execution per time unit
624
625- Turnaround time
626amount of time to execute a particular process from its entry time
627
628- Waiting time
629amount of time a process has been in the waiting queue
630
631- Response time ( in timesharing systems )
632amount of time it takes after a request was submitted to when a response was first (not final output) produced
633
634
635Optimization criteria:
636
637- Max CPU utilization
638- Max Thoroughput
639- Min Turnaround time
640- Min Waiting time
641- Min response time
642
643
644
645First Come First Served Scheduling (FCFS):
646
647- Process that submits a request for the CPU FIRST is allocated to the CPU first
648Non-preemptive algorithm
649- implemented using a FIFO queue:
650New processes are added to the tail of the queue
651The CPU selects a process to execute from the head of the queue
652
653
654
655Shortest Job First Scheduling :
656
657Scheduling algorithm that picks processes based on their CPU burst times; in that it pics the process with the least burst time next
658Two schemes:
659Non-preemptive:
660- once the process has been allocated to the CPU, the process will be executed until it has finished execution and cannot be preempted
661Preemptive:
662- When a process is executing and another process enters that has a lower CPU burst time, the currently executing process is preempted and the newer one is executed instead. Also called Shortest Remaining Time First
663
664SJF is optimal - gives minimum average waiting time for a given set of processes.
665The only difficulty is knowing the length of the next CPU request
666
667
668Determining length of Next CPU Burst:
669
670- One can only estimate the length of the next CPU burst
671- Use the lengths of previous CPU bursts and perform exponential averaging
672tn = actual value of nth burst
673Tn+1 - predicted value of next CPU burst
674alpha = 0, 0<=1<=alpha
675Define
676Tn+1 = alpha(tn) + (1-alpha)Tn
677
678at alpha = 0, recent history does not count
679at alpha = 1, only the actual last CPU burst counts
680Commonly, alpha is set to 0.5
681
682
683Priority Scheduling :
684
685- A priority number is associated with each process and is usually an integer. This number may be based on:
686Cost to user
687Importance to user
688Age
689% of CPU time used in last X hours
690- CPU is allocated to process with highest priority
691preemptive
692non-preemptive
693
694
695Shortest Job first is a type of priority scheduling where the priority value is the next CPU burst time
696Problem - Starvation : Low priority processes may never execute
697Solution - Aging: As a process spends more time waiting its priority increases
698
699
700Round Robin (RR):
701
702Each Process is allocated a fixed amount of CPU time and the process is then preempted and added to the end of the ready queue
703- The amount of CPU time allocated is usually a time quantum between 10-100 milliseconds
704For n processes, each process usually gets 1/nth CPU time for a time chunk of Q
705- if n is too large, FIFO behaviour starts getting executed
706- if n is too small, there is high context switching overhead
707
708
709Multilevel Queue:
710
711- Ready queue is partitioned into separate queues
712- Different scheduling algorithms are used on each queue
713- Processes assigned to one queue permanently
714- Scheduling must be done between the queues
715Fixed priority: serve all from foreground, then from background - possibility of starvation
716Time slice: each queue gets some CPU time that it schedules
717
718
719Multilevel Queue scheduling priority order:
720
721- system processes <-- highest priority
722- interactive processes
723- interactive editing processes
724- batch processes
725- student processes <-- lowest priority
726
727
728Multilevel feedback queue:
729A process can move between the queues
730- aging is implemented this way
731Parameters for a multilevel feedback queue scheduler:
732- The number of queues
733- The scheduling algorithm used for each queue and between queues
734- Which process to promote to another queue
735- Which process to demote to another queue
736- the method for determining which queue a process will enter when a new process needs servicing
737
738
739Thread Scheduling:
740
741- When threads support scheduling, threads are scheduled instead and not processes
742- The CPU scheduler schedules kernel threads
743- In many-to-one and many-to-many, thread library schedulers user-level threads to run on a kernel thread
744
745
746Multiple Processor Scheduling:
747
748- Scheduling of processes becomes more complex when there are multiple processors involved:
749
750Symmetric Multiprocessing (SMP):
751- each processor dispatches a job from the ready queue
752- there is one common ready queue of all processes for the processors, or each has its own private queue of ready processes
753- homogeneous processors within a multiprocessor
754- currently the most common multiprocessor set up
755- Difficulties : access to shared data structure and making sure that all processes are scheduled and that no process is scheduled by separate processors
756
757Asymmetric Multiprocessing:
758- master-slave system where one processor schedules the other processors
759- Only one processor accesses the data structure, alleviating the need for data sharing
760
761
762Multicore processors:
763
764- there has been a recent trend of placing multiple cores on the same processor
765- faster and consumes less power
766
767
768Hyperthreading:
769
770- multiple threads per core
771Takes advantage of memory stall to make progress on another thread when memory retrieve happens
772- One CPU core looks like two cores to the operating system with hyperthreading
773
774
775Methods of algorithm evaluation:
776
777- Deterministic evaluation:
778uses a pre-determined workload and defines the performance of each algorithm for that workload. Too specific, requires exact knowledge to be useful
779
780- Queuing Models and Queuing Theory:
781Make use of CPU and I/O burst times and knowing arrival and service rates - can compute utilization, average queue length, average wait time, etc.
782
783Other techniques: simulations, implementation
784
785
786Producer-Consumer problem:
787
788Paradigm for cooperating processes
789- Producer processes produce information and consumer processes receive the information
790We need buffer of items that can be filled by producer and emptied by consumer
791- Unbounded buffer: this buffer places no practical limit on the amount of information it can hold. Consumers may wait, producers never wait.
792- Bounded buffer: this buffer has a limit to the amount of messages/information it can hold. Consumers have to wait for new information and producers will wait if the buffer is full
793Producers and consumers must synchronize
794
795
796Problem with shared memory solution to the Bounded Buffer problem
797
798- The problem with this solution to the bounded buffer problem is that at most n-1 items are allowed in the buffer at the same time
799
800Solution to the problem is above ( the need to have N items in the buffer ) is to initialize a variable called counter that keeps track of the number items in the buffer
801- incremented when the producer adds an item
802- decremented when the consumer removes an item
803
804Problems with this solution (lol wtf):
805
806- Access to shared memory between concurrent processes can result in data inconsistency
807- To protect data consistency, there needs to be mechanisms in place that ensure the orderly execution of cooperating processes
808
809Atomic operations:
810
811Shared data statements that can cause inconsistency problems like:
812counter++
813counter--
814must be executed atomically
815
816Which means the operation must run to completion or not at all.
817(done by assigning value to register and then increment the register and then assigning it back to counter)
818
819
820Race condition:
821
822If threads are working on separate threads, we dont need to worry about race conditions
823However, there can be race conditions if we have shared data ; where one variables value is non deterministic because it can one of numerous values depending on the order of execution of the threads statements
824
825
826Critical Section Problem:
827
828Consider a set of n processes competing to access shared data
829Each process has a critical section segment of code ; where in this section it updates variables, changes tables, writes to files, etc.
830At any given time, only process should be allowed to be in its critical section
831This problem is to design protocol to achieve/solve this
832Each process must ask for permission to enter the critical section in its entry section, and then may follow critical section with exit section, and then remainder section
833
834
835Requirements of the critical section problem:
836
837Mutual Exclusion : if one process is in its critical section, no other process must be allowed to be in its critical section
838Progress : If no process is currently in its critical section and there are processes requesting to access their critical section, then the selection of the processes that will enter its critical section cannot be postponed indefinitely
839Bounded Waiting : a bound must exist on the number of times processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted
840
841Assumptions:
842- Assume that processes execute their critical section at a nonzero speed, that is every process finishes execution of critical section once it enters
843- There is no relative speed between processes
844- A process may be stuck in an infinite loop in its remainder section (eg) non terminating while loop
845
846
847Synchronization Hardware:
848
849Many software based solution like Peterson's solution and Bakery Algorithm may not work on modern day computers because of how they perform basic machine level instructions (eg out of order execution)
850Many systems provide hardware support for implementing the critical section
851Solutions based on idea of locking :-
852
853do { acquire lock;
854critical section;
855release lock;
856remainder section
857} while (TRUE);
858
859
860We need atomic execution
861- atomic = non-interruptible
862To do this on uniprocessors we can disable interrupts
863- currently running code would execute without preemption
864- too inefficient for multiprocessors systems - operating systems might use this but not broadly scalable
865Modern machines provide special atomic hardware instructions
866- either test memory word or set memory value
867- or compare and swap the contents of two memory words
868
869
870test_and_set instruction
871
872boolean test_and_set ( boolean *target )
873{
874boolean *rv = *target;
875*target = TRUE;
876return *rv;
877}
878- executed atomically
879- returns the original value of passed parameter
880- sets the new parameter value to TRUE
881
882
883compare_and_swap instruction
884
885int compare_and_swap(int *value, int expected_value, int new_value)
886{
887int temp = *value
888
889if ( *value == expected ) *value = new_value;
890return temp
891}
892
893- executed atomically
894- returns the original value of passed parameter
895- Set the variable value to be the value of the passed parameter new_value if value == expected_value. The swap only takes place under this condition
896
897
898Mutex locks:
899
900Previous solutions discussed are too complication for application programmers
901OS designers usually design software tools to help with the critical section problem
902The simplest ones are called mutex locks
903To protect a critical section, a process must acquire() a lock and then release() the lock when it leaves the critical section
904- Calls to these functions return a boolean value
905Calls to acquire() and lock() must be atomic
906- usually implemented using atomic hardware instructions
907- - but this solution requires busy waiting
908- - this lock is therefore called a spinlock
909
910acquire()
911{
912while (!available); // busy wait
913available = false;
914}
915
916release()
917{
918available = true;
919}
920
921
922Semaphores:
923
924Semaphore S - integer variable (non-negative)
925- Semaphores are used to represent number of abstract resources
926Can only be accessed via two atomic operations:
927- wait(S):
928while ( S <= 0 );
929S--;
930
931- signal(S): S++;
932
933P or wait used to acquire a resource, waits for semaphore to become positive and then decrements it by 1
934V or signal used to release a resource and increments the semaphore by 1 wake up any waiting processes if any
935If P is performed on a count of <= 0, process must wait for V or the release of a resource
936
937
938
939Semaphores as a general Synchronization tool
940
941- Execute B in Pj only after A has executed in Pi
942- initialize semaphore flag at 0
943
944Pi Pj
945-- --
946-- --
947A wait(S)
948signal(s) B
949
950
951Problem...
952
953Synchronization involves waiting
954- Busy waiting, uses CPU that others could use. This type of lock is called spinlock
955- - Waiting thread may take cycles away from thread holding lock
956- - Ok for short times since it prevents a context switch
957- - Priority inversion: Thread that is waiting has higher priority than thread holding lock --> problem
958- should sleep if waiting for a longer time
959For longer runtimes, need to modify P and V so processes can block and resume
960
961Semaphore Implementation with no busy waiting:
962
963Instead of the semaphore just being an integer value, make it a struct that also has a list
964In the wait function, start of by decrementing no matter what. If the value has gone negative, add it to S->list. Then block()
965In the signal function, increment and if the value is <= 0, remove a process P from the list
966Then wakeup(P)
967
968
969Deadlock and Starvation:
970
971Deadlock: two or more processes are waiting indefinitely for an event that can be caused by only one of the waiting processes
972
973Starvation: Indefinite blocking. A process may never be removed from the semaphore queue in which it is suspended
974
975
976In the Producer-Consumer Bounded Buffer problem:
977
978The producer: P(empty), V(full)
979The consumer: P(full), V(empty)
980
981
982Problems With Semaphores:
983
984Incorrect use of semaphore operations can result in deadlock and/or starvation
985- signal(mutex) ... wait(mutex)
986- wait(mutex) ... wait(mutex)
987- omitting of wait(mutex) or signal(mutex) or both
988
989
990Monitors:
991
992A high-level abstraction that provides a convenient and effective mechanism for process synchronization
993Abstract type, internal variables are accessible only by code within the procedure
994Only one process may be active within the monitor at a time
995But not powerful enough to model some synchronization schemes
996
997Condition variables
998
999condition x, y
1000Two operations are allowed on a condition variable:
1001- x.wait() - a process that invokes the operation is suspended until x.signal()
1002- x.signal() - resume one of processes (if any) that invoked x.wait()
1003- - if no x.wait() on variable then has no effect on the variable
1004
1005What happens in the event that P is in the monitor and invokes x.signal() and process Q is suspended on x.wait()?
1006Both P and Q cannot execute in parallel in the monitor, if Q is resumed then P must wait
1007Options:
1008- Signal and wait: P waits until Q either leaves the monitor or waits for another condition
1009- Signal and continue: Q waits until P either leaves the monitor or waits for another condition
1010
1011
1012The Deadlock Problem:
1013
1014- What is a deadlock?
1015A set of blocked processes each holding a resource and waiting to acquire a resource held by another process in the set
1016
1017
1018- Definition?
1019A process is deadlocked if it is waiting for an event that will never occur
1020_ _ typically, more than one process will be involved in a deadlock
1021
1022Conditions for deadlock?
1023
1024Deadlock can arise if four conditions hold simultaneously.
1025
1026- Mutual Exclusion: Only one process at a time can use a resource
1027
1028- Hold and Wait: A process holding at least one resource is waiting to acquire additional resources held by other processes
1029
1030- No preemption: A resource can only be released voluntarily by a process holding it, only once the process finishes its task
1031
1032- Circular wait: there exists a set of processes {P0, P1, ... Pn} of waiting processes such that P0 is waiting for a resource that is held by P1, and P1 is waiting for a resource held by P2, ... Pn is waiting for a resource held by P0
1033
1034
1035Resource Allocation Graph:
1036
1037A resource allocation graph is a graph with a set of vertices and edges
1038- Vertices represent a set of processes and a set of resources
1039- direction edges from a process to a resource means it is a request. Directed edge from a resource to a process means it is an assignment
1040
1041If there is no cycle in the graph, no deadlock.
1042If there is a cycle:
1043- if there is only one instance of every resource - deadlock
1044- if several instances per resource type - possibility of deadlock
1045
1046
1047Methods for handling deadlocks:
1048- Ensure that the system will never enter a deadlocked state:
1049Deadlock prevention
1050Deadlock avoidance
1051
1052- Allow the system to potentially enter a deadlocked state and then detect it and recover
1053Deadlock detection
1054Deadlock recovery
1055
1056
1057Deadlock prevention:
1058
1059Restrain the ways requests can be made
1060
1061- Mutual exclusion: not required for shareable resources, must hold for non-shareable resources
1062
1063- hold and wait: must guarantee that whenever a process makes a request for a resource, it does not hold any other resources
1064Requires process to request and be allocated all its resources before executing, or allow process to request resources only when the process has none allocated to it
1065Low resource utilization, can cause starvation
1066
1067- No preemption:
1068If a process that is holding some resources requests for another resource that cannot be immediately allocated to it, then all resources currently being held are released
1069Preemtped resources enter the list of resources that the process is waiting for
1070process has to regain all its resources to execute, as well as the new ones that it is requesting
1071
1072- Circular Wait
1073Impose a total ordering of all resource types and require that each process can request for resources from another process in an increasing order of enumeration
1074
1075
1076Deadlock avoidance:
1077
1078- Each process tells the system how many resources it needs
1079- System only allocates resources to the process if the system will be in a safe state even after allocation
1080
1081What is a safe state?
1082A system is said to be in a safe state if there is a sequence <P1, P2, ... Pn> of all processes in the system such that the resources requested by each process Pi can be satisfied with currently available resources in the system + resources held by all the other Pjs in the system
1083That is:
1084- If Pi needs a resource that is not immediately available, Pi can wait until all Pj have finished (j<i)
1085- When all Pj (j<i) have finished, requested resources can be allocated to Pi which can execute and then terminate, releasing the resources
1086- When Pi is terminated, Pi+1 can obtain its needed resources and so on
1087If there is no such sequence, the system is in an unsafe state
1088
1089If a system is in a safe state --> no deadlocks
1090If a system is in an unsafe state --> possibility of deadlock
1091Avoidance ==> ensure a system will never enter an unsafe state
1092
1093
1094Key idea behind avoidance algorithms:
1095- if a resource is requested, assume it is allocated
1096- then check the state of the system after allocation
1097- is the system still safe? If yes, allocate the resource
1098- if not, reject the allocation request
1099
1100Single instance of a resource -> use a resource allocation graph
1101multiple instances of a resource -> use the bankers algorithm
1102
1103
1104Resource allocation graph scheme:
1105
1106- A single dashed line between Pi and Rj represents a claimed edge, which means that Pi may request for Rj
1107- When Pi makes the request for Rj, it becomes a request edge for Rj from Pi
1108- When the request is assigned, the edge is inverted and becomes an assignment edge from Rj to Pi
1109- When the resource is released by a process, assignment edge reconverts to claim edge
1110
1111
1112Suppose that process Pi requests a resource Rj
1113- The request can be granted only if converting the request edge to an assignment edge doesn't create any cycles in the graph
1114
1115
1116Bankers algorithm - multiple instances of resource types
1117
1118Data structures of bankers algorithm:
1119
1120- Available : amount of each resource available to each type (vector)
1121- Allocated : amount of each resource allocated to each process (matrix)
1122- Max : the maximum amount of resources of each type that each process can take (matrix)
1123- Need : How many more resources of each type the process needs above its current allocation (matrix)
1124
1125
1126Agorithm
1127
11281) Work = available
1129Finish[i] = False for i = 1,2,3,4,...
1130
11312) find an i such that: Finish[i] = False and Need <= work
1132If no such i exists, skip to step 4
1133
11343) work = work + allocated
1135Finish[i] = True
1136go back to step 2
1137
11384) if finish[i] == true for all i, system is in a safe state
1139
1140
1141
1142Deadlock detection:
1143
1144maintain wait for graph
1145- processes are nodes
1146- if there is an edge from Pi to Pj, then Pi is waiting for Pj
1147
1148Run an algorithm to check if there is a cycle, if there is then there is a deadlock
1149
1150- the detection algorithm works similar to the bankers algorithm, but when creating an array of false values, finish[i] is false only if Allocation of i does not = 0
1151
1152Recovery from deadlock:
1153Process termination:
1154
1155- Abort all deadlocked process
1156- abort one process at a time until the deadlock cycle is eliminated
1157how do we determine order of abortion of processes?
11581) Priority of the process
11592) How long has the process been executing and how much more needed for completion
11603) How many resources the process holds
11614) How many more resources the process needs
11625) How many processes will need to be terminated
11636) Is the process batch or interactive?
1164
1165Recovery from deadlock:
1166Resource preemption
1167
1168Successively preempt some resources from processes and give these resources to other processes until the deadlock cycle is complete
1169
1170Considerations:
1171- How do we decide which process is to be the victim?
1172- Rollback: how do we preempt resources from a process? return process to a safe state, restart the process
1173- Starvation: same process may always be picked as victim
1174
1175
1176------------------------------------------------------------------------
1177Main memory
1178------------------------------------------------------------------------
1179
1180Virtualizing Resources:
1181
1182The physical reality is that both processes and threads share the same hardware
1183For this reason there needs to be a way to either
1184- multiplex the CPU (scheduling)
1185- multiplex memory
1186
1187Why worry about memory multiplexing?
1188- the complete working state of a process and/or kernel is determined by data in memory (and registers)
1189- Consequently, cant just let different processes use the same memory
1190- dont want processes to access each others memory either
1191
1192
1193Important aspects of memory multiplexing:
1194
1195- Controlled overlap:
1196processes should not collide in physical memory
1197conversely, we would like them to share memory when desired (for communication)
1198
1199- Protection:
1200prevent access to private memory of other programs
1201kernel data protected from user programs
1202
1203- Translation:
1204There needs to be a means of translating accesses from one address space (logical/virtual) to the other (physical)
1205When translation exists, processes use logical addresses, physical memory uses physical addresses
1206
1207
1208Different Addresses and Binding
1209
1210- Symbolic addresses : variable names in the program, function names, etc.
1211- Logical addresses : an offset within the logical address space
1212- Physical addresses: address seen by main memory
1213
1214Addresses can be bound to final values anywhere in this path:
1215- Compile Time
1216- Load Time
1217- Execution Time
1218
1219
1220Dynamic Linking:
1221
1222Linking postponed until execution time
1223Small piece of code, stub, used to locate the appropriate memory-resident library routine
1224the stub replaces itself with address of the routine and executes the routine
1225Operating system needed to check if routine is already in memory
1226
1227
1228Dynamic Loading:
1229
1230- Routine is only loaded when it is called
1231- Good memory space utilization since unused routines are never loaded
1232- Useful when large amounts of code are needed to handle infrequently occurring cases
1233- No special support from the operating system required; implemented through program design
1234
1235
1236Binding Time Tradeoffs:
1237
1238Address binding of instructions and data to memory addresses can happen at three different stages
1239
1240Compile Time (early binding):
1241- if memory location is known a priori, absolute code can be generated; must recompile code if starting location changes
1242- No logical addresses in this case, compiler generates physical address directly
1243- produces efficient code
1244- checking can be done early
1245- allows estimates of running time and space
1246- not portable
1247- not flexible for memory allocation
1248
1249Load Time (delayed binding):
1250- Linker, Loader
1251- Compiler must generate relocatable code (absolute physical addresses, which can be easily changed at load time)
1252- Produces efficient code, allows separate compilation
1253- portability and sharing of object code
1254- not flexible for memory allocation
1255
1256Execution time (Late binding):
1257- Logical addresses and physical addresses
1258- compiler generates logical addresses
1259- Code less efficient, checks done at runtime
1260- need hardware support to translate logical addresses to physical addresses at execution time
1261- flexible for memory allocation
1262- portable
1263
1264
1265Logical vs Physical Address Space:
1266
1267- The concept of a logical address space that is bound to a separate physical address space is central to proper memory management:
1268Logical address: or virtual address - generated by the CPU
1269Physical address: address seen by memory unit
1270- Logical addresses and physical addresses are the same in compile time and load time binding
1271- Logical and physical address differ in execution-time address-binding
1272
1273
1274Contiguous Allocation:
1275
1276Memory allocated in contiguous partitions for processes
1277- relocation and limit registers used to determine the partition at runtime
1278- relocation register contains value of smallest physical address, limit register contains range of logical addresses - each logical address must be less than limit register
1279- Operating system has its own partition
1280
1281Relocation Register diagram:
1282Logical address from memory is added to the relocation register to find the physical address in memory
1283
1284Relocation and Limit Register diagram:
1285Logical address is first checked to see if it is less than limit register, if it is then it is added to the relocation register and the physical address is used, if it isn't there is a trap: addressing error
1286
1287
1288Contiguous allocation (continued):
1289
1290Multiple partition allocation:
1291Holes - blocks of available memory; holes of variable sizes are scattered throughout memory
1292When a process arrives, it is allocated memory from a hole large enough to accommodate it
1293Operating system maintains information about:
1294- allocated partitions
1295- free partitions (hole)
1296
1297Dynamic storage allocation problem:
1298
1299How to satisfy a request of size n from a list of free holes:
1300- First fit: allocate the first hole that is big enough
1301- Best fit: allocate the smallest hole that is big enough. May have to search entire list unless list is sorted. Produces smallest leftover hole
1302- Worst fit: allocate the largest hole. Produces largest leftover hole
1303
1304First-fit and best-fit are better than worst-fit in terms of storage utilization
1305First fit is the best in terms of execution speed
1306
1307
1308Fragmentation:
1309
1310External fragmentation - total memory space exists to satisfy a request, but its not contiguous
1311(if all these small free blocks of memory were together, we could divide it up and use it to our discretion for any size process, but in this case small ones may be wasted)
1312Internal fragmentation - allocated memory may be slightly larger than requested memory, difference is memory internal to a partition, but not being used
1313(if large processes come in to the input queue and request for almost as much space available, the storage required to determine how much space is available will be greater than the amount available. So we divide available space into fixed size blocks and allocate as many blocks as needed. More memory ends up getting allocated than needed in the process. This is internal fragmentation)
1314
1315
1316Contiguous allocation suffers mainly from external fragmentation (as well as internal fragmentation)
1317We can reduce external fragmentation by compaction:
1318- shuffle memory contents to place all free memory together in one large block
1319- Compaction is possible only if relocation is dynamic, and is done at execution time
1320- Compaction might cause problems for I/O due to inflight DMA
1321Solutions:
1322(1) Pin memory used by I/O devices
1323(2) Do I/O only into OS buffers
1324
1325What is segmentation?
1326Memory management scheme that supports the user's view of memory
1327- program is a collection of segments
1328- segment is a logical unit in a program
1329- protect each entity independently
1330- allow each segment to grow independently
1331- share each segment independently
1332
1333What is STBR and STLR?
1334Segment Table Base Register: points to the segment tables location in memory
1335STLR: indicates the number of segments used by a program
1336
1337
1338- Dynamic allocation of segments through best fit/first fit come with problems of external fragmentation
1339- Protection of segments can be achieved through protection bits associated with segments (such as read-write, execution bits, etc.)
1340an example of this protection is when an array is in a separate segment, hardware can check for illegal array indexes
1341
1342Paging:
1343
1344Paging solves the external fragmentation problem by dividing up physical memory into equal sized frames and logical memory into equal size pages
1345- to run a program of size n pages, find n free frames and load the program
1346
1347Page table is stored in main memory and not in registers because of their typical large size
1348- Page-table base register points to the page table in memory
1349- page-table length register indicates the size of the page table
1350
1351The problem with this implementation is that it requires two memory accesses. One to access the page table itself, and then another to access the data from the offset
1352- This can be solved using a TLB (Translation Look-Aside Buffers) that function as a cache for frequently accessed data from page tables
1353
1354The TLB keeps the page number and associated frame number for quick access from main memory. If not in the cache, the page table needs to be searched for the correct frame
1355
1356Memory protection:
1357
1358Memory protection in paging is implemented by associating protection bits with each frame
1359A valid/invalid bit, read-write bit and execute bit may be attached to each entry in the table to details what is allowed and what isn't
1360
1361
1362Since some page table implementations may be too big for memory, page tables may be divided into two tables where the outermost page table offsets the inner page table which then finds the
1363appropriate frame
1364
1365
1366Inverted page table:
1367
1368Instead of having one page table for each process, we have one single page table that keeps track of which memory space is allocated to which pages
1369- one entry for each real page of memory
1370- reduces memory required to store table
1371- increases memory access time
1372- A hash table can be used to limit search to a single page entry
1373
1374CPU generates a 3 item address with process ID, page, and offset
1375- process ID looked up in page table
1376- page found and used to find offset information
1377- physical memory accessed
1378
1379Shared Pages
1380
1381- Re-entrant code or pure code that is non self-modifying and common to multiple process can be stored in one page
1382- processes can store their own unique data and code associated with each pure code in their logical address space
1383- Reduces amount of memory needed by multiple processes running the same task
1384
1385Segment Paged Memory:
1386
1387- Segment refers not to base address of segment in memory, but base address of page table for the segment
1388- overcomes external fragmentation problem of segmented memory
1389- Enables the use of segments, (e.g) for easier control of permissions of a memory region
1390
1391
1392Need for virtual memory:
1393
1394- Separation of user logical memory from physical memory
1395- only a part of the program needs to be present in physical memory for execution, not the whole of it
1396
1397What is demand paging?
1398Bring a page into memory for a process's execution only when it is needed
1399- You can support more users and multiprogramming with this response
1400- Less I/O and memory needed
1401- Faster response
1402
1403
1404Steps to Handle Page Fault:
1405
1406- When a page fault occurs - execution is trapped into OS mode
1407- The virtual memory address is checked to see if it is legitimate. If it is illegitimate, suspend execution and kill the process, if it isn't proceed to find the page in virtual memory
1408- Bring into physical memory and onto disk and fetch disk content onto page frame
1409- Restart instruction as if page was always present
1410
1411If there is no free frame?
1412Use an algorithm to determine what page to swap out and swap in the required page for that page
1413Use an algorithm that will result in minimum number of page faults and page replacements
1414
1415Replacing unmodified pages is easier than replacing modified pages because they dont need to be written back to disk like modified pages
1416
1417
1418Possible ways of implementing the LRU algorithm:
1419
1420Counter Implementation:
1421- every time page is referenced, copy the current clock value into the counter
1422- when pages need to get evicted, evict the frame with the page with smallest counter value
1423- slow in that it has to keep searching page entries
1424
1425Stack implementation:
1426- When a page is referenced, move it to the top of the stack
1427- No search required for replacement
1428
1429Counting Algorithms:
1430
1431- Keep a count on the number of references that have been made to each page
1432From here, you can use the LFU (Least Frequently Used) algorithm or MFU (Most Frequently Used) algorithm
1433
1434
1435Page Buffering Algorithm:
1436
1437Keep pool of free frames
1438- when a page fault occurs, choose victim frame
1439- desired page is read into free from pool before victim is written out
1440- Allows process to restart soon, victim is given its time to be written out and added to the free frame pool
1441Expansion: Maintain a list of modified pages. When paging device is idle, write modified pages to disk and clear modify bit
1442
1443
1444Thrashing:
1445
1446Occurs when a process spends more time paging than executing because it is busy swapping pages in and out
1447
1448
1449Locality: Set of pages that are actively used together on account of being in a similar part of a program
1450- localities may overlap
1451- processes migrate from one locality to another
1452
1453Thrashing occurs when the size of a locality is greater than the total memory size
1454
1455Working Set:
1456Working set is an approximation of the size of the locality
1457If total demanded frames from working set size of Process Pj is greater than number of available frames then suspend one of the processes
1458
1459What is an inode?
1460An inode (index node) is an important control structure needed by the OS to store important information to access a particular file. Several file names may be associated with an inode, but each file is controlled by exactly ONE inode