· 7 years ago · Dec 11, 2018, 06:24 AM
1Smallest number that can be represented in 2s complement:
2
3-2^n : 1000000
4
5Largest number that can be represented in 2s complement:
6
72^n - 1 : 0111111
8
9Sign Extension:
10
11- to add two numbers of different lengths, well need to pad them on the left for addition.
12- you cant simply pad the remaining bits on the left with 0s because it will violate the number if it is 2s complement
13- only the MSB can be used to pad the number on the left
14
15How do we know if we have overflow?
16
17- sign of both operands is same and sign of sum is different
18
19Computer Operation:
20
21- Computers only understand machine level instructions that are simple and low level
22- this languages simplifies the process of hardware design
23- however, its still hard to write complex programs this way in machine language, so we prefer to use higher-level languages instead
24
25
26How can we execute these high level languages (L1) on standard hardware?
27- We can make use of compilers and interpreters to either
28translate: replace whole blocks of code or entire programs with their machine language (L0) equivalent
29interpret: taking each instruction from the high level language and execute it one at a time
30
31- Translation is slower, but easier to debug. Interpreting is faster and more efficient
32- We can thus think of compilers or interpreters as virtual machines for L1, because as far as us users are concerned the L1 is executed
33
34Hierarchy of VMs:
35
36- The L1 and L0 cant be too different, but we can have a hierarchy of virtual machines that do the job of breaking down the language into its relevant complexities at its level, thereby hiding the details of the level below it
37- Virtualization is thus achieved, and gives us a layered or structured way of looking at computers
38
39Layered or Structured View:
40
41- Layers move from the top down, in decreasing order of abstraction. Each layer groups together similar elements and abstracts away details of whats below. This way, the understanding of the system is simplified and complexity of the system is hidden.
42- In our theoretical model, the bottom layer belongs to hardware, transistors, etc. - but in practice and reality there are layers below it as well.
43
44
45Hardware has influence in the definition of many basic issues:
46
47- digital hardware has 2 states to incorporate the functionality of binary arithmetic
48- The speed of computer memory is dependent on its physical type and size
49These details are abstracted away from a programmer making use of the level meant for the high level language, but hardware is an important component to computer functionality
50
51
52The Layered or Structured View Model follows:
53
54- The Application Program Layer:
55High level languages (L1) are written in this language to solve a problem and incorporate heavy abstraction in terms of the way physical features of the computer are interacted with as simple variable names, or control structures, objects, etc. The high level languages written in this layer are translated to the language of a lower layer to be directly executed by the hardware. They also make use of function libraries supplied by the compilers or OS.
56
57- Assembly Language Layer:
58Abstractions are to a lower extent in this level, but there is more direct interaction with the hardware and more details of the hardware are exposed for this level to make use of. The language written in this layer views the computer as an Instructions Set Architecture, and also has access to OS-level abstractions
59
60- OS Layer
61This layer has the ability to implement multiprogramming. It also gives users privilege in their execution of hardware details and resources. It also provides protection to the system from the user and to the user from other users. This layer also provides several abstractions to the user in terms of the way its features can be used and how they are actually handled in the background (e.g) virtual memory.
62access to the functions in the OS is through the system call interface
63
64- Instructions Set Architecture Layer:
65ISA is the assembly programmer's view of a computer. The abstractions it provides are in terms of its data types, instructions, memory addressing, addressing modes and registers.
66
67- Microarchitecture Layer:
68This layer primary deals with the organization of a computer, in that it describes major units of a system(ALU, etc.), how they are interconnected and protocols for their communication. This layer then abstracts the digital logic and hardware layer
69
70- Digital Logic Layer:
71This layer details the gate level functioning of a computer and shows the gate-level implementation. Gates are basic building blocks at this level, but it still abstracts away further details about transistors like their analogous nature, which we ignore and only treat them as digital switches that are either ON or OFF
72
73- Hardware Layer:
74Describes a computer at a transistor level. These transistors are abundant in modern processors, and are integral to the design of the system. Still, even they need their own abstraction to make the design of processors possible through the grouping of the construction of semiconductors and transistors, and gates and memory cells.
75
76
77System Organization Terminology:
78
79CPU: Central Processing Unit. Same as processor. Consists of data path and control unit.
80Data path: ALU plus register file
81Memory: DRAMs plus controller. Always byte-addressable
82Bus: A collection of wires and protocol for communication.
83
84- I/O devices are viewed as memory. You read/write data to/from them
85
86
87System and CPU Organization:
88
89Within the CPU is the ALU, Registers, and Control unit. This CPU box is connected to a bus that externally runs along and branches into connections with main memory, the disk and a printer
90
91
92CPU Instruction Execution Cycle:
93
94- Fetch instruction from memory and load into Instruction Register (IR)
95- Increment Program Counter (PC)
96- Decode Instruction in IR
97- Fetch any additional operands specified by fields in IR
98- Execute the instruction specified by the IR
99- Store result (if necessary) as specified by a field in IR
100
101The above loop is done forever until a HALT instruction stops the processor. This is the main algorithm for the control unit
102
103
104Data Path Operation:
105
106The data bus reads operands from the data file, specifies the ALU operation, waits for the ALU ( in 1 clock cycle ), writes result from result bus to intended register, and sets the N, C and Z flags
107(eg) what x86 does when executing the instruction ADD R4, R6
108
109
110There can also be other data path organizations with a number of operands different from that of two:
111- 3 operand specifiers: when the destination register and the source register are different
112- 1 operand specifier: when the destination register is already predefined
113- 0 operand specifier: when the source and destination registers are already predefined and is dealing with more structural functions like stack[top], etc.
114
115
116Memory View:
117
118- Memory is organized as a linear array of words, where each word is byte addressable
119
120Register Transfer Language (RTL):
121
122- Data path operations between registers can be controlled using RTL
123- Can handle simultaneous operations
124- Useful shortcut for describing operands that involve mainly registers
125
126The same works for control logic
127
128
129
130Digital Hardware Level:
131
132- Computers are built using gates and memory
133- Gates and memory are constructed using transistors. These transistors can be thought of as electronic switches that give rise to a binary system of input
134- Information values are in the form of voltages
135- Digital circuits can be in only one of these states
136
137
138Transistors:
139
140These transistors, or electronic switches, are of two types:
141
142N-MOS Transistors: When the input voltage is LOW, the switch is OPEN. When the input voltage is HIGH, the switch is CLOSED
143P-MOS Transistors: When the input voltage is HIGH, the switch is OPEN. When the input voltage is LOW, the switch is CLOSED
144
145
146Construction of an AND gate:
147
148- A and B N-MOS transistors connected in series to the high end, and the A and B P-MOS transistors connected in parallel to the low ends and ultimately to the output
149
150
151One bit adder:
152
153- Two boolean inputs A and B are added, their sums are an XOR gate, and their carry value is an AND gate
154
155
156Boolean Operator Precedence:
157
158- NOT - highest
159- AND
160- OR - lowest
161
162Function Description:
163
164- In boolean functions, the functions can be described using a truth table unlike regular functions because the input space is a small set of boolean inputs.
165- Using the truth table, we can obtain a canonical form of the function. This is a sum of all the products of inputs that result in an output of 1
166
167Boolean algebra can be used to transform a function to a different but equivalent form
168
169Composite Gates: composite gates are gates that are built using the basic logic gates like AND, NOT or OR. (e.g) NAND, NOR. In some technologies however, it is worth knowing that NAND can also be considered a basic gate, depending on the hardware implementation
170
171Multiplexor:
172
173Multiplexors are boolean functions with multiple inputs, control inputs, but one single output. Multiplexors usually select one of the many data inputs as its output
174They are usually specified as n-to-1 multiplexors that have n inputs and 1 output. n usually takes the value of 2, 4, or 8. The number of control inputs in a multiplexor follows: log2(n)
175In the truth table for a multiplexor, the value for the output will have different data input variables instead of a 0 or 1, which means its output value is dependent on its input
176
177Was that a truth table? Can it be simplified? How to implement this function?
178
179Encoders:
180
181Encoders are boolean functions with 2^n inputs and n outputs. The function table for an encoder might follow, for instance, 8 inputs (=2^3) and thus 3 different Y values as 3 different outputs
182
183
184-----------------------------------------------------------------------------
185
186What are the advantages of programming in high level languages?
187
188- The code is portable. It will work on any system
189- Easier to code in due to the constructs of the language: faster development, fewer lines of code, easier code maintenance
190- Assembly code is not portable
191- Assembly code is difficult to learn
192- Assembly code is also error prone
193
194
195However, you might still choose to program in assembly. Why?
196- Assembly code gives you direct access over the hardware functionality
197- Compilers are not perfect at generating optimized machine code
198- Embedded systems programming
199- In some cases, performance of the system is very important, in which case assembly code is the way to go
200
201
202Memory:
203
204- The programmer views memory as a contiguous array of bytes.
205- The index of element of this array is called an address for the memory
206
207Registers:
208
209Sometimes, MIPS needs immediate, temporary locations to the store the results or data from intermediate computation results or frequently used data
210For this purpose, MIPS has 32 32-bit general purpose registers for temporary data storage
211
212
213Data and Text directive:
214
215.data
216Defines the data segment of the program containing data. The program's variables should be defined under this directive
217
218.text
219Defines the code segment of a program containing functions and assembly instructions
220
221main
222special label that defines the entry point of the program
223
224
225MIPS provides a syscall instruction to obtain services from the operating system. Many of these services are provided in the MIPS simulators
226
227
228Bit Masking:
229
230Mask is a special binary number that is used in bitwise binary operations to: query bits, clear bits, set bit values to 1
231
232Binary multiplication:
233
234- Shifting a binary number n bits to the left is equal to multiplying that number by 2^n
235- Similarly, shifting a binary number n bits to the left is equal to dividing that number by 2^n
236When you have to multiply by binary numbers in assembly code, just shift the bits. For numbers that are not powers of 2, decompose the number into a sum of powers of 2 and do shifting for each power and then add all the final results
237
238Assembly Control Flow:
239
240Assembly control is maintained by the program counter that keeps track of the next instruction that is to be executed
241
242---------------------------------------------------------------------------------------------
243
244Decoders:
245
246- n inputs and 2^n outputs
247
248for an 8-bit decoder, the output of 1s follow a diagonal line starting from the first column of outputs in the first output to the last column in the last output
249
250Comparators:
251
252- Boolean functions that take 2 inputs and give one output based on a comparison of the inputs
253(e.g) XOR gate
2541-bit comparator = NOT( XOR(A, B) )
2552-bit comparator = NOT ( OR (XOR(A1, B1), XOR(A0, B0)) )
256
257Shifters:
258
259- Shift By One Circuits: n bit inputs are shifted 1 bit to the left or right as specified by 1 bit control input
260- Implemented using other gates
261AND gate is used to disable inputs (required a 2nd control input), and the OR is used to combine different possibilities
262
263
264Full Adder:
265
266- Full adder has an additional input for carry in
267
268Adders (in general):
269
270- can be a half adder with 2 bit inputs and 2 bit outputs
271- or a full adder with 3 bit inputs and 2 bit outputs
272
273Addition of multiple bit numbers:
274
275- multiple bit numbers can be added together using a ripple carry adder design, where the carry out of one adder becomes the carry in of the next adder, and the carry in of the first adder is a 0
276- How do you implement subtraction in this circuit? Add a 2s complement of the subtrahend by - taking a complement of the subtrahend and adding a 1 using the least significant carry in
277- How do you detect overflow in this circuit? By keeping a tab on the carry in and carry out of the Most Significant Bit (MSB)
278
279We can have an XOR gate connected as input to each gate along with one of the inputs (A) so that a 0 can specify addition and 1 can specify subtraction
280
281Hardware peculiarities ( how is the software of either an AND or OR executed )?
282
283- The trick is to know that hardware is always active, so the output of an AND and OR gate will always be present
284- To 'select' the desired output, we just AND the corresponding input with a 1 using a control input (dashed lines)
285- The outputs not selected is made 0
286- Control input is typically generated by the control unit
287
288
289ALU
290
291- A functional unit that performs:
292Arithmetic Operations
293Logical Operations
294on given data types: 8, 16, 32 or 64 bits
295
296
297Packaging Gates:
298
299- Gates can be packaged into integrated circuits on different scaled:
300small-scale integrated (SSI) circuits, medium-scale integrated (MSI) circuits, large-scale integrated (LSI) circuits, very large-scale integrated (VLSI) circuits
301
302
303So far, we've only been dealing with memoryless functions ( combinational logic)
304Now, we'll also need to store program data and intermediate values ( sequential logic )
305
306Storage Elements:
307
308- Are made from sequential circuits. (But require feedback )
309- Require timing to be taken into account. (Will assume non-zero gate delay)
310- New state is a boolean function which takes into account the inputs from deltaT earlier, and the old state from deltaT earlier
311- A combinational circuit with latency deltaT: output changes deltaT seconds after an input change, but the change is continuous as long as inputs change
312
313SR Latch:
314
315- SR is a Set-Reset Latch. Q and Q' are outputs that are continuously fed back as inputs making it sequential
316- An SR latch is a bi-stable circuit that stores binary values. Q=1 is a "1" state, Q=0 is a "0" state
317- Can only be in one of these states. Output change requires that S and R both be applied. If S=R=0, then Q remains unchanged
318
319S = R = 1 leads to an unstable state, hence the output is undefined
320
321Problems with SR Latch:
322
323- Combinational functions take time to settle. The output varies for a while before going to a correct value
324- If any of S or R change, then Q will change
325
326How do we make sure changes occur
327- only when desired?
328- once per clock cycle?
329
330The answer to both these questions is by modifying the latch and incorporating a clock as one of the inputs
331
332Clock
333
334- Any state change in the system is referenced to clock
335- A clock is also the basic unit of time for executing instructions ( although an instruction may take several clocks to execute )
336- We need to take into account the time taken for signals to propagate through gates
337
338Clocked S-R Latch:
339
340- Inside a computer, we only want the outputs of gates to change only at specific times
341- we can add some circuitry to make sure that outputs of the gates only change when the clock is active (clock is 1)
342
343Thus:
344- Q only changes when the Clock is a 1
345- when the Clock is a 0, neither S or R reach the NOR gates
346
347
348There are still problems that arise from the use of an SR Latch:
349
350- we still have an undefined state that exists for S=R=1. A solution to this is the D-Latch
351- There is still a latency of inputs/clock. One possible solution is to play with the duty cycle and limit when high, or use a flip flop
352
353
354D Latch:
355
356- Now we only have one input D
357- When the input D is high when the clock is high, the circuit will remember the value 1
358- When the input D is low when the clock is high, the circuit will remember the value 0
359
360- If the clock is active, Q will be set to whatever value D currently is. If the clock is low, Q will continue to remain in the state it was previously in
361
362
363Flip-Flops
364
365Flips flops arise out of the need to have the clock cycles become narrower, but theres only so much more that you can push its narrowness due to the nature of frequency, the solution to thus omitting chance of changes to D affecting the output is to only allow the input to change on the rising edge of the clock, and there we have our narrow time window
366
367- D-latches still possess a problem with input changes during a time clock of 1
368- Flip-Flops are a better way of solving the problem of inherent input change arising from the SR Latch
369Flips Flops may be of two types: Master-slave flip flop, or edge triggered flip flop
370- changes can only occur on a clock "edge" or during a transition of the clock from 0 to 1 or 1 to 0
371
372D flip flops Q will take whatever the value of D is when the clock is transitioning from 0 to 1, at any other value of the clock, Q will stay in whatever state it is in
373
3748-bit memory:
375
376- we can make use of 8 D-latches to create an 8-bit memory
377- this consists of 8 inputs that we want to store, all that are written in the same time. ( all 8 latches use the same clock )
378
379Registers:
380
381- Notationally, 8-bit memory is referred to as one single unit of storage called a register instead of addressing each storage element individually
382
383
384Master-Slave D FF:
385
386- Gated D latch is level sensitive
387- For master latch: when Clk = 1, Qm = D
388- For slave latch, when Clk' = 1, Qs = Qm
389
390- When the clock is high, Qm changes to D
391- And since clk = 0 = clk' = 1, when the clock changes to low, Qs = Qm
392- since the output value change is triggered when the clock goes from high to low, this D FF is negative edge triggered
393
394Random Access Memory (RAM):
395
396- This is a type of semiconductor memory (the other type being ROM)
397- Consists of many bits on one IC
398- Each bit is a latch
399- RAM can come in two types: SRAM (Static RAM) and DRAM (Dynamic RAM)
400
401- RAMs have a constant access time from address to data output, independent of the address in memory. This is unlike a tape or a disc
402
403Memory Cell Organization:
404
405- In a SRAM, the inputs are the Data_In (D), the Clock (C) and the Write (En), to one single output Data_Out(Q)
406- In a DRAM, there are controls that access a bit value and a bit cell that stores charge. When it needs to write, charge flows in and when it needs to read it lets charge flow out
407
408--------------------------------------------------------------------------------------------------------------
409
410
411Arrays:
412
413- A collection of objects of the same type stored contiguously in memory under one name
414- The index of each element of the array (byte) is called ADDRESS
415
416
417MIPS Addressing Modes:
418
419- Register Addressing: Data is stored in registers
420
421- Immediate Addressing: Data is encoded in instructions (max of 16 bits). Can only set half of the register.
422The MIPS equivalent of the C '&' operator is the la instruction that loads variable address in a register
423The MIPS equivalent of the C '*' operator is the ($reg) command that treats $reg value as memory address
424
425- Base Addressing
426Base address is stored in a register. Index is a relative offset to the base address and is an immediate value (16 bit number)
427
428Endian-ness
429
430- Endian-ness specifies how multi-byte data is stored in byte-addressable memory
431- Little Endian (x86) - the least significant byte is stored in the lowest address
432- Big Endian - The least significant byte is stored in the highest address
433
434How to declare an array in MIPS:
435
436.data
437
438label: data_type comma seperated list
439
440
441Disk Memory:
442
443- Disk is a magnetic memory
444- sequential access
445- reads and writes are performed on blocks of data (much more efficient this way)
446- much slower than main memory
447- also much larger than main memory
448
449
450Platter breakdown:
451
452- on one platter, data is written to and read on concentric circles called tracks
453- each track is further divided into a number of sectors
454- mechanical arm is responsible for moving the read/write head to the desired track
455
456
457Parameters:
458
459- Rotation Speed: determines how soon the desired data appears under the head
460- Seek speed: speed in moving from track to track
461- The average time to find data is thus: 1/2*rotational delay + seek time
462
463Transfer rate is usually around 100 MB/sec
464
465
466Magnetic Disk Organization (Terminology):-
467
468Track: Sequence of bits having same radius on disk
469Cylinder: Set of tracks having same radius on many platters
470Sector: Fixed length within a track
471Seek Time: Time it takes to position the head over a track
472Rotational latency: How much time it takes for a disk to rotate to a desired sector
473Head transfer rate: How fast data can be transferred once the head is in the right position
474
475Operation Sequence:
476
477- Address is supplied as {platter, track, sector}
478- Read/Write Head rotates to track
479- Platter rotates to sector
480- Now read/write sector
481
482Can also read a cylinder: read the track from all platters at once
483
484
485Optimization:
486
487- When required, read an entire sector or track or cylinder
488- store the read data in the memory in the controller. (the controller satisfies future requests from RAM)
489- Disk controller and the OS may optimize by selecting disk I/O request order to minimize seek time
490
491
492Data Path Operation:
493
494- Read operands from register file
495 Place on buses A and B to ALU
496
497- Specify ALU operation
498
499- Wait for ALU
500
501- Set flags N, C, Z
502
503- Write result bus to selected register
504
505
506Data Path timing:
507
508- Control signals change after next rising edge
509- Registers read after the control signals become valid
510- ALU operation takes some time
511- Result written to register file on the next rising edge
512
513
514What is memory:
515
516- linear array of words (each word typically 4 or 8 Bytes)
517- memory is byte addressable
518
519
520Steps for memory access:
521- generate memory address
522- access memory address
523- read data from memory into MDR
524- move data to the register file or ALU
525- put data out to memory