Cse560 Heym

Hints on Working on a Team
Until this point in your academic career, you worked primarily independently, and on projects of very limited scope. Once you are employed as a programmer, you will rarely work independently on a project again. A skilled programmer can only turn out an average of 10 lines of well-designed, documented, and debugged code per day. With most systems programs and larger applications requiring many thousands to hundreds of thousands of lines of code, these are clearly beyond the scope of a single hot-shot programmer; the time to market would be too great. Hence working with a team to produce a major software system is an essential part of being a computer scientist. The stereotype of a computer programmer as a loner who communes with his/her machine to avoid people could not be further from the truth. Professional programmers spend more time in design meetings, in code walk-throughs, communicating with other programmers, with users, with the system maintainers, with marketers, etc. than in front of a monitor. That should be your experience in this course as well. One problem with working as part of a team, and working on very large software systems, is that the program is too large for any one programmer to understand the whole system. A software system such as an operating system has more components than a Boeing 747, and it is clear to us that no one person understands each and every component of a 747, much less their interactions. Hence adherence to good software engineering practices is essential if the results of a large group programming eort are ever going to work together, be suited for debugging, be maintainable, be modiable, or meet the original requirements. Working with a team can be anywhere from fun to awful, depending on your attitude and the attitudes of your teammates. Since you may not know the work habits or attitudes of your new team members how can you ensure a successful project and fairness in grading? Everyone needs to be involved in each aspect of the project (design, documentation, test planning, coding, and testing). In order to work on a team you will have to be considerate of your teammates. They are all High School Graduates and college juniors or above. In spite of your rst impression, they are capable, intelligent people, and deserve respect. Most team problems occur because a member of the team currently has too many commitments in life. This over commitment may be due to school class load, work issues, family issues, etc. It does not necessarily mean that they are lazy or stupid. However, if you feel that a team member is not attempting to contribute to the team, just let me know. We can have some friendly discussions and many times resolve the problem. You should all agree on a common language and a common hardware platform. Unless you are all extremely skilled (or masochistic), you should not use multiple platforms. Planning is the most important thing you can do. One hour spent in a preliminary team meeting saves many individual hours of redundant and possibly incompatible coding. Many students seem to think time planning and designing away from the keyboard is wasted eort. Such eort will not be wasted in this class.
Based on handout prepared by Al Stutz
I feel that my teammates are not doing their fair share. Contact the graders and/or the instructor. We will have a group meeting or individual meeting to determine exactly what the problem is. The earlier we discover and correct problems the more exibility we have in making adjustments. My teammate is a coding whiz-kid and has decided to simply do it all by her or him self. Contact the graders and/or the instructor. We will have a group meeting or individual meeting to determine exactly what the problem is. The earlier we discover and correct problems the more exibility we have in making adjustments. The whiz-kid who prevents others from working on the lab will have his or her lab grade reduced. Two of my teammates are long time buddies and they do everything together (software wise) and leave me out. Contact the graders and/or the instructor. We will have a group meeting or individual meeting to determine exactly what the problem is. The earlier we discover and correct problems the more exibility we have in making an adjustment. What is dierential grading? Why should we avoid it? If the graders and I determine that a fair share of the work was not done by all team members, then dierent grades will be assigned to each team member. If one team member does it all, he/she may get a lower grade than the rest of the team. But no one will be happy with the grade. The team members and the instructor will have a meeting to discuss the problem and hopefully correct it. However, a dierential grade may still be assigned. Can I still pass the class without doing anything on the labs? Absolutely not.
General
Problems are hardly ever fully dened. Welcome to the real world! You get a set of end user requirements 1 , then you need to study, examine, and sketch out issues and concerns. You will need to ask leading questions. While I do not intentionally leave information out, end user requirements hardly ever match the level of detail needed by the programmer. You and your teammates must agree on some set of standard coding practices: Will variables be passed as parameters or will you use global variables? A standard format for variable names. Variable names that represent a meaning. Names such as a, x, z, and n are not as clear as number of cases, location counter, etc. You should agree to a maximum module size. If a module needs to be larger than that, break it up. My rule is two screens worth, including comments. You will need to agree how to share les and how to know when a module should be added or a new update added to the lab. You might designate one team member has having sole responsibility to update the program les. If everyone makes changes then you will have a real mess! Another alternative is to make use of the Unix cvs utility.
1
The somewhat misleading term specications is often used here.
You should learn to use the make utility. You should consider writing several Unix scripts that will change the permissions of les for easy compiling. You might also want to write a script to facilitate compilation via the Unix make utility. Use common modules to avoid duplication. What if we have a question? Order of operations for maximum success! 1. email the grader 2. visit the graders during oce hours 3. call the grader 4. email the instructor 5. use the instructors oce hours 6. call the instructor All these options are acceptable and encouraged.
General Things (before you write any code):

1. Establish a regular meeting at a time and place where everyone can meet. 2. Keep minutes (i.e., the ocial record of the proceedings) of meetings and/or discussions. 3. Publish clear assignments and due dates. 4. Think about testing as you design the system. What are the syntax rules? How will you discover very subtle defects?
Design:
1. Layout a top down design. 2. Look for routine modules you will need repetitively such as: binary to hex, binary to decimal, decimal to hex, decimal to binary (could these really be one routine? should they be?), building a table (whats the relationship, if any, between a table and a partial map?), searching a table, . . . . 3. Write a dummy module for each routine needed. The following sample dummy modules implementation is written in pseudo-code.
Check_for_overflow: begin begin comment Procedure Name: Check_for_overflow Description: This routine determines whether the results of the operation would have resulted in an overflow. In this system the data length is 24 bits so the results range from -8,388,608 to +8,388,607
Calling Sequence: (temp_result: Integer, overflow: Boolean) Input parameters: temp_result Output parameters: overflow Error Conditions Tested: overflow Error Messages Generated: message ### Original Author: Al Stutz Procedure Creation Date: February 22, 1995 Modification Log: Who when why Al 3/11/93 Forgot to send message to screen Wayne 1/3/02 Corrected mismatch between data length and results range in description; introduced quote marks (") into pseudo-code; changed indentations in the comment; inserted colon (:) into pseudo-code; introduced types into the calling sequence; changed parameter name from flag to overflow; changed ambiguous "Initialize overflow" to more precise "Set overflow to false" in the pseudocode; end comment Set overflow to false Write "In routine: Check_for_overflow" Write temp_result, overflow end;
Documentation:
1. Draft the user guide before any code is written. It is easier to make modications in this document rather than in the code. 2. Write down clear assignments to team members. 4
Designing a Test Plan:

Test plans should be logical. For example you should test all categories of executions of arithmetic instructions in one test. In another you should test all the shift operations, in another all the branches (jumps), . . . . 1. Write down an overall test plan similar to the above statement. 2. Describe the behaviors each test is trying to be a counterexample for. 3. For each item being tested, you should determine the expected correct outcome by hand before making a run to see if the programs result is incorrect. 4. Never hesitate using extra output (write) statements in your code. They will help you debug and help us in grading. 5. The grader may provide you with a grading sheet that shows the level of detail that we will check for. Be sure to review this and use it as a guide for your planning. (Hint: We expect everything listed on the grading sheet to be tested, and as many more things as you can think of.)
Writing Code:
Even it you ignore all other advice, you should not do any coding before you complete the above steps. You must know what needs to be done and the limits of your own testing. You will, of course, want to create code for which your own tests can nd no counterexamples; furthermore, you will want to strive to create code for which no possible test could reveal a defect. That is to say, youre striving to create code without defects. This is why you hope that your test plan will reveal as many defects as possible. 1. As routines are written, they should be tested, even in their dummy form. Once you have all the routines identied (you will miss some), then start expanding them and step testing as you go. Test changes as you make them rather than all at once! 2. Have someone other than the module author also test it.
Testing:
All the nal testing must be done on the same version of the code. If in one of the tests an error shows up that you subsequently x, then you must re-run all previous tests. Your one line change could very well impact the results of an earlier run. (We have many examples where a small change has been ruinous.) It is very embarassing for some very basic and simple feature of your program that you know was working yesterday to fail during grading because of some little, last minute, change you made to x some advanced feature. Re-run those tests! 1. Use realistic test data. 2. Test extreme cases. 3. Test each function. 4. Test each error message. Hint: The grader will often assume that you have the basics working but try to catch you on a ne point. Your program better not crash if the graders test script references a memory address out of range, or tries to feed a gif le to your assembler. Your program should gracefully catch the error and print an informative error message.
THE OHIO STATE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
CSE 560 Software Design
One of the major objectives of CSE 560 is the understanding and practice of techniques that enhance the development of quality software. The process of software development often is thought of as being composed of stages. In the rst stage the problem that needs to be solved is dened and the requirements that the software must meet are identied. The next stage consists of designing proposed solutions to the problem, evaluating alternatives, selecting one of the alternatives, and detailing the (modular) structure of the chosen solution. Next, the program is constructed in accordance with the design specication (this is the stage, coding, with which we all are familiar). The software also must be validated (e.g., through testing) to measure conformance with the specications laid out in the early stages, and installed so that the customer can use it. Throughout the development process documentation is the chief means of communication and management control. In formal development systems, there are specic documents that are required to be produced during these various stages of development. Each stage itself can be further decomposed into tasks, and each task can result in the production of some task document. Even for relatively small projects such as we have in this course, there are several reasons to follow a more formal approach: 1. To prepare you for more complex and more formal development environments in the real world. 2. To become more conscious about the various tasks that one goes through in developing a program. With this kind of awareness, one can more specically address sources of error in the development process. 3. To facilitate the later use of CASE tools to assist you with your project work. 1 4. To allow more direct supervision by the instructor. Each of the labs in this course goes through only a subset of the stages identied above. Requirements are provided by the instructor and installation is, for the most part, skipped because we arent in a production environment. You each have had experience with construction and testing of your software. Remember that you must prepare your own test data even though your software also may be tested by the grading assistant. This leaves the design stage, which we want to emphasize in this course. In order to assist you with this stage, we have identied a number of subtasks that you should perform,
CASE stands for Computer Aided Software Engineering. Though no CASE tool is stipulated in this course, you are welcome to use one with which you are familiar. Even if you do not use any such tool, the formal approach suggested in this document will prepare you for later use of CASE tools.
1
and a suggested order in which to perform them. Each task has some output, which you are required to produce and (with the exception of task 1.1) turn in as part of the writeup. These will constitute the programmers guide and part of the users guide portions of the writeup. To get you started, we have provided suggested output for a few of the earlier tasks. Feel free to augment our suggestions with your own. A preliminary version of your design is due the day before your design review meeting, so do not delay in getting started. The output for the tasks (particularly those in categories 2 and 4) can be produced using CASE tools that you know or using plain old pencil and paper. The diagrams and data descriptions should be shared among members of your group so that consistency is achieved as each of you works on your respective parts of the project.
Design Task List

1.0 Dene Design Framework 1.1 Review requirements 1.2 Identify development standards and utilities 1.3 Identify top level system structure 1.4 Prepare descriptive narrative 2.0 Dene Data Structures 2.1 Finalize input layout 2.2 Finalize output layout 2.3 Finalize major shared data elements 3.0 Identify Major Design Conventions 4.0 Dene Modular Structure 4.1 Identify modules and their interrelationships 4.2 Prepare detailed module descriptions
1.2 1.1 1.3 1.4 2.1 2.2 3.0 2.3 4.2 4.1
Figure 1: A Possible Task Flow Diagram Note: These tasks are not likely to ow smoothly from one to the other, as the diagram above might suggest. Rather, you probably will nd it necessary to iterate on certain tasks, especially when performing tasks 4.1 and 4.2.
Task 1.1 Review of Requirements

Objective: To develop a comprehensive understanding of the requirements for the system. Inputs: 1. Problem statement handout 2. Machine description handout 3. Instruction set description handout Task Summary: 1. Thoroughly review the handouts to become fully familiarized with the system requirements. 2. Identify key components of the system requirements. Outputs: Identication of descriptions in handouts of: 1. team conguration and responsibilities 2. quality assurance requirements 3. functional requirements 4. input requirements 5. output requirements
Solution: Make sure you have identied each of the requirements and constraints for the problem. Typically the problem description is not organized so that all of the requirements in each category are together, but it is important that you know exactly what youre expected to do in each of these categories. In the programmers guide, explicitly identify the responsibilities given to each team member. This is the only part of the solution to task 1.1 that you need to formally document.
Task 1.2 Identify Development Standards and Utilities

Objective: To identify the development standards, utilities and other support software to be used in the design of the system. Inputs: 1. Output of task 1.1 2. Knowledge of current course standards for internal design of system 3. Documentation on available utilities and support software Task Summary: 1. Based upon handouts and announcements in class, identify any standards that must be observed and applied in the development of the design specication. 2. Identify the available utilities and support software to be used. Outputs: 1. A tabulation of standards to be observed 2. A tabulation of utilities and support hardware and software to be used
Possible Partial Solution: Standards 1. Complete each design task identied in the task list table 2. Include output of each task in writeup, organized according to the task list table 3. Documentation is required of all team members 4. Use structure chart notation to show module relationships 5. Use Jackson notation for data structure diagrams Utilities and Support Hardware and Software Sun Workstation Unix operating system and X-windows Modula-2 Compiler Emacs
Task 1.3 Identify Top Level System Structure

Objective: To develop a high level system structure that will eectively implement the given requirements. Inputs: 1. Output of task 1.1 2. System ow diagramming expertise Task summary: 1. Allocate system functions to probable high-level programs. 2. Allocate system les to probable high-level programs. 3. Produce a system ow diagram that depicts the high-level structure of the system. Outputs: System ow diagram showing all the major planned programs, les, and outputs.
Possible Solution:
Load Object File object file
initial m/c state Interpret Instruc. initial state final state trace User
Task 1.4 Prepare Descriptive Narrative

Objective: To enhance the understanding of the system ow diagram with a narrative description that explains and claries the intent of its processes. Input: System ow diagram from task 1.3 Task summary: In narrative form explain the system ow diagram, emphasizing data ow, and giving the intent behind the function being performed. Output: Narrative explanation of the system ow diagram.
Possible Solution:
Load module: The user will create a le containing records that will be used to initialize the state of the 560 machine. The load module will read this le, providing initial values to various memory locations and various registers of the 560 machine. A display of the machine conguration is generated at the end of the load process. Interpreter module: Starting with the initial state provided by the load module, the instruction at the address given by the program counter is fetched and decoded, and the operation indicated by the instruction is performed. Each instruction appropriately resets the program counter. This entire cycle is repeated until the nal state is reached, when a HALT instruction is encountered or a fatal exception condition is reached. A trace is generated as each instruction is performed. A display of the machine conguration upon normal or abnormal termination of the simulation is generated.
Task 2.1 Finalize Input Layout

Objective: To describe the data element layout for all input les. Inputs: 1. Task 1.1 output item 5 2. Task 1.3 and 1.4 outputs Task summary: For each input le shown in the system ow diagram, prepare a detailed description of the le characteristics and record layouts. Outputs: 1. Detailed description of the input le characteristics. 2. Detailed layouts for input records.
Possible Solution: File characteristics: record input; each record is of type 1, or 2. There are 13 characters per record for type 1 records, 7 for type 2 records. Record layout: probably some kind of data structure diagram, perhaps using the Jackson notation. A sample is shown below.
input file
header record
text part
H-code
start exec.
seg name
seg length
IPLA
text* record
T-code
mem.addr.
init. contents
Each of the primitive subelements start exec, seg name, etc. should have a description as well. The descriptions should be in terms of a simple data type and its possible set of values (e.g., start exec, and mem. addr. might be of type cardinal with range 0..255; init contents might be declared as an integer, or array of char).
Task 2.2 Finalize Output Layouts

Objective: To describe the data element layout for all output les. Inputs: 1. Task 1.1 output item 6 2. Task 1.3 and 1.4 outputs Task summary: For each output le shown in the system ow diagram, prepare a detailed description of the le characteristics and record layouts. Outputs: 1. Detailed description of the le characteristics 2. Detailed layouts for the program output
Solution: Complete for each of the following, in a manner similar to that used for the outputs in task 2.1. 1. Initial state 2. Trace of execution 3. Final state Also note the way in which errors are reported. This isnt shown separately in the system ow diagram, but you may wish to create a separate error le. If so, modify the system ow diagram and complete the output description for the error information.
Task 2.3 Finalize Major Shared Data Elements

Objective: To describe the layout and structure of all major shared elements, including structured elements such as arrays and any global data structures. Input: Task 1.3 and 1.4 output Task Summary: Complete all shared data element denitions. Output: Detailed specication for all major shared data elements.
Solution: Include element name, purpose, and attributes (diagramming any substructure as you did with the input and output descriptions). Note: If a Modula-2 module, a C++ class, or a RESOLVE/C++ compontent will be encapsulating a type that is a major shared element, it is not necessary to describe that element here because it will be described in the detailed description for that module, class, or component.
Task 3.0 Identify Major Design Conventions

Objective: To record some special design conventions, and other observations that are important to the solution to the problem but may not otherwise be immediately apparent. Inputs: Task 1.1 and 1.4 output Task summary: 1. Review specic quality assurance requirements and identify their likely eect on the design. 2. Identify special problem constraints and their eect on the design. Output: Itemized special conventions agreed to as part of the design, and other observations that should be carefully considered during design and implementation.
Possible Partial Solution: 1. Not every word has initial contents read in from the input le. 2. Text records need not be in numerical order by word address. 3. All memory cells will be initialized to (ll in the value your group has chosen).
10
Task 4.1 Identify Modules and their Interrelationships

Objective: To develop the shared modular structure of the system. Inputs: 1. Outputs from tasks 1.0, 2.1, 2.2 and 3.0 2. Knowledge of functional decomposition principles Task summary: 1. For each system function, identify major procedural and/or data abstractions that appear to be needed. 2. For each such abstraction, decompose into more primitive components, continuing this step until the lowest level components are now elementary enough that further decomposition would be of questionable value. 3. Where alternative decompositions suggest themselves, evaluate the alternatives and select the one that appears to best satisfy the requirements. 4. Diagram the chosen decomposition, showing the modular structure, names of the modules, and their inputs and outputs. Output: Structure diagram (structure chart showing modules and their interrelationships.
Possible form of the solution: A graphical view of the program structure

1 2 3 B C A
Meaning (for procedural abstractions): The program A is thought of as invoking (calling) 2 modules, B and C, which presumably are invoked in that order. Module A provides B with inputs 1 and 2 and B returns 3. Module C is itself composed of D, E and F, and E has a submodule G. (Inputs and outputs for the other interfaces are not shown in this example, but all interfaces should appear in your solution.) The names given to modules should be simple commands such as Interpret Instructions, Compute Target Address, etc. This kind of diagram is called a structure chart. Meaning (for data abstractions): A data abstraction module Data Template provides type D and operations A, B, and C. A parameter d is both an input and output parameter to operation A. 11
Data Template D d Type D A B D
(Again, other parameters to the operations are not shown in this diagram, but should be when developing the completed module structure.) The boxes with curved sides represent the data type component of the module, while the boxes with the triangles on top represent the operations provided by the module. The triangle on top of a box denotes that this operation is lexically included as part of its parent, rather than being called from its parent. To show that this data abstraction module is invoked by (imported into) another module, simply show a line connecting the Data Template box to the other module.
12
Task 4.2 Prepare Detailed Module Descriptions

Objective: To describe each module in the modular structure in sucient detail that it can be coded in a straightforward manner. Input: Outputs from tasks 2.0, 3.0 and 4.1 Task summary: For each module in the structure diagram of 4.1, provide a statement of the purpose of the module a detailed description of each input and output, whether parameterized or global an overview of the algorithm (using pseudo code) if the module is a procedural abstraction, or an overview of the algorithms for each operation if the module is a data abstraction Output: Detailed design for each module
Elements of solution: (for each module) Module name: For a procedural module, this should be a declarative command like interpret instructions. For a data module, it should be a descriptive name of the data abstraction, like stack of integers. Formal Parameters: Name and type of each formal parameter in order of calling sequence. For data abstractions, there may not be any parameters to the top module box because not all components are templates. But there nearly always will be parameters to the other operations provided by the data module. Global Elements required, if any (descriptions should be included here unless they already are included in task 2.3). For data modules, any internal state information that is local to this module should be described here. Statement of purpose of module: A brief, one sentence description will do. If its too hard to write a concise statement of purpose, this may be a clue that the module isnt well thought out. Pseudo code: If this is a data abstraction, give the entire denition module, and pseudo code for each operation. It also would be nice to see pre- and post-conditions for each operation.
13
[This page was left blank intentionally.]
14
Principles of Good Technical Writing

Roshan Rao and Wayne Heym Writing is a process of communication between you and your audience. Generally, it involves reading and synthesizing material from dierent sources. The writer collates all the dierent threads of information together and presents it to the reader(s). Writing is not a passive process. Rather, it is creative. The writer should not just string the material together. Instead, he/she should integrate and interpret it in a manner suitable for the audience. Some guidelines: Before starting out, determine the needs and uses of your document. In the case of a software design document, a person who wants to redesign or implement a computer system would form your audience. When you write, adopt a tone appropriate to your audience. This means that you: Dont belabor trivial points. Stress the most important material. Present the material as simply and as clearly as possible. Strive to be as professional as possible. When you begin your report, you will nd that its dicult, if not impossible, to think about everything all at once. Experienced writers take a paper through stages starting from a rough draft and nally ending with a polished document. Initially, they concentrate more on content than on grammar, style or punctuation. The focus, organization, paragraphing and overall tone - these are considered rst. Only later come grammar, sentence structure, word choice and such other matters. The writing process is a development process. So, its stages have many parallels with the design process. Design includes the following stages: 1. Sketching an overall architectural structure of the solution. 2. Analyzing the proposed solution to see if it meets the specications. 3. Examining alternative solutions for correctness and relative quality. 4. Detailing the chosen solution, i.e., repeating steps 1-3 at ner levels of detail. The stages listed above are typically not sequential and you may need to iterate, especially when aws are discovered during the analysis phase, such as in the testing process.
1
Likewise, writing can consist of the following stages: 1. Generation of ideas and outlines. Think! Write down ideas. Ask yourself questions and generate a list of specications. Rearrange them, put them in groups. You may need to iterate here till you are able to sketch out a high-level structure and identify key components. 2. Prepare an initial draft. Flesh out details of the parts of the structure created above so that it can be read and understood by your audience. Basically, at this point you should have a prototype of your report. 3. Make big revisions. Evaluate basic ideas and make sure you are conforming to the requirements stated in (1). Look at the structure and the paragraphing. The organization of the report should represent a sensible coupling of ideas. Revise the components as needed. 4. Revise your paragraphs. Paragraphs reect the organization of your report. A paragraph should be cohesive, i.e., unied around an important point. (a) State the purpose. You should tell the reader the main topic of the paragraph as early as possible before the reader gets lost in it. (b) Be pertinent. Reject matter that is unrelated to the main theme of the paragraph. Develop the main point and enlarge on it. (c) Proper coupling. Link paragraphs to paragraphs. This improves the overall organization and coherence of the paper and ensures a smooth ow of information. 5. Revise your sentences. Now, you should look at individual sentences and assess them for style and clarity. Basically, at this stage, you are evaluating operations within components. (a) Highlight major ideas. Decide what ideas are worth emphasizing and put them in subjects, verbs or objects. Dont have too many short sentences, but dont move to the other extreme either and put too many ideas in one sentence. (b) Add necessary words. Put in words that are needed for logical completeness of the structure. Add words needed to complete compound structures. (c) Use good grammar. This means that you : Resolve mixed constructions. Fix misplaced and dangling modiers. Check if quantiers are properly bound. Provide consistency for verbs, etc. 6. Choose the right language. Your choice of words should suit your audience and your topic. Avoid jargon, slang and the like. Use proper math constructs and check expressions for succinctness. Avoid too much negation. Check logical connectives for succinctness and understandability. Choose an appropriate tone.
2
7. Edit your punctuation Check if your punctuation is appropriate to the context. Note: The latter steps represent the implementation stage of the writing process. Its an iterative process and you may have to move back and forth through each stage as you discover aws in your eort (testing and implementation are interspersed throughout the process). 8. Finally, you will be ready to show your report to the world. This represents the end of development. In the software lifecycle, this may correspond to the installation phase.
[This page was left blank intentionally.]
Guidelines For Writing A Software Report

Roshan Rao and Wayne Heym
A technical report is generally more intricate than the average essay. It contains complex materials, which need to be arranged in a suitable way to help readers read and understand the report quickly. It should be as brief as possible, yet as precise as possible. Accuracy is important, particularly in design documents. A complete design report consists of the following components: 1. The front matter. 2. The body of the report. 3. The references. 4. The appendices. These components are elucidated below for a software report.
The Front Matter
This helps readers use your report eciently. It includes the following: 1. The Title Page. 2. The Table of Contents. 3. The Introduction. 1.1 The Title Page
The title page is right at the head of the report and is the rst thing readers will look at. It should comprise the following: The title. This should be well-chosen and should clearly reect the content of the report. Names. The names of the people responsible for the report. The date. When the report was submitted.
1.2
The Table of Contents
This is a map of your report. Your readers will use it to nd their way through the report. It should be fairly comprehensive and should list all the sections and the subsections of the report in the order in which they appear and the page numbers on which each of them begins. It should be well-designed and should distinguish between sections and subsections by using upper/lower case letters and indentations. Figures and tables should be listed separately after the contents. 1.3 The Introduction
This gives a general overview of the project. It should provide the concepts on which the project is based and how it works. It should also lay the foundation for the other sections.
The Body of the Report
This is the main part of the report. In CIS 560, it might include the following sections : Users Guide. Programmers Guide. Source Code. Testing Documentation. Alternatively, the Users Guide, Programmers Guide, etc. can each be considered reports of their own, containing their own individual front matter, body, etc. In that case, the front matter, etc. would be more specic and relate to the particular report. 2.1 Users Guide
This should cover the basics of using your system. It should explain the capabilities of your program to the user and show him/her how to use it. It should not give the inner details of how and why you have written the program. Basically, it should cover the following: Learning to use the system. Getting started. Starting and exiting from the program. Other basic topics like expected input and output etc. Introduction to dierent commands. This section should cover the instruction set and can typically be subdivided as follows : Understanding the command syntax. Advanced commands.
Error messages. A descriptive list of error messages. How to recover from errors. An Index. If your Users Guide were a standalone document for a large system, then you could have an index containing all the signicant terms you have used. However, for this course, an index is not required. If one is produced, it may be better that it be a global index covering all documentation for the project rather than being separate for the Users Guide. 2.2 Programmers Guide
Almost invariably, someone (perhaps the original authors) will need to modify the program. The Programmers Guide is meant for a knowledgeable user who wants to know how it works, i.e., it should portray the design details of the program. Each design detail is the conclusion of some design decision. It should include: A Description of Data structures. The Purpose and Specications of the Dierent Modules. Their Inter-relationships. Error-handling. Parameter lists. It should describe your program concisely so that when the user looks at the program, he/she knows where to look for a particular structure/function. 2.3 Source Code
This is an important part of the overall system documentation, and may be considered part of the Programmers Guide. It is identied separately because it contains the implementation of your modules and data structures, rather than just their description and specication. Your program should include the following features: Modular code with appropriate indentation. So that it is easily readable. Good choice of variable names. Comments. These should neither be too sketchy nor too verbose.
2.4
Testing Documentation
This should contain: A Test Plan. This describes the dierent tests that are to be carried out, what they test and their input and expected output. Actual Test Runs. This portrays the actual results generated by the program for specied inputs and forms a collection of examples for running the program. Testing can be carried out separately for each module of your project and the documentation should reect this.
References
This section details the books, journals etc., to which you have referred for the project and also points the reader in the right direction, should he/she desire to learn more about the technical principles behind the project.
Appendices
Here, you can include information that, while it may be valuable to certain readers, can be omitted while still understanding the gist of the overall report. Sometimes the appendix includes extensive descriptions of matter that is more concisely used in the report body. Some candidates for the appendix of your report might be: The Instruction Set of the machine. A Glossary of terms used in the report. A list of errors discovered in the program and how to x them. A common term for this list is Errata. Possible enhancements to the system. An Index.
CSE 560
Required Lab Documentation
The rst thing to consider in doing a CSE 560 writeup, or any writing assignment whatsoever, is the audience for whom you are writing. The actual audience for your 560 writeups, of course, is the grader (and/or instructor) for the course, yourself, and your lab partners. We would like for you to imagine, however, that you are writing to fairly typical computer users experienced programmers who would like to nd pre-packaged software to ll their needs rather than write their own. You should imagine that your nished documentation be available on the world-wide web for potential users to browse through or study. We can reasonably assume that if one of our hypothetical users cannot nd a software package that ts the bill exactly, he or she will be willing to try to modify one that is close. The rst consequence of writing for this imaginary audience is that your documentation should have several distinct parts that will be used for distinct purposes. these will be described below under the headings Users Guide, Programmers Guide, Test Plan, and Meeting Minutes. Presumably you already have had some experience coding and testing programs, but perhaps little experience designing systems. For this reason, this document has a sequel, CSE 560 Software Design, which goes into more detail on the design of systems. Another consequence of our audience is that the organization and style of your writeup are almost as important as its content. If a prospective user cannot nd necessary information about your program, he or she is likely to give up on your program and look for another. Above all else, you should be concise. Try to avoid redundancy as well as ambiguity and omissions. To convey relationships among elements of your report, use tables and pictures rather than prose whenever possible. The CSE 560 Software Design document provides details and a suggested format for useful tables and pictures. A CASE tool may be employed in preparing this information. Do not reiterate a program from your text. Algorithms described at the same level of detail as the program itself are useless to our hypothetical user. He or she needs the big picture, not bit twiddling details, most of the time. Nearly all sections/levels of your documentation should be hypertext. (Nearly all exceptions to this rule will be diagrams or pictures.) The top level should either play the role of a table of contents, or there should be a link from the top level to a table of contents page. Each part can be reached from this table of contents through a link, making it easy to open to any particular one. Each part should begin with its own cover page, stating document information (e.g., title, date written, primary author) and group information (e.g. names of members). You should generously supply cross references to other parts of your documentation; use hyperlinks to implement these cross references.
Users Guide
The users guide should explain what your program does, and how a user can get the program to do it. Our hypothetical user does not need to know why you wrote the program. He or she simply wants to do X and wants to nd out if your program can do it. Based on this part of your writeup, a user must be able to install your program, make it run, and be able to understand its output, error messages and all. Write the users guide as if the grader knows nothing about the specics of the lab; the users guide should be communicating those specic details. It should explain every aspect of running and using the software, including troubleshooting. When describing what your program does, it is not necessary to copy the original requirements verbatim into your documentation. It is perfectly reasonable simply to paraphrase appropriate sections of it, or attach in their original form whatever parts are important. Remember, however, that in CSE 560 (and in virtually any system you will encounter) some parts of the problem are left unspecied. This means that you will always have something original to say about what your program does. Note that when you are working in a group (as you are in this course), it is important to have this part of your documentation done very early so that everyone in the group is working from the same requirements. Descriptions of the inputs to and outputs from your program, including their formats, are essential in a users guide. By reading this document, the user should be able to visualize the reports produced by the program. Error messages and their descriptions are also essential, as are any instructions and conventions needed to access the program. CSE 560 Software Design has some further information about these issues because some of this information is needed for both the users guide and the programmers guide.
Programmers Guide
The programmers guide should tell the prospective user how your program works. This is necessary in case he or she nds that the program needs to be changed in some way. The user needs to nd out fairly quickly how much work the change will take. Having to turn immediately to a long program listing will discourage our user, and probably will result in your program being set aside in favor of another (and someone else getting credit for writing a versatile program), or will result in an unnecessary new system which will be costly and wasteful of resources. Instead, the programmers guide captures the design details of the program the blueprint by which the nal program was written. It is in this part of the writeup that you should describe your data structures, the algorithms you have chosen, the module structure, the way errors are handled, etc. This is not an appendix to your program. The user has not looked at your program yet, but rather is trying to nd out whether or not to look at it, and on what parts to concentrate. You should not force the user to turn to the program to make sense of the writeup, but there certainly will be details in the code that are not in the programmers guide. Note that, in addition to the users guide, the programmers guide should be a working document for your group. To this end, it should separately document: (1) data structures, (2) relationships among modules, (3) module interfaces, (4) modules themselves. In documenting data structures, it 2
is important to describe the role the structure plays in the execution of the program (e.g., an object called pc may represent the program counter of the virtual machine), as well as its implementation (e.g., pc may be a record having two elds, one called length and the other called value), and any invariants (e.g., pc has a value in the range 0 to 65,536). In documenting modules, it is important to show which modules invoke which others, as well as how individual modules work and, lest we forget, what they do. Modules that encapsulate a data abstraction should separate the specication (i.e., denition) and algorithmic (i.e. implementation) details. Parameter lists are an essential part, but not the whole, of module documentation. Module interface descriptions must include what a module assumes about its calling environment (requires) and what it, in turn, guarantees to perform (ensures). A programmers guide should contain a very thorough description of the ow of control of the software. By reading only this guide, a programmer can learn everything that the software does, and how it is accomplished, without having to look at any code. The programmers guide should provide sucient detail about the design of the software and how everything works together. The CSE 560 Software Design handout provides you with templates to help you express this information in an organized manner. A CASE tool may be employed to assist you in producing the programmers guide, and in sharing its contents with other members of your group.
Data Element Dictionary

This section is used to describe each shared variable used in the program. The following format can be used.
Variable Name
Local/Global
Type
Declaring Module
Purpose
Code
An essential part of the documentation of any program is the source code itself. Despite the foregoing, all that you have ever learned about comments, choice of variable names, blocking structure, etc., still applies. Do not forget that someone modifying your program needs to be able to read it. The code should be organized so that individual modules and data structures can be found easily and read quickly. It is a good idea to adopt a precise coding standard, such as can be found under the class web site for C++ and for C.
Test Plan
Your test documentation is important to the hypothetical user for two reasons. First, it provides some indication that, at least sometimes, the program actually does what you say it does. Second, it provides a source of examples for running the program. It probably is obvious that the test 3
documentation should normally include a collection of actual runs of the program in which both the input and output are clear. Perhaps not so obvious is that the test documentation also should contain a test plan that describes the testing that is proposed to be done, rationalizes why these tests were chosen, and indicates the expected outcomes of each test case. In a sense, creation of the test plan is part of the design process. As such, it should be done early in the development process. If a mistake is found in your implementation, it should be possible to quickly nd where in the test plan this feature was (or was not) exercised.
Meeting Minutes
In this course, perhaps for the rst time in your major program, you will be working on a technical project as part of a group. Group projects oer the advantage of not requiring that each individual be responsible for every part of every assignment, but, at the same time, oer the disadvantage of having to depend on others to do part of the assignment correctly. Welcome to the real world. One of the most important elements in a successful group project is eective communication among the group members. You should meet with each other, and communicate via e-mail, frequently. At those meetings, there normally will be a set of topics covered, (possibly tentative) decisions made by the group, and perhaps assignments made to individual members of the group. A record of the meeting should be kept, including the date and time of the meeting, the topics covered at the meeting, the major ideas and rationals coming from the discussion, the conclusions reached (and their justication), and the assignments (often called action items) made to individuals as a result of the meeting. For each meeting, one of the group members should be assigned the responsibility of taking these minutes. The minute taker should type in the minutes and send them out to the members of the group via e-mail as soon as possible after the meeting. In addition, an archived copy of all minutes should be kept to be referred to by the group and by graders or the instructor. Each meeting (and the corresponding set of minutes) should begin with a review of the open action items from previous meetings, so that problems may be caught early. Action items should have milestones that are ne-grained enough to permit the group to determine whether a task is behind schedule early enough to be able to do something about it.
Members of the group will have dierent documentation responsibilities for the dierent labs. It is required that each member of the group have primary responsibility for a users guide or a programmers guide by the end of the quarter.
Lab Submission Instructions

You should place your documentation files (including source code) into a directory structure. Name your directory <groupname>_<labname> for example, c560aa01_lab1 I suggest you use organize your directory with subdirectories using a logical format, that is, subdirectory names such as doc, src, tests, etc. for the various parts of your lab. Do not submit executables. Your User Guide should contain instructions on building and setting up the emulator. Your install procedure should be as user friendly as possible. Your directory should contain a file named README or README.txt containing a list of the files submitted and any other immediately useful information (such as a suggestion to read the User Manual). To retain the directory structure you created and to allow easy transmission to others (including, and especially, your graders), you should gather your entire directory structure into one file. Archiving utilities are available for doing this, including, in Windows, sending your top-level directory to a compressed (zipped) folder (right-click on the folder and choose Send To then Compressed (zipped) Folder). Another alternative, on stdsun (which uses the Solaris flavor of Unix), is to bundle your files into a zipped "tarball" with the gtar command gtar zcf <groupname>_<labname>.tar.gz <groupname>_<labname> Look for error messages coming out of this tar command. Check especially for permission problems because different people have probably created the files. You should test that it worked using the table of contents option gtar ztvf <groupname>_<labname>.tar.gz or, to be even safer, by extracting the files and checking them using the command gtar zxvf <groupname>_<labname>.tar.gz
You should submit the file using the appropriate Carmen Dropbox. Only one person in a group should submit the lab. If there are multiple submissions, only the last one will be considered. Any submission dated after the due date and time is late. I would strongly suggest that groups test for problems in the process by submitting a test file early and seeing if they have any errors. Please email me if your group can't submit the test file. As a last resort, you can email a gzipped tar file to me. Because this last resort will cost me some time, I strongly discourage this.
Lecture #1
System Software Design, Development, and Documentation D l t dD t ti Introduction & Administration
0
Course Objectives
System Software Software Engineering
(A Little) Requirements Gathering Design Team Work
Writing (Documentation)
Choices
Hardware platform Software platform
Editor(s) Compiler(s) Compilation management (make) Configuration management (cvs) C fi i ( ) Off-the-shelf components Documentation
2
Remark
Now would be a particularly bad time to have to learn the main programming language that your project team will be using. Learning new off-the-shelf components also takes time.
Evaluation
We have to be able to verify independently that your source code produces an executable that has the desired behaviors. Therefore, if your team desires to use other than a CSE-provided platform, youll have to negotiate this matter with a grader, and come up with a short, written contract describing the agreement.
4
Graders
Our graders are: Sean O'Connor (oconnor.173@buckeyemail.osu.edu) Kai Li (li.966@osu.edu)
Lecture #2
Introducing the Machine
Overview of Labs 2-4

Assembly language e.g., LOAD r1,VALUE
Assembler
Machine code e.g., ...0110101110...
Linking Loader
Linked machine code e.g., ...0110101110...
Simulator
Executing program
8
The A11-560 Machine

An abstract machine abstract Nearly the worlds simplest architecture!
Memory CPU Output Device Input Device
. . .
PC Register Bank
Instruction Processing Cycle

Repetition of 2 steps:
Fetch
Read word in memory location indicated by PC Increment PC
Execute
Perform action specified by the contents of that memory location (may involve reading/modifying other memory locations or registers)
10
Memory
Organized in cells cells
i.e., smallest addressable unit
k 0 1 2 3
N cells, addressed 0..N-1

each cell consists of k bits
A cell is usually:
a byte (8 bits), or a word
N-2 N-1
More exotic architectures exist

11
Questions
How many different values can be represented in such a cell?
k 0 1 2 3
N-2 N-1
How many bits are required to represent an y q p address?
12
Number Representation
A number is a concept for which there are concept, many concrete representations/notations
e.g., the number eleven can be represented as 11 (decimal); XI (roman); 13 (octal); B (hex); 1011 (binary); k (alphabetic); etc
Conversely, a single concrete representation may have several interpretations

e.g., the representation 10 could be interpreted as ten, eight, sixteen, two, etc
13
We have only bits available for our machine

simple binary numbers used in memory for concrete representation the corresponding data, however, can be interpreted in different ways
e.g., numeric data, instruction, ascii string, i d i i ii i
Problem: representing negative numbers in binary.

14
Signed Magnitude
Use first bit to represent positive/negative
S
1
Magnitude
k-1
i.e., 1111, 1110, 1101, , 0111
Q. what is the smallest number? Q. Q what is the largest number? Q. how many numbers total? Notice: adding a negative to a positive number looks more like subtraction!
15
Ones Complement
To negate a number flip the bits number,
e.g., -5 would be:
First bit is still the sign!

i.e., 1000, 1001, 1010, , 0110, 0111
Now addition always looks like addition (with ( i h an end-around carry) d d ) But the range is still 2k - 1 Why?
16
Twos Complement
To negate a number:
i) flip the bits ii) add 1
e.g., -5 would be: Uses the full 2 k range!

consider negating 0
The first bit gives the sign (like signed mag)

17
Imagine a circle:
0000 1111 1110
0 -1 1
0001
-2
addition: clockwise subtraction: counterclockwise

-8
1000
7
0111
18
Let k be the number of bits in the representation representation. k 1 x 2k 1 1. Number x is representable iff -2 Assume so in the following. Let bin(y) be the simple binary representation of y. If 0 x, then x is represented by bin(x). Oth Otherwise, x i represented by bin(2k + x). i is t d b bi (2 )
19
10
Lecture #3
20
Instructions
Basic format of instructions in memory:
OP CODE OPERANDS
op code encodes function operands encode arguments
(A11-560 instructions are all the same size) Operands can be interpreted in different ways
21
11
Addressing Modes
Immediate
argument is the operand
LOAD #6 effect: ACC 6
Register Direct
operand gives the register where argument is
LOAD r1 effect: ACC r1
22
Addressing Modes (II)

Relative (PC or base)
operand gives displacement, relative to a special register (such as PC)
JMP 3 effect: branch forward 3 cells i.e., i e PC PCold + 3 ld (important to know what the PCs old value was!)
23
12
Addressing Modes (III)

Memory Direct
operand is the address of the argument
LOAD 1B00 effect is to copy contents of cell 1B00 into ACC
1AFF 1B00 1B01 4 i.e.,
ACC ACC
M[1B00]
24
Addressing Modes (IV)

Memory Indirect
operand is the address of the address of the argument
LOAD @1B00 effect is to move contents of cell 4 into ACC
0004 17 i.e., i.e., ACC ACC ACC
25
M[ M[1B00] ] M[ ]
1B00 1B01
13
Indexed Addressing Modes

Indexing can be used with direct & indirect
address of argument given by: operand + value of a register
LOAD +1B00
2 1B00 1B01 1B02 4 25 18 i.e., i.e., ACC ACC
26
IND ACC M[1B00 + IND] M[ ]
A11-560 Instructions
4 OP 2 U2 2 R 2 X 2 U1 8 S
addressing modes:
immediate, register direct, memory direct, PC relative (with or without indexing), opcode extension, and ignore
general syntax:
R S(X) -
OP R,S(X)
- a register (integer in range 03)
S , if X = 0 S + rX , if X = 1,2,3
27
14
Categories of Instructions
Branch
unconditional and conditional
Load / Store
copies data between registers and memory
Arithmetic / Logical
addition, subtraction, shifting,
IO
read/write numeric and ascii data
28
Advice for Groups

Meetings
set up frequent and convenient times and places keep minutes (designate someone) have an agenda (agree on or set before) conclude with action items
assign responsibility for each action item!
start with progress report on open action items

29
15
Advice for Groups (II)

Tasks
exchange email / phone numbers read all handouts carefully
e.g., Documentation Requirements, Software Design
agree on standards / conventions t d d ti

e.g., ELLEMTEL document on course web page
Anticipate and deal with problems

30
Machine Code Examples

1. 1 ST 2. IO 3. OR 0,EE(0) 0 EE(0) 3,0(2) 1,2B(3)
31
16
. . .
B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA
Memory
7 0 B 9 B 0 B 9 B C 5 0 2 3 2 3 0 3 0 3 0 4 0 0 8 0 8 0 0 0 0 0 6 A B 0 0 B B B 0 0 F F 0 A 0 8 4 0 6 8 0 F 5
Instruction/Data
. . .
32
Lecture #4
33
17
Labs 1 and 2: Milestones

Place lab1 in its Ca e Dropbox before class October 4. ace ab ts Carmen opbo be o e c ass Octobe . Mandatory Design Review: October 7, 10, & 11
Beginning Sep. 30, sign up for a 25-minute slot outside DL 481. Everyone in group must be present. Turn in printed version of preliminary documentation for lab2 October 5, in class (Programmers Guide in particular). Place preliminary documentation in the Carmen lab2 design review lab2-design-review Dropbox before class October 5.
Completed Documentation: October 18, before class, in the Carmen lab2 Dropbox.
34
Lab 2 Requirements
Input two text files put te t es
executable file: initial header record, followed by a sequence of text records process input file: consulted at process run time by IO instr. process trace:
1. memory configuration & registers after loading 2. trace of each instruction executed (i.e., memory & registers affected) 3. memory after termination
Output two text files
process output file: appended at run time by IO & BR instrs.
Robustness (so test it thoroughly)

35
18
CSE 560 Software Design

Task 1.1 Review of Requirements 11 How can we get a good handle on the functional requirements (Output #3)? Prepare a robust test plan.
Look for boundary conditions. There are more than a few little gotchas (Got you!) fe o !) in these project assignments. It is important for a software development team to find these.
36
When you discover a gotcha

Judge whether the customer should be involved in making the decision. When in doubt, ask the customer. Make a decision. Document your decision decision.
37
19
Example gotchas
What is a segment good for? How is it used? How might a segment name be used?
Ask the customer.
Should accesses outside of the segment be considered errors and/or warnings? g

Not if those accesses are to legal addresses (any address between 0 and 255 is legal).
38
Lecture #5
39
20
Software Engineering
40
The Software Crisis

Were in the midst of a s/w crisis We re
and weve been there for 40+ years!
Complexity continuously increasing:

machine software
Tools and techniques to manage complexity

tools: CASE, analysis, testing, techniques: languages, methodologies,
41
21
However system complexity frequently However, pushes the envelope Net effect:
The support for building complex systems always seems to lag behind the systems we build (or want to build) !!
42
Characteristics of Well-Designed Code:

use Easy to _______
user-friendly robust efficient flexible (and easy to configure)
43
22
Characteristics of Well-Designed Code (II):

maintain Easy to _______
easy to understand and reason about

documentation and simplicity of design
easy to read
coding conventions and style
easy to modify easy to extend

44
Geord Polya's How to Solve It
Waterfall Model of Development

Simple model of software development Occurs in stages:
Programmer's Guide -> 1. requirements analysis 2. system specification 3. 3 design I spend too much time 4. implementation doing #3 during #4 5. testing do this more often! This is the tried and true method for "checking" 6. maintenance / support -> User's Guide
45
23
Problems with this Model

There is no barrier between steps barrier
e.g., begin testing before implementation done
testing can happen at many different points. Writing tests even before designing can help mold the design
Water flows uphill

e.g., working on design reveals gaps in requirements analysis more like an Escher print than a real waterfall!
46
Fundamentals
Many alternatives to pure waterfall exist
spirals, matrices,
Concept of distinct stages is useful

helps structure the effort, like a battle plan
Some basic stages:

1. requirements / specification 2. design
47
24
Basic Stages in Software Design

Requirements Analysis
answers: What should the system do? understand the problem deliverables:
1. requirements document 2. specification document
from the users point of view p client is us. It makes it straightforward to defines and limits the scope of the system from the developers point of view basis for design / implementation / testing
This could be good to write even if the
write a design once we figure out how the system should work.
48
Not Done in Requirements Analysis
ON TESTS
Requirements Analysis does not aim to answer any of the following questions: What environment will the system operate This is in? deferred 'til later What will be the performance characteristics of the system? we haven't thought about HOW to implement h t i ti f th t ? this yet. What will be the cost of delivering the system to the client? same as above.
49
25
Basic Stages (II)

High Level Design Possible solutions (no evaluations)
identify and evaluate possible solutions now evaluate. factors to evaluate on: simplicity, effort, cost refine the design move on to lower levels...
Brainstorming!
50
Basic Stages (III)

Design
answers: How does the system do what it does? how to we answer the requirements analysis? (System specs) high-level description of components, interfaces, and interactions given in terms of data structures, procedures, algorithms, The data structures we provide, that is.
The operations we provide. The algorithms are NOT provided to the client. 51 Instead, this is an internal information.
abstraction is critical What details do we ignore? b i i ii l Which do we emphasize?
26
Basic Stages (IV)

Module Specification
identify the abstractions describe the abstractions
see the specification skeleton in the syllabus
describe the interactions (interfaces, operations)
Two common and broad classes of design:

1. procedural 2. object-oriented
These are both useful, and we can use either for different projects.
52
Procedural programming deals with data. There is less abstraction. It deals with transformations that happen to data.
Procedural Design
Focus on the functionality Create a data-flow view of computation

Input File Loader Memory Rep.
U thi data-flow to decompose a large Use this d t fl t d l system into smaller modules
53
27
Procedural Design (II)

Look for modules that:
are small enough to be understood are large enough to result in reasonable overall complexity are generic (and flexible) enough to be reused
Key activity: DEFINE INTERFACES

this allows work to proceed in parallel on the sub-parts Definiting interfaces is
important for both procedural AND OO programming.
54
ON TESTS
OO focuses on the data itself,not the functionality. Of course, looking at the name, this makes sense.
Object-Oriented Design (OOD)

Table is a Program = collection of Machine interacting objects Memory has a State Sketch out the types needed and the interactions between class these types object bj t Current State
We think of different types Focus on the data of data with different Objects. Objects have functionality by passing messages to each other by calling each others' methods.
55
28
OOD: Finding Classes

invariants inside of a class are those things which are true inside of a class. (Correspondence) outside, we have *constraints* client level invarients and implementation invarients
Design is often based on reality Talk to field experts to understand the system being modeled in software. Write down scenarios
use case analysis Very important. (JankCMS) use case
Draw lots of pictures and refine model Decide on class invariants

56
OOD: Specify Relationships

Typical class relationships include: yp ca c ass e at o s ps c ude:
inheritance (e.g., a car is a vehicle) containment (e.g., a car has a steering wheel) use (e.g., a car uses a highway) encapsulation (e.g., details of steering mechanisms are hidden behind the interface, i.e., the steering wheel)
Determine responsibility of each class

delegate where appropriate (not too much!)
Must strike a balance (small vs. large) in the size and functionality of classes
cuts across both procedural and OO
57
29
OOD: Specify Operations

Important categories of operations:
construct, initialize, copy, assign, xfer, destroy access, update, iterate
Set should be small and independent

do not implement every p p y possible use / extension
Focus on behavior, not implementation

confirm invariants
Key activity: DEFINE INTERFACES
58
Lecture #6
59
30
Road Map
Project - Lab 2 Form groups Requirements analysis Design review Submit complete project (i.e., implementation, documentation, tests) Lectures Admin stuff Abstract machine Software Engineering <detour!> Testing Technical Writing
60
System Software - Overview

What is system software anyway? system software
programs that support the operation of a computernot necessarily something the user interacts with often closely related to the architecture compiled. Java is a weird choice. allow us to focus on application without knowing details of machine Java makes more sense
for this kind of programming.
61
31
Overview (II)
Examples:
operating systems software used to create other software!
Could be on exam --->
--> Compilers and assemblers are both translators (from one language to another) Lab 2 is a simulator (interpreter) Usually a debugger is an interpreter (allowing us to see code execute)
compiler linker / loader assembler Linker / loader debugger editor could use word as system software (writing C++ code, etc.)
Driving force: people more expensive than machines

62
A program is static, A process is dynamic. A process uses a program's instructions to carry out actions.
Program vs. Process

Program: a collection of action descriptions Process: a program in execution
contains state:
i) values of variables, ii) location in program, iii) pending I/O, etc.
Processes contain state that change / evolve in time.
state changes over time

63
32
ALL THIS STUFF IS ON THE TEST.
Program vs. Process (Example)

Program S1 S2 S3 A <-- 3 B <-- A+1 Branch S1 State
A= 3 B = 17 PC = S2
time
A= 3 B=4 PC = S3
Executing an instruction changes the state

That's the purpose of executing an instruction: changing the state.
64
Layers of Abstraction Architecture

Computer can be Application < The stuff we're writing. viewed at different Tools < JUnit levels of abstraction LINE OF SYSTEM SOFTWARE--------------------------- Each layer is a virtual High-level Java translated to assembly machine (VM) language, translated to Somewhere Assembly machine language, run on This helps bridge the within these machine human / machine gap 3, we have a OS virtual machine. Each VM corresponds Machine to a language (Kind of like a C++ virtual machine or something.
MACHINE: State determined by the values stored in each memory location, registers, and 65 program counter. Machine has no notion of separately running processes. It simply stupidly executes instructions. The OS keeps track of where the processes go / do.
Keystrokes / mouse actions in Word constitute a langauge
33
An interpreter is a program. When it becomes a process (when we run it) it becomes something that advances another process through its states. That other thing is a program that we give the interpreter. Simply: Running an interpreter becomes a process that advances another process through its states. BASIC is interpreted. Java involves both translation and interpretation. Java bytecode is usually interpreted. A translator is a program, but when we run it, we give it another program, and it produces a 3rd program. The process a translator governs is merelt a translation process. C++, for example, is translated to machine code. Possibly with an interm translation to assembly language. Java also involves translation, by having source translated to bytecode (.class files) Advantages of Translating: Faster, because its translated to machine code. Disadvantages of trans: symbolic debugging is difficult. Advantages of Interpreting: Quicker debugging, prototyping. Disadvantages: Slower.
Two Important Kinds of Program

Translator
a program that, given a program at one VM level, produces a program at another (lower) one C++ is translated to machine code Java source is translated to byte-code faster in execution a program that advances another process through its states BASIC is interpreted Java byte-code is (usually) interpreted quicker debugging and prototyping
THIS IS ON TEST.
Interpreter
66
Why are we concerned more with OS instead of architecture when compiling? In C++, we have input/output streams IO, for example, is effected by the language using operating system routines.
Translating vs. Interpreting

Consider Java:
program MyProgram.java translated byte code MyProgram.class interpreted Java VM
67
34
Translating vs. Interpreting

Translation / interpretation can occur at all levels
High-level translator (compiler) Assembly or interpret or interpret
Any langauge can be translated or interpreted. The CPU always interprets its instructions.
translator (assembler) Machine Language interpreted (usually by a CPU)

68
Lecture #7
69
35
The operating system defines when a process really "begins". Though the CPU is the primary resource, the OS controls it and other pieces like memory in order to allow a process to execute.
Operating Systems - Introduction

When does a process begin?
when it is assigned certain system resources e.g., processor, memory, I/O, registers,
At any instant, there are many processes

multiple, concurrent users Even if there is only 1 user, there are usually many processes running mix of batch and interactive jobs interactive: emacs i fb h di i j b batch: g++ OS tasks (e.g., buffering printer output)
But only a fixed number of resources...

This is why we must meter them out.
70
OS - Introduction (II)
Resources must be managed This is the job of the operating system (OS)
This is one of the biggest tasks of the OS.
71
36
Concurrency is fundamental to the use of an operating system. Concurrency is hard because of scheduling. Priority, length of time waiting (starvation), and many other factors.
Challenges in OS
Concurrency is fundamental Concurrency is hard
Example: sharing a bridge
Long tunnel that only fits 1 lane of traffic What policy do you use to control traffic?
Example:
Process A is using X and needs Y, " B" " Y " " X <-- Deadlock OS must avoid and/or detect and resolve deadlock and starvation
waiting too long to get resources...
72
Responsibilities of OS
Handles interrupts
Interrupt-driven IO is supposed to increase efficiency of simultaneous requests.
may be generated by I/O or by programs
Manages real memory

loading of segments
Manages virtual memory "virtual" = "abstract"

virtual: appears to user to have different characteristics than it has in actuality, i.e., in implementation virtual memory: a large block of contiguous memory space The OS makes us think we have large blocks of memory,
but we really don't.
73
37
Responsibilities of OS (II)
virtual memory may be larger than real memory!
Disk User Virtual Memory Real Memory

page fault
Trying to access outside of Real Memory
74
Responsibilities of OS (III)
File management
These let us access bytes in a file without keeping CPU track of our place. schedules processes (i.e., running / ready / waiting) process cannot make progress without something else. (User interaction, etc.) Securityprocess is using CPUprocess is capable of making progress by using CPU
keeps file handles and position marks
prevents one user from damaging anothers data prevents user from damaging operating system
75
38
A Little About Documentation

See common errors (Top 15 ) link on Carmen common errors ( Top 15) Resource: OSU Center for the Study and Teaching of Writing (CSTW)
see link on web page (also under Resources on Carmen)
Clarification of some elements of the Programmers Guide:

Data Structures / Types Data Element Dictionary
Especially important: shared elements

76
Documenting Data Structures

Consider a structure representing a cell:
struct MemoryCell { char Bit[CellSize]; }
Give name, declaration, description, invariant, purpose,

invariant: (i : 0 i < CellSize : Bit[i] = 0 Bit[i] = 1)
Also use pictures and English to help in the description

77
39
Data Structures (II)

These are the shared data structures
used in the declaration of variables (even local)
variable active_cell: MemoryCell;
used in the declaration of other types

type MemoryType = array [256] MemoryCell;
used as parameter types in operation signatures

function decode (MemoryCell m): integer
For more OO designs, you have types

78
Documenting Data Types

Type: MemoryCell contents: array of 20 characters description: used to represent a ... invariant: every character is a 0 or a 1 operations: set_bit (int)
read_bit (int) : character initialize (string)
Then need to specify each operation

79
40
Documenting Operations
Name: CheckHeaderSyntax()
description: This function checks whether or not a given string conforms to the required syntax for header records (see section 2.3.4) calling sequence: input: char *s - h d record to be checked i h * header d b h k d returns: boolean - true iff header syntax ok requires: ensures:
80
Visible State
Basic principle: information hiding information hiding
hide implementation details from client
simplifies interface client shouldnt rely on these details
Same principle applies to specifications

given in terms of visible (abstract) state
For each shared type, then, there are two kinds of specification: internal & external
81
41
Data Element Dictionary

Should contain all shared data elements
types, variables, and constants documentation of classes should be elsewhere, not in the Data Element Dictionary
Give pertinent information:

name, type, declaring module, description of use, any invariant, value (for constants)
82
Lecture #8
83
42
Testing
1. 1 Philosophy 2. Example 3. How tos (including code) and caveats 4. Levels of Testing
84
Testing: Philosophy
85
43
Definition of Testing
What is testing? testing ?
A process whereby we increase our confidence in an implementation by observing its behavior
Fundamental point:
testing can detect the presence of mistakes, why we should never their absence! That'swithout testing! be confident in our code
A test case reveals a defect ==> Fix it! No test case reveals a defect ==> Not enough testing!
86
Importance of Testing
Despite limitations, testing is the most limitations practical approach for large systems Knuth quotation:
Warning: Ive only proven this algorithm is correct I havent tested it! Haha
87
44
The Right Frame of Mind

Tests should be written to break a program
not to show it works! Be mean!
When a test reveals an error, thats success! Good approach: have someone else test your code
This is one of the best things about working in a team.
88
Theory
3 levels of abstraction in functionality Want: the idea Have: implementation Testing requires comparing it against i i t something, but what?
Idea Id
capturing this idea into a concrete form.
Specification
Implementation
89
45
Theory (II)
Ideal: test against our idea idea
but the idea is usually too fuzzy
If different people, based on a specification, write two different implementations, should have the same expected output.
So make it concrete by writing specification

defines desired mapping from input to output
Input Specification Expected Output
Testing: compare expected and actual
Implementation! Actual Output

90
Testing: An Example
91
46
Example: Sorting a List

Idea: function sorts a list in ________ order Spec: void sort (List& x)
requires: |x| <= 100 modifies: x ensures: For all i in List, ARE_IN_ORDER(List(i), List(i+1)
ARE_IN_ORDER
Q: do we really need the expected output?

i.e., why not just look at actual output and see if 92 it is sorted?
elements(SORTED(List)) = elements(List)
Expected Output
A: #x is a permutation of x, and for all y in x, ARE_IN_ORDER(y,y+1) Specifications often relate final states to initial ones
but not necessarily true e.g., void f(int & x) g,
93
47
Testing: How Tos and Caveats
94
Importance of Independent Testing

See IEEE Computer, Oct 1999 Computer (J. D. Arthur, et al.)
study at NASA Langley had two groups working in parallel
The group with independent testers found:

more f lt overall (critical and non-critical) faults ll ( iti l d iti l) found these faults earlier in the process fixed these faults with less effort
95
48
Figure 1 from Arthur article
96
Figure 2 from Arthur article
97
49
Lecture #9
98
How To Choose Test Input

Too many possible inputs to test them all
space of possible inputs defined by requires
On MIDTERM / FINAL MEMORIZE
Important kinds of test input:

simple cases that are almost ---------------> too simple. They make the procedure not do work.
extremes (e.g., empty list, |x| = 100) trivial / degenerate (|x| = 1, x is already sorted) error-generating at least one (probably more) test cases that generate each possible error message i different categories (e.g., pos./neg. numbers) Also, inputs that cross categories. typical input (random list)
99
50
How To Generate Expected Output

1. 1 By hand
error-prone and tedious also error-prone often just redoing the implementation, and making the same mistakes! an inverse may be easier to calculate e.g., start with a sorted list, and permute it
100
2. With another program
3. Work backwards

Alternate: Validating Output

1. 1 Keep a copy of the input 2. Run the program 3. Validate the actual output against input
Example: sorting
write two functions: copy the input run program and check:
Check
Checking functions may be simpler than the 101 full implementation
51
INSERT: Code for Testing Harness Insert 1: Test Driver for Sort Insert 2: Test Suite for Sort Insert 3: Validating Form of Test Driver
102
Dangers with Testing

Expected output is wrong #1 -- could be reporting errors that are not errors. Expected output #2 could be not reporting errors Testing program is wrong #2 is worse.That would be a problem.
extra code means more chances to mess up e.g., is_permutation(A,B) always returns true
With these errors, there are 2 dangers: errors

1. reporting a non-error 2. not reporting an error
Which is worse?
Not reporting error! (ON TEST)

103
52
Dangers with Testing (II)

A third more subtle potential error: third, subtle, The specification is wrong
how can this be? may not be exposed during testing to increase the chances of finding these problems, have someone else test your code!
Make sure that the client is checking progress.
104
It can be vague on a point. It could say something other than what the author intended. (Conflicts with the original idea)
Testing: Levels
105
53
Levels of Testing
Typical testing path:
1. Unit tests Testing individual pieces of the software independantly. 2. Integration tests Testing how the individual parts communicate. 3. System tests Testing entire program behavior
106
Unit Tests
Individual modules tested in isolation Two flavors:
1. Black box: testing based only on specification (tester doesnt even look at code) 2. White box: testing based on code structure (e.g., tester makes sure every branch of a switch statement is followed)
107
54
Integration Tests
Modules tested in combination in order to check the interfaces Best done incrementally
Main
here
here
Initialize
here test here
Load
here
here
Simulate
here
FileIO
Header
here
Reserve
108
SetMemory
Bottom-up vs. Top-down Testing

Bottom up Bottom-up
start with most basic modules easy to exercise all the features write a driver in p place of higher-level g modules
Top down Top-down

start at top (main) test interfaces early write stubs in place of lower level modules
"stub" on test.
Often these two occur simultaneously, in tandem

109
55
System Tests
Verify that system as a whole meets the requirements and specifications Three flavors:
1. alpha: by developers, before release y y , general 2. beta: by friendly customers, before g release 3. acceptance: by end customer, to decide whether or not to hire you next time!
110
Lecture #10
111
56
Technical Writing
112
What Is Technical Writing?

Writing we do as part of our jobs Possible purposes:
Inform Instruct Persuade Call C ll to action i
Occasionally, entertainment is alright.
Missing from this list:

Entertain
113
57
Four Characteristics of Effective Technical Writing

1. 1 Engages a specific audience 2. Uses plain and objective language 3. Stresses presentation (obvious structure, understandable at a glance) 4. 4 Employs visual aids
114
Why Bother?
Communication is fundamental in society
Politics, law, science, personal lives, health,
Fundamental in personal success:

Good idea Ability to communicate that idea y
Highly valued by employers
115
58
Writing in Computer Science

Taking exams Documentation (for users and developers) Reports and memos Papers (journals, conferences, magazines) Proposals P l Reviews of others work Books
116
Good and Bad News

The bad news
Most of us are not very good at it We enjoy technical challenges much more
The good news

Writing is actually not too different from g y computer science!
117
59
Parallels Between Writing and Computer Science

Programming Must identify/determine: User Purpose of program Program features User interface Preprogramming Writing Must identify/determine: Audience Purpose of document Depth of document Style and tone Prewriting
118
Parallels Between Writing and Computer Science II

Software Development Requirements and design Implementation (i.e., coding) T ti Testing Debugging Technical Writing Prewriting Composition R i i Reviewing Revising
119
60
Analyzing the Audience

Few people read this stuff for fun Must correctly identify a customer and what that customers needs are Consider writing an audience profile
Novice, technician, expert, manager, VC, Reading level Motivations, biases, expectations,
120
Analyzing the Audience II

Make this analysis concrete by stating assumptions made on background, motivation, needs, etc.
The reader is expected to be familiar with the predicate calculus. This manual is designed for application programmers who write S47G applications for the insurance industry.
121
It's a good idea to say somewhere the requirements for the audience.
61
Technical Audience
Function oriented organization Function-oriented
E.g., alphabetical listing of all functions
Want a complete (exhaustive) resource

All information they might want is there somewhere
Willing to spend a great deal of time

Will read the document carefully
122
Technical Audiences: Function - oriented. Cusomer Audiences: Task - oriented. => More of a tutorial. (They don't care about "why")
Customer Audience
Prefer task-oriented organization task oriented
E.g., enumerated steps for each possible task
Want just the necessary information

Only the information critical to their jobs is there
Will spend as little time as possible

Must be concise and easy to read
123
62
Identify the Purpose

Rule of business letters and memos:
Begin with clear statement of what you want!
Larger documents are no different

What information are you trying to impart? What are you trying to teach? y y g What view do you want the reader to adopt? What action do you want done?
124
The Depth of Writing

Bloom s Blooms taxonomy of cognition:
1. 2. 3. 4. 5. 6.
(categorization) things people do when they think.
Knowledge Facts Comprehension Understand of a fact's implication Application How we apply facts to the current situation. Analysis Creating new ideas from those we already have. Synthesis Joining what we know together, creating connections to other topics and ideas. Evaluation Judging ideas based on their importance and
validity.
125
63
Verbs Used in Statement of Purpose

Knowledge
count, define, draw, identify, indicate, list, name, quote, recall, recite, recognize, record, state, tabulate, trace, write.
Comprehension
associate, compare, compute, contrast, describe, differentiate, discuss, distinguish, estimate, extrapolate, interpolate, predict, translate.
126
Verbs Used in Statement of Purpose II

Application
apply, calculate, classify, complete, construct, demonstrate, employ, examine, illustrate, practice, relate, solve, use.
Analysis
analyze, detect, explain, group, infer, order, relate, separate, summarize, transform.
127
64
Verbs Used in Statement of Purpose III

Synthesis
arrange, combine, construct, create, design, develop, formulate, generalize, integrate, organize, plan, prepare, prescribe, produce, specify, research.
E l ti Evaluation
appraise, assess, critique, determine, evaluate, grade, judge, measure, rank, select, test.
128
Prewriting: Getting Started

Read the question / problem statement carefully. Make a list of the required cognitive tasks. Assess what you know. Compare this knowledge with the level of the required cognitive task
129
65
Example
Compare the performance of two cache Compare replacement algorithms Cognitive tasks
Compare Contrast Maybe analyze and recommend?
Recall various issues in cache algorithms

130
Prewriting Tasks
Quick list
Specific points for each cognitive task
Brainstorm
List everything you know about the topic Do not judge or weed anything out Obj i quantity Objective: i
Review list
Assess where research is needed
131
66
Prewriting Tasks II
Choose a single point that will be in the final product and outline a section to develop that point. Involve other points as appropriate. Do the research. Plan the format. This is not a linear process!
132
Prewriting Tasks III

Outlining
Get the planned structure down Avoid forgetting a key point Check for the logical flow of arguments and information This is an easy step to skip, but good work here will pay dividends in the future!
133
67
Lecture #11
134
Writing the Document

The better the preparation in the prewriting phase, the smoother this goes. Regardless, its still work! Requires tools, skills, practice, experience, and motivation.
135
68
Technical Writing: Document Component Engineering

Component = section of the document. document Often has its own heading. Large components consist of smaller ones. Each (large enough) element from the outline becomes a component
136
Advantages of Components
The whole document is too intimidating Obvious milestones
Reduces panic (you know where you stand) Permits time budgeting
Reduces writing to a step-by-step process Instant gratification Easy cure for writers block: work on a different section
137
69
Writing a Component
Know the purpose Have all the information Different strategies:
Write a draft using sentences Jot down points in any form then flesh out into form, sentences Combination (sentences, phrases, points)
138
Overcoming Writers Block

Start anywhere If the problem is lack of information, go back and do more research Explain it to someone else (verbally) Work on a different section Take a walk (in a snow storm?) Imagine life when you are done
139
70
Overcoming Writers Block II

Force yourself to sit at your desk until done Revise your outline or organization Revise some section youve already written Change your environment Diagram th structure of the component Di the t t f th t Set an impossible schedule and panic Take a break
140
Rhetorical Patterns
Every culture has well-established patterns of exposition
Ready-made structures into which specific information may be dropped
The reader is already familiar with these patterns The technical writer does not have the time (or skill) to invent new ones
141
71
General-to-specific Pattern
Often used for introductory section Start with the most general statement
More computing resources are devoted to the management of data than to any other task
Gradually get more specific

This Thi program simplifies the manipulation of i lifi th i l ti f numeric data on a personal computer
Finally specific statements

142
Classification Pattern
Organize information by dividing it into categories
E.g., a section on each of initialization, data entry, selection, access control, etc
Within each category, present parallel information

E.g., purpose, prerequisites, results, error messages, alternatives, references, etc
143
72
Comparison-contrast Pattern (Point-by-point)

Consider one aspect at a time
In football, the ball may be thrown forward from behind the line of scrimmage In rugby, only lateral passes are allowed In football, play ends when the ball-carrier is tackled In rugby, play continues after a tackle, but the tackled player must release the ball
144
Comparison-contrast Pattern (Whole-to-whole)

Two ways to create a new object (zack)
1. From the command line
On the command line, type edit zack Fill in attributes of the presented template Select Save from the File menu
2. From within the application pp

Select New from the File menu Fill in attributes of the presented template Select Save As from File menu, and type zack
145
73
Definition Pattern
Typically short and simple Example
Undo is a function that restores an object to its state immediately prior to that last operation Places Undo in the class of functions, then distinguishes it from other functions
146
Chronological Pattern
Typical for task-oriented instructions task oriented Given in the order in which they must be performed
147
74
Effect-and-cause Pattern
Often used for error messages Give a list of error messages
ordered alphabetically, by error number,
For each message, list the possible causes

ordered most to least likely
After each cause, give the action(s) the user should take to recover
148
Putting Components Together

Add headings (part chapter section ) (part, chapter, section, ) Add transitions where needed
Important to prompt reader for what to expect, or to reinforce that some change is coming
149
75
Possible Transitions
Moving to the next point in a sequence
Firstly, secondly,
Contrasting item or viewpoint

However, on the other hand, otherwise,
A result or conclusion
Therefore, in consequence,
Relating things in time

Now, then, soon, immediately,
150
Possible Transitions
Introducing an example
For example, that is,
Further strengthening a point

Moreover, similarly, further,
Concluding
In conclusion, in summary, finally,
151
76
Preliminary Draft
Starting is always difficult
Helps to remember its just a draft!
Dont worry about spelling, grammar, form Spend effort on sound communication of major points Fill in your outline
152
Middle Draft
Build on the base of the preliminary draft Refine the organization and fill in points Ensure each point belongs in that paragraph Cut and paste Play ith th t t Pl with the text
Font, layout, spacing, page count
153
77
Final Draft
Spelling and grammar
Run the spell checker, but thats not enough!
A which hunt Word choice Transitions Typos Pagination

154
Revising
Where bad writing becomes good writing First draft is always bad
Tempting to become attached to text weve written Write the first draft anticipating that it will change in the future
155
78
Revising Tasks
Add flow and smooth transitions Careful, accurate movement from one point to the next, one section to the next Make decisions that have been put off Reduce wordiness Clarify subordination relationships between points
156
Red Flag Inconsistent View

Changing from 2nd to 3rd person
Limit your disk storage to 100 Mb. The user can submit a request for more storage space to the system administrator. Limit your disk storage to 100 Mb. For more storage space you can submit a request to the space, o s bmit req est system administrator.
157
79
Red Flag Passive Voice

The verb expresses what is done to the object (by someone or something) Occasional use is OK (and even unavoidable in many technical documents) But excessive use weakens your writing y g
This error is used to indicate The parser issues this error to indicate
158
Red Flag - Wordiness

In the final analysis, the end result of a analysis wordy document is increased cost in terms of pages of paper, bytes on a disk, and inefficient use of the readers time Words cost money. It is cheaper to print a short book than a long one.
159
80
Red Flag Faulty Parallelism

The use of different grammatical constructs in a parallel structure Consider the list:
Preparing for installation How to configure g Do you want the advanced options?
160
Red Flag Dangling Modifier

The use of a verbal phrase that does not connect with (or modify) anything else in the sentence
After typing enter, the system will continue with the second pass over the program After you type enter, the system will continue with the second pass over the program
161
81
Red Flag Ending a Sentence With a Preposition

Example:
Before using the software, you must set it up You must set up the software before you can use it
Winston Churchill:
That is criticism up with which I will not put.
162
Red Flag Provincial and Sexist Language

Unless you are sure your readership is homogeneous, be sensitive to and inclusive of many cultures and both genders Example: A list of names:
Bill White, Ken Williams, and Bob Smith Chris Amini, Lea Sanchez, and Rei Chi Lee
163
82
Red Flag Which vs. That

Simple rule: if that sounds OK use it! that OK, Use which with nonrestrictive clauses Use that with restrictive clauses Example:
Eds country house, which is located on five acres, had bats in the attic. The house that sat on the top of the hill had bats in the attic.
164
Red Flag - Utilize

Why would anyone utilize the word utilize when the word use would work just as well?
165
83
Things to Check Appropriate Style

Verbose vs terse vs. Formal vs. informal
Use of contractions, informal language, slang
Tone
Distant warm, intrusive Distant, warm
Consistency is very important
166
Bottom Line
Technical writing requires work, practice, work practice skill, technique, time; not talent. The first draft is always bad writing. Allow time (and energy) for revisions. There is no substitute for having something g g to say. You cant bluff it.
167
84
Lecture #12
168
CVS
169
85
CVS: Concurrent Versions System

Widely used, especially in the open source used open-source community, to track all changes to a project and allow multiple access
Can work across networks
Key Idea: Repository

The place where the originals and all the modifications to them are kept. kept Each person checkouts their own, private copy Changes are commited by each person Everyone elses changes are updated into your own copy.
170
CVS: Examples
The following examples show an existing project being put under CVS How to start using the repository Then two different people making changes:
Putting modified file into repository Getting each others changes. Finding out how things have changed.
You can read more:

man cvs, or emacs:
^U^Hi/usr/local/info/cvs.info
Pointer to manual in Carmen:Content:Resources
171
86
The Repository
Two ways to set the root of the repository root
Environment variable
setenv CVSROOT /project/c560ab05/CVSREP
Command line flag (-d)

-d /project/c560ab05/CVSREP
Repository may contain several modules
172
Creating Repository
Once per project by one person project, (with umask 7) Command:
cvs init
Creates repository root, administrative files Check that group and other permissions have been properly set. (Use ls -alF.)
173
87
Creating Repository (Example)
% cd /project/c560ab05/ % ls Lab1/ Lab2/ % cvs d /project/c560ab05/CVSREP init % ls CVSREP/ Lab1/ Lab2/
174
Adding Existing Project

Once per project by one person project, Command:
cvs import <module> <vendor> <release>
Copies current directory contents to module Afterwards, original source can be removed , g
But be careful!!
175
88
Adding an Existing Project (Example)

% cd Lab2 % ls loader.c loader.h simulator.c simulator.h % cvs d /project/c560ab05/CVSREP import sim A start [editor starts; save log entry; exit editor to cont.] N sim/loader.c N sim/loader.h N sim/simulator c sim/simulator.c N sim/simulator.h No conflicts created by this import % (cd to parent folder & carefully remove Lab2 folder)
176
Checking Out
Once per person Command:
cvs checkout <module>
Copies repository files to local, working directory

Local directory contains CVS subdirectory for administrative book-keeping
177
89
Checking Out (Example)

% cd ~person1/mycode/ % cvs d /project/c560ab05/CVSREP checkout sim cvs checkout: updating sim U sim/loader.c U sim/loader.h U sim/simulator.c U sim/simulator.h % ls sim/ % ls sim/ CVS/ loader.c loader.h simulator.c simulator.h
178
Committing
Commit (copy) changes made on local (working) files to the repository Command:
cvs commit
New files created in local working directory must be explicitly added (before commit)
cvs add <new-file>
179
90
Committing (Example)
% cd ~person1/mycode/sim % (modify loader.c and create memory.h) % cvs add memory.h cvs add: scheduling file memory.h for addition cvs add: use cvs commit to add this file permanently % cvs commit cvs commit: Examining . [editor starts; type & save log entry; exit editor to cont.] Checking in memory.h; /project/c560ab05/CVSREP/sim/memory.h,v <-- memory.h Initial revision: 1.1 done Checking in loader.c; /project/c560ab05/CVSREP/sim/loader.c,v <-- loader.c New revision: 1.2; previous revision: 1.1 done
180
Updating
Each person, with appropriate frequency person Command:
cvs update
Brings your local working directory up-to-date with repository (merging differences if possible)
U : local file was updated A/R: local file added/removed M: local file is a modification of repository C: conflict detected between local file and repository
181
91
Updating (Example)
% cd ~person2/mycode/sim % ls CVS/ loader.h loader.c simulator.h simulator.c % cvs update cvs update: Updating . U loader.c A memory.h % l ls CVS/ loader.c simulator.h loader.h memory.h simulator.c
182
Working on Project
Multiple people can simultaneously checkout the same module Person1 and Person2 are both working away on their local copies
If working on different files, no problem g p
Not quite true!
If they are working on the same file, there could be a conflict

183
92
Conflict Resolution
Person1 checks out code
modifies loader.c
Person2 also checks out code

also modifies loader.c
Person1 commits: no problem Person2 commits:

% cvs commit cvs commit Examining . cvs commit: Up-to-date check failed for loader.c cvs [commit aborted]: correct above errors first!
184
Resolving Conflicts
CVS tries to merge changes Sometimes changes clash
% cd ~person2/mycode/sim % cvs update cvs update: Updating . RCS file: /project/c560aa/CVSREP/sim/loader.c,v Retrieving revision 1.5 Retrieving revision 1.6 Merging differences between 1.5 and 1.6 into loader.c rcsmerge: warning: conflicts during merge cvs update: conflicts found in loader.c C loader.c
185
93
Resolving Conflicts: Human

Note how all non-overlapping modifications are incorporated in y pp g p your working g copy, and that the overlapping section is clearly marked with `<<<<<<<', `=======' and `>>>>>>>'.
int main(int argc, char **argv) { init_scanner(); parse(); if (argc != 1) { fprintf(stderr, "tc: No args expected.\n"); exit(1); } if (nerr == 0) gencode(); else fprintf(stderr, "No code generated.\n"); <<<<<<< loader.c exit(nerr == 0 ? EXIT_SUCCESS : EXIT_FAILURE); ======= exit(!!nerr); >>>>>>> 1.6 186 }
When to Update/Commit
When confident things can be used by others
Dont wait until perfection Your commits should at least compile though!
One should update before committing

Integrates everyone elses changes
Update when you are ready for someone else s elses work The more files, the better
187
94
Subversion: a more modern alternative

Subversion is also available on stdsun as stdsun, svn. (Must subscribe to SVN.) Documentation is available at http://svnbook.red-bean.com/
188
Distributed Version Control

No central repository Example (free) tools:
Mercurial Git
189
95
Lecture #13
190
Introduction to (and Review of) Assembly Language
191
96
Definition
Recall: translation (vs. ) ______________ (vs
source program translated into target program (virtual) execution of the target on its VM should represent (have the same behavior as) (virtual) execution of the source on its VM source is not directly executed target (object file) is executed or translated later
192
Definition II
When the source is a symbolic representation of machine language:
source language = __________________ translator = __________________
(When the source is higher-level, the ( g , translator is usually called a ____________)
193
97
Advantages of Assembly Language

Over machine code (lower level)
easier to remember mnemonic operations than actual opcodes
e.g., ADD, SUB, MUL, DIV, ... vs. 04, 2C, F6, F6,
similarly for addresses in program

e.g., BR LOOP1 vs. BR 46554
194
Advantages of Assembly Language II

Over higher level languages higher-level
access to full capabilities of the machine
e.g., testing overflow flag, test-and-set instruction, how would you do that in PASCAL or Modula?
performance ?
195
98
The Best of Both Worlds

Systems programming is often done in a language like C
syntax of a higher-level (problem-oriented) language but gives access to low-level machine, like assembly language bl l
196
An Old Notion, Now Mistaken

If a program will be used a lot, it should If lot (for efficiency) be written in assembly language. No longer true!
197
99
Good compilers Fast machines Hard to write

10 lines of code / day, independent of language
Hard to read
high cost of maintenance
can be 2/3 of total 15% (annual) programmer turnover
198
Modern Approach
Write in high-level language high level Analyze to find where time spent Invariably, its a small part of the code Tune that tiny part for high performance
perhaps by writing in assembly language
199
100
Modern Approach II
Higher level can be a performance win too!
problem-oriented language gives problem-level insights huge performance gains are in algorithmic insights
e.g., O (n3) vs. O (n lg n)
assembly language programmer tends to be immersed in bit-twiddling (saves small amounts all over, but misses big picture)
200
Modern Approach III

Conclusion:
assembly language use is often a holdover from when machines were expensive, and people were cheap
201
101
So Why Do We Learn This Stuff?

You may still need to write that tiny, critical tiny part in assembly language Concepts/techniques similar for compilers Good vehicle for understanding architecture Legacy code with large parts in asmbly lng asmbly. lng. This world still needs assemblers!
many compilers translate to assembly language
202
Assembly Language Instructions

Weve seen the basic structure in memory: We ve
OP CODE OPERANDS
Four parts to an instruction in assembly language:

1. 2. 2 3. 4. Label Operation Operands Comments
203
102
Example Instruction
Test BRZ 1,Loop ;if R1=0 goto Loop
label
operation
operands
comment
204
Label Field
Symbolic name for an instructions or a instruction s datums address (often, but not always) Clarifies branching to a particular instruction
e.g., BR e.g., IO Loop1 2,depth
Al allows symbolic access to d t Also ll b li t data Often severely limited in length

205
103
Operation Field
Mnemonic for an instruction
e.g., ADD, SUB, BRZ
Mnemonic for a pseudo-instruction

e.g., NMD well see what these mean later...
206
Operand Field
Addresses and registers used by instruction
recall: arguments to the function
What to add, where to branch, where to store, Operands for pseudo instructions pseudo-instructions
used to give information to the assembler e.g., program name, how much space to save,
207
104
Comment Field
No effect on translation
no semantic impact on program
But huge impact on legibility!

clarify the program strictly for human consumption y p
208
Lecture #14
209
105
Example Program
If we want:
N := I + J + K;
we might write (in SPARC assembly language) something like
210
Example Program (Continued)

set I_s, %r2 , ! %r2 = I_s ld [%r2], %r2 ! %r2 = [I_s] = I set J_s, %r3 ld [%r3], %r3 ! %r3 = J set K_s, %r4 ld [%r4], %r4 ! %r4 = K add %r2 %r3 %r2 %r2, %r3, ! %r2 = I + J add %r2, %r4, %r2 ! %r2 = I+J+K set N_s, %r3 st %r2, [%r3] ! N = I + J + K
211
106
Pseudo-Operations
Recall: operation field can be either: operation
instruction (BR, SHL, ) pseudo-op
Unlike operations, do not have a machine ( p ) q instruction (opcode) equivalent Give information to the assembler itself
assembler directives
212
SPARC Pseudo-Operations
I_s: I s: J_s: K_s: N_s: A_s: .word .word .word .word .skip 0 0 5 0 400
213
107
Pseudo-Ops: Uses
Four principal uses:
segment definition symbol definition memory initialization storage allocation
214
Segment Definition
Recall information in header record:
initial execution address segment name length load address
All this information comes from pseudo-ops

(all except ______________ )
215
108
Segment Definition II
Two important pseudo-ops: pseudo ops:
ORI END (origin) (end) MainP ORI 133 ST 0,136 . . . END 137
What is the header record of the object file? 216 (footprint?)
(133) 85x 86x
ST 0,136
89x
Header record: H89MainP_85??

217
109
Symbol Definition
A label creates a symbol Symbol is often implicitly defined to be the address of that instruction and/or data Hello Test ORI 133 ST 0,136 BRZ 1,147 . . .
What is the value of Test?

218
Symbol Definition II
(133) 85x ST 0,136 86x BRZ 1,147
So Test has value: _________

219
110
Explicit Symbol Definition

Symbols can also be defined explicitly Pseudo-op:
EQU (equate) EQU 0 ;set ACC to 0
Example:
ACC
Symbols are used as program constants
220
Use of Symbols
Example 1: ADD ACC,106
translates as: i.e.:
Example 2: NOut EQU 2 IO NOut,Count

translates as: i.e.:
221
111
. . .
B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA
Memory
7 0 B 9 B 0 B 9 B C 5 0 2 3 2 3 0 3 0 3 0 4 0 0 8 0 8 0 0 0 0 0 6 A B 0 0 B B B 0 0 F F 0 A 0 8 4 0 6 8 0 F 5
Instruction/Data
. . .
222
Memory Initialization
Recall Top\n example Top\n Sometimes want to load data into memory
might be able to use corresponding instruction (because machine doesnt care!) but that is inconvenient (and there may not always be a corresponding instruction)
Two pseudo-ops (for 2 kinds of data):

NMD (numeric data) CCD (character data)
223
112
NMD and CCD

Format for these pseudo-ops:
numeric data is decimal integer character data is two characters
Example:
Count NMD Text CC CCD CCD CCD CCD 10007 He ll o!
224
NMD and CCD (II)

These pseudo-ops often have labels pseudo ops
(but not required)
(106) 6Ax
0 2 7 1 7 0 0 0 0
(i.e., 1000710) Count = ________ Text T t = ________
H l o
e l !
225
113
Lecture #15
226
Storage Allocation
Set aside a block of memory not initialized (i.e., dont care) Pseudo-op:
RES (reserve storage) X Buffer Y NMD RES NMD 0 100 0
227
Example:
114
Yields:
Q: How would we typically address locations in this buffer?
228
Using Blocks of Storage

A: Example: NIn EQU 0 . . . IO NIn,Buffer Note: some pseudo-ops affect the location counter (storage) others dont
do: dont:
229
115
Symbols - Shortcomings
Weve seen a lot of utility for symbols We ve
mnemonics for data constants & memory addresses
But they are sometimes inconvenient:

consider initializing a register to 582
i.e., R1 <-- 582
must explicitly name and allocate a constant!

C582 start NMD LD 582 1,C582
(why not just LDI 1,582 ??)

230
Symbols - Shortcomings II
Problems with this approach:
___________________ ___________________ ___________________ ___________________
231
116
Alternative: Literals
Implicit allocation & initialization of memory Allows us to put the value itself right in the assembly language instruction Preface with = Example:
LD 1,=582
232
Literals
This means:
allocate storage at end of program initialize this storage with the value 582 use this address in the instruction
So it is almost equivalent to:

LD . . . C582 NMD 1,C582 1 C582 582
233
117
Literals - Restrictions
Must be in the range -219219-1 2 2 1
can be represented by 1 word
Can only replace the S field Cannot be indexed Cannot use with:
loading an immediate value, branch, store, shift, IO for reading, IO for writing a character
Each restriction has a motivation

234
Lecture #16
235
118
Big Picture: Labs 2-4

EG1 ORI 35 Assembly ST 1,Dt File
Assembler
H23EG1 2307 Object File T2321026
Loader
Executable File
Emulator
236
Big Picture: Lab 3

Assembly y Language Program
EG1 ORI 35 ... Buff RES 50 ...
Memory
0 35
50
Footprint
255
237
119
Lab 3
Assembler does not need to keep this (potentially huge) array/footprint Instead, use tables (symbol, literal, ) and location counter Generate object file only ( G j f y (much smaller) )
238
Assembler Tasks
1. 1 Parse assembly language instructions
check for syntax tokenize the character string maintain location counter (LC) ( ) LC = eventual location in memory of this instruction or data
239
2. Assign addresses to instructions and data
120
Assembler Tasks II
3. 3 Generate machine code
evaluate mnemonic instruction
replace with opcode recognize & translate synthetic instructions (RET, etc.) replace symbols & literals with value
evaluate operand subfields concatenate to form instruction generate header record evaluate NMD, CCD, etc
240
4. Process pseudo-ops
Assembler Tasks III

5. 5 Write object output file
header & text records
6. Write listing output file
Nothing here seems all that hard...

241
121
Example
1 2 3 4 5 6 7 8 9 Prog Acc Begin ORI EQU LD ADD ST BR NMD RES END 20 0 Acc,N ;R0 <- 13 Acc,=1 ;R0 <- R0+1 Acc,Ans M[Ans]<-R0 0,0 0 0 13 1 Begin
242
N Ans
First Attempt
Read each input line and generate machine code Line 1
information for header but not enough for full header record g we do know:
20
243
122
First Attempt II
Line 2:
information for assembler symbol Acc set to 0
Line 3:
yeah! An instruction to translate! ( y (LD Acc,N) , ) yields:
244
First Attempt - Difficulties

Problem: Lines 4 & 5:
same problem do not yet know address for =1 or Ans 1,
Solution:

245
123
Now let s see some basic data structures for lets assemblers...
246
Machine Op Table
Mnemonic Name Opcode O d Instruction Size Instruction Format
Static (doesnt change during computation) ( p ) For the (simple) abstract machine:
all opcodes are 4 bits all instructions are same size (i.e., 1 word) all formats are the same (i.e., O|U2|R|X|U1|S)
247
124
Machine Op Table II
But this need not be the case in general
different opcode lengths
common instrctns. have short opcodes (e.g., 0110) less common ones are longer (e.g., 1110110)
variable instruction length

e.g., branch relative, where PC <- PC + operand
could be near, operand is 8 bits could be far, operand is 28 bits (instruction uses 2 words)
248
Machine Op Table III

varying formats
e.g., long subroutine call operation syntax might be: CAL 19-bit-address-offset the fixed format of our abstract machines instructions makes the assemblers parsing easier
249
125
Pseudo Op Table
Mnemonic Length Format
Al a static table Also t ti t bl Some lengths are 0, others are 1, or variable

250
Location Counter
Eventual address of this instruction or data Initialized with __________ Increase with each instruction
see _________________
Increase with (some) pseudo-ops pseudo ops

see __________________
251
126
Symbol Table
Name Value Other Stuff
Pass #1
each symbol is identified
every time a new symbol is seen (i.e., a label), insert it into the Symbol Table if its already there?
252
Symbol Table II
Pass #1 (continued)
each symbol given a value
explicit assignment (e.g., Acc EQU 0)
easy! just put the value of operand into the table
implicit assignment (e.g., X NMD 13) p g ( g, )

must know the address of this instruction so, keep track of addresses as program is scanned use location counter (LC)
253
127
Symbol Table III

Pass #2
symbols in operands replaced with their value
look up symbol in Symbol Table if there, replace with value if not there, (Q: how could the symbol possibly not be there?) ho co ld s mbol possibl
254
Lecture #17
255
128
Literal Table
Name Address Value Size Other Stuff
P #1 Pass
literals are identified and placed in the table name, value, and size fields updated duplicates can be eliminated
256
Literal Table II
After Pass #1:
literals are added to the end of the program address field can now be calculated
Pass #2:
literals in instructions are replaced with the p __________ field from the Literal Table what if the literal is not in the table?
257
129
Information Flow (Passes 1/2)

Source File Intermediate File Object File
Pass #1
Pass #2
Symbol Table Location Counter Literal Table Machine Op Table Pseudo Op Table
Listing File
258
Two Pass Assembler: Limitations

Q: Does our 2-pass approach solve all 2 pass forward-reference problems? A: no! Something is still broken Hint:
what is the key invariant (w.r.t. symbols) during pass #1?
259
130
Pass #1 Invariant
Top ORI 34 ... Loop --- --... ... S EQU --...
Key invariant:
current position
So what could go wrong?
How could this happen?

260
Forward Reference Restriction

Consider
Y EQU 0 X EQU Y X EQU Y Y EQU 0
To avoid this trouble, impose a restriction:
261
131
1. 1 Pseudo code for 2-pass assembler 2 pass 2. In-class exercise: hand assembly
calculate symbol and literal tables calculate loaded image in memory calculate object file j
(Use list of op codes)
262
Lecture #18
263
132
Relocation
264
Absolute Programs
Programmer decides a priori where program will reside
e.g., Prog ORI 176
But memory is shared

concurrent users, concurrent jobs/processes , j p several programs loaded simultaneously
265
133
Absolute Programs: Limitation

At any given instant:
Memory occupied
Picture is dynamic
jobs are scheduled jobs complete
We cannot predict what this picture will look like!!

266
Absolute Programs: Limitation II

Would like the loading to be flexible
decide at load time where it goes! (not at ____________ time) this decision is made by __________________
What the programmer and user want:

find a free slot in memory that is big enough to fit this program
267
134
Motivating Relocation: Example

Prog X Reg Start ORI 0 NMD 32 EQU 2 LD Reg,X ST Reg,Y BR 0 0 0,0 RES 1 END Start
268
In memory, this memory program appears as:
269
135
One Slight Change

Prog X Reg Start ORI 58 NMD 32 EQU 2 LD Reg,X ST Reg,Y BR 0 0 0,0 RES 1 END Start
270
Appearance of new program in memory:
271
136
Compare the Old and the New
272
What Changed?
Load Address = ________
i.e., ____________ + _____ i.e., ____________ + _____
273
137
Another Slight Change

Prog X Reg Start ORI 176 NMD 32 EQU 2 LD Reg,X ST Reg,Y BR 0 0 0,0 RES 1 END Start In memory, this memory program appears as:
274
Relocation
The loader must update some (parts of) text records, but not all
after load address has been determined
The assembler does 2 things:

assemble with a load address of 0 tell the loader which parts will need to be updated
275
138
Modification Records
One approach: define a new record type
Tag Location
For our example:

H01Prog 0005 T0000020 T0102000 T0222004 T03C0000
276
Modification Records II
We could add the following records to the object file:
M01 M02
For some architectures, longer modification records would be required

for our machine, it is always the same part of the instruction that needs to be relocated
277
139
Modification Records III

One disadvantage of this approach:
278
Alternative: Bit Masks

Use 1 bit / memory cell
bit value is 0 means no relocation necessary bit value is 1 means relocation necessary
Size of relocation data independent of number of records needing modification g Hard to read (debug, grade,)
279
140
Compromise (For Our Machine)

Change the syntax of a text record Flag modification with an M at the end of the record Example
H------------M T00----T00 T01-----M T02-----M T03----280
Lecture #19
281
141
Kinds of Data
Our machine has two flavors of data:
relative (to the load address) absolute
The first must be modified, the second not Lets look at how these kinds arise Let s arise
282
Example 2
EG2 TS V Start ORI EQU NMD LDI LD LDI LD LD SUB BRZ BR BR END 27 27 1,V 2,0(1) 3,TS 0,0(1) 1,0(1) 1 =27 1,=27 1,Stop 3,Start 0,V(3) ! ! ! ! ! ! ! ! ! ! ! TS = 27 [V] = 27 R1 = V = ?(relative) R2 = [0+V] = 27 R3 = TS = 27 R0 = [0+V] = 27 R1 = [0+V] = 27 R1 = 27 27 = 0 if (R1 is 0) then halt goto Start halt; dump all
Stop
283
142
Symbols
Some are relative:
e.g.,
Some are absolute:

e.g.,
Symbol Table
Name Value Relative?
284
Symbols: Rules
A symbol is absolute if and only if it is defined in an EQU by:
_____________, or _____________________
(well see another way later) ( y )
285
143
Literals
Our machine does not have relative literals Other machines allow a special literal, =*, to mean current location counter
e.g., LD 1,=* such a literal is relative, others are absolute ,
Name Star1 =6
286
Value
Address
Relative?
Literals
With literals, relative refers to the value literals relative
the addresses are always relative!
287
144
Relocation Information in the Symbol Table: Example

Prog X Reg Start ORI NMD 32 EQU 2 LD Reg,X ST Reg,Y BR 0 0 0,0 RES 1 END Start After pass #1, the #1 symbol table is:
Name Value Relative?
288
Convention
To denote a relocatable program, program omit the operand of ORI
Prog1 ORI 96 Prog2 ORI (absolute) (relocatable)
289
145
Tables: Storing & Searching
290
Overview
Given: a collection of <tag,value> pairs <tag value>
e.g., symbol table
Searching =
g given a tag, return corresponding value g, p g
Q: is this spec ok?
291
146
Intentionality of Specification
What do we do if key not in table? y
a) return an arbitrary value b) crash, halt, explode c) return a special value (NULL, error, )
Traditionally, we want (c) But what if client knows key is in table? Pay for this extra checking with each call to search? The defensiveness dilemma; maybe options a) and b) for production look better if checking components are better, available for development. The point: intentionally decide what your specification is, and document your decision.
292
Relevance for Assemblers

Every line (almost) has an instruction or a pseudo op
search tables
Can have lots of symbols and literals

need to create and search tables
Assemblers can spend 50% of the time searching tables!!

so, its important this be efficient
293
147
Linear Search
Algorithm:
compare target with 1st key if match, then done (return value) else, compare target with 2nd key if match, then done (return value)
Advantages:
294
Linear Search - Complexity

How long to insert a new <tag,value> pair? <tag value>
How long to find a target?

best case: worst case: average case:
Average case assumes a distribution

Tavg = i p(i ) t(i )
295
148
Linear Search - Complexity II

E g for a distribution where the probability E.g., of seeking a key not present in the table is zero and is equal for all other keys (a uniform distribution),
Tavg =
One way to lower average case complexity:

____________________________________
296
Linear Search - Complexity III

Overall search complexity is ( )
i.e., double table size
Time
Table Size
297
149
Lecture #20
298
Binary Search
Algorithm to search among 2 or more items
compare target with middle key if target middle key, then search first half if target > middle key, then search second half
This algorithm requires: g q

a) b) c)
299
150
Binary Search - Complexity

Each iteration divides the problem in half
T(n) =
Time
Table Size
300
Binary Search - Complexity II

Example:
table with 1,000,000 entries: _________ table with 2,000,000 entries: _________
Drawback: simple insertion is linear

sorted sorted sorted + + sorted
time to build =
301
151
Building Sorted Tables

Solution #1: use a heap
insertion is faster, (lg n) but more complicated algorithm
Solution #2: build, then sort

in assembler, pass #1 does mostly insertions, then pass #2 does mostly searches create as a linear table ( ) sort ( ) use binary search on sorted table
302
Estimated Search
How do we search for things?
e.g., finding Brutus in phone book binary search? No! Make a guess
For table search, must know ____________

zzzz Key AAAA 0 Table Index N-1
303
152
Hash Tables - Overview

Combines:
strength of binary search (fast _________) strength of linear search (fast _________)
Hash function: converts keys to integers

h: K ZN-1 where: ZN-1 = { 0, 1 2 , N 1 } h 0 1, 2, N-1 N = K =
304
Hash Functions
Used to insert and to search
insert(key) --> into h(key) search(key) --> look in h(key)
Ideal: generate a unique integer for each key

but this ideal does not (usually) exist ( y) because:
So what we really look for in h:

returns a number in [0..N-1] with uniform distribution
305
153
Designing Hash Functions

1 Probability 1
0 AAAA Key zzzz
0 0 h(Key) N-1
Want: small differences in key result in big differences in h(key)

like a deterministic random number generator
306
Example Hash Function

h (key ) = ( letter values )mod 50
For example:
h (magenta ) = (13 + 1 + 7 + 5 + 14 + 20 + 1) mod 50
But two different keys could be mapped to same integer

h (rub ) = (18 + 21 + 2) mod 50 = h (madre ) = (13 + 1 + 4 + 18 + 5) mod 50 =
This is called a collision
307
154
Dealing with Collisions

a) Keep a linked list: chaining
collisions
b) Go to next open cell : open addressing

yields
insert
308
Dealing with Collisions II

this approach can lead to clustering clustering
c) Rehash with another function: open addressing with quadratic probing or double h hi (to d bl hashing (t approx. uniform if hashing)
cost: ___________________
309
big blocks of occupied slots
155
Complexity - Insertion
For approach c) (uniform hashing), how hashing) long does an insertion take? #attempts depends on _________________
e.g., first entry never collides (1 attempt)
Let r be the fraction of the rep. array that is full f ll

probability of a collision = probability of success =
310
Insertion II
Let p(t) = prob insertion requires t attempts prob.
The expected number of attempts is given by:
p(i ) i
i =1
Example: if a rep. array is half full, how many attempts does the next insertion take?
311
156
Complexity - Building a Table

Example:
array size = 1000, wish to insert 900 elements how long does this take? first item: ______ (so a lower bound: _______) 901st item: _____ (so an upper bound: ______)
Problem: each insertion takes more time
312
Building a Table II
A=
X
# Attempts 1 0 r
1 dr 1 r 0
= ... = _________
In our example: X = .9; array size = 1000 E(# of insertion attempts) 2.303 * 1000
313
157
Searching a (Hash) Table

On average how many attempts needed average, (for both insertion and search(!))?
V X = A = ln 1 1 X V = 1 ln 1 X 1 X
X r
314
# Attempts V 1 0
( (
) )
Lecture #21
315
158
Make
316
Prerequisite DAG
Large project: mix of files generated by people and by tools Contents of one file often depend on contents of some others (acyclic structure)
human final product(s)
317
machine
actions
159
Advantages of Make Tool

Automates the whole creation process
Describe DAG and actions once, then build entire structure with one command
Permits distribution of partial DAG to client Automates the partial creation process p p
Identify subDAGs that are out of date and need to be rebuilt, and invoke corresponding actions
318
Example Applications
Latex documents
tex, bib, bbl, dvi, ps
Report generation and filtering Compiled and linked code, object files, and executables
cpp, h, o, a, exe
(note: there can be more than 1 final product)

319
160
Example Prerequisite DAG
320
Makefile
DAG is represented in a makefile makefile
target: prerequisites command command rule
Notes:
command lines begin with tab line continuation with \ Comment lines begin with #
321
161
Processing Makefiles
Default: first target is the final product final product Rule: if any prerequisite is newer than target (or target does not exist), then execute associated commands But first (and in any case): ensure all ( y ) prerequisites are up to date!
recurse to rule that has prerequisite as a target
322
Special Rules Phony Targets

Targets without prerequisites
Always out of date (commands executed) clean: rm f *.o edit
Rules with no commands

Forces recursion to update prerequisites all: gui edit doc.ps
323
162
Variables
Frequently used strings can be replaced by variables Defined with = and referenced with $ Example
CC = g++ CFLAGS = -g prog: prog.c defs.h $(CC) o prog prog.c $(CFLAGS)
324
Implicit Rules
Describe when and how to remake files based on their name (extension)
E.g., <file.o> depends on <file>.c The associated command is cc c <file>.c
Omit the rule entirely

P Prerequisite and command implicitly provided i it d d i li itl id d
Omit only the command

Command implicitly provided
325
163
Explicit Pattern Rules

Contains one % character in target
Matches any non-empty string E.g., %.dvi : %.tex
Q: How can we write the command associated with such a rule? A: Automatic variables
%.dvi : %.tex latex $< latex $<
326
Automatic Variables
Not standard between make tools Gnu (gmake):
$@ - target filename $< - name of first prerequisite $ $? - names of all prereqs newer than target p q g $^ - names of all prerequisites
Not restricted to pattern rules

327
164
To Read More About Make

make: man make gmake: In xemacs:
^U^Hi/usr/local/info/make.info
328
Lecture #22
Review for Midterm
329
165
Lecture #23
Midterm
330
Lecture #24
331
166
Expressions
332
Introduction
Most assemblers permit use of expressions Used as instruction operands
in machine ops and pseudo ops
Typically simple mathematical forms only Two parts:

operators (+, -, *, /) individual terms
333
167
Introduction II
Individual te s may be: d v dua terms ay
constants (e.g., 4, A, 0x3F) user-defined symbols (e.g., X, Buff) special terms (e.g., * for LC) parenthesized expressions (e.g., (X-Z) in (X-Z)/2)
Examples
Buff RES ST ST 4 2,Buff 2,Buff+1 Buff RES 4 BEnd EQU * Len EQU BEnd-Buff
334
Relocation
Expressions are evaluated at ____________
(not entirely true as well see later)
Expressions can be relative or absolute or illegal Intuition:

the value of an absolute expression _________ __________ with program relocation
What are the rules for well-formed expressions?
335
168
Absolute Expressions
An expression is absolute iff:
1. it contains only absolute terms, OR 2. it contains relative terms provided:
i) they occur in summation pairs, AND ii) terms in each pair have opposite sign, AND iii) relative terms do not enter in * or /
Examples of 1: Examples of 2:
3+4+0x2 2*X where: Buff2-Buff
336
Relative Expressions
An expression is relative iff:
1. all relative terms can be paired as above, except one, AND 2. that remaining unpaired term is positive, AND 3. relative terms do not enter into * or /
Examples:
Buff+6
337
169
Motivation
These restrictions are not arbitrary They ensure the expression is meaningful after relocation If the restrictions are not met, the expression is erroneous p
338
Examples
Name X Z Y Value 16 6 4 R/A R R A
X+1-Z = 2+X/Y = Z+X = Z-Y = Y-Z = (X-Z)/2 = ((X-Y)-(Z+Y))*Y = (X/2) - (Z/2) = (X-Z)/2 =
339
170
Generalization
A relative value has the form:
LL + OFFSET
Load Location Indept of Load Location (i.e., absolute)
An absolute value has the form:

A
Indept of Load Location
340
Generalization - Examples
R1 - R2 = = R1 - A = = A - R1 = = (LL + OFF1) - (LL + OFF2)
(LL + OFF1) - A
A - (LL + OFF1)
341
171
Lecture #25
342
Loaders
343
172
What, Again?
So you ask:
Havent we done this already? I built one in Lab #2 !
Our view then was simplistic:

Source Program Compiler Object File Loader Memory
But there are problems with this view
344
Problems
Programmer is responsible for putting absolute addresses in code
error-prone
Programmer decides where object code g goes in memory y

better to give OS control allows multiple concurrent jobs & users (dynamic scheduling)
345
173
Problems II
Program must be self contained self-contained
would prefer to allow separate assembly
make one change, do not have to recompile the whole thing
would prefer to use libraries

for frequently used code freq entl sed functions used by many applications (e.g., strcpy(), sqrt(), sin(), )
346
Problems III
would prefer to have the flexibility to write different parts in different source languages
some languages are better suited to certain tasks than others library functions could be written in a single language (rather than rewriting for every possible source language)
347
174
More General View of Compilation and Loading

Source Files
C Program .c C Compiler
Object Files
.o
Library Files
.a
Memory
Fortran Program
.f
Fortran Compiler
.o Loader
Assembly Program
.s
Assembler
.o
348
General Loaders
This requires standardizing the format of the object file Each source language translator then follows this standard
349
175
Summary of Advantages
Needn t Neednt worry about address arithmetic More than 1 program in memory at once Assemble code once Separate assembly Libraries Lib i Multiple languages
350
Steps Before Execution

1. Translation
produce object file from source
2. Allocation
select area in memory for program
3. Relocation
adjust address references in object file
4. Linking
combine multiple object modules
5. Loading
351
176
Types of Loaders
Different loaders differ with respect to how these tasks are accomplished Types include
compile-and-go absolute relocating linking dynamic loading dynamic linking
352
Compile-and-Go
Observation: an executing translator is, is itself, a process governed by a program residing in memory! Reserve memory at the end of its block As source is compiled, object code is placed directly i t thi di tl into this reserved memory d
loader is really just part of the compiler example: WATFOR Fortran
353
177
Picture
Memory
Source Program
Translator
354
Advantages / Disadvantages
Advantages:
speed: need not produce intermediate file d d t d i t di t fil (I/O is always slow) batch environments: compiler remains resident in memory (so very low start-up cost)
Disadvantages:
must recompile every time you run
no object file produced bj t fil d d
libraries likely to be source-based requires more memory (program + compiler)

355
178
Absolute Loaders
Familiar loader from Lab #2 Consider responsibility for each task:
allocation: calc. length of all modules _______ calc. actual load location ________ relocation: ______________ linking: ________________ loading: ________________
356
Absolute Loaders II
Advantages:
simple, fast, small, programmer-controlled
Disadvantages:
program must be self-contained: programmer must edit library subroutines into one assembly language source file to run at a different memory location, reassembly is required
357
179
Special Case
Observation: any loader is a program
i.e., resides in memory; executes as a process
Q: What loads the loader?

the OS? then what loads the OS?
From an idl machine, we need a way to idle hi d start things up Solution: a bootstrap loader
358
Bootstrap Loader
Store a special program in ROM This program is automatically executed at power-up This program is an absolute loader
reads records from an input device puts them in a predetermined (absolute) location
Control is then transferred to loaded program (which can load other things, etc.)
359
180
Lecture #26
360
Relocating Loader
We need the assembler to do 2 things:
flag relative values (e.g., with modifn records) produce size-of-segment information
machine code
Loader works with OS to determine load location (dynamic)

361
181
Relocating Loader II
Loader performs relocation
adds LL to all relative values can be done in a single pass
Advantage:
more efficient packing of memory p g y
Disadvantage:
no external subroutines or libraries
362
Time for a slight detour, as a motivational detour aside for remaining loader types
363
182
Subroutine Linkage
364
Motivation
Example: want to calculate a square root
first write our program (in assembly) now how do we use this code?
Idea #1: embed right in the code of main

no code reuse program gets big and hard to read error-prone, tedious,
365
183
Motivation II
Idea #2: have a separate section
ORI ... Sqrt LD 0,=0 SHR . . . etc etc t t
From main we need to jump to this code
366
Branching
Want to branch:
to Sqrt, and (after were done) return to caller
So, 2 simple branches dont work

first branch is no problem (BR Sqrt) p ( q ) but what does the return look like???
Solution: for our machine, we can use

367
184
Branch-to-Subroutine
I e BRS R,S(X) I.e., R S(X) Example: 3Fx : BRS 1,Sqrt
loads the PC into register R branches to location Sqrt
Q: what is the new value in register 1? A:
368
Returning from a Subroutine

How do we return to the caller? Want: PC <-- R In our machine, we can use:__________ At the end of Sqrt, we have
BR 3 3,
(Aside: our synthetic instruction, RET, is a more direct way of expressing this)
369
185
Using the Sqrt Subroutine

Prog og Soln O ORI RES ... LD BRS ST BR ST SHR BR RES END 1 1,=625 3,Sqrt 1,Soln 0,1 1,Value ... 3, 1
370
Sqrt Value
Calling Conventions
Program and subroutine must agree on:
where to branch for function where to return when done where to put/get argument(s) where to put/get result(s)
In previous example:
return address in register 3 argument in register 1 result in register 1
371
186
Calling Conventions II
So conventions are required conventions
e.g., caller always places return address in register 3
if function uses r3, must save value first
caller pushes return address onto stack caller stores return address in first word of subr.
These conventions are generally not checked at assembly time

contrast this with higher-level languages
372
All this talk of subroutines is great But were still missing...
373
187
Separate Compilation
Would like to have in our program the line: BRS 3,Sqrt where Sqrt is a label in a different program! (Aside: what does your current assembler do with such a thing?) We extend our language and provide a (typical) mechanism for resolving this...
374
Lecture #27
375
188
New Pseudo Op: EXT

1. 1 EXT symbol name symbol_name
external indicates that symbol_name is defined in a different program legal to use, but assembler cant fill it in Prog ORI EXT Sqrt BRS 3,Sqrt END
376
New Pseudo Op: ENT

2. 2 ENT symbol name symbol_name
symbol_name is defined in this program it is a global symbol i.e., may be referenced in other programs
These pseudo ops change the scope of a p p g p symbol

local (default) -----> global (with ENT)
Q: why not make global the default, or make them all global?
377
189
Example: Two Programs

Main ORI EXT Sqrt ... CALL Sqrt ... END Subr ORI ENT Sqrt ... SHR 1,1 ... RETURN END
Sqrt
378
These two programs can now be:

Independently written Independently assembled into 2 object files
How does the linkage between these object files get resolved? g So now back to loaders
379
190
Binary Symbolic Subroutine Loaders (BSS)

One of the first relocating loaders
1956: IBM, GE, UNIVAC
Allows multiple program segments (control sections in your textbook)

different languages g g different times
Separate compilation!
Lets examine each of the tasks in turn

380
Tasks for BSS Loader

Allocation
assembler calculates each segment length loader adds them all up load location obtained from OS
Relocation
assembler flags words for relocation (bit masks) loader makes modifications
381
191
Tasks for BSS Loader II

Linking
by loader (with help from assembler) a restricted form uses a transfer vector
Loading g
by loader
382
Transfer Vector
Contains 1 entry per external symbol used by this program segment Assembler sets aside room at the beginning of the object file for TV Assembler places symbolic representation p y p of referenced external symbols in TV
sqrt Transfer Vector Program
383
192
Transfer Vector II
Assembler replaces all calls to external symbols with calls to appropriate locations in TV Loader replaces the entries in TV with calls to the appropriate location
0 ORI EXT Sqrt ... CALL Sqrt 6 7 Assembler CALL 6 relative Sqrt 60 66 67 BSS Loader CALL 66
384
CALL 32 RETURN
Disadvantages of Transfer Vector

Overhead
time (extra call instruction) space (for transfer vector in object file)
Works for subroutine calls, but what about for sharing data?
e.g., LD 1,XValue this cannot be replaced with a call to TV, or even a load (with memory direct addressing) from the TV
385
193
Data Sharing with BSS Loaders

Permit 1 common (shared) data segment data segment All external data is in this one DS
DS X
CS
LD 1,X
Assembler replaces X with its (relative) address in the data segment
386
Impact on Relocation
What does this mean for relocating the program? There are 2 different kinds of relative Assembler must distinguish them
extend relocation information e.g., use 2 bits per word
00 - absolute 01 - relative (to CS load location) 10 - relative (to DS load location)
387
194
Lecture #28
388
Direct Linking Loaders
389
195
Introduction
General linking/loading strategy Very common in modern systems And used in Lab #4! Advantages:
separate assembly multiple control and data segments lower time overhead (in program execution) lower space overhead (in run-time footprint)
390
Assembler Responsibilities
1. 1 Header information
length of segment execution start address
2. List of entry symbols

those defined in this segment gives their (relative) value
3. Mark each reference to an external symbol

used in this segment defined outside of segment
391
196
Assembler Responsibilities II
4. 4 Relocation information
modification records
5. Machine code
text records
There are some new things here, which suggests defining some new record types
392
Entry Record
List and define all the entry symbols Possible format:
<Flag> <symbol_name> <value>
Examples:
ESqrt 0E (possible because symbols have fewer than 7 characters) f th h t ) ESqrt=0E
Idea: provide information to loader

393
197
For Lab #4
Well adopt the following conventions: We ll
A programs name is always (implicitly) an entry symbol Entry symbols must be relative
(If you wish to handle absolute, too, thats up to you)
394
External Record
Can be combined with text and modification records if you wish Examples:
LD 1,9 LD 1,Num LD 1,Enum 1E T1F01009 T1F01002M T1F01000XEnum T1F01000XE
Format:
T <addr> <machine_code> X <symbol_name>
395
198
External Records II
Such a record tells the loader to:
find seg. that defines that (external) symbol find the value of that symbol within that seg. (i.e., look at the corresponding entry record!) add this value to the one in the text record add the load location of the seg. that defines the symbol to the text record
this last step is just like the usual relocation operation of relative symbols, but using the LL of the segment that defines the symbol
396
Lecture #29
397
199
Direct Linking Loaders: Algorithm and Data Structures
398
Algorithm & Data Structures

Problem is similar to assembler:
resolve symbols (with forward references) use these values
Solution is similar too!

use _______________
Pass #1: find definition of all external symbols Pass #2: aggregate, relocate, link, load
399
200
Pass #1
Q What does the assembler tell the loader Q. about each ENT symbol? A. So, to determine the actual symbol value, loader must calculate: ______________ + ______________ For lab #4, we can load the segments into one contiguous block of memory
400
Loaded Memory
Seg. 1 Seg. 2 Seg. 3
PLA (program load address)
401
201
Pass #1: Pseudocode

calculate total size (all segments) get PLA (from OS) SLA PLA for each segment do: if (its Main Seg.){IPC = SLA + Headers IPC} add entry symbols with their absolute values (stated value + SLA) to external symbol table (EST) y ( ) (if symbol already present flag an error) calculate next SLA (SLA += Seg. Len.) rof
402
Example
Main ORI EXT ENT BRS BR Num NMD END Pnum Num 3,Pnum 0,0(0) 7 ORI EXT ENT Pnum IO BR END Lib Num Pnum 2,Num 3,0(3)
403
202
External Symbol Table

For our example: Name Value
(assume: ______________________ ) Note: for lab #4, can restrict external symbols to be relative only
404
Pass #2: Pseudocode

SLA PLA for each Seg. in same order as pass 1 for each text record in Seg. calculate memory location relocate record: absolute relative external load the word rof SLA += Seg. size rof transfer control to IPC, start of Main segment
405
203
Recommended exercise: assemble link, and assemble, link load our example (assuming PLA of ________ )
406
Lecture #30
407
204
Checking External Symbols

Q. When do we check whether an external symbol is Q y actually defined? A. In addition to entry records at the top of the object file, there might be special external records, too, for EXT symbols declared. We may be tempted to rely on the fact that external y y y symbols used EXT symbols declared; we could try to gain speed with a conservative approach that complains when no definition is found for a declared EXT symbol regardless whether its used. (Look only at entry and special external records for this check.) Could be done after pass 1 or in pass 2.
408
Checking External Symbols II

However such a strategy is not robust; the fact However, fact is only surely true in files produced by the assembler. Good news: robustness and liberal accuracy can both be achieved here without (in the case of no errors to report) sacrificing speed. p ) g p During pass 2, when each text record is processed, report an error if the symbol is not in the EST.
409
205
Unifying X and M
Dont really need 2 separate mechanisms! Don t Recall meanings of
T_______M T _ _ _ _ _ _ _ XSym
Sym is in EST add this value of Sym to address field
Recall that segment names always in EST This suggests that X can be seen as a more general form of M!
410
Replacing M with X
Prog ORI ... Loop - - ... BR 3,Loop ... T 05 C3002 M or T 05 C3002 XProg
411
206
Linking with Libraries

Common funcs often defined in libraries func s Library linking can be made implicit:
after pass 1, may still have unresolved external symbolssymbols from EXT declarations that are not (yet) in EST if so, search libraries for matching definitions and load them (after pass 1?) still some unresolved externals? Then error
Typically, user-specified libraries searched first, then standard ones (automatically)

412
Lecture #31
413
207
Loader Refinements and Optimizations
414
Problem: Space
Consider a program that calls sqrt, rnd, sqrt rnd and substr Each defined in its own (large) library So, linked and loaded program is huge Solutions (for saving memory):
virtual memory and paging dynamic loading dynamic linking
415
208
Dynamic Loading
Observe: program does 1 thing at a time
dont need all segments present simultaneously
Example
B
500
200 300
D
Total Size = 1.9 Mb
300
200
400
416
Dynamic Loading - Overlays

B/D never together with C/E/F Define an overlay structure for how segments can be swapped in and out A 200 A 200 A 3 scenarios:
B D
Total Size = 1000 500 300 700
C E
300 200
C F
900
200 300 400
Only 1Mb needed (length of longest path) Trade-off: memory space & time
417
209
Dynamic Linking
Instead of branching directly to an external symbol, program issues a call request to OS
subroutine name is parameter for request
OS responsibilities
keep table of loaded libraries p
loads new library if needed manages swapping of libraries as appropriate
transfer control to appropriate subroutine return control to original program
418
Dynamic Linking II
Binding: the association of an actual Binding : address (5E) with a symbolic name (Sqrt) Dynamic linking delays binding from load time to execution time (late binding) Advantages: g
many programs can share 1 loaded library library can be recompiled on-the-fly library only loaded if actually used
419
210
Problem: Time
Every time we want to execute a program, program must re-link, relocate and re-load
costly if object code hasnt changed
Idea: separate these two operations

Object j File Linkage g Editor Linked Program Relocating Loader Memory y
linkage editor does the binding loader does allocn/relocn/loading

small, simple, fast
420
Linking and Loading in Practice

Real object files have multiple sect o s ea es ave u t p e sections
Code (C): fetch for execution only (instructions) Data (D): no fetches for execution (storage)
Why distinguish between C and D sections?

Multiple processes executing same program Load one copy of C (text) segment(s), and share it Each process gets its own copy of D (read/write) p g py ( ) segments
When linking multiple object files:

group C (D) sections together as segments
421
211
Object File Format: UNIX a.out

header text data text relocn data relocn symbol table string table
stack
422
text
data bss heap
Unix a.out Header Structure

int int int int int int int int a_magic; a magic; a_text; a_data; a_bss; a_syms; a_entry; a_trsize; a_drsize; //magic number //text seg size //data seg size //uninit data size //symbol table size //entry point //text relocn size //data relocn size
423
212
Unix a.out Relocation Entry

One entry (8 bytes) for each location to be patched
Handles both relocation and external symbols address index flags
Address: location (offset within segment) to patch Extern flag (1 bit):

Off: plain relocation (index gives segment) On: ext symbol (index is symbol no. from sym table)
Length: (2 bits) patch item is 1, 2, 4, or 8 bytes

424
Unix a.out Symbol Table

Each entry (12 bytes) describes 1 symbol
type name offset spare debug info value
Name offset: pointer into string table

Allows arbitrarily long symbol names (null terminated)
Type byte: low bit is external flag external

Text/data/bss: relative symbol (to that segment) Abs: absolute value (may or may not be external) Undefined: external bit must be on
425
213
Lecture #32
426
Macro Processors
427
214
Introduction
Macro: a notational convenience for Macro : programmers
short-hand for commonly used blocks of code not restricted to assembly languages
Macro Processor: tool that replaces shortp hand with corresponding block of code
performs string substitution (expansion) no analysis of instructions no semantics of programming language
428
Example
To clear all registers we write: registers,
LDI LDI LDI LDI 0,0 1,0 2,0 3,0
If needed often, this can be tedious S l i Solutions:

define a subroutine and call it when needed use a macro...
429
215
Example II
CLEAR MAC LDI LDI LDI LDI MND ;begin defn 0,0 1,0 2,0 3,0
macro name
macro body
430
Example III
In body of program:
M CLEAR M CLEAR M
After being fed to macro processor:

M LDI 0,0 LDI 1,0 LDI 2,0 LDI 3,0 M LDI 0 0 0,0 LDI 1,0 LDI 2,0 LDI 3,0 M
431
216
Picture
Notice that result is a __________ program Macro Processor source
source
The languages of the two programs differ only by what can be achieved with textual substitution
i.e., approximately the same level of abstract machine
432
Outline
Features
arguments labels variables conditional expansion
Algorithm for macro processor Macros in C and C++ Reference: Beck chp. 4
433
217
Macro Arguments
Arguments make macros more flexible I Involves textual substitution l l b i i
SWAP MAC LD LD ST ST MND ORI NMD NMD SWAP BR END (&A,&B) 1,&A 2,&B 1,&B 2,&A 10 0 (X,Y) 0,0(3)
434
Prog P X Y
Result of Macro Processing

Prog ORI X NMD Y NMD LD LD ST ST BR END 10 0 1,X 2,Y 1,Y 2,X 0,0(3)
435
218
Labels and Macros

Labels inside macro bodies can be useful
e.g., a macro that swaps values of 2 registers:
SWAPR MAC (&r1,&r2)
436
Labels: Problem
Consider a program with multiple invocns of macro SWAPR:
M SWAPR (1,2) M ( ) SWAPR (1,3) M
Expands to:
Labels defined twice!

assembler error
437
219
Labels: Solution
Macro processor provides a mechanism for generating unique labels
e.g., preface symbol (definition and use) with $
SWAPR $Tmp1 $Tmp2 $Strt MAC BR RES RES ST M (&r1,&r2) 3,$Strt 1 1 &r1,$Tmp1
438
Labels: Solution II
First expansion of this macro:
$AATmp1 $AATmp2 $AAStrt BR RES RES ... 3,$AAStrt 1 1
Unique prefix for each invocation

generated symbols must conform to assembler syntax (begins with $, length, etc.) programmer follows conventions (not to use $ outside of macros, use short labels, etc.)
439
220
Variables
Evaluated at time of: __________________
i.e., not at execution time
Example: &Test
variable name
SET 0
special expression pseudo-op
&Test can then be used in expressions within the macro body This feature is often used in conjunction with
440
Conditional Expansion
So far all macros we ve seen have been expanded far, weve to the same block of code
(modulo argument replacement)
Useful to generate different blocks of code

perhaps depending on value of some bool expr
Syntax: IF / ELSE / ENDIF

IF (expr) block1 ELSE block2 ENDIF Meaning: if expr is true, expand with block 1 else, expand with block 2
441
221
Example: Shifting Left/Right

S SHIFT MAC (&Target, & ou t, &Dir) C (& a get, &Amount, & )
442
Example: Swap
Conditional expansion for efficiency:
SWAP MAC (&A,&B) IF (&A NEQ &B) LD 1,&A LD 2,&B ST 1,&B ST 2,&A 2 &A ENDIF MND
Now SWAP (S,S) is expanded to nothing

443
222
Lecture #33
444
Macros vs. Subroutines: Tradeoffs

Macros are expanded inline Disadvantage:
program size increases
Subroutines are called (branched to) Disadvantage:

overhead of parameter passing (more costly than sbrtn. body?)
Advantage:
speed
Advantage:
program size
This is another example of the space / time tradeoff

445
223
Algorithm: First Attempt

2-pass approach is tempting 2 pass
lets us resolve forward references
1st pass:
build table of key, domain: ? and attribute, range: ?
2nd pass:
do the expansion (replace macro calls with bodies)
1st pass Invariant: after each MND, table contains all previous macro names seen in definitions, and their bodies
446
Problem A: Nested Definitions

Often useful to define macros inside macros
HPOS READ MAC MAC SolOS MAC READ MAC
M
MND WRITE MAC
M
MND WRITE MAC
M
MND MND
M
MND MND
In program:
begin by invoking OS macro (e.g., HPOS) then use READ & WRITE
447
224
Nested Definitions
To recompile on different OS change flag at the OS, top of program only! Another solution?
but nested definitions more convenient. Why?
Will this work with our 2-pass approach?

multiple definitions of READ (notice how invariant is violated)
Problem: definitions depend on previous expansions

448
Algorithm: Second Attempt

Use a 1-pass approach that alternates as alternates, necessary, between defining and expanding Data structure: Macro_Def_Table
Domain: macro names Range: macro definitions
Key invariant: after each outer MND seen outer

1. All previous outer macro definitions have been inserted into the table 2. All previous macro invocations expanded
449
225
Algorithm: Intuition
Scan program line-by-line MAC seen: change into definition mode
insert body into DefTable match up outer MND with initial MAC
Macro call seen: change into expansion mode

l k up macro name in table look i t bl process expansion from DefTable line-by-line
may include macro definitions!! requires changing back into definition mode
450
Algorithm: Limitation
This two edged approach is pretty clever two-edged clever... But are there any limitations it imposes on the definition / use of macros? A.
i.e.,
In practice, this is not a big problem
451
226
Problem B: Nested Invocations

Convenient to allow macros to call macros
CYCLE MAC (&A,&B,&C) SWAP (&A,&B) SWAP (&A,&C) MND
Expansion requires further expansion p q p

CYCLE (X,Y,Z) SWAP (X,Y) SWAP (X,Z) LD 1,X LD . . .
Or, in particular . . .
452
Recursive Invocations
Macro invokes itself! Of course, beware infinite recursion:
TROUBLE MAC NMD 10 TROUBLE MND
Solution: use _____________________
453
227
Recursive Macro: Example

Consider
TAB MAC IF TAB ENDIF NMD MND (&C) (NZ &C) (&C-1) &C
E Exercise: expand TAB (3) i d Exercise: what happens with

Depth EQU TAB 3 (Depth)
454
Expansion of TAB (3)
TAB (3)
TAB (3-1) NMD 3
TAB (3-1-1) NMD 3-1 NMD 3
TAB (3-1-1-1) NMD 3-1-1 NMD 3-1 NMD 3
NMD 3-1-1-1 NMD 3-1-1 NMD 3-1 NMD 3
455
228
Algorithm: Refinement for Nested Invocations

Add a data structure
ArgStack -- arguments in current expansion
As nested expansions encountered:

arguments are pushed onto this stack
Now, expand mode means:

look up macro name in table push arguments onto stack process expansion from DefTable line-by-line
may include macro definitions (change back to definition mode) may include macro invocations (expansions)
After last line from DefTable, pop arguments off stack.

456
Lecture #34
457
229
Macros in C and C++
458
MP Algorithmic Highlights
No nested definitions; nested invocations supported, ; pp , BUT no recursion allowed
Self-references not further expanded #define T (x+T) //only one expansion of T Circularities handled the same way (stop at first self-reference)
First action: strip comments; dont remove newlines View results of macro expansion with E
gcc E test.c > test.i E For RESOLVE/C++:
gcc E I/class/sce/rcpp I/class/sce/rcpp/RESOLVE_Catalog \ test.cpp > test.ii
Standard file extension for preprocessed C (C++) is .i (.ii), for intermediate file.
459
230
Basic Features: Definition

Use #define
e.g., #define BUFF_SIZE 1000
Must be 1 line
macro name
macro body
longer definitions use line continuation, \
N i convention: all upper case Naming ti ll Defines a global constant

example of use: int Buffer[BUFF_SIZE]; to change this constant, must recompile
460
Using Arguments
Argument list follows name (no space):
#define INC(X) X++ #define SUM(X,Y) X+Y
DANGER: arithmetic grouping Problem #1: protecting the body

e.g., #define MAX(X,Y) X > Y ? X : Y works fi with a = MAX(b,c); k fine i h but consider a = MAX(b,c) + 1; solution: protect the body
#define MAX(X,Y) (X > Y ? X : Y)
461
231
Using Arguments II
Problem #2: protecting the arguments
now consider using MAX macro in:
flag = MAX (b>0, c<0);
i.e., solution: protect the arguments

#define MAX(X,Y) ((X) > (Y) ? (X) : (Y))
Aside: line continuation

#define INC(X,Y) { X++; Y++; } \ \
462
Conditionals
Common condition is this macro defined? is defined?
#ifndef BUFF_SIZE #define BUFF_SIZE 1000 #endif /*BUFF_SIZE*/
Application: debugging modes

#define DEBUG_ON 1 _ ... #ifdef DEBUG_ON printf ( . . . ); #endif /*DEBUG_ON*/
Incurs no space/time overhead when not debugging!
463
232
File Inclusion
Syntax: #include filename filename
text of file called filename inserted at that point
#include f
DANGER: recursive inclusion

File f1
#include f2
File f2
#include f1
464
Recursive File Inclusion

Solution: protect every included file Convention: use #ifndef #endif
filename
File F1.h #ifndef F1 H IFP F1_H_IFP #define F1_H_IFP 1 ... #endif /*F1_H_IFP*/
something unique
465
233
Predefined Macros
Some defined by ANSI standard:
_ _ FILE_ _ / _ _LINE_ _: current file name / line number _ _DATE_ _ / _ _TIME_ _ : current date / time
Useful for error reporting

printf(error in %s, line %d\n, _ _FILE_ _, _ _LINE_ _);
Others defined by particular compilers (e.g., gcc)

_ _VERSION_ _, _ _BASE_FILE_ _, _ _INCLUDE_LEVEL_ _
Useful for distinguishing OSs

#ifdef _ _VAX_ _ ... #endif /*_ _VAX_ _*/
466
Arguments in Strings
ANSI C: parameter substitution not performed within quoted strings
#define DISP(EXP) printf(EXP = %d\n, EXP) Invocation: DISP (i*j+1); Result:
Solution: stringizing operator # stringizing operator,

#define DISP(EXP) printf(#EXP = %d\n, EXP) Result:
467
234
Pitfalls to Avoid
Text substitution aspect of macros can Text substitution make them tricky General strategy: limited use! Pitfall #1: side effects
recall MAX example consider: a = MAX(b++, c++) Q. if b = 2, c = 5 beforehand, what is result? A. a = ______ b = ______ c = _______
468
Pitfalls to Avoid II
Pitfall #2: swallowing the semicolon t a # : swa ow g t e se co o
Macro expands to form a compound statement:
#define INC(X,Y) {X++; Y++;}
We want to include semicolon with call:

INC(a,b);
But consider:
if (. . . ) INC(a,b); else . . .
This doesnt compile! why not? Solution (notice the missing semicolon at the end):
#define INC(X,Y) do { while(0)
469
X++; Y++; }
\ \ \
235
Lecture #35
470
Compilers
471
236
Introduction
Ref Beck chapter 5 Ref. Compiler = a kind of translator
high level language --> machine (or assembly) code
Translation gap is larger than for assembly language

sophisticated data structures str ct res
arrays, records, classes,
sophisticated control structures

if, while, switch, function calls, nested scopes,
472
The High-level Language

Two aspects to language definition:
1. Syntax
what are legal programs? i.e., what is accepted by the compiler
2. Semantics
what does the program mean? i.e., into what machine code it is translated
473
237
Modular Decomposition
View input as a stream of characters
P r o g _ _ _ _ _ O R I _ _ _ \n X _ _
source
Compiler
object file
Compiler must give this stream structure in order to perform the translation
474
Coarse-Grained Decomposition
( (source) ) stream of characters
Lexical Analyzer
stream of tokens
Parser
parse tree
Code Generator
object file
475
238
Lexical Analysis
First step of compilation process Also called:
scanner, tokenizer, lexer
Scans program (often stripping comments) Recognizes:

keywords, operators, identifiers, ints, floats,
All of these called tokens

476
Tokens
A token is defined by:
1. Type (e.g., integer) 2. Value (e.g., 312)
Keywords (e.g., while) often have their own token type (no associated value) yp ( ) Example:
MEAN := SUM DIV 100;
477
239
Tokens - Example
Result of tokenizing:
Line 13 Token Type id := id DIV int ; Token Value MEAN SUM 100
478
Token Definition
How to define what is & isnt a token isn t Some things seem to be simple
e.g., keywords
But language syntax rules add complexity

line continuation characters are spaces meaningful?
Need a general notation for defining all token types

479
240
Regular Expressions
Examples
label :: [A - Z] [A - Z 0 - 9] {0, 5}
a label is a capital letter followed by 0 to 5 characters that may be capital letters or numbers
int :: 0 | [1 - 9] [0 - 9]*
an int is either a 0 or a digit in range 1 to 9, followed 9 follo ed by any number (0 or more) digits
Regular expressions are equivalent to

480
Finite State Automata (FSA)

Definition:
a finite collection of nodes (states) directed arcs (transitions) between nodes arcs are labeled special nodes:
1 start at least 1 final (or ending or accepting)
An NFSA accepts a string iff it can read the string and end up in a final state
481
241
Example: LongLabel
longlabel :: [A - Z] [A - Z 0 - 9]* A-Z0-9 A- Z
482
Example: Int
0 0-9 1-9
483
242
Exercise
Write an NFSA for labels with underscores
same rules as for LongLabels (for letters/no.s) no _ at start no _ at end no 2 _s in a row BUFFER1 T_B_SI9ZE BUFF_SIZE BUFF_ BUFF_ _S 1SIZE
484
Should the following be accepted?
Lexical Analysis
Could write code to recognize LongLabel directly
see figure 5.10 but this is hard to read, modify, maintain,
Much easier to read and understand FSA! Scanners are often built automatically from FSA descriptions!
485
243
486
Lecture #36
487
244
Review (From Last Time)

Compiler overview Tokenizers Stream of char stream of tokens Token definition
regular expressions FSAs
Today: next step in compilation process...

488
Step #2: Parsing
489
245
Grammar
Defines syntax of language Given as a collection of rules
transformations e.g., ( X ) * ( T X ) * maps string on left into string on the right p g g g
One particular (and important) kind of grammar: Context-Free Grammar (CFG)

490
CFG
Two kinds of symbols:
terminals non-terminals
Each non-terminal has an associated rule

non-terminal is the only thing on the left eg p | (p) | pp e.g., application: p ((p)p)p pp (p)p (pp)p (()p)p (()())()
491
246
BNF (Backus-Naur Form)

A common notation for CFGs Invented to define the syntax of ALGOL60 Terminals are tokens! E.g.,
<entry> ::= ENT <entry-list> <entry list>
NT is defined to be token NT
<entry-list> ::= id | <entry-list>, id (notice the recursion)
492
BNF II
One special start symbol start
e.g., <program> ::= id <origin> <body> <end>
Notice division between tokenizer & parser

tokenizer could return smaller tokens then rules in parser become more complicated p p
e.g., <read> ::= R E A D . . . vs. <read> ::= READ ( <id-list> ) . . .
Example: see PASCAL BNF p. 228

493
247
494
Parse Trees
Record the application of BNF rules
root: the start symbol internal nodes: non-terminal symbols leaves: terminals (i.e., tokens)
Example: using PASCAL BNF, what is the p g , parse tree for MEAN := SUM DIV 100 ?
495
248
Parse Trees - Example

<assign> id MEAN := <exp> <term> <term> <factor> id SUM <factor> int 100
496
DIV
Parse Trees - Exercise

Exercise: FOR I := 1 TO 10 DO
READ (TEMP)
Exercise:
<exp> ::= <exp> + <exp> | <exp> - <exp> | int parse 3 - 6 - 2 answer?
Grammar that allows more than one parse tree to be formed for the same token sequence: ambiguous
497
249
Algorithm
How do we calculate a parse tree? Two approaches:
bottom-up (start at leaves) top-down (start at root)
498
Shift-Reduce Parsing
Bottom up approach Bottom-up Scan tokens, placing them on a stack Group tokens at top of stack:
pop them all off push corresponding non terminal non-terminal shift reduce
Repeat until done

should be left with ________________
499
250
Shift-Reduce Parsing II
Grammar must be LR LR
Left-to-right scan of the input, producing a Right-most derivation symbols to be reduced always appear at top of stack (never inside it)
Need to look ahead to decide how/when look ahead to reduce

if we only need to look ahead 1 token: LR (1) grammar
500
Lecture #37
501
251
Recursive Descent
Top-down approach Each non-terminal has associated routine
scan forward try to identify string matching this rule
Routine may have to call other routines (or itself) i lf)

see Figure 5.16, example for <read>:
find READ; find ( ; find <id-list>; find )
call a routine
502
503
252
Recursive-Descent - Problem
Subtle potential problem: left-recursion left-recursion
the left-most (first) symbol in the BNF rule is the same non-terminal (recursive) e.g., <id-list> ::= id | <id-list>, id
If we want to expand 2nd alternative, first call ourselves! (i.e., infinite recursion) ( , ) One solution: change notation slightly
<id-list> ::= id [ , <id-list> ] routine always consumes a token before recursion
504
Step #3: Code Generation
505
253
Introduction
Use a collection of routines 1 routine / non-terminal in the grammar
called semantic or code-generating routines
2 approaches:
create entire tree
then walk the tree, generating code
generate code as we go
when a grammar rule is recognized, call the corresponding code-generating routine
506
Example
Consider: <term> ::= <factor> * <factor> Occurs in parse tree as: <term>
<factor> * <factor>
G Generate code as we come up th tree t d the t

keep track of where (which registers) results of lower nodes are stored generate code for * operation 507 keep track of where result is placed
254
Optimization
An optimizing compiler tries to generate the most efficient object code
time (fast execution times) space (small object files)
Requires sophisticated analysis q p y Often uses an intermediate form of code

not executed directly analyzed for deciding register allocation, instruction ordering, branch shadows, etc...
508
Lex & Yacc

Unix tools for building compilers
lex: lexical analyzer yacc: yet another compiler compiler
A compiler compiler takes as input:

lexical analyzer y grammar code-generation rules
And produces as output:

compiler
509
255
Lex Example
Input file:
Definitions %% Rule {action} . . .
Definitions: convenient short-hands for REs R l recognized regular expressions and Rules: i d l i d corresponding action to perform
set the token value (use global variable yylval) return the token type (return an int)
510
INSERT: Lex Example Input file for simple Pascal syntax (pascal.lex)
511
256
Lex Example II
Run: lex pascal.lex pascal lex Result:
file called lex.yy.c a 677-line C program! implements the function int yylex() p yy ()
512
Yacc Example
Create a file defining the grammar C eate e de gt eg a a
%token NUMBER %% expr: NUMBER {$$ = $1 } | expr + expr {$$ = $1 + $3} | ( expr ) {$$ = $2 }
An invocation of yylex used to return the next token (and token value) Action produces output (object code) Run yacc on this file to produce a compiler that uses a bottom-up parsing method.
513
257
Lecture #38
514
To Ponder
What is meant by a text file? (vs. binary) A file of English text occupies 5 Mbytes on disk. A Java program reads the contents of this file into a String (or StringBuilder) object. How much memory does it need? Java string length vs. number of characters
String s = . . . assert (s length() == 7) (s.length() How many characters does s contain?
Whats so scary about:

..%c0%af..
515
258
Unicode
A standard for the discrete representation of written p text
516
The Big Picture

glyphs
m
code points binary encoding
U+0444 U+006D
U+20AC U+2019 U+5975
D1 84 6D
E2 82 AC E5 A5 BD E2 80 99
517
259
Text: A Sequence of Glyphs

Glyph: A recognizable abstract graphic symbol symbol
See foyer floor in main library
One character can have many glyphs

Example: e e e e e e e
One glyph can be different characters (capital Latin A and Greek Alpha: ) One glyph can be several characters (ligature of f+i into one symbol: )
518
Security Issue
Visual homograph: Two different characters that look th same h t th t l k the
Would you click here: www.paypl.com ? Oops! The second a is actually CYRILLIC SMALL LETTER A This site successfully registered in 2005 y g
Solution
Heuristics that warn users when languages are mixed and homographs are possible
519
260
Unicode Code Points

Each character is assigned a unique code point A code point is defined by an integer value, and is also given a name
Example: LATIN SMALL LETTER M, one hundred and nine
Convention: Write code points as U+hex

Example: U+006D
As of November 2010:
Contains 109,000+ code points Covers 93 scripts (and counting)
520
Organization
Code points are grouped into categories
e.g., Basic Latin, Cyrillic, Arabic, Cherokee, Currency, g, , y , , , y, Mathematical Operators
Standard allows for 17 x 216 code points

i.e., > 1 million U+0000 to U+10FFFF
Each group of 216 called a plane

U+nnnnnn, same green ==> same plane
Plane 0 called basic multilingual plane (BMP)

Has practically everything you could need Convention: code points in BMP written U+nnnn, others written with 5 or 6 hex digits
521
261
Basic Multilingual Plane
522
UTF-8
Encoding of code point (integer) in a sequence of bytes (octets)
Standard: all caps, with hyphen (UTF-8)
Variable length
Some code points require 1 octet Others require 2, 3, or 4
Consequence: Can not infer number of characters from size of file! No endian-ness: just a sequence of octets
D0 BF D1 80 D0 B8 D0 B2 D0 B5 D1 82 ...
523
262
UTF-8 Encoding Recipe

1-byte encodings
First bit is 0 Example: 0110 1101 (encodes U+006D)
2-byte encodings
First byte starts with 110 Second byte starts with 10
Example: Payload: = = 1101 0000 1011 1111 1101 0000 1011 1111 100 0011 1111 U+043F (i.e., , Cyrillic small letter pe)
524
UTF-8 Encoding Recipe

Generalization: An encoding of length k:
Fi t b t starts with k 1 th 0 First byte t t ith 1s, then
Example 1110 0110 ==> first byte of a 3-byte encoding
Subsequent k-1 bytes each starts with 10 Remaining bits are payload
Example: 11100010 10000010 10101100

Payload: x20AC (i.e., U+20AC, )
Consequence: Stream is self-synchronizing

A dropped byte affects only that character
525
263
UTF-8 Encoding Summary
(from wikipedia)
526
Security Issue
Not all encodings are permitted
overlong encodings are illegal g g g example: C0 AF = 1100 0000 1010 1111 = U+002F (should be encoded 2F)
Classic security bug (IIS 2001)

Should reject URL requests with ../..
Scanned for 2E 2E 2F 2E 2E (in encoding)
Accepted ..%c0%af.. (doesnt contain x2F) %c0%af (doesn t After accepting, then decoded
2E 2E C0 AF 2E 2E gets decoded into ../..
Moral of the story: Work in code point space!

527
264
Other (Older) Encodings

In the beginning Character sets were small
ASCII: only 128 characters (i.e., 27) 1 byte/character, leading bit always 0
Globalization means more characters

But 1 byte/character seemed so fundamental
Solutions:
Use that leading bit!
Text data now looks just like binary data
Use more than 1 encoding!

Must specify data + encoding used
528
ASCII
529
265
ISO-8859 family (-1 Latin)
530
Windows Family (1252 Latin)
531
266
Early Unicode and UTF-16

Unicode started as 216 code points
The BMP of modern Unicode Matches ISO-8859-1 in bottom 256 points
Encode every code point in 2 bytes (1 word)

Simple, but leads to bloat of ASCII text
For code points outside of BMP

A pair of words (surrogate pairs) carry 20-bit payload split, 10 bits in each word First: 1101 10xx xxxx xxxx (xD800-DBFF) (xD800 DBFF) Second: 1101 11yy yyyy yyyy (xDC00-DFFF)
U+D800 to U+DFFF are reserved code points in Unicode

And now were stuck with this legacy, even for UTF-8
532
Basic Multilingual Plane
533
267
UTF-16 and Endianness

Multibyte representation
Must distinguish between big & little endian
One solution: Specify encoding in name

UTF-16BE or UTF-16LE
Another solution: require byte order mark (BOM) at the start of the file
U+FEFF (ZERO WIDTH NO BREAK SPACE) there is no U FFFE code point th i U+FFFE d i t FE FF ==> BigE, FF FE ==> LittleE Not considered part of the text
534
BOM and UTF-8

Should BOM be added to start of UTF-8?
U+FEFF is encoded as EF BB BF
Advantages:
Forms magic-number for UTF-8 encoding
Disadvantages:
Not backwards-compatible to ASCII Existing programs may no longer work E.g., in Unix, shebang (#!, i.e., 23 21) at start of file is significant (file is a script)
#! /bin/bash
535
268
To Ponder
What is meant by a text file? (vs. binary) A text file occupies 5 Mbytes on disk A Java disk. program reads the contents of this file into a String (or StringBuilder) object. How much memory does it need? Java string length vs. number of characters
String s = . . . assert (s length() == 7) (s.length() How many characters does s contain?
Whats so scary about:

..%c0%af..
536
Lecture #39
Review
537
269

Cse560 Heym

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Cse560 Heym

Hochgeladen von

Copyright:

Verfügbare Formate

Hints on Working on a Team

Based on handout prepared by Al Stutz

The somewhat misleading term specications is often used here.

General Things (before you write any code):

Designing a Test Plan:

THE OHIO STATE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CSE 560 Software Design

Design Task List

Task 1.1 Review of Requirements

Task 1.2 Identify Development Standards and Utilities

Task 1.3 Identify Top Level System Structure

Load Object File object file

Task 1.4 Prepare Descriptive Narrative

Task 2.1 Finalize Input Layout

Task 2.2 Finalize Output Layouts

Task 2.3 Finalize Major Shared Data Elements

Task 3.0 Identify Major Design Conventions

Task 4.1 Identify Modules and their Interrelationships

Possible form of the solution: A graphical view of the program structure

Data Template D d Type D A B D

Task 4.2 Prepare Detailed Module Descriptions

[This page was left blank intentionally.]

Principles of Good Technical Writing

[This page was left blank intentionally.]

Guidelines For Writing A Software Report

The Front Matter

The Table of Contents

The Body of the Report

Required Lab Documentation

Data Element Dictionary

Lab Submission Instructions

Introducing the Machine

Overview of Labs 2-4

The A11-560 Machine

Instruction Processing Cycle

N cells, addressed 0..N-1

More exotic architectures exist

How many bits are required to represent an y q p address?

Conversely, a single concrete representation may have several interpretations

We have only bits available for our machine

Problem: representing negative numbers in binary.

i.e., 1111, 1110, 1101, , 0111

First bit is still the sign!

e.g., -5 would be: Uses the full 2 k range!

The first bit gives the sign (like signed mag)

addition: clockwise subtraction: counterclockwise

op code encodes function operands encode arguments

Addressing Modes (II)

Addressing Modes (III)

1AFF 1B00 1B01 4 i.e.,

Addressing Modes (IV)

Indexed Addressing Modes

IND ACC M[1B00 + IND] M[ ]

- a register (integer in range 03)

Advice for Groups

start with progress report on open action items

Advice for Groups (II)

agree on standards / conventions t d d ti

Anticipate and deal with problems

Machine Code Examples

Labs 1 and 2: Milestones

Output two text files

process output file: appended at run time by IO & BR instrs.