CMPSC 311- Introduction toSystems ProgrammingModule: Systems ProgrammingProfessor Patrick McDanielFall 2015CMPSC 311 - Introduction to Systems Programming

WARNING Warning: for those not in the class, there is anunusually large number of people trying to get in (4xmore than any other year). I cannot make anypromises that everyone will get into the class due toothers dropping.CMPSC 311 - Introduction to Systems ProgrammingPage

Software Systems A platform, application, or other structure that:‣ is composed of multiple modules the system’s architecture defines the interfaces of andrelationships between the modules‣ usually is complex in terms of its implementation, performance, management‣ hopefully meets some requirements Performance Security Fault tolerance Data consistencyCMPSC 311 - Introduction to Systems ProgrammingThese are properties of computersystems that people design, optimize,and test for. Some refer to the as"ilities” (pronounced "ill-it-tees")Page 3

100,000 Foot View of SystemsOS / app interface(system calls)HW/SW interface(x86 devices)C applicationC applicationJavaapplicationC standard library(glibc)C STL / boost /standard libraryJREoperating systemhardwarememorystorageCPUnetworkGPU clock audio radio peripheralsCMPSC 311 - Introduction to Systems ProgrammingPage 4

A layered viewprovidesservice tolayers aboveclientclientclientyour systemunderstandsand relies onlayers belowlayer belowlayer below CMPSC 311 - Introduction to Systems ProgrammingPage 5

A layered viewmore useful,portable, reliableabstractionsclientclientclientyour systemconstrainedby performance,footprint, behaviorof the layers belowlayer belowlayer below CMPSC 311 - Introduction to Systems ProgrammingPage 6

Example system Operating system‣ a software layer that abstracts away themessy details of hardware into a useful,portable, powerful interface‣ modules: file system, virtual memory system,network stack, protection system,scheduling subsystem, . each of these is a major system of its own!‣ design and implementation has manyengineering tradeoffs e.g., speed vs. (portability, maintainability,simplicity)CMPSC 311 - Introduction to Systems ProgrammingPage 7

Another example system Web server framework‣ a software layer that abstracts away the messy details ofOSs, HTTP protocols, database and storage systems tosimplify building powerful, scalable Web services‣ modules: HTTP server, HTML template system, database storage,user authentication system, .‣ also has many, many tradeoffs programmer convenience vs. performance simplicity vs. extensibilityNote: we will focus on the OS system this semester.CMPSC 311 - Introduction to Systems ProgrammingPage 8

Systems and Layers Layers are collections of systemfunctions that support someabstraction to service/app above‣ Hides the specifics of theimplementation of the layer‣ Hides the specifics of the layers below‣ Abstraction may be provided bysoftware or hardware‣ Examples from the OS layer processes files virtual memoryCMPSC 311 - Introduction to Systems ProgrammingPage 9

A real world abstraction . What does this thing do?What about this?CMPSC 311 - Introduction to Systems ProgrammingPage 10

What makes a good abstraction? An abstraction should match “cognitive model” of usersof the system, interface, or resources“Cognitive science is concerned with understandingthe processes that the brain uses to accomplishcomplex tasks including perceiving, learning,remembering, thinking, predicting, inference,problem solving, decision making, planning, andmoving around the environment.”--Jerome BusemeyerCMPSC 311 - Introduction to Systems ProgrammingPage

How humans think (vastly simplified) Our brain’s receive sensor datato perceive and categorizeenvironment (pattern matchingand classification)‣ Things that are easy to assimilate(learn) are close to things wealready know‣ The simpler and more generic theobject, the easier (most of thetime) it is to classify See human factors, physiology,and psychology classes .CMPSC 311 - Introduction to Systems ProgrammingPage

A good abstraction Why do computers have a desktop with files, folders,trash bins, panels, switches and why not streets with buildings, rooms, alleys,dump-trucks, levers, CMPSC 311 - Introduction to Systems ProgrammingPage

In class exercise In groups of three to four:‣ Desktops are outlawed by the computer police‣ You are to come up with alternate abstractions for: Data objects (i.e., replacements for files and directories) Be ready to explain in 30 seconds your “environment”,what are the metaphors, and why they are appropriategiven user’s cognitive models . Bonus for being innovative and timelyCMPSC 311 - Introduction to Systems ProgrammingPage

Computer system abstractions What are the basic abstractions that we use (anddon’t even think about) for modern computersystems?CMPSC 311 - Introduction to Systems ProgrammingPage

Processes Processes are independent programs runningconcurrently within the operating systems‣ The execution abstraction provides is that it has sole controlof the entire computer (a single stack and execution context)Tip: if you want to see what processes are running on your UNIX system,use the “ps” command, e.g., “ps -ax”.CMPSC 311 - Introduction to Systems ProgrammingPage 16

Files A file is an abstraction of a read only, write only, orready/write data object.‣ A data file is a collection of data on some media often on secondary storage (hard disk)‣ Files can be much more: in UNIX nearly everything is a file Devices like printers, USB buses, disks, etc. System services like sources of randomness (RNG) Terminal (user input/out devices)Tip: /dev directory of UNIX contains real and virtual devices, e.g., “ls /dev”.CMPSC 311 - Introduction to Systems ProgrammingPage 17

Virtual Memory The virtual memory abstractionprovides control over an imaginaryaddress space‣ Has a virtual address space whichis unique to the process‣ The OS/hardware work together to mapthe address on to . Physical memory addresses Addresses on disk (swap space)‣ Advantages Avoids interference from other processes swap allows more memory use than physicallyavailableCMPSC 311 - Introduction to Systems ProgrammingPage 18

Byte-Oriented Memory Organization Programs Refer to Virtual Addresses‣ Conceptually very large array of bytes‣ Actually implemented with hierarchy of different memory types‣ System provides address space private to particular “process” Program being executed Program can clobber its own data, but not that of others Compiler Run-Time System Control Allocation‣ Where different program objects should be stored‣ All allocation within single virtual address spaceCMPSC 311 - Introduction to Systems ProgrammingPage 19

Machine Words Machine Has “Word Size”‣ Nominal size of integer-valued data Including addresses‣ Many traditional machines use 32 bits (4 bytes) words Limits addresses to 4GB Becoming too small for memory-intensive applications‣ Recent systems use 64 bits (8 bytes) words‣ Potential address space 1.8 X 1019 bytes‣ x86-64 machines support 48-bit addresses: 256 Terabytes Machines support multiple data formats‣ Fractions or multiples of word size‣ Always integral number of bytesCMPSC 311 - Introduction to Systems ProgrammingPage 20

Word-Oriented Memory Organization Addresses Specify Byte Locations‣ Address of first byte in word‣ Addresses of successive words differby 4 (32-bit) or 8 (64-bit)32-bitWordsAddr ?0000Addr ?0004Addr ?0008Addr ?0012CMPSC 311 - Introduction to Systems Programming64-bitWordsAddr ?0000Addr 80009001000110012001300140015Page 21

APIs An Applications Programmer Interface is a set ofmethods (functions) that is used to manipulate anabstraction‣ This is the “library” of calls to use the abstraction‣ Some are easy (e.g., printf)‣ Some are more complex (e.g., network sockets)‣ Mastering systems programming is the art and science ofmastering the APIs including: How they are used? What are the performance characteristics? What are the resource uses? What are their limitationsCMPSC 311 - Introduction to Systems ProgrammingPage

Example: Java Input/Output Set of abstractions that allowfor different kinds of input andoutput‣ Streams ‣ Tokenizers .‣ Readers ‣ Writers Professional Java programmersknow when and how to usesthese to achieve their goalsCMPSC 311 - Introduction to Systems ProgrammingPage

Systems programming The programming skills, engineeringdiscipline, and knowledge you need tobuild a system using these abstractions:‣ programming: C (the abstraction for ISA)‣ discipline: testing, debugging, performanceanalysis‣ knowledge: long list of interesting topics concurrency, OS interfaces and semantics,techniques for consistent data management,algorithms, distributed systems, . most important: deep understanding of the“layer below”CMPSC 311 - Introduction to Systems ProgrammingPage 24

Programming languages Assembly language / machine code‣ (approximately) directly executed by hardware‣ tied to a specific machine architecture, not portable‣ no notion of structure, few programmer conveniences‣ possible to write really, really fast code Compilation of a programming language results inexecutable code to be run by hardware.‣ gcc (C compiler) produces target machine executable code(ISA)‣ javac (Java compiler) produces Java Virtual Machineexecutable codeCMPSC 311 - Introduction to Systems ProgrammingPage 25

Programming languages Structured but low-level languages (C, C )‣ hides some architectural details, is kind of portable, has a fewuseful abstractions, like types, arrays, procedures, objects‣ permits (forces?) programmer to handle low-level details likememory management, locks, threads‣ low-level enough to be fast and to give the programmercontrol over resources double-edged sword: low-level enough to be complex, error-prone shield: engineering disciplineCMPSC 311 - Introduction to Systems ProgrammingPage 26

Programming languages High-level languages (Python, Ruby, JavaScript, .)‣ focus on productivity and usability over performance‣ powerful abstractions shield you from low-level gritty details(bounded arrays, garbage collection, rich libraries, .)‣ usually interpreted, translated, or compiled via anintermediate representation‣ slower (by 1.2x-10x), less controlCMPSC 311 - Introduction to Systems ProgrammingPage 27

Discipline Cultivate good habits, encourage clean code‣ coding style conventions‣ unit testing, code coverage testing, regression testing‣ documentation (code comments!, design docs)‣ code reviews Will take you a lifetime to learn‣ but oh-so-important, especially for systems code avoid write-once, read-never codeCMPSC 311 - Introduction to Systems ProgrammingPage 28

Knowledge Tools‣ gcc, gdb, g , objdump, nm, gcov/lcov, valgrind, IDEs, racedetectors, model checkers, . Lower-level systems‣ UNIX system call API, relational databases, map/reduce,Django, . Systems foundations‣ transactions, two-phase commit, consensus, handles,virtualization, cache coherence, applied crypto, .CMPSC 311 - Introduction to Systems ProgrammingPage 29