[Concepts] Paper 3
- CHAPTER 1: DATA REPRESENTATION
- CHAPTER 2: COMMUNICATION AND INTERNET TECHNOLOGIES
- CHAPTER 3: HARDWARE AND VIRTUAL MACHINES
- CHAPTER 4: SYSTEM SOFTWARE
- CHAPTER 5: SECURITY
- CHAPTER 6: ARTIFICIAL INTELLIGENCE
- CHAPTER 7: COMPUTATIONAL THINKING AND PROBLEM-SOLVING
- CHAPTER 8: FURTHER PROGRAMMING
CHAPTER 1: DATA REPRESENTATION
1.1 USER-DEFINED DATA TYPES
1.1.1 Introduction to User-Defined Data Types
Definition: User-defined data types are data types designed by the programmer. When object-oriented programming is not being used, a programmer may choose to utilize user-defined data types for a large program as their use can reduce errors and make the program more understandable.
Why Use User-Defined Data Types:
- Reduce errors in the program
- Make code more understandable
- Less restriction than built-in types
- Allows for inevitable user definition
1.1.2 Non-Composite User-Defined Data Types
Enumerated Data Type: A non-composite user-defined data type that is a list of possible data values. The values defined have an implied order of values to allow comparisons.
Characteristics:
- Values are countable and finite
- Allow comparisons (value2 > value1)
- NOT string values (don't use quotes)
Pointer Data Type: Used to reference a memory location. It may be used to construct dynamically varying data structures. The pointer definition has to relate to the type of the variable being pointed to.
Pointer Notation:
@identifier- returns the address of identifierPointer^- accesses the data stored at the address (dereferencing)
1.1.3 Composite User-Defined Data Types
Record Data Type: A data type that contains a fixed number of components that can be of different types. Allows the programmer to collect values with other data types together when these form a coherent whole.
Set Data Type: Allows a program to create sets and to apply mathematical operations defined in set theory. All elements in the set should be unique.
Operations on Sets:
- Union
- Difference
- Intersection
- Include an element
- Exclude an element
- Check whether an element is in a set
Class (in OOP): In object-oriented programming, a program defines the classes to be used. Then, for each class, the objects must be defined. A Class includes variables and methods (functions or procedures that an object can run).
1.2 FILE ORGANISATION AND ACCESS
1.2.1 Types of Files
Text Files:
- Contains data stored according to a defined character code (ASCII or Unicode)
- Can be created using a text editor
- Data appears as readable characters
Binary Files:
- Designed for storing data to be used by a computer program
- Stores data in its internal representation
- Created using specific programs
- Structure: File → Records → Fields → Values
1.2.2 Methods of File Organisation
Serial Files:
- Records have no defined order
- Stored one after another in the order they were added
- New records added at the end of the file
- No end of record character (must have defined format)
Advantages:
- Simple task
- Low cost
Disadvantages:
- Difficult to access specific records
- Must read all preceding records
- Cannot support modern high-speed requirements
File Access: Sequential access only
Uses:
- Batch processing
- Backing up data on magnetic tape
- Bank transactions
Sequential Files:
- Records ordered using a key field
- Key field values must be unique and ordered
- More efficient than serial files due to data integrity
- New records must be added in correct position
File Access Methods:
- Sequential Access: Read key field values until required value found
- Direct Access: Use index to look up address of record location
Random Files:
- Records stored randomly
- Accessed directly using hashing algorithm
- Uses hashing on record's key field to calculate address
- Well suited for magnetic and optical disks
Advantages:
- Quick retrieval of records
- Records may vary in size
1.2.3 Hashing Algorithms
Definition: Takes the key field as input and outputs a value for the record's position relative to the file's start.
Example:
Handling Collisions: When two different keys produce the same position, use the next available position in the file.
File Access:
- Value in key field submitted to hashing algorithm
- Algorithm provides position in file
- May require short linear search due to collisions
1.3 FLOATING-POINT NUMBERS
1.3.1 Floating-Point Representation
Definition: The approximate representation of a real number using binary digits.
Format:
- Mantissa: The non-zero part of the number
- Exponent: The power to which the base is raised
- Base: 2 (in binary floating-point)
1.3.2 Converting Denary to Floating-Point Binary
Steps:
- Convert whole number part to binary
- Add sign bit (0 for positive, 1 for negative)
- Convert fractional part by multiplying by 2 and recording whole parts
- Combine parts and adjust exponent
- Normalise the number
Example: Converting 8.75 to floating-point (8-bit: 4 for mantissa, 4 for exponent)
1.3.3 Normalisation
Purpose:
- Maximise precision using all available bits in mantissa
- Ensure most significant bits are different (0 1 for positive, 1 0 for negative)
- Avoid multiple representations of the same number
Process:
- For positive numbers: Shift left until first bit after binary point is 1
- For negative numbers: Shift left until first two bits are different (10)
- Each shift left decreases exponent by 1
1.3.4 Problems with Floating-Point Numbers
-
Rounding Errors:
- Binary representation of fractions is often approximate
- Can become significant after multiple calculations
-
Overflow:
- Result too large to represent
- Occurs when exponent exceeds maximum
-
Underflow:
- Result too small to represent
- May be rounded to zero
-
Inability to Store Zero:
- Normalised form requires mantissa to be 0.1 or 1.0
- Zero must be handled as special case
1.3.5 Precision vs Range
Trade-off: Must balance precision and range based on application needs
CHAPTER 2: COMMUNICATION AND INTERNET TECHNOLOGIES
2.1 PROTOCOLS
2.1.1 Introduction to Protocols
Definition: A protocol is a set of rules governing communication between computers. It ensures the computers that communicate understand each other.
Key Terms:
- MAC Address: Unique number assigned to each device's networking hardware worldwide
- IP Address: Unique number assigned to each node/networking device in a network
- Port Number: Software-generated number specifying an application or process communication endpoint
2.1.2 TCP/IP Protocol Suite
Four Layers:
Data Flow - Sender Side:
- Application Layer: Encodes data in appropriate format
- Transport Layer: Creates packets with port numbers
- Network Layer: Adds sender and receiver IP addresses
- Link Layer: Formats into frames, adds error checking
- Physical Layer: Converts to signals for transmission
Data Flow - Receiver Side: Reverse of sender side, stripping headers at each layer
2.1.3 Key Protocols
2.1.4 BitTorrent Protocol
Components:
- Torrent File: Contains details regarding the tracker
- Tracker: Server that keeps track of peers
- Peers: Users downloading and uploading simultaneously
- Swarm: Network of peers sharing the torrent
- Seeding: Uploading file after/complete download
- Leeching: Downloading without uploading
How It Works:
- Peers obtain torrent file (small)
- Torrent file points to tracker
- Tracker lists all peers in swarm
- Peers download chunks from each other
- Peers upload chunks to other peers
2.2 CIRCUIT SWITCHING AND PACKET SWITCHING
2.2.1 Circuit Switching
Definition: A method of data transfer where a dedicated communication channel is established before transmission begins.
Characteristics:
- Dedicated path between sender and receiver
- Path remains open for entire duration
- Like traditional telephone system
Advantages:
- Guaranteed bandwidth
- Consistent quality
- No delay from routing decisions
Disadvantages:
- Inefficient if data is sporadic
- Connection setup takes time
- Line unavailable if circuit busy
Example Use:
- Traditional landline phones
2.2.2 Packet Switching
Definition: A method of data transfer where the message is broken into parts and sent over optimum routes to reach its destination.
Characteristics:
- Data divided into packets
- Each packet can take different route
- Packets reassembled at destination
- Used in the Internet
Advantages:
- Efficient use of network capacity
- If one route fails, packets can reroute
- Better for bursty data
Disadvantages:
- Packets may arrive out of order
- Variable delays
- More complex infrastructure
2.2.3 Router Function
Definition: A device that connects two or more computer networks and directs incoming packets to their receiver according to network traffic.
Functions:
- Examines packet destination IP address
- Determines best route for each packet
- Manages network traffic congestion
- Connects LAN to WAN
2.3 SSL/TLS
2.3.1 SSL and TLS
SSL (Secure Socket Layer):
- Provides secure communication
- Functions between TCP and application layer
- Creates "socket" for secure connection
TLS (Transport Layer Security):
- Improved version of SSL
- Provides encryption, compression, integrity checking
When to Use:
- Online shopping websites
- Online banking
- Any site requiring secure data transmission
Handshake Process:
- Client sends request to server
- Server sends digital certificate (includes public key)
- Client validates certificate
- Client generates session key, encrypts with server's public key
- Server decrypts session key
- Secure session established
CHAPTER 3: HARDWARE AND VIRTUAL MACHINES
3.1 RISC AND CISC PROCESSORS
3.1.1 RISC (Reduced Instruction Set Computers)
Characteristics:
- Fewer instructions
- Simpler instructions
- Small number of instruction formats
- Single-cycle instructions where possible
- Fixed-length instructions
- Only load/store instructions access memory
- Fewer addressing modes
- Multiple register sets
- Hard-wired control unit
- Pipelining easier
3.1.2 CISC (Complex Instruction Set Computers)
Characteristics:
- More instructions
- Complicated instructions
- Many instruction formats
- Multi-cycle instructions
- Variable-length instructions
- Many types of memory addressing instructions
- More addressing modes
- Fewer registers
- Microprogrammed control unit
- Pipelining difficult
3.1.3 Pipelining
Definition: Instruction-level parallelism where the fetch-decode-execute cycle is separated into several stages.
Five Stages:
- Instruction Fetch (IF)
- Instruction Decode (ID)
- Operand Fetch (OF)
- Instruction Execution (IE)
- Result Write Back (WB)
Advantages:
- Multiple instructions processed simultaneously
- Increased throughput
- More efficient use of processor components
Disadvantages:
- Interrupt handling complex
- Pipeline stalls possible
Interrupt Handling in Pipelines:
- When interrupt occurs, must clear pipeline
- Store current state of all instructions in pipeline
- Execute ISR
- Resume interrupted instructions
3.1.4 Parallel Processing Architectures
SISD (Single Instruction Single Data):
- Single processor
- Early computers
- No pipelining
SIMD (Single Instruction Multiple Data):
- Multiple processors
- Array processors
- Same instruction on multiple data points
MISD (Multiple Instruction Single Data):
- Multiple processors
- Same data, different instructions
- Used for sorting
MIMD (Multiple Instruction Multiple Data):
- Modern computers
- Each processor executes different instructions
- Most common parallel architecture
Massively Parallel Computers:
- Vast amounts of processing power
- Bus structure for multiple processors
- Network infrastructure for multiple hosts
- Used for complex mathematical problems
3.2 BOOLEAN ALGEBRA AND LOGIC CIRCUITS
3.2.1 Boolean Algebra Laws
Identity Law:
Null Law:
Idempotent Law:
Inverse Law:
Commutative Law:
Associative Law:
Distributive Law:
De Morgan's Laws:
3.2.2 Karnaugh Maps (K-Maps)
Purpose: Simplify Boolean expressions and reduce number of gates needed.
Benefits:
- Minimises Boolean expressions
- Minimises logic gates
- More efficient circuits
Methodology:
- Fill K-map from truth table
- Group '1's in powers of 2 (1, 2, 4, 8)
- Wrap around edges allowed
- Within each group, only constant values remain
- Write expression from groups
Example K-map (2 variables):
3.2.3 Half Adders and Full Adders
Half Adder:
- Adds two bits
- Outputs sum and carry
Sum = A XOR B Carry = A AND B
Full Adder:
- Adds three bits (two operands + carry in)
- Outputs sum and carry out
3.2.4 Flip-Flops
SR Flip-Flop:
- Set-Reset flip-flop
- Stores 1 bit of data
- Two inputs: S (set) and R (reset)
JK Flip-Flop:
- Improvement over SR
- All input combinations valid
- Uses clock pulse for synchronization
- Toggle action when J=K=1
Purpose:
- Store bits of data
- Create memory from flip-flops
- Used in registers and counters
3.3 VIRTUAL MACHINES
3.3.1 Concept of Virtual Machines
Definition: Software that provides an exact copy of hardware. The process interacts with the software interface provided by the OS, which provides this copy. The OS kernel handles interaction with actual host hardware.
Benefits:
- Multiple OS on single system
- Testing without affecting main system
- Legacy software on newer hardware
- Server consolidation
Drawbacks:
- Performance drop from native OS
- Time and effort for implementation
Examples:
- VMware
- VirtualBox
- Hyper-V
CHAPTER 4: SYSTEM SOFTWARE
4.1 PURPOSES OF AN OPERATING SYSTEM
4.1.1 Resource Optimisation
How OS Maximises Resources:
- CPU scheduling
- Memory management
- I/O optimisation
- File system management
4.1.2 User Interface
Purpose: Hides complexities of hardware from user. Allows users to interact with application programs without knowing hardware details.
Types:
- Command-line interface (CLI)
- Graphical User Interface (GUI)
- Touch interface
4.1.3 Process Management
Process: A program being executed which has an associated Process Control Block (PCB) in memory.
Process States:
- Ready: New process arrived, PCB created
- Running: Has CPU access
- Blocked: Cannot progress until event occurs
PCB Contents:
- Process state
- Program counter
- CPU registers
- Memory management information
- I/O status information
4.1.4 Scheduling Routines
First-Come-First-Served (FCFS):
- Non-preemptive
- FIFO queue
- Simple but may cause long waits
Round Robin:
- Preemptive
- Allocates time slice to each process
- No prioritization
- Fair but may waste CPU on context switching
Priority-Based:
- Most complex
- Priorities re-evaluated on queue change
- Can be preemptive or non-preemptive
Shortest Job First:
- Non-preemptive
- Processes with shortest execution time first
- Optimal for minimizing average waiting time
Shortest Remaining Time:
- Preemptive version of Shortest Job First
4.1.5 Memory Management
Paging:
- Process split into pages
- Memory split into frames
- All pages can be loaded into memory
Virtual Memory:
- No need for all pages in memory
- CPU address space larger than physical
- Address resolved by Memory Management Unit (MMU)
Benefits:
- Large programs can run with less physical memory
- More programs can run simultaneously
Disk Thrashing:
- Perpetual loading/unloading of pages
- Occurs when too many processes compete for memory
- Performance severely degraded
4.1.6 OS Structure
User Mode:
- Available for users and applications
- Limited access to system resources
Kernel/Privileged Mode:
- Sole access to parts of memory
- Access to certain system functions
- Only OS components run in this mode
4.2 TRANSLATION SOFTWARE
4.2.1 Compilation Stages
Lexical Analysis:
- Converts sequence of characters to sequence of tokens
- Identifies keywords, identifiers, operators
Syntax Analysis:
- Checks code for grammar mistakes
- Identifies syntax errors
Code Generation:
- Generates intermediate code after syntax analysis
Optimization:
- Improves efficiency of code
- Removes redundant operations
4.2.2 Interpreters vs Compilers
4.2.3 BNF (Backus-Naur Form)
Purpose: Formal mathematical way of defining syntax unambiguously.
Components:
- Set of terminal symbols
- Set of non-terminal symbols
- Set of production rules
Example:
4.2.4 Reverse Polish Notation (RPN)
Definition: A method of representing expressions without brackets or special punctuation. Uses postfix notation where operator is placed after variables.
Example:
- Infix: A + B
- RPN: A B +
Advantages:
- No need for brackets
- No need for operator precedence
- Simpler for machine to evaluate
- No backtracking needed
Evaluation:
- Use stack
- When operator encountered, pop two operands, apply operator, push result
CHAPTER 5: SECURITY
5.1 ENCRYPTION AND ENCRYPTION PROTOCOLS
5.1.1 Encryption Concepts
Plain Text: Original data before encryption.
Cipher Text: Result of applying encryption algorithm to data.
Encryption: Process of making cipher text from plain text.
Key: Value used by encryption/decryption algorithm.
5.1.2 Symmetric Key Encryption
Definition: Same key used for encryption and decryption.
Process:
- Sender and receiver share secret key
- Sender encrypts plain text with key → cipher text
- Cipher text transmitted
- Receiver decrypts with same key → plain text
Advantages:
- Fast
- Simple
Disadvantages:
- Key distribution problem
- Multiple keys needed for multiple recipients
Examples:
- AES (Advanced Encryption Standard)
- DES (Data Encryption Standard)
5.1.3 Asymmetric Key Encryption
Definition: Different keys for encryption and decryption.
Public Key:
Private Key:
- Kept secret
- Used for decryption
- Used for digital signatures
Sending Private Message:
- Receiver sends public key to sender
- Sender encrypts message with public key
- Only receiver (with private key) can decrypt
Sending Verified Message:
- Sender encrypts with private key
- Anyone can decrypt with public key
- Message verified as from sender
5.1.4 Digital Signatures
Process:
- Calculate hash of message (digest)
- Encrypt digest with sender's private key → digital signature
- Send message + signature
- Receiver decrypts signature with public key → digest
- Receiver calculates hash of message
- If digests match → message authentic
5.1.5 Digital Certificates
Purpose: Verify that public key belongs to claimed entity.
Certificate Contents:
- Entity's public key
- Entity's identity information
- CA's digital signature
Obtaining Certificate:
- Entity contacts Certification Authority (CA)
- CA confirms entity's identity
- CA creates certificate with entity's public key
- CA signs certificate with its private key
- Entity posts certificate on website
5.1.6 SSL/TLS Protocol
Purpose: Provide secure communication between client and server.
Uses:
- Online shopping
- Online banking
- HTTPS websites
Process:
- Client connects to server (port 443)
- Server sends certificate
- Client validates certificate
- Client generates session key
- Client encrypts session key with server's public key
- Server decrypts session key
- Secure session begins
5.2 MALWARE AND RESTRICTION METHODS
5.2.1 Types of Malware
5.2.2 Restriction Methods
CHAPTER 6: ARTIFICIAL INTELLIGENCE
6.1 INTRODUCTION TO AI
6.1.1 Definition
Artificial Intelligence: The ability of computers to perform tasks that usually only humans would be able to do (decision-making, speech recognition, etc.).
Machine Learning (ML): A subfield of AI where computers learn to perform tasks without being explicitly programmed. Uses historical training data to produce a model for predictions.
Deep Learning (DL): A subset of ML where computers learn using neural networks similar to human brain function.
6.1.2 Types of Machine Learning
Supervised Learning:
- Trained on labelled data
- Known inputs and outputs
- Model learns from examples
- Used for classification and regression
Unsupervised Learning:
- Trained on unlabelled data
- Finds hidden patterns
- Groups similar data
- Used for clustering
Reinforcement Learning:
- Reward-based system
- Agent learns from actions
- Not given specific instructions
- Learns from trial and error
6.1.3 Classification, Regression, and Clustering
Classification:
- Split data into predefined groups
- Example: Spam/not spam email
Regression:
- Predict value based on explanatory variable
- Used for forecasting
- Examples: Weather, sales, prices
Clustering:
- Split data into groups based on features
- Groups not predefined
- Used for pattern discovery
6.2 ARTIFICIAL NEURAL NETWORKS
6.2.1 Structure
Layers:
- Input Layer: Accepts inputs in various formats
- Hidden Layers: Perform calculations, find patterns
- Output Layer: Provides results
Nodes:
- Connected to nodes in adjacent layers
- Each connection has weight
- Data passes through network
6.2.2 Back Propagation
Definition: "Backward propagation of errors" - standard method for training neural networks.
Process:
- Input data fed to network
- Network generates output
- Output compared to expected result
- Error calculated
- Error propagated backwards through network
- Weights adjusted
- Process repeated until acceptable error
6.2.3 Graph Theory in AI
Graph Components:
- Nodes: Fundamental units (represent objects)
- Edges: Lines connecting nodes (represent relationships)
Real-World Uses:
In AI:
- Represent ANN structure
- Find paths in problem solving
- Algorithms examine graphs
6.3 GRAPH SEARCH ALGORITHMS
6.3.1 Dijkstra's Algorithm
Purpose: Find shortest path between nodes in weighted graph.
How It Works:
- Start at initial node
- Calculate distances to all connected nodes
- Select node with smallest distance
- Update distances through that node
- Repeat until destination reached
Limitations:
- No heuristics (searches all directions)
- Inefficient on large networks
- Cannot handle negative weights
6.3.2 A* Algorithm
Purpose: Informed search algorithm using heuristics to find shortest path efficiently.
Formula:
- G: Movement cost from start
- H: Heuristic (estimated cost to goal)
- F: Total estimated cost
Advantages over Dijkstra:
- Uses heuristics to guide search
- More efficient
- Finds goal faster
CHAPTER 7: COMPUTATIONAL THINKING AND PROBLEM-SOLVING
7.1 RECURSION
7.1.1 Essential Features
Base Case:
- Stopping condition
- Prevents infinite recursion
Recursive Case:
- Calls itself
- Changes state toward base case
Unwinding:
- When base case reached
- Returns through call stack
- Calculates final result
7.1.2 Advantages and Disadvantages
7.1.3 How Compilers Handle Recursion
Stack Usage:
- Before call: Save current state (registers, local variables) onto stack
- Make recursive call
- During return: Restore state from stack
Unwinding:
- When base case met
- Values popped from stack in reverse order
- Previous call states restored
7.1.4 Examples
Factorial:
Fibonacci:
7.2 ALGORITHM COMPLEXITY
7.2.1 Big O Notation
7.2.2 Comparing Algorithms
Time Complexity:
- How execution time grows with input size
Space Complexity:
- How memory usage grows with input size
CHAPTER 8: FURTHER PROGRAMMING
8.1 PROGRAMMING PARADIGMS
8.1.1 Low-Level Programming
Characteristics:
- Close to machine language
- Direct hardware access
- Instructions converted to machine code without compiler
- Non-portable (specific to processor)
- Small memory footprint
- Examples: Assembly, Machine code
Addressing Modes:
- Immediate: Data is value itself
- Direct: Address contains data
- Indirect: Address at address contains data
- Indexed: Address + index register contents
8.1.2 Imperative (Procedural) Programming
Characteristics:
- Sequence of statements changes state
- Explicit control flow
- Variables modified by assignments
- Based on Von Neumann architecture
Examples:
- C, Pascal, BASIC
- Python (also supports other paradigms)
8.1.3 Object-Oriented Programming (OOP)
Key Terms:
- Object: Instance of a class
- Class: Blueprint for creating objects
- Property/Attribute: Data in object
- Method: Function in object
- Inheritance: Child class inherits from parent
- Polymorphism: Same interface, different behaviour
- Encapsulation: Hiding internal details
- Containment: Object contains other objects
Inheritance Example:
8.1.4 Declarative Programming
Characteristics:
- States what needs to be done, not how
- Non-procedural
- Control flow implicit
- Backtracking supported
Example:
- Prolog
- SQL
8.2 FILE PROCESSING
8.2.1 File Modes
8.2.2 Serial File Operations
Reading:
Writing:
8.2.3 Sequential File Operations
Search:
- Read records sequentially
- Compare key field
- Stop when found or key exceeds target
Add:
- Read from old file
- Write to new file
- Insert new record in correct position
Delete:
- Read from old file
- Write to new file
- Skip record to delete
8.2.4 Random File Operations
Hashing:
Direct Access:
8.3 EXCEPTION HANDLING
8.3.1 Exceptions
Definition: Runtime errors that cause program to terminate if not handled.
Common Exceptions:
- Division by zero
- File not found
- Invalid input
- Memory overflow
8.3.2 Handling Exceptions
Try-Except Structure:
Purpose:
- Prevent program crash
- Provide meaningful error messages
- Allow graceful recovery