CP4P_CompressionBackup_Activity_Instructions

.pdf

School

Seneca College *

*We aren’t endorsed by this school

Course

101

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

8

Uploaded by ColonelSandpiper3637

Report
File Compression and Backup Computer Principles for Programmers Fall 2023 Page 1 of 8 Part A: Compression (40 points) Compression algorithms substitute a repeating string in the original text with a unique token and keep a record of each substitution in a dictionary. The dictionary is stored with the compressed text for later decompression but not in our case here. This exercise is an analogy, not the algorithm used by LZW or Huffman encoding. old quote from Vangie Beal, managing editor of Webopedia. (lower-case used to simplify example) data compression is particularly useful in communications because it enables devices to transmit or store the same amount of data in fewer bits. there are a variety of data compression techniques, but only a few have been standardized. the ccitt has defined a standard data compression technique for transmitting and a compression standard for data communications through modems. in addition, there are file compression formats, such as arc and zip. Total original Characters (with spaces) count from Word doc 449 single char token replaces character string Dictionary size N occurrences Savings = Length × ( N 1) - 1 N ! data 6 5 14 @ compression 13 5 42 # communications 16 2 12 $ transmit 9 2 5 % there are 12 2 8 & standard 9 3 12 * technique 10 2 6 Dictionary size 75 Compressed Characters (with spaces) count from Word doc 275
File Compression and Backup Computer Principles for Programmers Fall 2023 Page 2 of 8 Saving = Original size less (dictionary + compressed text) 99 99 Compression as a percent of original 78% 22% Percentages should sum to 100% 100.00% Compressed text: !@is particularly useful in #because it enables devices to $ or store the same amount of !in fewer bits. %a variety of !@*s, but only a few have been &ized. the ccitt has defined a & !@* for $ting and a @& for !#through modems. in addition,%file @formats, such as arc and zip. How much can you compress the lyrics to a song using the ideas above? You choose the song. Copy the lyrics of a song to a new MS-Word document (Ctrl+N). To reduce complexity, make all letters lower case: Ctrl+A to select all text, Alt+H 7 L to make the selection lower case. In the bottom left of the Word display, click “ ### words”. e.g. The Word Count dialog will pop up showing the number of characters with spaces. (Spaces are characters,itishardtoreadwordswithoutthem.)
File Compression and Backup Computer Principles for Programmers Fall 2023 Page 3 of 8 N.B. paragraph / new line / CRLF / formatting codes are not counted by Word which is fine. Our exercise here is concerned only with the text. (Alt+H,8 will toggle the display of whitespace characters) The following will help with your substitution analysis: separate the words in the text so each is on its own line, then sort the lines to see repeating patterns of individual words. copy the lyrics to another new document (Ctrl-N) used only for analysis Find and Replace a space with a space + paragraph marker ^p (Ctrl+H) Find what: Replace with: ^p Then sort the lines to see repeating words. (Alt H S O)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help