CP4P_CompressionBackup_Activity_Instructions
.pdf
keyboard_arrow_up
School
Seneca College *
*We aren’t endorsed by this school
Course
101
Subject
Computer Science
Date
Dec 6, 2023
Type
Pages
8
Uploaded by ColonelSandpiper3637
File Compression and Backup
Computer Principles for Programmers
Fall 2023
Page 1 of 8
Part A: Compression (40 points)
Compression algorithms substitute a repeating string in the original text with
a unique token and keep a record of each substitution in a dictionary. The
dictionary is stored
with
the compressed text for later decompression
but
not in our case here.
This exercise is an analogy, not the algorithm used by LZW or Huffman
encoding.
old quote from Vangie Beal, managing editor of Webopedia. (lower-case used to simplify example)
data compression is particularly useful in communications because it enables
devices to transmit or store the same amount of data in fewer bits. there
are a variety of data compression techniques, but only a few have been
standardized. the ccitt has defined a standard data compression technique
for transmitting and a compression standard for data communications through
modems. in addition, there are file compression formats, such as arc and
zip.
Total original Characters
(with spaces) count from
Word doc
449
single
char
token
replaces
character string
Dictionary
size
N
occurrences
Savings =
Length × (
N
–
1)
- 1
–
N
!
data
6
5
14
@
compression
13
5
42
#
communications
16
2
12
$
transmit
9
2
5
%
there are
12
2
8
&
standard
9
3
12
*
technique
10
2
6
Dictionary size
75
Compressed Characters
(with spaces) count from
Word doc
275
File Compression and Backup
Computer Principles for Programmers
Fall 2023
Page 2 of 8
Saving = Original size
less (dictionary +
compressed text)
99
99
Compression as a percent
of original
78%
22%
Percentages
should sum to 100%
100.00%
Compressed text:
!@is particularly useful in #because it enables
devices to $ or store the same amount of !in fewer bits.
%a variety of !@*s, but only a few have been
&ized. the ccitt has defined a & !@*
for $ting and a @& for !#through
modems. in addition,%file @formats, such as arc and zip.
➔
How much can you compress the
lyrics to a song using the ideas above?
You choose the song.
→
Copy the lyrics of a song to a new MS-Word document (Ctrl+N).
→
To reduce complexity, make all letters lower case:
Ctrl+A to select all text, Alt+H 7 L to make the selection lower case.
In the bottom left of the Word display, click “
###
words”.
e.g.
The Word Count dialog will pop up showing the number of characters with spaces.
(Spaces
are
characters,itishardtoreadwordswithoutthem.)
File Compression and Backup
Computer Principles for Programmers
Fall 2023
Page 3 of 8
N.B. paragraph / new line / CRLF / formatting codes are not counted by Word which is fine. Our
exercise here is concerned only with the text. (Alt+H,8 will toggle the display of whitespace characters)
The following will help with your substitution analysis: separate the words in the text so each is on its
own line, then sort the lines to see repeating patterns of individual words.
•
copy the lyrics to another new document (Ctrl-N) used only for analysis
•
Find and Replace a
space
with a
space
+ paragraph marker
^p
(Ctrl+H)
Find what:
□
Replace with:
□
^p
•
Then sort the lines to see repeating words. (Alt H S O)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
Describe the Performance of Binary Search.
arrow_forward
Language - Python
arrow_forward
Transcribed Image Text
Brian is also new to algorithms and you are assigned to help him in finding out how many characters comparison will Horspool's Algorithm will require in searching for pattern '000111'in a given binary text file of 1000 zeros.
arrow_forward
Answer it in matlap 2009
arrow_forward
Need code only answer fast
arrow_forward
IN PYTHON
how did read from a file and make a dictionary, assigning the data from the file i imported as keys and values of the dictionary I'm making.
arrow_forward
All of the following can be said about character data EXCEPT
OCharacter data is stored in RAM as a positive integer
OCharacters are identified by treating the value stored in the variable as the file name of the thumbnail representing
the character into the unicode table.
Oordering strings is based on the order of the individual characters.
OThe order of characters is based on the value of their index into the unicode table
arrow_forward
7. Do the insertion sort for the data below:
55, 79, 23, 40, 90, 15
arrow_forward
Both sequential and binary search have advantages and
disadvantages.
arrow_forward
Save the NBA dataframe you extracted in problem 4 as a JSON-formatted text file on your local
machine. Format the JSON so that it is organized as dictionary with three lists: columns lists
the column names, index lists the row names, and data is a list-of-lists of data points, one
list for each row. (Hint: this is possible with one line of code)
arrow_forward
What are the advantages of using a dictionary data structure?
arrow_forward
Q5: Write an M-file program to return the summation to each column of any matrix (M)
without using sum ( ) function?
Q6: Write an M-file program to return the minimum element to each column of
matrix (M) without using min ( ) function?
any
arrow_forward
show using regex, how you wouldextract certain data items and store them in a dictionary. Then create a text file and place the data from the dictionary into that file, adding proper headers to identify each column. Properly document your code to support what the code is trying to do. The data source must be a live website and not a downloaded file that is offline.
using python , pick any website for data please
arrow_forward
Python: if possible answers in one line of code
def name_sort(staff): ''' Question 2 You are given a list of Georgia Tech staff members- each with a first and last name separated by a space, and need to sort them alphabetically by last name for a directory. However, the person who provided you the list did not proofread the records and accidentally misspelled some of the names. You can identify which ones have typos just by looking at the first letter of the first name.
- First, sort the list alphabetically by the last name. - Remove any members whose first name starts with "Q" or "X". - Return the resulting list.
Args: staff (list) Returns: list
>>> name_sort(['Damon Williams', 'Xngel Cabrera', 'George Burdell', 'Brett Key', 'David Joyner', 'Qon Lowe']) ['George Burdell', 'David Joyner', 'Brett Key', 'Damon Williams']
'''
# print(name_sort(['Damon Williams', 'Xngel Cabrera', 'George Burdell', 'Brett…
arrow_forward
Consider the list of characters: [T', 'E, 'L, 'E, 'P', 'A', T, 'H, Y]. Show how this list is sorted using the following algorithms at each successive iteration (of outer loop):
1. Selection sort
2. Insertion sort
Your Answer:
Selection Sort:
L
A
H
Y
Insertion Sort:
Y
E
Tools Table
Edit View Insert Format
arrow_forward
Q9: Write an M-file program to return the maximum element to each row of any matrix
(M) without using max () function?
Qie: Write an M-file program to return the minimum element to main diagonal of any
matrix (M) without using min () function?
QMAWrite an M-file program to retum the maximum element to main diagonal of any
matrix (M) without using max () function?
arrow_forward
solution in python language
arrow_forward
python code. define the run function. Instructions are given within the quotations.
1)def invert_dict(d):
"""
Given a dictionary d, create the invert of the dictionary where
the key is a value v in d, and v is mapped to a set of keys in original
d whose values are equal to v.
Return the inverted dictionary mapping value to set of keys
"""
2)def histogram(words):
"""
Given a list of words, create a histogram dictionary where the key
is the length of a word and the value is the count of how many
words in the list are of that length.
Return the histogram dictionary mapping length to word counts
"""
3)def max_kv(d):
"""
Given a dictionary, return a tuple of key value pair where the key
is the largest in d.
"""
4)def run():
"""
Get words list from the file listed in argv[1];
create histogram of count of each word length;
create invert dictionary that maps count to list of word length;
use invert dictionary to find the list of word length with the
most popular word count (use max_kv)
"""…
arrow_forward
Sort function
false
true
is used in
MATLAB to
arrange
elements
from larger to
smaller
While loop
function is
used in case
of no number
of repetitions
is defined
To delete
data in
workspace,
we can use
clc command
Length
function is
used to know
the size of
matrix in
MATLAB
arrow_forward
Write a M-file mat lab program call it Eql.m file to find the following eqnation
(Multiply
matrix): C (m,n)3DA*B and A and B was matrix.
arrow_forward
PYTHON
Using binary search, how many checks would it take to determine if the number 400 is or is not contained in the list shown below? Make sure provide an explanation of your answer.
[100, 150, 175, 225, 235, 245, 300, 400, 1000, 1300]
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning
Related Questions
- Describe the Performance of Binary Search.arrow_forwardLanguage - Pythonarrow_forwardTranscribed Image Text Brian is also new to algorithms and you are assigned to help him in finding out how many characters comparison will Horspool's Algorithm will require in searching for pattern '000111'in a given binary text file of 1000 zeros.arrow_forward
- All of the following can be said about character data EXCEPT OCharacter data is stored in RAM as a positive integer OCharacters are identified by treating the value stored in the variable as the file name of the thumbnail representing the character into the unicode table. Oordering strings is based on the order of the individual characters. OThe order of characters is based on the value of their index into the unicode tablearrow_forward7. Do the insertion sort for the data below: 55, 79, 23, 40, 90, 15arrow_forwardBoth sequential and binary search have advantages and disadvantages.arrow_forward
- Save the NBA dataframe you extracted in problem 4 as a JSON-formatted text file on your local machine. Format the JSON so that it is organized as dictionary with three lists: columns lists the column names, index lists the row names, and data is a list-of-lists of data points, one list for each row. (Hint: this is possible with one line of code)arrow_forwardWhat are the advantages of using a dictionary data structure?arrow_forwardQ5: Write an M-file program to return the summation to each column of any matrix (M) without using sum ( ) function? Q6: Write an M-file program to return the minimum element to each column of matrix (M) without using min ( ) function? anyarrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Systems ArchitectureComputer ScienceISBN:9781305080195Author:Stephen D. BurdPublisher:Cengage Learning
Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning