Prepare Data For Exploration Weekly Challenge 3 (Part 4)
.docx
keyboard_arrow_up
School
Kaplan University *
*We aren’t endorsed by this school
Course
IN224
Subject
Computer Science
Date
Jan 9, 2024
Type
docx
Pages
2
Uploaded by LieutenantTroutPerson859
Prepare Data For Exploration Weekly Challenge 3 (Part 4)
16.Think about data as driving a taxi cab. In this metaphor, which of the following are examples of metadata? Select all that apply.
Passengers the taxi picks up
Make and model of the taxi cab (Correct)
License plate number (Correct)
Company that owns the taxi (Correct)
17.What are some key benefits of using external data? Select all that apply.
External data is free to use.
External data is always reliable.
External data can provide industry-level perspectives. (Correct)
External data has broad reach. (Correct)
18.You are working with a database table that contains customer data. The city column lists the city where each customer is located. You want to find out which customers are located in Berlin.
You write the SQL query below. Add a WHERE clause that will return only customers located in Berlin.
SELECT
*
FROM
customer
How many customers are located in Berlin?
9
12
2 (Correct)
7
19.A data analyst reviews a national database of movie theater showings. They want to find the first movies shown in San Francisco in 2001. How can they organize the data to return the first 10 movies shown at the top of their list? Select all that apply.
Sort by date in descending order
Sort by date in ascending order (Correct)
Filter out showings outside of San Francisco (Correct)
Filter out showings not in 2001 (Correct)
20.A nonprofit maintains a list of how many laptops they provide to each school in the county. In the table, there is a column called number_of_laptops. A data
analyst wants to determine which schools were given the fewest laptops. How should they sort the data to return these schools first?
Sort numerically in descending order
Sort alphabetically in ascending order
Sort numerically in ascending order (Correct)
Sort alphabetically in descending order
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
using the design file in the image attached to the question :
Normalize the database in the design file
Be aware of the possibility of duplicate data.
Name in Role and Employee and Emp_Contact_name all refer to the same data.
Name and Email in Vendor refer to the Vendor.
Assumptions made:
An employee works with one or more vendors.
An employee may have none or many dependents.
An employee may have many roles over the employee's work history and a record should be retained for each role.
Make sure to have a primary key for each table .
Make sure to include the relationships between the tables .
arrow_forward
Font
Paragraph
Styles
Editing
Dictate
Editor
Reuse
aste
Files
Styles
Voice
Sensitivity
Editor
Reuse Files
oboard
You have just been hired as a consultant for a big Health Care provider. Impressed by your
background in databases, they want you to completely redesign their database system.
Talking with the people at the HC provider, you get the following information (be very
careful on some of these attributes, and apply everything you have learned to this
point:
- The database contains information about patients, hospitals and doctors.
- Each patient has an ID, first name, last name, address and age. A patient is
uniquely identified by his or her ID.
- Each hospital has an id, name and budget. The id uniquely identifies a hospital.
- Each doctor an id, a full name, email address and a favorite movie. The id
uniquely identifies a doctor.
- Each patient can be treated at many hospitals, a hospital can treat many patients.
Each doctor can work at only one hospital. A hospital can have many…
arrow_forward
Final Project
Create a database that you might find at a college. You will be building this database from the ground up so you have many decisions to make such as naming conventions, how to organize data, and what data types to use.
Deliverables:
Document “Relationship report” showing
Table names used in the database
Table relationships
Keys
Table Fields names
Field Data types
Constraints
SQL (code) to create the following
Faculty contact list
Course Book List by Semester
Course Schedule by semester
Student Grade Report by semester
Faculty Semester grade report (number of A's, B's, C's, D's, F's per course)
Student GPA report by semester and overall (semester and cumulative)
Mailing list for Diplomas
Student Demographics over time (how many were under 18 last year, this year)
Sample query output (at least 10 entries per query)
Faculty contact list
Course Book List by Semester
Course Schedule by semester
Student Grade Report by semester
Faculty Semester grade report (number…
arrow_forward
Open the Missing Addresses query in Design view. Add a new column to determine if a customer does not have an address on file. If the customer’s Address is null, it should display Missing. If not, it should display nothing. Name the column AddressPresent. Add criteria of Missing to the column you just created, so only the customers missing an address display. Move the AddressPresent field so it appears between PhoneNumber and Address. Run the query. Ensure only customers with null Address fields display. Save and close the query.
arrow_forward
When you say "metadata," what do you mean exactly? In the context of a data set, the following is a definition of metadata: When might it be helpful to use information pulled from a set of results?
arrow_forward
Load & check the data:1. Load the data into a pandas dataframe named data_firstname where first name is you name.2. Carryout some initial investigations:a. Check the names and types of columns.b. Check the missing values.c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.)d. In you written response write a paragraph explaining your findings about each column.Pre-process and visualize the data3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’4. Fill any missing data with the median of the column.5. Drop the ID column6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add themto your written response explaining what are the key insights and findings from the plots.7. Separate the features from the class.8. Split your data into train 80% train and 20% test, use the last two digits of your student numberfor the seed.Build Classification ModelsSupport vector machine classifier with…
arrow_forward
Load & check the data:1. Load the data into a pandas dataframe named data_firstname where first name is you name.2. Carryout some initial investigations:a. Check the names and types of columns.b. Check the missing values.c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.)d. In you written response write a paragraph explaining your findings about each column.Pre-process and visualize the data3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’4. Fill any missing data with the median of the column.5. Drop the ID column6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add themto your written response explaining what are the key insights and findings from the plots.7. Separate the features from the class.8. Split your data into train 80% train and 20% test, use the last two digits of your student numberfor the seed.Build Classification ModelsSupport vector machine classifier with…
arrow_forward
Load & check the data:1. Load the data into a pandas dataframe named data_firstname where first name is you name.2. Carryout some initial investigations:a. Check the names and types of columns.b. Check the missing values.c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.)d. In you written response write a paragraph explaining your findings about each column.Pre-process and visualize the data3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’4. Fill any missing data with the median of the column.5. Drop the ID column6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add themto your written response explaining what are the key insights and findings from the plots.7. Separate the features from the class.8. Split your data into train 80% train and 20% test, use the last two digits of your student numberfor the seed.Build Classification ModelsSupport vector machine classifier with…
arrow_forward
Load & check the data:1. Load the data into a pandas dataframe named data_firstname where first name is you name.2. Carryout some initial investigations:a. Check the names and types of columns.b. Check the missing values.c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.)d. In you written response write a paragraph explaining your findings about each column. Pre-process and visualize the data3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’4. Fill any missing data with the median of the column.5. Drop the ID column6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add them to your written response explaining what are the key insights and findings from the plots.7. Separate the features from the class.8. Split your data into train 80% train and 20% test, use the last two digits of your student number for the seed. Build Classification ModelsSupport vector machine classifier with…
arrow_forward
mongodb : how to find the attributes from the database which are not common among all documents
how to query for these fields, provide some of the values they contain in the database.
arrow_forward
SportTech Events.
SportTech Events SportTech Events puts on athletic events for local high school athletes. The company needs a database designed to keep track of the sponsor for the event and where the event is located. Each event needs a description, date, and cost. Separate costs are negotiated for each event. The company would also like to have a list of potential sponsors that includes each sponsor’s contact information, such as the name, phone number, and address. Each event will have a single sponsor, but a particular sponsor may sponsor more than one event. Each location will need an ID, a contact person, and a phone number. A particular event will use only one location, but a location may be used for multiple events. SportTech asks you to create a Conceptual Entity Relationship Diagram from the information described above.
arrow_forward
Web devvelopment and e-commerce question
arrow_forward
In Access Module (Database) form please!!!
arrow_forward
Chevening Scholarship Database's stores data about each application received by the Chevening Foundation. As a database designer,
you're assigned to design the new database system according to the following requirements:
Each applicant can submit one or more applications (one application per year). Each application form is submitted by only one applicant.
Each application has a unique application ID number, application date, class level, GPA, GPA scale, academic honours, extracurricular activities,
community activities, other scholarship awarded and financial need.
The applicant's record contains student ID, student's last name, first name, date of birth, address, city, state, zip code, phone number and email address.
Each school can be associated with one or more applications. Each application is associated with only one school. Each school has a unique school ID,
school name, type, address, city, zip code and phone number.
Each application must include two or more recommendations.…
arrow_forward
Please share the resource link or website or author reference - how to download or extracted the dataset used for this project will be the crime data provided by the City of Los Angeles Open Data Portal. The dataset includes information about crime incidents, including the type of crime, date, time, location, and other details. The dataset covers the period from January 2010 to the present and contains more than 2 million rows.
arrow_forward
Load & check the data:1. Load the data into a pandas dataframe named data_firstname where first name is you name.6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add themto your written response explaining what are the key insights and findings from the plots.7. Separate the features from the class.8. Split your data into train 80% train and 20% test, use the last two digits of your student numberfor the seed.Build Classification ModelsSupport vector machine classifier with linear kernel
breast cancer problem : I have already answered 1 to 3. Please provide solution from 4,5,6,7.
Programming language python
arrow_forward
Mindtap Cengage Assignment I already figured out two this has to work through mindtap to be grade
arrow_forward
Access: Tables and Controls
arrow_forward
Question 4}Describe the difference between the Custom settings and custom metaData in Salesforce
Question 5}Can we use Custom settings and custom metadata in the formula fields?
arrow_forward
What is the meaning of "There is a power in metadata".
arrow_forward
send answer asap
arrow_forward
Dashboard actions are only applied to the entire datasets.
Group of answer choices
True
False
arrow_forward
Alert dont submit AI generated answer.
arrow_forward
Question Setup:
Scenario and Database Model: InstantRide
InstantRide is the new ride sharing application in the city and it has just started its operations. With the help of the InstantRide mobile application, the users request a ride with their location. Drivers and cars are assigned to the request; and then the driver picks up the user to ride their requested location. Information for the users, drivers and cars are stored in the database as well as the travel transactions.
In the USERS table, information for the users are stored with their first name, last name and email:
USER_ID
USER_FIRST_NAME
USER_LAST_NAME
USER_EMAIL
USERS Table
In the DRIVERS table, all the drivers in the InstantRide are stored with their name, driving license number and check and rating information:
DRIVER_ID
DRIVER_FIRST_NAME
DRIVER_LAST_NAME
DRIVER_DRIVING LICENSE_ID
DRIVER_START_DATE
DRIVER_DRIVING_LICENSE_CHECKED
DRIVER_RATING
DRIVERS Table
In the CARS table, all the cars in the InstantRide…
arrow_forward
Prestige data is data set in the package carData in R. Download and install this
package and use it to perform the following exercises.
1. (a) How can you show the structure of the data? What data structure is prestige and
what data type is each variable in Prestige?
(b) What are the dimensions of the data?
(c) Display the first six and last six rows of the data
(d) What is the highest and the lowest Income? which category of occupation(s)
is(are) paid these incomes?
2. What percentage of occupational incumbents earn more than $25,000?
3. What percentage of occupational incumbents earn above $20,000 and below $10,000?
4. Construct a box plot to compare the distribution of Income according to type of oc-
cupation. Comment on the distribution of each box plot and the presence of outliers.
Colour the box plots with different colours OF your choice.
5. Construct a histogram for each for income of each type of occupation. Do your his-
tograms conform to your results from the box plots?
6.…
arrow_forward
Create a new Personnel.mdf database with Visual Studio. Employee ID, name, job title, and pay rate must be in the database's "Employee" table. Worker ID should be the primary key. Please add five examples to the Employee table. Make a DataGridView software to display employee data?
arrow_forward
4 tables for the database: PATRON, BOOK_COPY, BOOK, and CHECKOUT. (Since a book may have multiple copies that may be purchased by the library at different time, it is better to have a BOOK_COPY table to avoid unnecessary data redundancy.) The data in the tables are as follows:
PATRON table records a patron’s ID, name, address, phone number, and email address.
BOOK table contains information such as author, title, publication date, subject, language, and a unique identifier (It can be the ISBN of the book) for each book.
BOOK_COPY table records a unique identifier for each copy of a book, the date of purchase, and the identifier of the book from the BOOK table.
CHECKOUT table records the date of check-out, patron’s ID, the identifier of the book copy from the BOOK_COPY table, and the due date.
arrow_forward
microsoft access
i need to make sure for the answer
arrow_forward
Test cases: all tables in your design will have 4 test cases. 2 create and 2 delete
What does this mean?
They are an organization aimed at providing sporting events for non-competitive athletes. This organization prides itself on providing safe, wholesome fun for athletes ages 12-17. There are multiple facilities throughout their region where games/events can be hosted.
You have been hired by DBD Sports to help them create a database to track data annually (we will keep 3 years of rolling data) for
Sports: The currently offered sports are basketball, volleyball and kickball. More sports may be added in the future. We also need to track when the sign-up time period is for each sport.
Facility Locations: Lux North, Lux East, Lux South, Lux West and Lux Central; all facilities offer all sports.
Athletes: Besides name and some demographic information, we need to record the "home facility" for the athlete. Athletes are welcome to go to any facility but ONE must be recorded as…
arrow_forward
Data dictionary contains metadata about:
Answers
Authorization
Database schema
All of them
Integrity constraints
NEXT QUESTION <
arrow_forward
When you say "metadata," what do you mean exactly? In the setting of a dataset, we can say the following about metadata: When might it be helpful to use information pulled from a set of results?
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L
Related Questions
- using the design file in the image attached to the question : Normalize the database in the design file Be aware of the possibility of duplicate data. Name in Role and Employee and Emp_Contact_name all refer to the same data. Name and Email in Vendor refer to the Vendor. Assumptions made: An employee works with one or more vendors. An employee may have none or many dependents. An employee may have many roles over the employee's work history and a record should be retained for each role. Make sure to have a primary key for each table . Make sure to include the relationships between the tables .arrow_forwardFont Paragraph Styles Editing Dictate Editor Reuse aste Files Styles Voice Sensitivity Editor Reuse Files oboard You have just been hired as a consultant for a big Health Care provider. Impressed by your background in databases, they want you to completely redesign their database system. Talking with the people at the HC provider, you get the following information (be very careful on some of these attributes, and apply everything you have learned to this point: - The database contains information about patients, hospitals and doctors. - Each patient has an ID, first name, last name, address and age. A patient is uniquely identified by his or her ID. - Each hospital has an id, name and budget. The id uniquely identifies a hospital. - Each doctor an id, a full name, email address and a favorite movie. The id uniquely identifies a doctor. - Each patient can be treated at many hospitals, a hospital can treat many patients. Each doctor can work at only one hospital. A hospital can have many…arrow_forwardFinal Project Create a database that you might find at a college. You will be building this database from the ground up so you have many decisions to make such as naming conventions, how to organize data, and what data types to use. Deliverables: Document “Relationship report” showing Table names used in the database Table relationships Keys Table Fields names Field Data types Constraints SQL (code) to create the following Faculty contact list Course Book List by Semester Course Schedule by semester Student Grade Report by semester Faculty Semester grade report (number of A's, B's, C's, D's, F's per course) Student GPA report by semester and overall (semester and cumulative) Mailing list for Diplomas Student Demographics over time (how many were under 18 last year, this year) Sample query output (at least 10 entries per query) Faculty contact list Course Book List by Semester Course Schedule by semester Student Grade Report by semester Faculty Semester grade report (number…arrow_forward
- Open the Missing Addresses query in Design view. Add a new column to determine if a customer does not have an address on file. If the customer’s Address is null, it should display Missing. If not, it should display nothing. Name the column AddressPresent. Add criteria of Missing to the column you just created, so only the customers missing an address display. Move the AddressPresent field so it appears between PhoneNumber and Address. Run the query. Ensure only customers with null Address fields display. Save and close the query.arrow_forwardWhen you say "metadata," what do you mean exactly? In the context of a data set, the following is a definition of metadata: When might it be helpful to use information pulled from a set of results?arrow_forwardLoad & check the data:1. Load the data into a pandas dataframe named data_firstname where first name is you name.2. Carryout some initial investigations:a. Check the names and types of columns.b. Check the missing values.c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.)d. In you written response write a paragraph explaining your findings about each column.Pre-process and visualize the data3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’4. Fill any missing data with the median of the column.5. Drop the ID column6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add themto your written response explaining what are the key insights and findings from the plots.7. Separate the features from the class.8. Split your data into train 80% train and 20% test, use the last two digits of your student numberfor the seed.Build Classification ModelsSupport vector machine classifier with…arrow_forward
- Load & check the data:1. Load the data into a pandas dataframe named data_firstname where first name is you name.2. Carryout some initial investigations:a. Check the names and types of columns.b. Check the missing values.c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.)d. In you written response write a paragraph explaining your findings about each column.Pre-process and visualize the data3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’4. Fill any missing data with the median of the column.5. Drop the ID column6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add themto your written response explaining what are the key insights and findings from the plots.7. Separate the features from the class.8. Split your data into train 80% train and 20% test, use the last two digits of your student numberfor the seed.Build Classification ModelsSupport vector machine classifier with…arrow_forwardLoad & check the data:1. Load the data into a pandas dataframe named data_firstname where first name is you name.2. Carryout some initial investigations:a. Check the names and types of columns.b. Check the missing values.c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.)d. In you written response write a paragraph explaining your findings about each column.Pre-process and visualize the data3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’4. Fill any missing data with the median of the column.5. Drop the ID column6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add themto your written response explaining what are the key insights and findings from the plots.7. Separate the features from the class.8. Split your data into train 80% train and 20% test, use the last two digits of your student numberfor the seed.Build Classification ModelsSupport vector machine classifier with…arrow_forwardLoad & check the data:1. Load the data into a pandas dataframe named data_firstname where first name is you name.2. Carryout some initial investigations:a. Check the names and types of columns.b. Check the missing values.c. Check the statistics of the numeric fields (mean, min, max, median, count..etc.)d. In you written response write a paragraph explaining your findings about each column. Pre-process and visualize the data3. Replace the ‘?’ mark in the ‘bare’ column by np.nan and change the type to ‘float’4. Fill any missing data with the median of the column.5. Drop the ID column6. Using Pandas, Matplotlib, seaborn (you can use any or a mix) generate 3-5 plots and add them to your written response explaining what are the key insights and findings from the plots.7. Separate the features from the class.8. Split your data into train 80% train and 20% test, use the last two digits of your student number for the seed. Build Classification ModelsSupport vector machine classifier with…arrow_forward
- mongodb : how to find the attributes from the database which are not common among all documents how to query for these fields, provide some of the values they contain in the database.arrow_forwardSportTech Events. SportTech Events SportTech Events puts on athletic events for local high school athletes. The company needs a database designed to keep track of the sponsor for the event and where the event is located. Each event needs a description, date, and cost. Separate costs are negotiated for each event. The company would also like to have a list of potential sponsors that includes each sponsor’s contact information, such as the name, phone number, and address. Each event will have a single sponsor, but a particular sponsor may sponsor more than one event. Each location will need an ID, a contact person, and a phone number. A particular event will use only one location, but a location may be used for multiple events. SportTech asks you to create a Conceptual Entity Relationship Diagram from the information described above.arrow_forwardWeb devvelopment and e-commerce questionarrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- COMPREHENSIVE MICROSOFT OFFICE 365 EXCEComputer ScienceISBN:9780357392676Author:FREUND, StevenPublisher:CENGAGE L
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L