0 of 55 Questions completed
Questions:
You have already completed the quiz before. Hence you can not start it again.
You must sign in or sign up to start the quiz.
You must first complete the following:
Quiz complete. Results are being recorded.
0 of 55 Questions answered correctly
Your time:
Time has elapsed
You have reached 0 of 0 point(s), (0 )
Earned Point(s): 0 of 0 , (0 )
0 Essay(s) Pending (Possible Point(s): 0 )
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
Current
Review
Answered
Correct
Incorrect
Question 1 of 55
1 point(s)
A machine learning specialist employs linear models, such as logistics and linear regression, to create a prediction model for many features. During exploratory data analysis, the Specialist noticed that several characteristics strongly correlate. This might lead to model instability. What steps should be taken to lessen the effects of having so many features?
Question 2 of 55
1 point(s)
A machine learning specialist implements a comprehensive Bayesian network using a dataset that details New York City’s public transport system. Given that buses run in cycles of ten minutes ,one of the random variables in discrete and reflects the average wait time for a bus in New York, which is three minutes.
Which prior probability distribution for this variable should the ML specialist use?
Question 3 of 55
1 point(s)
A significant company’s data science team accesses data kept in Amazon S3 buckets using Amazon SageMaker notebooks. The IT security team worries that instances of notebook computers with internet access provide a security hole wherein malevolent programs executing on them might jeopardize user data privacy. Data communication traffic must remain within the AWS network, and the business demands that all instances remain inside a protected VPC with no internet access. How should the Data Science team set up the location of the notebook instance to satisfy these needs?
Question 4 of 55
1 point(s)
A deep learning neural network model developed by a machine learning expert performs well on training data but badly on test data.
Which of the following approaches need the Specialist to consider applying to remedy this? (Select three.)
Question 5 of 55
1 point(s)
A data scientist must develop a serverless ingestion and analytics solution for high-velocity, real-time streaming data. Without losing data, the ingestion process must buffer and transform incoming JSON records into a query-optimized, columnar format. Analysts must be able to connect to current business intelligence dashboards and conduct SQL queries against the data, and the output datastore must be highly accessible. What kind of solution should the data scientist create to meet the needs?
Question 6 of 55
1 point(s)
A major mobile network operator is developing a machine learning algorithm to forecast which customers are most likely to cancel their subscriptions. Since the cost of churn is far higher than the cost of the incentive, the firm intends to pay a bonus to these clients. The model generates the following confusion matrix after it has been evaluated using a test dataset of 100 customers: Why is this a production-ready model, according to the model evaluation results?
Question 7 of 55
1 point(s)
A machine learning specialist is developing a technique for increasing sales for a corporation. The goal is to forecast which things customers would enjoy based on their similarities to other users using the firm’s extensive data on user behavior and product preferences. What actions should the Specialist take to accomplish this goal?
Question 8 of 55
1 point(s)
Using Amazon Athena and Amazon S3, a Mobile Network Operator is developing an analytics platform to evaluate and improve a business’s operations. The source systems send the data in CSV format in real-time. Before putting the data on Amazon S3, the Data Engineering team wishes to convert it to the Apache Parquet format. Which approach requires the LEAST amount of work to implement?
Question 9 of 55
1 point(s)
To combat the effects of air pollution, a city wishes to monitor its air quality. For the following two days, the city’s air quality in parts per million of contaminants has to be predicted by a machine learning expert. Only daily data from the previous year is provided because this is a prototype. Which model in Amazon SageMaker is MOST likely to deliver the most significant outcomes?
Question 10 of 55
1 point(s)
How can a data engineer ensure the data is encrypted and the credit card information is safe while utilizing a dataset, including client credit card information, to construct a model?
Question 11 of 55
1 point(s)
A corporate VPC’s private subnet has an instance of a machine learning specialist using an Amazon Sage Maker notebook. The Amazon SageMaker notebook instance’s Amazon EBS volume contains crucial data that the ML Specialist needs to take a snapshot of. The Amazon SageMaker notebook instance’s EBS volume and Amazon EC2 instance within the VPC are not, however, visible to the ML Specialist. Why is the instance not visible in the VPC to the ML Specialist?
Question 12 of 55
1 point(s)
Using Amazon SageMaker, a machine learning expert is developing a model to forecast time series. After completing the model’s training, the Specialist intends to load test the endpoint to set up Auto Scaling for the model variation. Which method will enable the Specialist to examine the latency, memory, and CPU use throughout the load test?
Question 13 of 55
1 point(s)
A manufacturing business keeps organized and unstructured data in an Amazon S3 bucket. An expert in machine learning wants to use SQL to query this data. Which approach makes it the LEAST difficult to query this data?
Question 14 of 55
1 point(s)
A machine learning specialist is creating an application’s bespoke video recommendation model. Millions of data points comprise the sizable dataset used to train this model, stored in an Amazon S3 bucket. The Specialist doesn’t want to load all of this data onto an Amazon SageMaker notebook instance since doing so would take hours and go over the 5 GB limit of the associated Amazon EBS volume. Which strategy enables the Specialist to train the model using all the data?
Question 15 of 55
1 point(s)
A machine learning specialist is prepared to use Amazon SageMaker to construct an end-to-end solution in AWS after completing a proof of concept for a business using a tiny data sample. In Amazon RDS, the past training data is kept. Which method should the Specialist take to train a model with the data?
Question 16 of 55
1 point(s)
Customer information for an online retailer is sent to a machine learning specialist. Demographics, historical visits, and location data are all included in the report. To improve the website and provide better customer service and insightful suggestions, the Specialist must create a machine-learning strategy to understand client buying behaviors, preferences, and trends. Which remedy ought the Specialist suggest?
Question 17 of 55
1 point(s)
A big corporation collaborates with a machine learning specialist to incorporate machine learning into its goods. Based on which consumers will and will not leave the firm during the next six months, the business wishes to divide its clientele into several groups. The data that is accessible to the Specialist contains labels from the firm. What kind of machine learning model must the Specialist employ to do this task?
Question 18 of 55
1 point(s)
The graph shown is from a forecasting model used to evaluate a time series. Which inference should a machine learning specialist draw about the model’s behavior based only on the graph?
Question 19 of 55
1 point(s)
A dataset is being prepared for model training by a machine learning specialist at a business that values security. The dataset comprises Personally Identifiable Information (PII) kept in Amazon S3. The data set must only be reachable from a VPC. Must not use a public internet connection. How can these demands be fulfilled?
Question 20 of 55
1 point(s)
A data scientist sees oscillations in a neural network’s training accuracy as a mini-batch trained to solve a classification challenge. What is the RISIEST reason for this problem?
Question 21 of 55
1 point(s)
A video with audio was discovered by an employee on a business’ social media profile. Spanish is the language spoken in the video. The employee speaks English as their primary language and does not comprehend Spanish. The worker desires to do a sentiment analysis. Which mix of services will complete the assignment MOST effectively?
Question 22 of 55
1 point(s)
To use Amazon SageMaker for training, a machine learning expert packs a customized ResNet model into a Docker container. The Specialist must correctly configure the Docker container to use the NVIDIA GPUs while using Amazon EC2 P3 instances to train the model. What should the Specialist do?
Question 23 of 55
1 point(s)
A machine learning expert is developing a logistic regression model to foretell whether or not a person would buy pizza. The expert attempts to create the best model with the best classification threshold. What kind of model evaluation should the Specialist employ to comprehend the effects of various categorization thresholds on the model’s performance?
Question 24 of 55
1 point(s)
An interactive online dictionary would want to add a widget that shows terms used in related situations. The downstream nearest neighbor model driving the widget requests word features from a machine learning specialist. What steps should the Specialist take to fulfill these demands?
Question 25 of 55
1 point(s)
Amazon SageMaker is being set up by a machine learning specialist so that several data scientists may access notebooks, train models, and deploy endpoints. The Specialist must be able to monitor the frequency with which the Scientists deploy models, the GPU and CPU use on the deployed SageMaker endpoints, and all errors produced when an endpoint is called to guarantee optimal operational performance. What applications are linked with Amazon SageMaker so that this data may be tracked? (Select two.)
Question 26 of 55
1 point(s)
A retail chain utilizes Amazon Kinesis Data Firehose to ingest buying details from its 20,000 outlets to the Amazon S3 network. Training data will need to go through some new, straightforward transformations, and some features will be consolidated to allow training an upgraded machine learning model. Every day, the model needs to be retrained. Which update will demand the LEAST amount of development work given the several storages and the historical data ingestion?
Question 27 of 55
1 point(s)
A convolutional neural network (CNN) developed by a machine learning expert will categorize ten animal species. An animal input picture will be sent through a succession of convolutional and pooling layers before passing through a dense, fully connected layer with ten nodes that The Specialist has constructed. The output that the Specialist is looking for from the neural network is a probability distribution that indicates how probable it is that each of the ten classes that the input image belongs to. Which operation will result in the required result?
Question 28 of 55
1 point(s)
A machine learning expert developed a regression model. However, the initial iteration has to be optimized. The Specialist must know whether the model usually overestimates or underestimates the objective. What method might the Specialist choose to ascertain whether the target value is over- or under-estimated?
Question 29 of 55
1 point(s)
Area Under the Curve (AUC) is the target measure as a Machine Learning Specialist starts a hyperparameter tuning project for a tree-based ensemble model using Amazon SageMaker. This method will eventually be implemented in a pipeline to model click-through on data that stales every 24 hours by retraining and fine-tuning hyperparameters each night. The Specialist wishes to modify the input hyperparameter range(s) to reduce the time required to train these models and, eventually, to save expenses. What visualization will make this happen?
Question 30 of 55
1 point(s)
A machine learning expert is developing a new natural language processing tool that analyses a dataset with 1 million sentences. The next step is to run Word2Vec to create sentence embeddings and enable various predictions. A sample from the dataset is shown here: The speedy BROWN FOX leaps over the lazy dog. What actions must the Specialist carry out to properly sanitize and prepare the data in a repeatable manner?
(Select three.)
Question 31 of 55
1 point(s)
A company uses Amazon Polly to convert plaintext texts to speech for automated business announcements. But in the existing texts, business abbreviations are mispronounced. What steps should a machine learning expert take to resolve this problem for the following documents?
Question 32 of 55
1 point(s)
An insurance firm is developing a new car accessory that utilizes a camera to monitor driver behavior and notify them when they look inattentive. A machine learning specialist will utilize about 10,000 training photos produced by the firm in a controlled setting to train and test machine learning models. During the model evaluation, the expert observes that the training error rate decreases more quickly as the number of epochs rises and that the model does not correctly infer from the test pictures that have not yet been viewed. Which of the following methods ought to be applied to fix this problem? (Select two.)
Question 33 of 55
1 point(s)
Which standard parameters must be given when submitting Amazon SageMaker training tasks using one of the built-in algorithms?
Question 34 of 55
1 point(s)
A gaming company has developed an online game where users can play for a charge to access particular features or register for a free account. The company needs to develop an automated system to predict if a new user will upgrade to a premium membership in less than a year. The company has gathered labeled datasets from one million consumers. The training dataset consists of 999,000 negative samples (from users who did not use any premium dataset) and 1,000 positive samples (from users who paid within a year). Two hundred factors are included in each data sample: play behaviors, device, location, and user age. Using this dataset, the Data Science team constructed a random forest model, and it converged with over 99% accuracy on the training set.
Question 35 of 55
1 point(s)
Based on data gathered on each patient and their treatment plans, a data scientist is creating a machine learning model to forecast future patient outcomes. As a prediction, the model should produce a continuous value. A collection of 4,000 patients’ labeled outcomes is included in the data set. The research subjects were a group of people over 65 who had a specific ailment known to get worse with age. Initial models have had unsatisfactory results. When evaluating the underlying data, the data scientist discovered that, out of 4,000 patient observations, 450 had patient age entered as 0. Compared to the rest of the sample population, the other characteristics of these observations seem typical.
How can the data scientists fix this problem?
Question 36 of 55
1 point(s)
A monitoring service produces 1 TB of scale metrics record data every minute. Using Amazon Athena, a research team runs queries on this data. The team needs more extraordinary performance because the queries take a long time to execute because of the volume of data.
What kind of record storage should be used in Amazon S3 to boost query performance?
Question 37 of 55
1 point(s)
A gaming business has released an online game where players may sign up for a free account and play for a fee if they want to access specific features. The business must create an automated method to forecast whether a new user will upgrade to a premium account within a year. The firm has collected one million users’ labeled datasets. The training dataset comprises 1,000 positive samples (from users who paid within a year) and 999,000 negative samples (from users who did not utilize any premium features). Each data sample includes two hundred characteristics: user age, device, location, and play behaviors. The Data Science team built a random forest model on this dataset, and it converged on the training set with above 99% accuracy. On a test dataset, however, the prediction results were unsatisfactory. Which method from the list below should the data science team use to address this problem? (Select two.)
Question 38 of 55
1 point(s)
A team of data scientists is developing a dataset repository to save a lot of training data frequently utilized in their machine learning models. The system must grow automatically and be economical because Data Scientists could produce an arbitrary number of new datasets daily. Additionally, SQL querying of the data must be possible.
Which storage solution is BEST suited for this situation?
Question 39 of 55
1 point(s)
On a commercial website, a machine learning expert installed a model that offers product recommendations. The concept worked well initially, leading to average people buying more items. The Specialist has observed, however, that over the past several months, the impact of product suggestions has reduced and that clients are beginning to revert to their earlier buying patterns. The model hasn’t changed since it was deployed over a year ago, so the Specialist is unaware of what went wrong.
Which approach should the Specialist try to enhance the performance of the model?
Question 40 of 55
1 point(s)
A machine learning specialist at an online clothing retailer wants to create a data ingestion solution for the organization’s data lake hosted on Amazon S3. The Specialist is developing a set of ingestion techniques that will allow for future capabilities such as Real-time analytics, Interactive analytics of historical data, Clickstream analytics, and product suggestions.
Which services ought to be used by the Specialist?
Question 41 of 55
1 point(s)
A firm noticed poor accuracy while training using Amazon SageMaker’s default built-in picture categorization algorithm. Instead of using a ResNet design, the data science team prefers to employ an Inception neural network architecture. Which of the following options will make this happen? (Select two.)
Question 42 of 55
1 point(s)
A machine learning expert created a deep learning model for picture categorization. The Specialist, however, encountered an overfitting issue where the training and testing accuracy were 99% and 75%, respectively.
What is the cause of this issue, and how should the Specialist handle it?
Question 43 of 55
1 point(s)
An Apache MXNet handwritten digit classifier model is trained using a research dataset by a machine learning team utilizing Amazon SageMaker. When the model is overfitting, the team wishes to be notified. Auditors want to see the Amazon SageMaker log activity report to ensure there haven’t been any unauthorized API calls. What should the machine learning team do to satisfy the criteria in the shortest possible time and with the least amount of code?
Question 44 of 55
1 point(s)
What is the critical benefit of Amazon SageMaker in machine learning?
Question 45 of 55
1 point(s)
Which AWS service is specifically designed for building, training, and deploying machine learning models in the cloud?
Question 46 of 55
1 point(s)
A significant company’s data science team accesses data kept in Amazon S3 buckets using Amazon SageMaker notebooks. The IT security team worries that instances of notebook computers with internet access provide a security hole wherein malevolent programs executing on them might jeopardize user data privacy. Data communication traffic must remain within the AWS network, and the business demands that all instances remain inside a protected VPC with no internet access. How should the Data Science team set up the location of the notebook instance to satisfy these needs?
Question 47 of 55
1 point(s)
A deep learning neural network model developed by a machine learning expert performs well on training data but badly on test data. Which of the following approaches need the Specialist to consider applying to remedy this? (Select three.)
Question 48 of 55
1 point(s)
An immense, multi-column dataset belonging to an online retailer is missing 30% of the data in one of the columns. A machine learning expert thinks some of the dataset’s columns may be utilized to fill in the missing data. Which reconstruction technique should the Specialist apply to maintain the dataset’s integrity?
Question 49 of 55
1 point(s)
A business is putting up an environment for Amazon SageMaker. The company’s data security policy prohibits internet-based communication. Without permitting direct internet access to Amazon SageMaker notebook instances, how can the business activate the Amazon SageMaker service?
Question 50 of 55
1 point(s)
A machine learning expert is training a model to recognize the make and model of automobiles in photographs. The Specialist wishes to leverage an existing model developed using photos of everyday items and transfer learning. The Specialist compiled a sizable bespoke collection of images with various car brands and models. What should the Specialist do to set up the model so that it can be retrained using the customized data?
Question 51 of 55
1 point(s)
An industrial business has a vast collection of labeled historical sales data. The company wants to forecast how many units of a specific item should be manufactured every three months. Which machine learning technique ought to be applied to address this issue?
Question 52 of 55
1 point(s)
A company’s machine learning expert needs to use TensorFlow to increase a time-series forecasting model’s training pace. Currently, a single GPU is used for the training, which takes around 23 hours to finish. The training must be conducted every day. The model’s accuracy is satisfactory, but the firm expects the bulk of the training data to keep growing, necessitating hourly rather than daily model updates. Additionally, the business wishes to reduce the amount of code and infrastructure modifications. What adjustments should the machine learning specialist make to make the training solution scalable for future Demand?
Question 53 of 55
1 point(s)
Which measures should a machine learning specialist often utilize to contrast and assess various machine learning categorization models?
Question 54 of 55
1 point(s)
100 TB of forecasts are produced daily by a machine learning prediction service offered by a firm. A machine learning expert must create a read-only version of the daily precision-recall curve visualization and send it to the business team. Which option needs the LEAST amount of coding?
Question 55 of 55
1 point(s)
Data is being prepared for training on Amazon SageMaker by a machine learning specialist. The Specialist is using one of SageMaker’s built-in algorithms during training. The dataset is converted from its original CSV format into a numpy—array, which looks to be slowing down the training process. What steps should the Specialist take to prepare the data for SageMaker training?