0 of 55 Questions completed
Questions:
You have already completed the quiz before. Hence you can not start it again.
You must sign in or sign up to start the quiz.
You must first complete the following:
Quiz complete. Results are being recorded.
0 of 55 Questions answered correctly
Your time:
Time has elapsed
You have reached 0 of 0 point(s), (0 )
Earned Point(s): 0 of 0 , (0 )
0 Essay(s) Pending (Possible Point(s): 0 )
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
Current
Review
Answered
Correct
Incorrect
Question 1 of 55
1 point(s)
You work as a machine learning specialist for a mobile phone operator, where you need to build a machine learning model that predicts when a given customer is about to leave your phone service or churn. The inference data produced by your model will allow your marketing department to offer customers incentives to get them to stay with your service. Using data generated by customer activity with your service offering, you need to visualize the inference data in a dashboard. So, your marketing department can quickly decide which customer churn candidates to offer additional incentives. How can you get your machine learning inference data into your dashboard visualization most efficiently and efficiently?
Question 2 of 55
1 point(s)
You work as a machine learning specialist for a social media software company. Your company produces social media apps such as interactive games and photo-sharing communities. Your machine learning team has created a machine learning model that produces recommendations via advertising in your apps, such as showing advertising for skiing trips to a user who follows a ski resort in the photo-sharing app. The production variant that your team has deployed experiences very wild swings in traffic volume over any given day. Also, since the app is relatively new to the mobile community, it receives no traffic for some periods. You have set up SageMaker’s automatic scaling policy for your production variant instances. However, you have noticed that scaling-in does not happen when your traffic reduces to nothing for some time. Why might this happen?
Question 3 of 55
1 point(s)
You work as a machine learning specialist for a book publishing firm. Your firm is releasing a new publication and would like to use a machine learning model to structure a marketing campaign for the new publication to decide whether to market to each of its registered customers or not. You and your machine learning team have developed a model using the XGBoost SageMaker built-in algorithm. You are now at the hyperparameter optimization stage, trying to find the best version of your model by running several training jobs on your data using your XGBoost algorithm. How do you configure your hyperparameter tuning jobs to get a recommendation for the best values for your hyperparameters?
Set the eta, alpha, and min_child_weight to specific values and the max_depth to a range of values. Minimize the area under the curve (AUC) as your optimization metric.
Set ranges of values for the eta, alpha, min_child_weight, and max_depth hyperparameters. Maximize the area under the curve (AUC) as your optimization metric.
Set ranges of values for the eta, alpha, min_child_weight, and max_depth hyperparameters. Minimize the normalized discounted cumulative gain (ndcg) as your optimization metric.
Set ranges of values for the eta, alpha, min_child_weight, and max_depth hyperparameters. Launch one training job. Maximize the area under the curve (AUC) as your optimization metric.
Question 4 of 55
1 point(s)
You are a machine learning specialist for a national government agency’s infectious disease testing department. Your machine learning team is responsible for creating a machine learning model that analyzes the daily test datasets for your country and produces daily predictions of disease contraction and death rate trends. National and international news agencies use these projections to report on the daily projections of infectious disease progression. Since your model works on massive datasets daily, which of the following statements accurately describes your inference processing?
Question 5 of 55
1 point(s)
You work as a machine learning specialist for a polling organization using US census data to predict whether a given polling respondent earns more significant than $75,000. Your company will then sell the polling prediction data to candidates nationwide for various political office positions. You need to clean the polling data on which you wish to train your binary classification model. Specifically, you need to remove duplicate rows with erroneous data, transform the income column into a label column with two values, transform the age column to a definite feature by binning the column, scale the capital gain and capital losses columns, and finally split the data into train and test datasets. Which options are the most efficient ways to achieve your data sanitizing and feature preparation? (Select TWO)
Question 6 of 55
1 point(s)
You work as a machine learning specialist for a start-up software company that builds a mobile app that subscribers can use to identify various birds from pictures they take with their phone camera. You have a large set of unlabeled images of birds that you want to use as your training data for your image recognition application. Which option is the most efficient approach to creating a labeling job to build the training dataset for your mobile app?
Question 7 of 55
1 point(s)
You work as a machine learning specialist for a home automation company that produces home automation devices such as automated door locks, security cameras, and alarm systems. Your machine learning team is building a new data repository for a device your company will launch as a new product soon. The new device will generate significant streams of IoT data using the MQTT protocol. It would help if you created a data repository for use in your machine learning models that will be used to produce future device usage predictions. Your management team will use these device usage predictions for marketing campaigns. Which option is the most cost-effective configuration of AWS services to build your data repository? (Select TWO)
Question 8 of 55
1 point(s)
You work as a machine learning specialist for a software company that offers real-time interactive sports viewing apps for mobile phones and tablets. You gather real-time streaming sports statistics and game action data and use the streaming data to produce real-time analytics and active predictions of the likely outcome of the game. You must use several machine learning models with real-time streaming data as their training and inference data sources to produce your prediction. Since the real-time streaming game data is delivered from several different sources, the format and schema of the data need transformation and sanitation. Which option best performs the feature engineering of your real-time streaming data for your training and inference requests?
Question 9 of 55
1 point(s)
You work as a machine learning specialist for a medical imaging company. You and your machine learning team have been assigned to build a model that predicts whether a breast mass image indicates a benign or malignant tumor. Your model will help physicians quickly decide how to treat their patients using a verified diagnosis. Which option gives the appropriate machine learning services and features to train your model for your image diagnosis problem?
Specify the SageMaker role is used to give learning and hosting access to your data by using the role = PageMaker.get_role () statement in your jupyter notebook. Load your data into a pandas dataframe. Split your data into 80% training, 10% validation, and 10% testing. Set the predictor_type hyperparameter to binary_classifier. Then, run your training job using the PageMaker.create_training_job statement in your jupyter notebook.
Specify the SageMaker role is used to give learning and hosting access to your data by using the role = PageMaker.get_execution_role () statement in your jupyter notebook. Load your data into a pandas dataframe. Split your data into 80% training, 10% validation, and 10% testing. Set the predictor_type hyperparameter to binary_classifier. Then, run your training job using the PageMaker.create_training_job statement in your jupyter notebook.
Specify the SageMaker role is used to give learning and hosting access to your data by using the role = PageMaker.get_execution_role () statement in your jupyter notebook. Load your data into a pandas dataframe. Split your data into 80% training, 10% validation, and 10% testing. Set the predictor_type hyperparameter to the regressor type. Then, run your training job using the PageMaker.create_training_job statement in your jupyter notebook.
Specify the SageMaker role is used to give learning and hosting access to your data by using the role = PageMaker.get_execution_role() statement in your jupyter notebook. Load your data into a pandas dataframe. Split your data into 80% training, 10% validation, and 10% testing. Set the predictor_type hyperparameter to multiclass_classifier. Then, run your training job using the PageMaker. Create a training_job statement in your jupyter notebook.
Question 10 of 55
1 point(s)
You work as a machine learning specialist for a video game software company. You have been asked to produce a machine learning model that predicts whether a newly released game will eventually become a successful product that profits the company. The data used for your model is product information and product ratings from social media. Your management team would like to use your model results to help them decide if a new game is worth investing in marketing dollars to promote it further. Which model and objective will best match your model requirements?
Question 11 of 55
1 point(s)
You work as a machine learning specialist for a financial services organization. Your machine learning team is responsible for building models that predict index fund tracking errors for the various funds managed by your mutual fund portfolio management department. You must ingest data into your data lake for use in your machine-learning models. The required securities pricing data come from varying sources that deliver the data you need to use in your model inferences in near real-time. You need to perform data transformation, such as compression, of the data before writing it to your S3 data lake. Which option is the most efficient for ingesting the data into your data lake?
Question 12 of 55
1 point(s)
You work as a machine learning specialist for a data mining department of a large bank. Your department is responsible for leveraging the bank’s huge data lake to gain insights and make predictions for your marketing and risk departments. Your team’s latest project, an XGBoost prediction model, is ready for production deployment. However, you want to run additional batch predictions using a batch inference job to ensure your model can handle the production prediction workload. In your SageMaker notebook, how do you extend your estimator to read input data in batch from a specified S3 bucket and make predictions?
Question 13 of 55
1 point(s)
You work as a machine learning specialist for the sales department of a large web retailer that needs to gain insight into their sales patterns. They need a way to use visualization to show their sales data in near-real time torecognize higher-than-expected sales of specific products quickly. This will help your product operations quickly meet high demands. Which option is a viable, efficient solution to your problem?
Question 14 of 55
1 point(s)
You work as a machine learning specialist for a security company that uses video feeds to identify criminal activity in a client’s retail environment. You are building a convolutional neural network model for your video classification using TensorFlow. The model is expected to classify video scenes as criminal, such as theft or benign. You have thousands of hours of video on which to train your model. Therefore, you plan to leverage hyper parameter tuning to run multiple training jobs using different hyper parameter combinations. The goal is to find the model with the best training result. You are writing your hyper parameter tuning job in your Sage Maker jupyter notebook. When you create your Hyper parameter Tuner object in your Python code, which parameters do you pass in the method? (Select TWO)
Question 15 of 55
1 point(s)
You work as a machine learning specialist for a social media software company that produces games for mobile devices. Your company has a new game that they believe will generate a large following very quickly. It would help if you built a model to predict whether users will purchase additional game features via in-app purchases. You have a large dataset to use for your training, and you need to find the best hyperparameters by using a hyperparameter tuning job. You have configured the training jobs; the hyperparameter tuning job will run by defining an estimator and objective. You want to run your training jobs in a highly parallel manner to complete your hyperparameter tuning quickly. Also, you know that the order of magnitude is more important than the absolute value for your hyperparameter values. For example, a change from 1 to 2 is expected to have a much more significant impact than a 100 to 101. Which scaling and search types combination should you use for your hyperparameter tuning job?
Question 16 of 55
1 point(s)
You work as a machine learning specialist for a small startup software company. You are the only machine learning specialist in the company. The company’s founder needs you to quickly build a machine learning model to test one of the team’s minimum viable products with the intent of persevering or pivoting depending on the outcome of your model experiment. You have decided to use SageMaker Autopilot to create your experiment. You are creating your experiment in SageMaker Autopilot. You have selected the S3 bucket, data source, and target feature for which to make predictions. You can now select the machine learning problem type and objective metric. Which are viable combinations for your selections?
Question 17 of 55
1 point(s)
You work as a machine learning specialist for a social media software company. Your company produces social media game apps. Your machine learning team has been asked to produce a machine learning model to predict user purchase of apps similar to apps they have already purchased. You have created a model based on the SageMaker built-in XGBoost algorithm. You are now using hyperparameter tuning to get the best-performing model for your problem. Which evaluation metrics and corresponding optimization direction should you choose for your automatic model tuning (a.k.a. hyperparameter tuning)? (Select TWO)
Question 18 of 55
1 point(s)
You work as a machine learning specialist for an auto manufacturer. You are on a machine learning team responsible for analyzing the efficiency of potential electric car drive trains. These drivetrains have explicit energy storage requirements (regenerative braking) to help with efficiency when driving in cities. Your team is in the feature engineering phase of your model development. It would help if you produced visualizations to understand which features are useful and which can be improved using dimensionality reduction. You have several data sources that you would like to visualize in your QuickSight environment. Which of your data sources cannot be directly used as data sources in QuickSight?
Question 19 of 55
1 point(s)
You are a machine learning specialist for a research data streaming service serving subscribers’ research reference content. Your company’s subscriber base is primarily made up of university research staff. However, your company occasionally produces research content with broader appeal, and your service gets big spikes in requests for streaming traffic. Your machine learning team is a critical component in the content delivery process. You have a recommendation engine model variant that processes inference requests for every content streaming request. When your model variant receives these spikes in inference requests, your company’s streaming service suffers poor performance. You have decided to use SageMaker autoscaling to meet the varying demand for your model variant inference requests. Which type of scaling policy should you use in your SageMaker autoscaling implementation?
Question 20 of 55
1 point(s)
You work as a machine learning specialist for a real estate company. Your company wishes to have you develop a model that predicts if a given property is in a “high value” neighborhood (properties with a median household value at or above $180,000). Your real estate agents will use this model to prioritize their sales work based on potential commissions for any given property in their list of potential sales leads. Which option is the best approach to solve this problem?
Use SageMaker Linear Learner to optimize for a continuous objective, such as mean square error, cross-entropy loss, or absolute error, to predict the median household value for each district.
Use SageMaker Linear Learner optimizing for a discrete objective suited for classification, such as F1, precision, recall, or the accuracy, to predict whether or not a district is "high value".
Use SageMaker Linear Learner to optimize for a continuous objective, such as mean square error, cross-entropy loss, or absolute error, to predict whether or not a district is "high value."
Use SageMaker Linear Learner to optimize for a continuous objective, such as F1, precision, recall, or accuracy, to predict the median household value for each district.
Question 21 of 55
1 point(s)
You work as a machine learning specialist for a financial services firm specializing in risk analysis for other financial services firms. Your machine learning team has been tasked with building a model that categorizes a firm’s foreign exchange risk for each portfolio. You have begun building your model using SageMaker Studio, and you are at the point in your data exploration where you need to know the importance of each feature in your training dataset. Which option gives you the most efficient view of this feature comparison?
Question 22 of 55
1 point(s)
You work as a machine learning specialist for an alternative transportation ride-share company. Your company has scooters, electric longboards, and other electric personal transportation devices in several major cities across the US. Your machine learning team has been asked to produce a model that classifies device preference by trip duration for each available personal transportation device you offer in each city. You have created a model based on the SageMaker built-in K-Means algorithm. You are now using hyperparameter tuning to get the best-performing model for your problem. Which evaluation metrics and corresponding optimization direction should you choose for your automatic model tuning (a.k.a. hyperparameter tuning)? (Select TWO)
Question 23 of 55
1 point(s)
You work as a machine learning specialist for a cruise ship company. Due to new health restrictions, your company must only book its cruise ships at 50% capacity across all their offerings. To maximize profitability, you have been asked to create a model that gathers streaming data from various data sources such as weather services, census data, gross national product for various countries, spending habits across various countries, etc. You will use this data to build a model that uses data clusters to predict cruise allocation. You must perform feature engineering, such as feature transformations, on your streaming data and then load it into your company’s MongoDB database. What is the most efficient solution for your scenario?
Question 24 of 55
1 point(s)
You work as a machine learning specialist for a security firm that requires you to encrypt all your machine learning infrastructure in transit and at rest. Your team is building a fraud detection algorithm using the Random Cut Forest SageMaker built-in algorithm. You and your teammates use SageMaker notebook instances to build your model components. You need to customize the operating system of your notebook instances by installing custom libraries and setting specific operating system-level configurations to meet your firm’s security requirements. Your Chief Financial Officer wants to keep the cost of running your SageMaker instances as low as possible. Therefore, you must manage the runtime of your SageMaker notebook instances, only having them running when they are actively in use. How can you meet your requirements most efficiently?
Question 25 of 55
1 point(s)
You work as a machine learning specialist for an online flight booking service that finds the lowest-cost flights based on user input, such as flight dates, origin, destination, number of layovers, and other factors. Your machine learning team gathers data from many sources, including airline flight databases, credit agencies, etc., to use in your model. You must transform this data for your model training and in real-time for your inference requests. What is the most efficient way to build these transformations into your model workflow?
Question 26 of 55
1 point(s)
You work as a machine learning specialist for a computer hardware component producer. Your company produces individual components, such as processor chips, GPUs, etc., and assembled computer peripherals, such as monitors and external disk drives. Your team has been tasked with building a machine learning model that predicts future product sales to improve supply chain management based on data from your semiconductor, transistor, and other base component suppliers and data from your sales department. After training your model, you need to evaluate it to determine whether its performance and accuracy will allow you to use it to predict future product sales accurately. You have decided to perform an offline model evaluation of your model using your historical data. You have split your validation dataset into ten parts. You then execute ten training runs, which produce ten models. You then aggregate the ten models to get your final evaluation. Which model evaluation method are you using?
Question 27 of 55
1 point(s)
Worked as a machine learning specialist for a research department at a large university. Your team of machine learning specialists is responsible for all aspects of the machine learning lifecycle, including creating the data repositories used by your research scientists for their data science work. Your team has built a SageMaker infrastructure for your data scientists. You stream data from many sources, such as satellite feeds, IoT devices like underwater sensors, etc. You have recently implemented SageMaker Feature Store and are now implementing batch data ingestion from your streaming data sources. Which options are viable approaches to stream data into your SageMaker Feature Store? (Select TWO)
Question 28 of 55
1 point(s)
You work as a machine learning specialist for a sports gambling website. Your machine learning team has been asked to create a football score prediction model that predicts the winner of a match, the score difference, and the shots-on-goal differential. You have collected historical football match data, and you have selected the SageMaker XGBoost built-in algorithm to use for your model. You are now set to train your model using a SageMaker training job. Which of the following are NOT used by your SageMaker training job? (Select TWO)
Question 29 of 55
1 point(s)
You work as a machine learning specialist for a mobile phone carrier network. Your team of machine learning specialists needs to produce a model that clusters network users by payment plan and real-time geographic region. Your management team wants to use the model to consider marketing offerings targeted to a customer’s billing plan and geographic region where they spend the most time. You have created a k-means model andmust test model variations on real-time customer data. Which option is best for testing model variations described by your use case?
Question 30 of 55
1 point(s)
You work as a machine learning specialist for a city government agency in their urban housing department. You have been assigned to use a machine learning model to find the best housing location to place new public housing applicants. You have been asked to propose housing sites for new applicants based on the similarity of the applicant (such as the applicant’s work location, number of people in the family group, applicant’s income range, etc.) to the other housing residents in the city. You have decided to use the SageMaker k-nearest neighbors built-in algorithm. You have produced a model variant and deployed it to an HTTPS endpoint. Based on your initial evaluation results, you would like to change the SageMaker endpoint by updating the ML compute instances of the existing variant to make them more powerful and add a new model variant. What is the most productive way to implement these changes?
Question 31 of 55
1 point(s)
You work as a machine learning specialist for a mobile phone operator, where you need to build a machine learning model that predicts when a given customer is about to leave your phone service or churn. The inference data produced by your model will allow your marketing department to offer customers incentives to get them to stay with your service. Using data generated by customer activity with your service offering, you need to visualize the inference data in a dashboard. So, your marketing department can quickly decide which customer churn candidates to offer additional incentives. How can you get your machine learning inference data into your dashboard visualization most efficiently and efficiently?
Question 32 of 55
1 point(s)
You work as a machine learning specialist for a social media software company. Your company produces social media apps such as interactive games and photo-sharing communities. Your machine learning team has created a machine learning model that produces recommendations via advertising in your apps, such as showing advertising for skiing trips to a user who follows a ski resort in the photo-sharing app. The production variant that your team has deployed experiences very wild swings in traffic volume over any given day. Also, since the app is relatively new to the mobile community, it receives no traffic for some periods. You have set up SageMaker’s automatic scaling policy for your production variant instances. However, you have noticed that scaling-in does not happen when your traffic reduces to nothing for some time. Why might this happen?
Question 33 of 55
1 point(s)
You work as a machine learning specialist for a book publishing firm. Your firm is releasing a new publication and would like to use a machine learning model to structure a marketing campaign for the new publication to decide whether to market to each of its registered customers or not. You and your machine learning team have developed a model using the XGBoost SageMaker built-in algorithm. You are now at the hyperparameter optimization stage, trying to find the best version of your model by running several training jobs on your data using your XGBoost algorithm. How do you configure your hyperparameter tuning jobs to get a recommendation for the best values for your hyperparameters?
Set the eta, alpha, and min_child_weight to specific values and the max_depth to a range of values. Minimize the area under the curve (AUC) as your optimization metric.
Set ranges of values for the beta, alpha, min_child_weight, and max_depth hyperparameters. Maximize the area under the curve (AUC) as your optimization metric.
Set ranges of values for the beta, alpha, min_child_weight, and max_depth hyperparameters. Minimize the normalized discounted cumulative gain (ndcg) as your optimization metric.
Set ranges of values for the beta, alpha, min_child_weight, and max_depth hyperparameters. Launch one training job. Maximize the area under the curve (AUC) as your optimization metric.
Question 34 of 55
1 point(s)
You are a machine learning specialist for a national government agency’s infectious disease testing department. Your machine learning team is responsible for creating a machine learning model that analyzes the daily test datasets for your country and produces daily predictions of disease contraction and death rate trends. National and international news agencies use these projections to report on the daily projections of infectious disease progression. Since your model works on massive datasets daily, which of the following statements accurately describes your inference processing?
Question 35 of 55
1 point(s)
You work as a machine learning specialist for a polling organization using US census data to predict whether a given polling respondent earns more significant than $75,000. Your company will then sell the polling prediction data to candidates nationwide for various political office positions. You need to clean the polling data on which you wish to train your binary classification model. Specifically, you need to remove duplicate rows with erroneous data, transform the income column into a label column with two values, transform the age column to a definite feature by binning the column, scale the capital gain and capital losses columns, and finally split the data into train and test datasets. Which options are the most efficient ways to achieve your data sanitizing and feature preparation? (Select TWO)
Question 36 of 55
1 point(s)
You work as a machine learning specialist for a startup software company that builds a mobile app that subscribers can use to identify various birds from pictures they take with their phone camera. You have a large set of unlabeled images of birds that you want to use as your training data for your image recognition application. Which option is the most efficient approach to creating a labeling job to build the training dataset for your mobile app?
Question 37 of 55
1 point(s)
You work as a machine learning specialist for a home automation company that produces home automation devices such as automated door locks, security cameras, and alarm systems. Your machine learning team is building a new data repository for a device your company will launch as a new product soon. The new device will generate significant streams of IoT data using the MQTT protocol. It would help if you created a data repository for use in your machine learning models that will be used to produce future device usage predictions. Your management team will use these device usage predictions for marketing campaigns. Which option is the most cost-effective configuration of AWS services to build your data repository? (Select TWO)
Question 38 of 55
1 point(s)
You work as a machine learning specialist for a software company that offers real-time interactive sports viewing apps for mobile phones and tablets. You gather real-time streaming sports statistics and game action data and use the streaming data to produce real-time analytics and active predictions of the likely outcome of the game. To produce your prediction, you must use several machine learning models that use real-time streaming data as training and inference data sources. Since the real-time streaming game data is delivered from several different sources, the format and schema of the data need transformation and sanitation. Which option best performs the feature engineering of your real-time streaming data for your training and inference requests?
Question 39 of 55
1 point(s)
You work as a machine learning specialist for a medical imaging company. You and your machine learning team have been assigned to build a model that predicts whether a mass breast image indicates a benign or malignant tumor. Your model will help physicians quickly decide how to treat their patients using a verified diagnosis. Which option gives the appropriate machine learning services and features to train your model for your image diagnosis problem?
Specify the SageMaker role is used to give learning and hosting access to your data by using the role = PageMaker.get_role() statement in your jupyter notebook. Load your data into a pandas dataframe. Split your data into 80% training, 10% validation, and 10% testing. Set the predictor_type hyperparameter to binary_classifier. Then, run your training job using the PageMaker.create_training_job statement in your jupyter notebook.
Specify the SageMaker role is used to give learning and hosting access to your data by using the role = PageMaker.get_execution_role() statement in your jupyter notebook. Load your data into a pandas dataframe. Split your data into 80% training, 10% validation, and 10% testing. Set the predictor_type hyperparameter to binary_classifier. Then, run your training job using the PageMaker.create_training_job statement in your jupyter notebook.
Specify the SageMaker role is used to give learning and hosting access to your data by using the role = PageMaker.get_execution_role() statement in your jupyter notebook. Load your data into a pandas dataframe. Split your data into 80% training, 10% validation, and 10% testing. Set the predictor_type hyperparameter to the regressor type. Then, run your training job using the PageMaker.create_training_job statement in your jupyter notebook.
Specify the SageMaker role is used to give learning and hosting access to your data by using the role = PageMaker.get_execution_role() statement in your jupyter notebook. Load your data into a pandas dataframe. Split your data into 80% training, 10% validation, and 10% testing. Set the predictor_type hyperparameter to multiclass_classifier. Then, run your training job using the PageMaker. Create a training_job statement in your jupyter notebook.
Question 40 of 55
1 point(s)
You work as a machine learning specialist for a video game software company. You have been asked to produce a machine learning model that predicts whether a newly released game will eventually become a successful product that profits the company. The data used for your model is product information and product ratings from social media. Your management team would like to use your model results to help them decide if a new game is worth investing in marketing dollars to promote it further. Which model and objective will best match your model requirements?
Question 41 of 55
1 point(s)
You work as a machine learning specialist for a financial services organization. Your machine learning team is responsible for building models that predict index fund tracking errors for the various funds managed by your mutual fund portfolio management department. You must ingest data into your data lake for use in your machine-learning models. The required securities pricing data come from varying sources that deliver the data you need to use in your model inferences in near real-time. You need to perform data transformation, such as compression, of the data before writing it to your S3 data lake. Which option is the most efficient for ingesting the data into your data lake?
Question 42 of 55
1 point(s)
You work as a machine learning specialist for a data mining department of a large bank. Your department is responsible for leveraging the bank’s huge data lake to gain insights and make predictions for your marketing and risk departments. Your team’s latest project, an XGBoost prediction model, is ready for production deployment. However, you want to run additional batch predictions using a batch inference job to ensure your model can handle the production prediction workload. In your SageMaker notebook, how do you extend your estimator to read input data in batch from a specified S3 bucket and make predictions?
Question 43 of 55
1 point(s)
You work as a machine learning specialist for the sales department of a large web retailer that needs to gain insight into their sales patterns. They need a way to use visualization to show their sales data in near-real time torecognize higher-than-expected sales of specific products quickly. This will help your product operations quickly meet high demands. Which option is a viable, efficient solution to your problem?
Question 44 of 55
1 point(s)
You work as a machine learning specialist for a security company that uses video feeds to identify criminal activity in a client’s retail environment. You are building a convolutional neural network model for your video classification using TensorFlow. The model is expected to classify video scenes as criminal, such as theft or benign. You have thousands of hours of video on which to train your model. Therefore, you plan to leverage hyperparameter tuning to run multiple training jobs using different hyperparameter combinations. The goal is to find the model with the best training result. You are writing your hyperparameter tuning job in your SageMaker jupyter notebook. When you create your HyperparameterTuner object in your Python code, which parameters do you pass in the method? (Select TWO)
Question 45 of 55
1 point(s)
You work as a machine learning specialist for a social media software company that produces games for mobile devices. Your company has a new game that they believe will generate a large following very quickly. It would help if you built a model to predict whether users will purchase additional game features via in-app purchases. You have a large dataset to use for your training, and you need to find the best hyperparameters by using a hyperparameter tuning job. You have configured the training jobs. The hyperparameter tuning job will run by defining an estimator and objective. You want to run your training jobs in a highly parallel manner to complete your hyperparameter tuning quickly. Also, you know that the order of magnitude is more important than the absolute value for your hyperparameter values. For example, a change from 1 to 2 is expected to have a much more significant impact than a 100 to 101. Which scaling and search types combination should you use for your hyperparameter tuning job?
Question 46 of 55
1 point(s)
You are a machine learning specialist working for an oil refinery company. Your team is working on a machine learning problem to determine the relationship between oil well depth and production. You must understand your data better to select the appropriate machine-learning algorithm to solve the oil well production problem. For example, what is the correlation between your oil well depth data and your production data?
When you examine your data visually using the Python matplotlib library, you find that your data has what looks like a non-Gaussian distribution of oil well depth and oil sound production:
Question 47 of 55
1 point(s)
You are a machine learning specialist working for a clothing manufacturer. You have been tasked with building a machine learning model to determine the return on investment (ROI) for advertising a specific clothing line based on the labeled data of past social media campaigns for similar clothing lines.
You decide to run a Pearson’s correlation coefficient to understand your data correlation in a better way. When you calculate your Pearson’s correlation coefficient of social media advertising ROI, you get a value of 0.35. What are your thoughts on this outcome?
Question 48 of 55
1 point(s)
You are a machine learning specialist working for your company’s social media software development division. The social media features of your web applications allow users to post text messages and pictures about their experiences with your company’s products. You have defined a vocabulary of words deemed inappropriate for your site. You need to be able to block posts that contain inappropriate words quickly.
Which algorithm is best suited to your task?
Question 49 of 55
1 point(s)
You are a machine learning specialist working for a government agency that uses a series of web application forms to gather citizen data for census purposes. You have been tasked with finding novel user entries as your citizens enter them. A novel user entry is defined as an outlier compared to the established citizen entries in your data store.
You have cleaned your citizen data store to remove any existing outliers. It would be best to build a model to determine whether new entries on your web application are novel. Which algorithm best fits these requirements?
Question 50 of 55
1 point(s)
You are a machine learning specialist working for a translation service company. Your company offers several mobile applications used for translation on smartphones and tablets. As a new feature of one of your translation apps, your company offers a feature to generate handwritten notes from spoken text.
Which algorithm is the best choice for your new feature?
Question 51 of 55
1 point(s)
Your machine learning team is responsible for processing video clips posted to your company’s Twitter social media account to understand the sentiment of the video clips. Your team labels these video clips with the appropriate sentiment so your marketing department can use them in their advertising campaigns. You are now expanding into Spanish and Portuguese-speaking regions of the world. So, you need to translate video clip audio as a part of your sentiment labeling process.
What AWS services and SageMaker built-in algorithms allow your team to label the foreign language video clips most efficiently?
Question 52 of 55
1 point(s)
You work for a machine learning team at a global retail auto parts chain. Your team ingests purchasing data from its 100,000 global auto parts stores to S3 using Kinesis Data Firehose. You are now ready to start training an improved machine learning model that will be used to predict purchasing patterns by global region. The training data requires additional simple transformations. Also, you will need to combine some data attributes. Finally, your team expects to train the model daily.
Based on many stores plus changing data ingestion, which options require a minor administration and development effort?
Question 53 of 55
1 point(s)
You are a machine learning specialist working for an oil and gas company. Your company’s oil and gas drilling sites worldwide have sensors that stream site equipment status and external conditions like weather. You are responsible for building a machine-learning model that predicts site equipment failures. Before using the data in your model, the streaming data from the sites must be ingested, transformed, and stored in Apache Parquet files for exploration and analysis.
Which of the following options would ingest, transform, and store your data in the parquet format with the least effort on your part?
Question 54 of 55
1 point(s)
You are a machine learning team member at a large online retailer. Your team is responsible for retail competitor analysis. You have a competitive product data streaming source that you need to invest in your data lake. It would help if you used that streaming competitor product data to match the corresponding product data in your catalog of products. Your data scientists can use this matching to produce competitive analysis dashboards in a BI tool.
Which options give you the best data ingestion and most efficient product comparison solution?
Question 55 of 55
1 point(s)
You are a machine learning specialist working for the digital banking division of a global banking firm. Your bank is in the process of introducing a conversational user interface for its digital banking service. The service will receive audio from the conversational user interface and converse with the user in real-time. Your machine learning team lead has decided to use the Amazon Transcribe service to convert the streaming audio to streaming text.
To handle issues in the network connection when users are on mobile phones, how can you leverage the features of Amazon Transcribe to keep your solution as cost-effective as possible?