Showing posts with label cloud. Show all posts
Showing posts with label cloud. Show all posts

Tuesday, 29 October 2019

Cloud - VII

We did a word count using Apache Hive on Cloudera QuickStart VM 5.12 here. In this post, we will repeat the same word count using Apache Hive but on AWS EMR. Amazon Elastic MapReduce or AWS EMR is a managed Hadoop framework that can be easily deployed swiftly to process large amounts of data across dynamically scalable Amazon EC2 instances. You can also run open source tools and popular distributed frameworks such as Apache Hive, Apache Spark, Apache HBase, Presto, and Apache Flink in Amazon EMR, and interact with data in other AWS data stores such as Amazon S3, Amazon DynamoDB, and Amazon Redshift.

We will use Amazon S3 to store our Hive queries, input data, and also the output result from Hive queries on Amazon EMR. In S3, we have created a bucket called emr-example--bucket having three folders: code, input and output. code folder will house all the hive queries. input folder will contain the data file called blue_carbuncle.txt containing the text on which we will attempt the word count. Results of the Hive queries will be outputted to the output folder. The bucket and folders are shown below:













The first paragraph in the data file, blue_carbuncle.txt, is shown below:

 











The data file, blue_carbuncle.txt, is placed in input folder:














The Hive queries is shown below:

create external table text (line string) location '${INPUT}/input';

insert overwrite directory '${OUTPUT}/result1/' select word, count(*) from(select explode(split(line,'\\s')) as word from text) z group by word;

They are the same from the previous post expect for the "location" part in the first query and "insert overwrite directory" part in the second query. They are in a single file called hive1.q under code folder:














The output folder is empty as no Hive queries have been run so far. Now, we can go ahead and run the Hive queries on the data file by spinning up a Amazon EMR on the fly. Frankly, I enjoyed the experience as I never thought setting up a Hadoop cluster will be so much of a breeze. So, now onto Amazon EMR:











Click on Create cluster button. In the next screen, only add a EC2 key pair if you already have, else, take the defaults. Click on Create cluster to launch a cluster comprising 1 m5.xlarge master node and 2 m5.xlarge core nodes:























The cluster will take a few minutes to launch. Once launched, click on steps tab and Add step button to add the details to run Hive query:

 









In the Add step window, add the values as follows:
















After setting the values for Script, Input and Output locations, click Add button to kick off the Hive query on the cluster. Once the Hive query has run, we get Completed in Status:











Navigate to S3 to see result:














Download the file, 000000_0 and see the contents:

























The contents are the same like in the last post. This concludes the post on Amazon EMR

Monday, 28 October 2019

Cloud - VI

In this post, we take a look at AWS Lambda, a good example of serverless compute. As with any serverless application, there is no need to manage any servers and related server activities like OS installation, patching, etc. Scaling is handled automatically in that AWS Lambda code is triggered in response to an event. If more events occur, then, for each event, AWS Lambda code is executed. Similarly, if lesser events occur, lesser corresponding amount of AWS Lambda code are executed. If no events occur, no Lambda is executed. Billing for Lambda is for the number of times the code is executed and code execution time in multiples of 100 milliseconds.  We will see a few simple examples on AWS Lambda below.

After logging into Management Console, call AWS Lambda, click on Create Function:
















We will not write any code from scratch. Instead, we will borrow code from an existing blueprint. In the next window, after clicking Use a blueprint, using the filter under Blueprints, bring up hello-world-python blueprint. Select it and click Configure:












Under Function name, enter FirstLambda. Select Create a new role with basic Lambda permissions under Execution role. Observe the code and click Crate Function at the bottom of the page:





















Click on Save button. Then, click on Test button to the left of Save button. In the Configure test event window, replace value1 to Hello, world! and enter LambdaEvent under Event name and click Create button at bottom:

























Once the event test event details are saved, click on Test button:











The results can be seen below:














Click on Details to see more details:













See the results output that is in line with the code. Click on Monitoring to see CloudWatch metric details:














Then, click on View logs in CloudWatch:










Click on the only record under Log Streams to see more details:












Create a second function using existing role as shown below:














Then, click Create Function at bottom of page. Default code generated is shown below:



Modify the code to include context properties as shown below including handler name:









Then, click Test to configure Text event details:

























Then, save and test the function. Results are shown below:












Execution results are shown below:














This concludes the introduction to AWS Lambda


Saturday, 14 September 2019

Cloud - V

In the fifth and final post on questions for AWS Cloud Practitioner Exam, we see the last set of questions for preparation towards AWS Cloud Practitioner accreditation. The answers to the last post are given at the start of the post followed by the last batch of questions. The answers to these questions follow these questions. These questions may be helpful for preparation of other AWS Certification Exams as well

Answers to questions in last post are given below:

46. d
47. a
48. b
49. d
50. c
51. d
52. a
53. b
54. c
55. a,c,d
56. a,b,d
57. c
58. a
59. b
60. a

The next set of questions are:

61. __________ is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets

a) Amazon Quantum Ledger Database
b) Amazon Neptune
c) AWS Storage Gateway
d) Amazon Elastic File System

62. An easy way to get an idea of the costs incurred for moving to AWS Cloud Services is:

a) Use Simple Monthly Calculator at https://calculator.s3.amazonaws.com/index.html
b) Call AWS Region Representative and check with him
c) Call up local AWS Data Center and set up a meeting
d) Use AWS Cost Management

63. __________ is a AWS product with which we can add user sign-up, sign-in, and access control to web and mobile apps quickly

a) AWS Device Farm
b) AWS AppSync
c) Amazon Cognito
d) AWS Amplify

64. Using __________ one can create and publish interactive dashboards to deliver insights to users in an organization

a) AWS Device Farm
b) Amazon Pinpoint
c) AWS Batch
d) Amazon QuickSight
65. Amazon Elastic Block Store, Amazon Elastic File System, Amazon FSx for Lustre and Amazon FSx for Windows File Server are associated with

a) design
b) storage
c) compute
d) networking

66. __________ is a free AWS service that simplifies the billing for multiple accounts by setting up a single payment method for all the accounts in an organization through consolidated billing

a) AWS Budgets
b) AWS Cost & Usage Report
c) AWS Cost Explorer
d) AWS Organizations

67. When Amazon EC2 Reserved Instances are used, upto __ percent over equivalent capacity purchased in an On-Demand model may be realized

a) 45
b) 60
c) 75
d) 90

68. __________ is a service that monitors applications and automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost

a) AWS Auto Scaling
b) Elastic Load Balancing
c) Amazon Elastic File System
d) Amazon ElastiCache

69. Which of the following load balancers are related to Elastic Load Balancing in AWS: (Choose all that apply)

a) Classic Load Balancer
b) Advanced Load Balancer
c) Network Load Balancer
d) Application Load Balancer

70. __________ gives complete control over virtual networking environment, including selection of IP address range, creation of subnets, and configuration of route tables and network gateways

a) Amazon CloudFront
b) Amazon Virtual Private Cloud
c) AWS PrivateLink
d) AWS Cloud Map

71. When Amazon EC2 Spot Instances are used, upto __ percent over equivalent capacity purchased in an On-Demand model may be realized

a) 45
b) 60
c) 75
d) 90

72. Advantages offered by Amazon EC2 Dedicated Host include: (Choose all that apply)

a) instance running on different physical servers
b) saving money on existing server-bound licensing costs
c) rotating license usage between multiple instances
d) meeting compliance and regulatory requirements

73. __________ are best recommended for steady-state applications and customers are confident of running EC2 for periods as one or three years:

a) Dedicated Hosts
b) On-Demand Instances
c) Reserved Instances
d) Spot Instances
e) A judicious combination of above

74. Visibility of sockets, cores, and host ID are characteristics of which EC2 type:

a) Dedicated Hosts
b) On-Demand Instances
c) Dedicated Instances
d) Reserved Instances
e) Spot Instances

75. __________ gives 24x7 access to the AWS DDoS Response Team and protection against DDoS related spikes in Amazon Elastic Compute Cloud (EC2), Elastic Load Balancing (ELB), Amazon CloudFront, and Amazon Route 53 charges

a) AWS Shield Standard
b) Amazon GuardDuty
c) AWS Shield Advanced 
d) Amazon Inspector

Answers to the above questions are:

61. b
62. a
63. c
64. d
65. b
66. d
67. c
68. a
69. a,c,d
70. b
71. d
72. b,d
73. c
74. a
75. c

All the best for AWS Cloud Practitioner Exam ...

Friday, 13 September 2019

Cloud - IV

In the fourth post on Cloud topic, we see the next bunch of questions towards AWS Cloud Practitioner accreditation. The answers to the last post are given at the start of the post followed by the next set of questions. These questions may be helpful for preparation of other AWS Certification Exams as well

Answers to questions in last post are given below:

31. c
32. d
33. a
34. b
35. d
36. a
37. c
38. b
39. a
40. d
41. b
42. b
43. d
44. c
45. c

The next set of questions are:

46. __________ is an interactive query serverless service that makes it easy to analyze data in Amazon S3 using standard SQL

a) Amazon Glue
b) Amazon Aurora
c) Amazon Redshift
d) Amazon Athena

47. __________ comprises the most comprehensive set of AWS cost and usage data available, including additional metadata about AWS services, pricing, and reservations

a) AWS Cost & Usage Report
b) AWS Budgets
c) AWS Expense Report
d) AWS Forecast Report

48. __________ allows a user to run code without provisioning or managing servers paying only for compute time

a) AWS Fargate
b) AWS Lambda
c) Amazon Lightsail
d) AWS Outposts

49. __________ can be used to collect, process, and analyze real-time, streaming data to obtain timely insights and also react quickly to new information

a) Amazon Kinesis Data Firehose
b) AWS Lake Formation
c) Amazon CloudSearch
d) Amazon Kinesis

50. __________ to detect anomalous behavior in your environments, set alarms, visualize logs and metrics side by side, take automated actions, and troubleshoot issues

a) AWS Systems Manager
b) AWS Config
c) Amazon CloudWatch
d) AWS Service Catalog

51. __________ is a fast content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to customers globally with low latency, and high transfer speeds

a) Amazon VPC
b) AWS PrivateLink
c) AWS Direct Connect
d) Amazon CloudFront

52. __________ is an online resource to reduce cost, increase performance, and improve security by optimizing AWS environment

a) AWS Trusted Advisor 
b) AWS OpsWorks
c) AWS License Manager
d) AWS Auto Scaling

53. __________ makes it easy to establish a dedicated network connection from customer premises to AWS to reduce reduce network costs, increase bandwidth throughput, and provide a more consistent network experience than regular Internet-based connections

a) AWS Global Accelerator
b) AWS Direct Connect
c) AWS PrivateLink
d) Amazon Route 53

54. Which of the following does not contribute to Amazon RDS cost:

a) Number of database instances
b) Database characteristics
c) Inbound data transfer
d) Clock hours of server time

55. The three modes of accessing data in Amazon S3 Glacier are:

a) Standard
b) Non-Standard
c) Bulk
d) Expedited
e) Immediate

56. Which of the following are valid cost saving measures using reservations? (Choose all that apply)

a) Amazon ElastiCache Reserved Nodes
b) Amazon RDS Reserved Instances
c) Amazon Athena Reserved Nodes
d) Amazon Redshift Reserved Nodes
e) Amazon Simple Queue Service Reserved Instances

57. __________ lets customers view and manage a select set of resources to support incident response while on-the-go

a) AWS Management Console 
b) AWS Command Line Interface
c) AWS Console Mobile Application
d) Amazon CloudWatch

58. Under the Shared Responsibility Model, Amazon is responsible for security of the cloud while the customer is responsible for security in the cloud

a) True
b) False

59. Using which AWS product can we send notifications in the form of SMS and emails?

a) Amazon SWF
b) Amazon SNS
c) Amazon SQS
d) Amazon MQ

60. __________ is the industry leading cloud-native big data platform, allowing teams to process vast amounts of data quickly, and cost-effectively at scale

a) Amazon EMR 
b) Amazon CloudSearch
c) Amazon Kinesis Data Streams
d) AWS Lake Formation
Answers in next post ...

Cloud - III

The third segment of series on Cloud sees the next batch of questions towards towards AWS Cloud Practitioner accreditation that is the first of the Amazon Certification credentials series.The answers to questions in last article are also mentioned.  These questions may be helpful for preparation of other AWS Certification Exams as well

Answers to questions in last post are given below:

16. c
17. a,c,e
18. b,c,d
19. b
20. a,b,c,d
21. b
22. a,b,d,e
23. d
24. c
25. a
26. d
27. c
28. b
29. a,c,d,e
30. b,d

The next set of questions are below:

31. __________ is an email sending service designed to help digital marketers and application developers send marketing, notification, and transactional emails

a) Amazon SNS
b) Amazon SQS
c) Amazon SES
d) Amazon SMS

32. Which is a fast, scalable data warehouse that makes it simple and cost-effective to analyze all your data across your data warehouse and data lake and provides high performance using massively parallel query execution, and columnar storage on high-performance disk

a) Amazon Glue
b) Amazon Athena
c) Amazon QuickSight
d) Amazon Redshift

33. __________ is a service that provides secure, resizable compute capacity in the cloud and is designed to make web-scale computing easier for developers

a) Amazon EC2
b) Amazon EMR
c) Amazon SWF
d) Amazon SQS

34. __________ is a highly available, durable, secure, fully managed pub/sub messaging service that enables you to decouple microservices, distributed systems, and serverless applications

a) Amazon SQS
b) Amazon SNS
c) Amazon SES
d) Amazon SMS

35. __________ is a fully managed, multiregion, multimaster, key-value and document database with built-in security, backup and restore, and in-memory caching for internet-scale applications

a. Amazon ElastiCache
b. Amazon Neptune
c. Amazon Aurora
d. Amazon DynamoDB

36. __________ is a secure, durable, and extremely low-cost storage service for data archiving and long-term backup and is designed to deliver 99.999999999% durability

a. Amazon S3 Glacier
b. Amazon S3
c. Amazon Elastic File System
d. Amazon Elastic Block Store

37. __________ is a data transfer service at exabyte scale

a. AWS Snowball
b. AWS Snowball Edge
c. AWS Snowmobile
d. AWS Server Migration Service

38. __________ is a MySQL and PostgreSQL compatible relational database engine that combines the speed and availability of high-end commercial databases and is up to five times faster than standard MySQL databases and three times faster than standard PostgreSQL databases

a. Amazon Neptune
b. Amazon Aurora
c. Amazon DynamoDB
d. Amazon ElastiCache

39. __________ is a managed cloud service that lets connected devices easily and securely interact with cloud applications and other devices.

a. AWS IoT Core
b. AWS IoT Button
c. AWS IoT 1-Click
d. AWS IoT Analytics

40. __________ is a cost efficient, easy to set up, operate, and scale a relational database in the cloud and supports different database engines

a. Amazon Neptune
b. Amazon DynamoDB
c. Amazon Timestream
d. Amazon RDS

41. __________ is an object storage service that offers industry-leading scalability, data availability, security, performance and durability of 99.999999999%

a) AWS Snowball
b) Amazon Simple Storage Service
c) AWS Storage Gateway
d) Amazon Elastic File System

42. __________ gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion.

a. AWS Systems Manager
b. AWS CloudFormation
c. AWS Config
d. AWS OpsWorks

43. __________ is a web service that records AWS API calls for your account and delivers log files containing details of the identity of the API caller, the time of the API call, the source IP address of the API caller, the request parameters, and the response elements returned by the AWS service

a. AWS Trusted Advisor
b. Amazon CloudWatch
c. AWS Config
d. AWS CloudTrail

44. __________ is a fully-managed platform to easily build, train, and deploy machine learning models at any scale

a. Amazon Translate
b. Amazon Polly
c. Amazon SageMaker
d. Amazon Transcribe

45. __________ is a tool that can be used to review the state of workloads and compares them to the latest AWS architectural best practices

a. AWS Trusted Advisor
b. AWS License Manager
c. AWS Well-Architected Tool
d. AWS Systems Manager

Answers in next post ...

Monday, 26 August 2019

Cloud - II

Into the second part of series on Cloud, we look at the next batch of questions. The questions in this article and series of posts are oriented more towards AWS Cloud Practitioner accreditation that is the first of the Amazon Certification credentials series.The answers to questions in last article are also mentioned.  These questions may be helpful for preparation of other AWS Certification Exams as well

Answers to previous post are given below:

1. a,c,d
2. a,c,e
3. b,d,e
4. a,b,c,e
5. c
6. c,d
7. b
8. b
9. c,d,e
10. a
11. a,d
12. a,b,c,d,e
13. a
14. a,b,c,d
15. a,b

The next batch of questions are below:

16. Which of the following is not a pillar of AWS Well-Architected Framework? 

a) Performance Efficiency
b) Security
c) Availability
d) Operational Excellence
e) Reliability

17. The fundamental drivers that form the basis for AWS costing are: (Choose all that apply)

a) outbound data transfer
b) inbound data transfer
c) storage
d) design
e) compute

18. Which of the following characteristics are associated with AWS Pricing? (Choose all that apply)

a) Pay only per user
b) Pay-as-you-go
c) Pay less by using more
d) Save when you reserve

19. Using AWS Free Tier, any AWS user can access all of the AWS services and gain free hands-on-experience

a) True
b) False

20. Which of the following are design principles of the AWS Cloud? (Choose all that apply)

a) Loose Coupling
b) Caching
c) Scalability
d) Automation
e) Open Source

21. An important customer is planning to deploy their custom auction application on EC2 for the first time. They believe that it will attract millions of hits but have no data to back their claim. Which of the following EC2 instances would you recommend?

a) Dedicated Hosts
b) On-Demand Instances
c) Reserved Instances
d) Spot Instances
e) A judicious combination of above

22. Non-free Pricing Models for EC2 instances comprise: (Choose all that apply)

a) Dedicated Hosts
b) On-Demand Instances
c) Provisioned Instances
d) Reserved Instances
e) Spot Instances

23. Which of the following offers the most favourable pricing for Amazon EC2 Reserved Instances:

a) Partial Upfront
b) Prorated Upfront
c) No Upfront
d) All Upfront

24. AWS Batch, AWS Fargate, AWS Lightsail and AWS Lambda are associated with

a) design
b) storage
c) compute
d) networking

25. Lift-and-shift is a migration approach that comprises moving on-premise software applications to cloud with minimal changes

a) True
b) False

26. Which of the following offers the most favorable pricing for EC2:

a) Dedicated Hosts
b) On-Demand Instances
c) Reserved Instances
d) Spot Instances

27. _________ is a highly available and scalable cloud Domain Name System (DNS) web service. 

a) AWS PrivateLink
b) AWS Direct Connect
c) Amazon Route 53
d) AWS Transit Gateway

28. Which of the following is a fully managed pub/sub messaging service:

a) Amazon SQS
b) Amazon SNS
c) Amazon SWF
d) Amazon MQ

29. Which of the following database instance types are supported by RDS: (Choose all that apply)

a) MySQL
b) DB2
c) Oracle
d) MariaDB
e) Aurora

30. Spot instances are best suited for applications that: (Choose all that apply)

a) involve serverless compute
b) have no scheduled run times
c) can be load balanced using ELB
d) are very compute intensive to run 

Answers in next post ...

Wednesday, 21 August 2019

Cloud - I

I am truly delighted to be writing on another important topic: Cloud Computing. In the first post on this topic, we start with sample questions for AWS Cloud Practitioner Exam. The questions in this article and following posts are oriented more towards AWS Cloud Practitioner credential that is the first of the AWS Certifications series. These questions may be helpful for preparation of other AWS Certification Exams as well

1. What are the different Cloud Computing models? (Choose all that apply)

a) Infrastructure as a Service
b) Network as a Service
c) Platform as a Service
d) Software as a Service
e) Hardware as a Service

2. Advantages of Cloud Computing are: (Choose all that apply)

a) Focus on organization's business rather than IT infrastructure
b) Better control of on premise IT hardware
c) Benefits from economies of scale by cloud deployment
d) Huge investment of capital expenditure on data centers
e) Increased speed and agility of IT operations

3. Different Cloud Computing Deployment Models are: (Choose all that apply)

a) Social 
b) Private
c) Domain
d) Hybrid
e) Public

4. Amazon Web Services product offerings includes areas as: (Choose all that apply)

a) mobile
b) IOT
c) networking
d) availability zone set up
e) security

5. Each Amazon Region has at least __ Availability Zone(s)

a) 4
b) 3
c) 2
d) 1

6. Content delivery with lower latency is achieved by: (Choose all that apply)

a) Dedicated Amazon Servers
b) Superfast Amazon network
c) Edge Locations
d) Regional Edge Caches

7. The AWS S3 pricing is the same across the globe

a) True
b) False 

8. Long-term contracts are one of the significant benefits of Amazon Cloud Services

a) True
b) False

9. Three ways to access AWS are by using:

a) Logging into AWS servers in nearest Availability Zone
b) Logging into AWS servers in nearest Edge Location
c) AWS Management Console
d) AWS Command Line Interface
e) AWS Software Development Kits

10. AWS Shared Responsibility Model is related to

a) Security and Compliance
b) Pay-as-you-go Pricing
c) Lift-and-Shift
d) AWS Replication Strategy
e) AWS High Availability

11. Benefits of Serverless Model include: (Choose all that apply)

a) Automated High Availability
b) Automated Functional Programming
c) Full control of Cluster Provisioning
d) Flexible Scaling

12. Amazon EC2 Instance type families are: (Choose all that apply)

a) Compute optimized
b) Storage optimized
c) Accelerated computing
d) General purpose
e) Memory optimized

13. AWS Regions are connected to AWS Global Network with a 100 Gbps intercontinental network 

a) True
b) False

14. Different components of AWS Global Infrastructure are: (Choose all that apply)

a) AWS Availability Zones
b) AWS Data Centers
c) AWS Regions
d) Points of Presence

15. The advantages that Cloud offers to IT personnel are: (Choose all that apply)

a) Increased speed and agility
b) Try out innovations
c) Manual configuration of hardware devices onto racks
d) Full visibility into AWS Data Centers

Answers in next post ...