Blog Posts: Latest Trends and Insights in Technologies | Clarion Technologies

Choosing the Right AWS Cloud Storage for Your Data

Written by Dilip Kachot - Technical Architect Delivery | Oct 4, 2019 1:55:03 PM

As the business world becomes smarter, data become vital for any type of business. It serves a key role for competitive advantage because the ability of a company to compete will be driven by how well it leverages data. Your business data are the gears that keep the organization moving. Data storage strategy and its security are necessary for every company or business regardless of its size. Besides, there should be easy and organized access to stored data and information to make the business successful.

So, where do you store your data? And how do you secure the data? This is where the solution to AWS cloud storage comes in.

AWS cloud storage allows businesses to save files, information and user data on the cloud and saves the untenable costs of maintaining, monitoring, and hosting it on their own. It's not just about storing data but in a secure way. AWS cloud storage ensures data security through data encryption, data backup, easy data recovery, and protection from hackers & data loss.

The AWS cloud offers a broad range of storage and cost management options, each of which has its specific use case. The most popular types of Amazon storage options are:

  1. Amazon Simple Storage Service (S3)
  2. Amazon Glacier
  3. Elastic Block Storage
  4. AWS Storage Gateway

This array of amazon cloud storage options sometimes confuses the organizations when it comes to selecting the right option. Comparing their performance, availability, and price schemes can help to make the decision on which AWS storage option to choose.

1. Amazon Simple Storage Service (S3)

Amazon S3, the object storage service is the first publicly available cloud storage service launched by Amazon in 2006. Amazon S3 is a well-known and most used Amazon storage option. It allows you to store an infinite amount of data that can be accessed programmatically via different methods like REST API, SOAP, web interface, and more. It is an ideal storage option for videos, images and application data.

Features:

  • Fully managed
  • Store in buckets
  • Versioning
  • Access control lists and bucket policies
  • AES-256 bit encryption at rest
  • Private by default

Best used for:

  • Hosting entire static websites
  • Static web content and media
  • Store data for computation and large-scale analytics, like analyzing financial transactions, clickstream analytics, and media transcoding
  • Disaster recovery solutions for business continuity
  • Secure solution for backup & archival of sensitive data

Performance

  • Access to data stored in Amazon S3 from within Amazon Elastic Compute Cloud (EC2), in the same region, is relatively fast
  • Built to scale storage, requests, and users to support an unlimited number of web-scale applications

Durability and Availability

  • Ensures the highest level of data durability and availability on the AWS platform - Provides 99.999999999% durability per object as well as 99.99% availability over a year
  • Includes built-in error corrections
  • No single point of failure

Scalability and Elasticity

  • Offers a high level of scalability & elasticity automatically
  • Supports an infinite number of files in any bucket

Pricing

The total price of Amazon S3 usage depends on various factors including the amount of storage used, the number of GET/PUT/LIST/POST/COPY requests and data transfers.

  • Storage Pricing: Amazon S3 charges $0.03 per GB for the first 1TB/month. The more data you store, the less you pay
  • Request Pricing: For PUT/COPY/POST and LIST requests, Amazon charges $0.005 per 1,000 requests and $0.004 per 1,000 requests for GET requests
  • Data Transfer Pricing: For up to 10TB/month of data transferred OUT from Amazon S3 to the internet, Amazon charges $0.09 per GB. There is no charge for INWARD traffic

A real-time Use case

A photo-sharing site relies completely on Amazon S3 to store and display photos uploaded by their end-users. Their current end-user customer base is about 8,000, and on average, 4,000 users upload around 60,000 photos per month. The approximate average size of each photo is around 1MB. Each photo is viewed around three times per month.

Let’s calculate the tentative costs incurred by this photo sharing site:

Total storage per month: 60,000 images x 1MB = 60,000MB = 60GB

Storage cost for 60GB: 60GB x $0.03 = $1.80/month

Total PUT/POST requests per month: 60,000 requests

PUT/POST request cost: 60,000 x ($0.005/1000) = $0.30/month

Total GET requests per month: 60,000 photographs x 3 views = 180,000 requests

Total GET request cost: 180,000 x ($0.004/1000) = $0.24/month

Total data transfer OUT cost from Amazon S3 to the internet: 60GB of data x 3 views x $0.09 per GB = $16.20/ month

Total cost = $18.54/month (storage cost + PUT/POST request cost + GET request cost + data transfer cost)

2. Amazon Glacier

Amazon Glacier the low-cost storage solution is widely used for data archive and backup. Its data retrieval process is too long; hence, Glacier should only be used for data accessed very infrequently.

Best used for

  • Off-site enterprise stores
  • Media assets
  • Research and scientific data
  • Digital preservation
  • Magnetic tape replacement

Performance

  • Slow
  • Typically takes 3 to 5 hours to complete jobs

Durability and Availability

  • Offer 99.999999999% (11 nines) of average annual durability for an archive

Scalability and Elasticity

  • Scales to fulfill your enhancing as well as unpredictable storage needs – Scales the storage up & down based on the requirement
  • Limits the single archive to 4TBs, but allows unlimited amount of data storage

Pricing

  • Charges $0.01 per gigabyte per month
  • Charge only for what you use
  • Offers free retrieval up to 5% of average monthly storage
  • Amazon Glacier includes three pricing components (per GB per month):
  1. Storage
  2. Data transfer out
  3. Requests (per thousand UPLOAD & RETRIEVAL requests per month)

A real-time Use Case

A Big Data company generates many log files (~5TB per month) from their data analysis that needs to be retained for a long period due to compliance constraints. However, there is no immediate need to access this log data. Therefore, the company decided to go with Amazon Glacier for storage.

The estimated price for storing this data in Glacier is around $50 per month.

3. Elastic Block Storage

Amazon Elastic Block Storage (EBS) is another common storage option offered by Amazon. Unlike Amazon S3, which is object storage, Amazon EBS volumes provide persistent block-level storage designed for Amazon EC2. It is like an external hard drive attached to your system. Amazon EBS volumes offer different storage sizes from 1GB to 16TB and come in three options:

  1. General Purpose Magnetic storage
  2. General Purpose SSD storage
  3. Provisioned IOPS SSD storage

By default, AWS allows you to have 5,000 EBS volumes, 20TB of Magnetic storage, 20TB of SSD Storage, 20TB of Provisioned IOPS and 40,000 Provisioned IOPS. You can also request AWS if you need more storage.

Best used for

  • Data that changes often and necessitates long-term persistence
  • Being used as primary storage for a file system or a database
  • Applications that need access to raw block-level storage

Performance

  • Offers two volume types: standard volumes and Provisioned IOPS volumes
1. Standard volumes
  • Ensures cost-effective storage for apps with bursty I/O requirements
  • Designed to deliver 100 I/O operations per second on average
  • Ideal for using as boot volumes
2. Provisioned IOPS
  • Deliver high performance for I/O intensive workloads like databases
  • Currently supports up to 2,000 IOPS
  • Allows you to stripe multiple volumes together to offer thousands of IOPS per Amazon EC2 instance

Durability and Availability

  • Highly available & reliable
  • No single point of failure
  • To maximize durability and availability of Amazon EBS data, it is vital to create snapshots of Amazon EBS volumes often
  • Snapshots deliver an easy-to-use disk clone for sharing, backup, and disaster recovery

Scalability and Elasticity

  • It can easily scale in & out with total storage demands
  • With snapshot, you can resize a volume to expand the size of Amazon EBS

Pricing

  • For Amazon EBS Magnetic volumes, AWS charges $0.05 per GB/month and $0.05 per 1 million I/O requests.
  • For Amazon EBS SSD volumes and Amazon EBS Provisioned IOPS volumes, AWS charges $0.10 per GB/month and $0.125 per GB/month respectively.
  • For Provisioned IOPS, AWS additionally charges $0.065 per provisioned IOPS/month.

A real-time Use Case

A healthcare company needs to store 7TB of data in the distributed replicated mode, for their shared application and medical records. For this, they use Gluster storage, which allows their application instances to access shared storage. They use two EC2 instances, each with four EBS 1TB SSD volumes. These volumes are configured in distributed replicated mode and the Gluster volume is mounted on other instances that also require access.

Now, let’s calculate the cost of this use case:

Total storage = 2 instances x 4TB EBS SSD volumes = 8TB EBS SSD volumes

Total price = 8TB (8,000GB) x $0.10 per GB/month = $800 per month

4. AWS Storage Gateway

Amazon Storage Gateway unites the on-premises software application with cloud-based storage. It provides seamless and secure integration between your on-premises IT environment & the Amazon storage infrastructure. It supports three configurations - Gateway-Cached Volumes, Gateway-Virtual Tape Library (VTL) and Gateway-Stored Volumes.

Best used for

  • Corporate file sharing
  • Data mirroring to cloud resources
  • Disaster recovery

Performance

As this service sits between Amazon S3, your application, and underlying on-premises storage, its performance depends on the following factors:

  • Bandwidth between the gateway VM & Amazon S3
  • Speed & configuration of underlying local disks
  • Amount of local storage assigned to the gateway VM
  • Network bandwidth between iSCSI initiator & gateway VM

Durability and Availability

  • Durably it stores on-premises application’s data by uploading it to Amazon S3
  • Amazon S3 achieves regular, systematic data integrity checks and automatic self-healing in nature

Scalability and Elasticity

  • Stores data in Amazon S3, which can offer a high level of scalability & elasticity automatically
  • Store any number of objects
  • Can store an infinite number of bytes
  • Supports an infinite number of files

Pricing

  • Charges only for what you use
  • Includes four pricing components - gateway usage (per gateway per month), volume storage usage (per GB per month), snapshot storage usage (per GB per month) and data transfer out (per GB per month)

A real-time Use Case

A single Tape Gateway constructed with 10 full 100 GB tapes and stores online backups. If you plan for 5 more 100 GB tapes archived in the cloud, and plan to retrieve one of these tapes to restore a backup:

The first 100GB is free. In this example, that's one tape stored locally

Online (local) tape storage capacity cost = $.023/GB/month

Hence, 900 GB’s cost = $20.70/month

Offline (cloud archive) tape storage capacity cost = $.004/GB/month

Hence, 500 GB’s cost = $2/month

Archived tape retrieval cost = $.01/GB, so a full 100 GB tape cost = $3 to retrieve

The data transfer cost is $.09/GB, so 100 GB Costs $9 to transfer back to your premises

In this scenario, you can protect 1TB of capacity with an additional 500 GB archive or $22.70 per month. Retrieving a single, complete 100 GB virtual tape from your archive and reading the contents would cost $12 each time. Remember that this scenario assumes zero compressions; your compression ratios will reduce these results.

Overall, by observing these storage options, it is clear that using amazon S3 provides high availability and durability. Moreover, it can be easily integrated with third-party applications and scales on-demand. However, in terms of performance, EBS remains superior to amazon S3.

The company or business in need of storage platform, Amazon is a trustable and tested solution. One can easily find a suitable solution to fulfill his business needs with so many storage options available at AWS.