Onboarding : Azure Data Storages and databases

Use Cases

  • Determine the storage, based on the requiremenst
  • Azure Cosmos DB (for semi-structured / No-SQL)
  • Azuer SQL database (structured data)
  • Azure Storage
    • Azure Blob stoarge (Unstructured data)
    • Azure Files storage
    • Azure Queues storage
    • Azure Table storage
    • Diagnostic settings
  • Azure HBase
  • Azure Redis cache
  • Azure App Service

Related topics

Determine the storage, based on the requiremenst

Correct solution storage help to have

  • better performance
  • cost saving
  • improve manageability
  • How to classify the data
    • Structured -> e.g. sometimes relational data, data with strict schema -> using stored in database table
    • Semi-structured -> e.g. data contains tags, non-relational, NoSQL, -> structure is defined with selialization languages (for independent systems)
      • XML : Extensible Markup Language -> tag & attribute
      • JSON : Javascript Object Notation -> like key/value
      • YAML : Yaml Ain’t Markup Language -> usually used for configuration
    • Unstructured -> e.g. file, Photo, video, media, office files, log files, text files
  • How data will be used and operations on data
    • OLTP : Online Transaction Processing -> with Transaction and ACID guarantees (Atomicity, Consistency, Isolation, Durability)
    • OLAP : Online Analytical Processing
  • How to have the best performance for the application

Azure Analytics Services can be used with Azure SQL Database

Developers can refer to Azure Storage and Best Practices Document.

Azure Cosmos DB (for semi-structured / No-SQL)

  • Semi-structured data because of the need to extend and modify the schema
  • Suitable for high number of read operation
  • Ability to query on many fields
  • High number of write
  • Support SQL for queries
  • Every properties is indexed by default (good performance by quering)
  • ACID compliance / Transaction
  • Data can be replicated anywhere (for low latency)
  • By replication one of the consistency level can be selected
  • Can be scaled up/down
  • Elastic storage and throughput
  • low latency
  • flexible consistency model
  • enterprise-grade security
  • left side are supported data model
  • right side are model data exposing tools
  • Turnkey global distribution
    • quickly replicate data to a new location
    • global availability is a configuration setting / distributed database system
    • Manual and automatic failover
    • support multi-read and multi-write

Azuer SQL database (structured data)

  • Structured data in columns and semi-structured data in Json columns (easily extandable)
  • Doesn’t index every property in json column only the specified properties
  • Not suitable for semi-structured data
  • for complex analytical queries across multiple databases
  • Transactional
  • Suitable tools for business analytics
  • Can be paired with Azure Analytics Service for creating semantic models
  • By default Azure SQL Database backups are stored as RA-GRS storage blobs that are replicated to a paired region for protection against outages impacting backup storage in the primary region. If you plan to keep data only in a single region, please contact sqlbackuppreview@microsoft.com

Available solutions for SQL Databases

  • Azure SQL databases are a Platform-as-a-Service (PaaS) offering (less infrastructure & maintenance)
    • Microsoft Azure manages the hardware, software updates, and OS patches
    • get the performance you need at a predictable cost
    • with a firewall 
    • with first creation, we have to create Azure SQL logical server (it’s a logical server as an administrative container)
    • we can control logins, firewall rules, and security policies through the logical server
    • can also override these policies on each database within the logical server
  • Install SQL Server Instance on a VM

Scenario: you’re a software development lead at a transportation logistics company, Contoso Transport. The transportation industry requires tight coordination among everyone involved: schedulers, dispatchers, drivers, and even customers.
Your current process involves piles of paper forms and hours on the phone to coordinate shipments. You find that paperwork is often missing signatures and dispatchers are frequently unavailable. These holdups leave drivers sitting idle; and as a result, some important shipments arrive late.
Customer satisfaction and repeat business are crucial to your bottom line, so your team decides to move from paper forms and phone calls to digital documents and online communication. Going digital will enable everyone to coordinate and track shipment times through their web browser or mobile app.
You want to quickly prototype something to share with your team. Your prototype will include a database to hold driver, customer, and order information. Your prototype will be the basis for your production app, so the technology choices you make now should carry over to what your team delivers.

DTUs versus vCores
  • DTU and vCore are purchasing model
  • DTU
    • Database Transaction Unit (DTU)
    • is a combined measure of compute + storage + IO resources / preconfigured purchase option
    • Because your logical server can hold more than one database, there’s also the idea of eDTUs, or elastic Database Transaction Units. This option enables you to choose one price, but allow each database in the pool to consume fewer or greater resources depending on current load
    • it provides a good balance of compute, storage, and IO performance and is less expensive to get started
    • DTU model provides fixed combinations of compute, storage, and IO resources
    •  provides a simple, preconfigured purchase option. To increase IO throughput, you would need to move to a higher tier that also increases your storage and compute power, things you don’t need.
  • vCore
    • Virtual cores (vCore)->give you greater control over the compute and storage resources that you create and pay for
    • vCore model enables you to configure resources independently. For example, with the vCore model you can increase storage capacity but keep the existing amount of compute and IO throughput
SQL elastic pools
  • When you create your Azure SQL database, you can create a SQL elastic pool
  • SQL elastic pools relate to eDTUs. They enable you to buy a set of compute and storage resources that are shared among all the databases in the pool. Each database can use the resources they need, within the limits you set, depending on current load
Collation
  • refers to the rules that sort and compare data
  • helps you define sorting rules when case sensitivity, accent marks, and other language characteristics are important.

Let’s take a moment to consider what the default collation, SQL_Latin1_General_CP1_CI_AS, means.

  • Latin1_General refers to the family of Western European languages.
  • CP1 refers to code page 1252, a popular character encoding of the Latin alphabet.
  • CI means that comparisons are case insensitive. For example, “HELLO” compares equally to “hello”.
  • AS means that comparisons are accent sensitive. For example, “résumé” doesn’t compare equally to “resume”.
# Get information about your Azure SQL database

# Configure the default values in shell session 
az configure --defaults group=learn-34caf1bf-72a7-4f3f-8a42-4036206377e2 sql-server=logistics-vm

# ist all databases on your Azure SQL logical server
az sql db list

#jq is a command-line JSON parser. You'll pipe output from az commands to this tool to extract important fields from JSON output.
az sql db list | jq '[.[] | {name: .name}]'
# output
[
  {
    "name": "Logistics"
  },
  {
    "name": "master"
  }
]
# Like SQL Server, master includes server metadata, such as sign-in accounts and system configuration settings.


# get details about the Logistics database
az sql db show --name Logistics
az sql db show --name Logistics | jq '{name: .name, maxSizeBytes: .maxSizeBytes, status: .status}'
# output
{
  "name": "Logistics",
  "maxSizeBytes": 2147483648,
  "status": "Online"
}
# You see that the database is online and can hold around 2 GB of data.

# get connection string
az sql db show-connection-string --client sqlcmd --name Logistics
# output
"sqlcmd -S tcp:Logistics-vm.database.windows.net,1433 -d Logistics -U <username> -P <password> -N -l 30"

# start an interactive session
sqlcmd -S tcp:Logistics-vm.database.windows.net,1433 -d Logistics -U <username> -P <password> -N -l 30

# create a table
CREATE TABLE Drivers (DriverID int, LastName varchar(255), FirstName varchar(255), OriginCity varchar(255));
GO

SELECT name FROM sys.tables;
GO

INSERT INTO Drivers (DriverID, LastName, FirstName, OriginCity) VALUES (123, 'Zirne', 'Laura', 'Springfield');
GO

SELECT DriverID, OriginCity FROM Drivers;
GO

UPDATE Drivers SET OriginCity='Boston' WHERE DriverID=123;
GO

SELECT DriverID, OriginCity FROM Drivers;
GO

DELETE FROM Drivers WHERE DriverID=123;
GO

SELECT COUNT(*) FROM Drivers;
GO


Source

Azure Storage

  • Storage is managed service
  • Durable, Secure, Scalable, Managed
  • Single subscription can host up to 200 storage accounts, and each can hold 500 TB.
  • Azure data services
    • Blob
    • File: Managed files share
    • Queue: Messaging store for reliable messaging between application components.
    • Table: NoSQL store for schemaless storage
  • Azure Storage is accessable over HTTP/HTTPS
  • Azure Storage has REST API

Pricing

  1. General-Purpose V2 (GPv2)
    1. support all services
    2. Lowest price per gigabyte
  2. General-Purpose V1 (GPv1)
    1. doesn’t support Cool and archive storage
    2. pricing is lower for workloads with high churn or high read rate
  3. Blob Storage Account
Azure Blob stoarge (Unstructured data)
  • Massive Unstructured data e.g. files, Photos, Videos, Documents
  • Full static websites
  • Storing files for distributed access
  • Streaming Audio & Video
  • Retrieve data by ID
  • High number of read with low latency but not frequent
  • Works with Azure Content Delivery Network (CDN) for caching most frequently used content and storing it on edge servers (low latency)
  • Move file from hot tier to cold or archive tier
  • Scalable object Store
  • Support for Azure Data Lake Storages

Different type of blobs

  • Block blobs: Text and binaries up to 5 TB each block 100 MB
  • Page blobs: VHDs
  • Append blobs:
Azure Files
  • Enable to set up highly availble network file share
  • Access shared files by using standard Server Message Block Protocol (SMB)
  • Multiple VMs can share the same files with both read and write
  • Can read files by using the REST interface or client interface
  • Can associate a unique URL for any file to allow fine-grained access to a private file for a set period of time.
  • Scenarios of file sharing
    • Storing shared configuration files for VMs, tools, or utilities so that everyone is using the same version
    • Log files such as diagnostics, metrics, and crach dumps.
    • Shared data between on-premises application and Azure VMs to allow migration of apps to the cloud over a period of time.
Azure Queues
  • For storing and retrieving large number of messaages
  • Queue messages can be up to 64 KB in size
  • Queue can contains millions of messages
  • Messages are processed asynchronously
  • Loosely connect different parts of the application together (the services can be developed separately)
  • Decoupling causes a better durability across large workloads, and components can be scaled independantly
  • One or more sender and receiver components
  • It can be used for creating a backlogs of works
  • For distributing loads among different web servers
  • For building resilience against component fauilure

Sources

Azure Table storage

Diagnostic settings

Log data in storage account (v2) cab be streamed to. As long as the destination is not configured the diagnostic setting cannot be saved.

  • Log Analytics
  • Storage Account
  • Event Hub

Azure HBase

Azure Redis cache

Azure App Service

  • data can be stored on app service as well but when the data volume must not be so big
  • It’s not a good storage for global audience
  • app service is mainly
    • a platform as a service (PaaS)
    • each web app must have an app service plan
    • endpoints are load balancved
    • KUDU -> scaled architecture which use compute instances
  • app service diagnostics overview
    • for issues like 500 errors
    • diagnostic is intelligent & interactive
    • App service > diagnostics and solve problems
  • run backend tasks with app service
    • via webjob
    • it doesn’t have additional costs
    • it runs in the same context of the web app (still doesn’t support linux)
    • itsupports different laguages (py|php|ps1|cmd|bat|exe|js|jar)
    • app service > web jobs> +add button
      • type : continous | triggered
        • if triggered : manual | schedules (CRON expression 0 0/20 * * * *)
      • scale : multiscale
    • supports remote debuding
web job (continuous)web job (triggered)
starts immediately after creationstarts when triggered
job can be resrtarted
runs on all instances that web app runsruns on only one instance, which is for load balancing
optionally can be restricted to only one instance
support remote debuggingdoesn’t support remote debugging
  • from VS .NET > Project right click > publish / republish
  • http://<app-name&gt;.azurewebsites.net
  • App Service has these SKUs (Free, Shared, Basic, Standard, Premium)

Lift-and-shift / rehosting -> is a strategy for migrating a workload to cloud without redesigning the application or making code changes

Cloud optimization -> is a strategy for migration to cloud by refactoring an application to take advantage of cloud native features and scalabilities.


Resources


Always move on to either find or build the way to the summit of your life.

Parisa Moosavinezhad


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: