Cray in Azure for weather forecasting

February 20, 2018, 1:00 am

≫ Next: Spring Security Azure AD: Wire up enterprise grade authentication and authorization

≪ Previous: Last week in Azure: Azure SDK for Go GA, and more

When we announced our partnership with Cray, it was very exciting news. I received my undergraduate degree in meteorology, so my mind immediately went to how this could be a benefit to weather forecasting.

Weather modeling is an interesting use case. It requires a large number of cores with a low-latency interconnect, and it is very time sensitive. After all, what good is a one hour weather forecast if it takes 90 minutes to run? And weather is a very local phenomenon. In order to resolve smaller scale features without shrinking the domain or lengthening runtime, modelers must add more cores. A global weather model with a 0.5 degree grid spacing can require as many as 50,000 cores.

At that large of a scale, and with the performance required to be operationally useful, a Cray supercomputer is an excellent fit. But the model by itself doesn’t mean much. The model data needs to be processed to generate products. This is where Azure services come in.

Website images are one obvious product of weather models. Image generation programs require small scale and can be done in parallel, so they’re great for using the elasticity of Azure virtual machines. The same can be said for generating model output statistics forecasts, a form of forecast that applies statistical regression to the raw model output which eliminates bias and adds fields that are not directly forecasted in the model. Artificial Intelligence is starting to be used as a forecasting tool as well.

To put this end-to-end workflow together requires more than just the Cray supercomputer. It would use virtual machines, perhaps with Batch to manage tasks. It would use storage: disks attached to virtual machines, blob storage for scalability, and archive storage to hold the raw data for later re-analysis. AI-generated forecasts can use Azure’s broad suite of AI products. And if you’re serving web images, the Azure CDN provides reliable product delivery.

As you can see, Cray in Azure is an important piece of a larger computing ecosystem. Let us know how Cray in Azure can help your HPC workloads.

↧

Spring Security Azure AD: Wire up enterprise grade authentication and authorization

February 20, 2018, 2:00 am

≫ Next: Migrating to Azure SQL Database with zero downtime for read-only workloads

≪ Previous: Cray in Azure for weather forecasting

We are pleased to announce that Azure Active Directory (Azure AD) is integrated with Spring Security to secure your Java web applications. With only few lines of configurations, you can wire up enterprise grade authentication and authorization for your Spring Boot project.

With Spring Boot Starter for Azure AD, Java developers now can get started quickly to build the authentication workflow for a web application that uses Azure AD and OAuth 2.0 to secure its back end. It also enables developers to create a role based authorization workflow for a Web API secured by Azure AD with the power of the Spring Security.

Getting Started

Take the To-do App, which Erich Gamma showed on SpringOne 2017, as an example. The sample is composed of two layers: Angular JS client and Spring Boot RESTful web service. It illustrates the flow to login and retrieves user's information using AAD Graph API.

Authorization Flow Chart

The authorization flow is composed of 3 phrases:

Login with credentials and get validated through Azure AD.
Retrieve token and membership information from Azure AD Graph API.
Evaluate the membership for role-based authorization.

spring security aad auth flow chart

Register a new application in Azure AD

To get started, first register a new application in Azure Active Directory. After the app is ready, generate a client key and grant permissions to the app.

spring security aad register a new app

Features of Spring Security Azure AD

Use Spring Initializer to quick-start a new project with dependencies of Spring Security and Azure Active Directory. Specify the Azure AD connections and wire up AAD AuthFilter in your project. Now you can easily set up AAD authentication and role-based authorization with the following features:

@PreAuthorize: Implement Spring’s @PreAuthorize annotation to provide method-level security with roles and permissions of logged-in users.
isMemberOf(): provide access control with roles and permissions based on a specified Azure user group.

spring security aad features

Access Control with Azure AD Group

Run and test your app in a web browser. Now you can easily use Azure AD Group for access control by adding or removing group members.

spring security aad todo app

spring security aad access control

Next Steps

For more information about using Spring on Azure, visit the following pages:

GitHub: Spring Boot Starters for Azure Services
Tutorial: Spring on Azure Homepage
Tutorial: Java support on Azure Homepage

Feedback

Please share your feedback and ask questions to help us improve. You can contact us on Gitter.

↧

Migrating to Azure SQL Database with zero downtime for read-only workloads

February 20, 2018, 3:00 am

≫ Next: Visual Studio Code C/C++ extension Feb 2018 update

≪ Previous: Spring Security Azure AD: Wire up enterprise grade authentication and authorization

Special thanks to MSAsset engineering team’s Peter Liu (Senior Software Engineer), Vijay Kannan (Software Engineer), Sathya Muhandiramalage (Senior Software Engineer), Bryan Castillo (Principal Software Engineer) and Shail Batra (Principal Software Engineering Manager) for sharing their migration story with the Azure SQL Database product team.

Microsoft uses an internally written service called MSAsset to manage all Microsoft data center hardware around the world. MSAsset is used for tracking Microsoft’s servers, switches, storage devices, and cables across the company and requires 24/7 availability to accommodate break-fix requirements.

Before migrating to Azure SQL Database last year, MSAsset’s data tier consisted of a 107 GB database with 245 tables on SQL Server. The database was part of a SQL Server Always On Availability Group topology used for high availability and the scaling out of read-activity.

The MSAsset engineering team faced the following issues:

Aging hardware was not keeping up with stability and scale requirements.
There was an increase in high severity data-tier incidents and no database administrator on staff to help with troubleshooting, mitigation, root cause analysis and ongoing maintenance.
MSAsset’s database ran on SQL Server 2012. Developers and internal customers were increasingly requesting access to new SQL Server functionality.

After exploring various options and weighing several factors, the MSAsset engineering team decided that Azure SQL Database was the appropriate data tier for their future investment and would address all of their key pain points. With the move to Azure SQL Database, they gained increased scalability, built-in manageability and access to the latest features.

With 24/7 availability requirements, the engineering team needed to find a way to migrate from SQL Server to Azure SQL Database without incurring downtime for read-only activity. MSAsset is a read-heavy service, with a much smaller percent of transactions involving data modifications. Using a phased approach, they were able to move to Azure SQL Database with zero downtime for read-only traffic and less than two hours of down time for read-write activity. This case study will briefly describe how this was accomplished.

The original MSAsset architecture

The original MSAsset application architecture consisted of a web tier with read-write access to the primary database located on a SQL Server 2012 instance. The database was contained within an Always On Availability Group with one synchronous read-only secondary replica and three read-only asynchronous secondary replicas. The application used an availability group listener to direct incoming write traffic to the primary replica. To accommodate the substantial amount of read-only reporting traffic, a proprietary load balancer was used to direct requests across the read-only secondary replicas using a round-robin algorithm.

When planning for a move to Azure SQL Database as with the legacy SQL Server solution, the proposed new solution needed to accommodate one read-write database and depending on the final migrated workload volume and associated Azure SQL Database resource consumption one or more read-only replicas.

Using a phased migration approach

The MSAsset engineering team used a phased incremental approach for moving from SQL Server to Azure SQL Database. This incremental approach helped reduce the risk of project failure and allowed the team to learn and adapt to the inevitable unexpected variables that arise with complex application migrations.

The migration phases were as follows:

Configure hybrid SQL Server and Azure SQL Database read-only activity, while keeping all read-write activity resident on the legacy SQL Server database.
- Set up transactional replication from SQL Server to Azure SQL Database, for use in accommodating read-only activity.
- Monitor the replication topology for stability, performance, and convergence issues.
- As needed, create up to four active geo-replication readable secondary databases in the same region to accommodate read-only traffic scale requirements.
- Once it is confirmed the topology is stable for a sustained period of time, use load-balancing to direct read-only activity to Azure SQL Database, beginning with 25 percent of the read-only traffic. Over a period of weeks, increase to 50 percent, and then 75 percent. For load balancing, the MSAsset engineering team uses a proprietary application-layer library.
- Along the way, use Query Performance Insight to monitor overall resource consumption and top queries by CPU, duration, execution count. MSAsset also monitored application metrics, including API latencies and error rates.
- Adjust the Azure SQL Database service tiers and performance levels as necessary.
- Move or redirect any high-resource consuming unnecessary legacy traffic to bulk access endpoints.
After stabilizing in the prior phase of 75 percent read-only activity on Azure SQL Database, move 100 percent of the read-only traffic to Azure SQL Database.
- Again, use Query Performance Insight to monitor overall resource consumption and top queries by CPU, duration, execution count. Adjust the Azure SQL Database service tiers and performance levels as necessary and create up to four active geo-replication readable secondary databases in the same region to accommodate read-only traffic.
Prior to the final cut-over to Azure SQL Database, develop and fully test a complete rollback plan. The MSAsset team used SQL Server Data Tools (SSDT) data comparison functionality to collect the delta between Azure SQL Database and a four day old backup and then applied the delta to the SQL Server database.
Lastly, move all read-write traffic to Azure SQL Database. In MSAsset’s case, in preparation for the final read-write cutover they reseeded, via transactional replication a new database in Azure SQL Database for read-write activity moving forward. Steps they followed:
After the full reseeding, wait for remaining transactions on SQL Server to drain before removing the transactional replication topology.
Change the web front-end configuration to use the Azure SQL Database primary database for all read-write activity. Use read-only replicas for read-only traffic.
After a full business cycle of monitoring, de-commission the SQL Server environment.

This phased approach allowed the MSAsset team to incur no downtime for read-only activity and also helped them minimize risk, allowing enough time to learn and adapt to any unexpected findings without having to revert to the original environment.

The final MSAsset architecture uses one read-write Azure SQL Database replica and four active geo-replication readable secondary databases.

The remaining sections will talk about key aspects and lessons learned from the migration effort.

Creating a read-only Azure SQL Database using Transactional Replication

The first phase involved setting up transactional replication from SQL Server to Azure SQL Database, ensuring a stable replication topology with no introduced performance or convergence issues.

The MSAsset engineering team used the following process for setting up transactional replication:

They first reviewed the existing SQL Server database against the requirements for replication to Azure SQL Database. These requirements are detailed in the Replication to SQL Database documentation. For example, a small number of the legacy tables for MSAsset did not have a primary key, and so a primary key had to be added in order to be supported for transactional replication. Some of the tables were no longer being used, and so it was an opportunity to clean up stale objects and associated code.
Since the MSAsset publication was hosted on an Always On Availability Group, the MSAsset team followed the steps detailed here for configuration transactional replication: Configure Replication for Always On Availability Groups (SQL Server).

For an overview of two primary methods for migrating from SQL Server to Azure SQL Database, see SQL Server database migration to SQL Database in the cloud.

Once transactional replication was configured and fully synchronized, read-only traffic was first directed to both SQL Server and Azure SQL Database with read-write activity continuing to go just against the SQL Server-resident database.

The read-only traffic against Azure SQL Database was incrementally increased over time to 25 percent, 50 percent, and 75 percent, with careful monitoring along the way to ensure sufficient query performance and DTU availability. The MSAsset team used a proprietary load balancing application library to distribute load across the various read-only databases. Once stabilized at 75 percent, the MSAsset team moved 100 percent of read-only activity to Azure SQL Database and continued with the other phases described earlier.

Cleanup opportunities

The MSAsset team also used this as an opportunity to clean up rogue reporting processes. This included in-house Microsoft reporting tools and applications that, while being permitted to access the database had other data warehouse options that were more appropriate for ongoing use than MSAsset. When encountering rogue processes, the MSAsset team reached out to the owners and had them re-route to appropriate data stores. Disused code and objects, when encountered, were also removed.

Redesigning around compatibility issues

The MSAsset team discovered two areas that required re-engineering prior to migration to Azure SQL Database:

Change Data Capture (CDC) was used for tracking data modifications on SQL Server. This process was replaced with a solution that leverages temporal tables instead.
SQL Server Agent Jobs were used for executing custom T-SQL scheduled jobs on SQL Server. All SQL Server Agent Jobs were replaced with Azure worker roles that invoked equivalent stored procedures instead.

The team used Data Migration Assistant to detect compatibility issues and also used the following reference, Resolving Transact-SQL differences during migration to SQL Database.

Microsoft is also introducing a new deployment option, Azure SQL Database Managed Instance which will bring increased compatibility with on-premises SQL Server. An expanded public preview is coming soon.

Understanding networking and connectivity with Azure SQL Database

With an array of services requiring access to MSAsset’s data tier, the engineering team had to familiarize themselves with Azure SQL Database networking and connectivity requirements as well as fundamentals. Having this background was a critical aspect of the overall effort and should be a core focus area of any migration plan to Azure SQL Database.

To learn about Azure SQL Database connection fundamentals and connection troubleshooting essentials, see Azure SQL Database Connectivity Architecture and Troubleshoot connection issues to Azure SQL Database.

Modernizing the platform and unlocking cloud scalability

The original MSAsset SQL Server hardware was powerful, but old. Before moving to Azure SQL Database, the MSAsset engineering considered replacing the servers. But they were concerned about the projected cost and ability for the hardware to keep up with MSAsset’s projected workload growth over the next five years. The MSAsset engineering team was also concerned about keeping up with the latest SQL Server versions and having access to the latest features.

Moving to Azure SQL Database means that the MSAsset team can scale resources much more easily and no longer have to worry about outgrowing their existing hardware. They can also now access new features as they become available in Azure SQL Database without having to explicitly upgrade. They are also now able to leverage built-in capabilities unique to Azure SQL Database like Threat Detection and Query Performance Insight.

Reducing high severity issues and database management overhead

The MSAsset engineering team has no database administrator on staff, so coupled with the support of old hardware and standard DBA maintenance requirements, these factors were a major contributor to increasingly frequent high severity incidents.

Moving to Azure SQL Database, the MSAsset team no longer worries about ongoing database server patching, backups, or complex high availability and disaster recovery topology configuration. Since moving to Azure SQL Database, the MSAsset engineering team has seen an 80 percent reduction in high severity issues for their data tier.

Next Steps

Learn more about Azure SQL Database and building scalable, low-maintenance cloud solutions: in What is SQL Database? Introduction to SQL Database documentation.

Want to get started but don’t know where to begin? Create your first SQL Database in Azure with your free Azure account.

↧

Visual Studio Code C/C++ extension Feb 2018 update

February 20, 2018, 10:57 am

≫ Next: One Email Rule – Have a separate Inbox and an Inbox CC to reduce email stress. Guaranteed.

≪ Previous: Migrating to Azure SQL Database with zero downtime for read-only workloads

The February 2018 update to the Visual Studio Code C/C++ extension is here! In addition to several bug fixes, this update added colorization for inactive code regions, making it easy to read C and C++ code. You can find the full list of changes in the 0.15.0 release notes.

Colorization for inactive regions

Inactive code regions that are controlled by conditional-compilation directives, such as #if and #ifdef, are now greyed out in the editor.

Join the Insiders program

If you have been using, or are interested in using, the C/C++ extension for Visual Studio Code and would like to get early access to the latest features and bug fixes, please use the following link to join the Insiders program:

https://aka.ms/vcvscodeinsiders

By joining the Insiders program, you get:

Early access to the Insiders build which includes the latest features and bug fixes.
A direct feedback channel to the development team to influence the future of the extension.

Tell us what you think

Download the C/C++ extension for Visual Studio Code, try it out, and let us know what you think. File issues and suggestions on GitHub. If you haven’t already provided us feedback, please take this quick survey to help shape this extension for your needs. You can also find us on Twitter (@VisualC).

↧

One Email Rule – Have a separate Inbox and an Inbox CC to reduce email stress. Guaranteed.

February 18, 2018, 5:55 pm

≫ Next: One Email Rule – Have a separate Inbox and an Inbox CC to reduce email stress. Guaranteed.

≪ Previous: Visual Studio Code C/C++ extension Feb 2018 update

Two folders in your email client. One called I've mentioned this tip before but once more for the folks in the back. This email productivity tip is a game-changer for most information workers.

We all struggled with email.

Some of us just declare Email Bankruptcy every few months. Ctrl-A, delete, right? They'll send it again.
Some of us make detailed and amazing Rube Goldbergian email rules and deliberately file things away into folders we will never open again.
Some of us just decide that if an email scrolls off the screen, well, it's gone.

Don't let the psychic weight of 50,000 unread emails give you headaches. Go ahead, declare email bankruptcy - you're already in debt - then try this one email rule.

One Email Rule

Email in your inbox is only for email where you are on the TO: line.

All other emails (BCC'ed or CC'ed) should go into a folder called "Inbox - CC."

That's it.

I just got back from a week away. Look at my email there. 728 emails. Ugh. But just 8 were sent directly to me. Perhaps that's not a realistic scenario for you, sure. Maybe it'd be more like 300 and 400. Or 100 and 600.

Point is, emails you are CC'ed on are FYI (for your information) emails. They aren't Take Action Now emails. Now if they ARE, then you need to take a moment and train your team. Very simple, just reply and say, "oops, I didn't see this immediately because I was cc'ed. If you need me to see something now, please to: me." It'll just take a moment to "train" your coworkers because this is a fundamentally intuitive way to work. They'll say, "oh, make sense. Cool."

Try this out and I guarantee it'll change your workflow. Next, do this. Check your Inbox - CC less often than your Inbox. I check CC'ed email a few times a week, while I may check Inbox a few times a day.

If you like this tip, check out my complete list of Productivity Tips!

Sponsor: Unleash a faster Python Supercharge your applications performance on future forward Intel® platforms with The Intel® Distribution for Python. Available for Windows, Linux, and macOS. Get the Intel® Distribution for Python* Now!

↧

One Email Rule – Have a separate Inbox and an Inbox CC to reduce email stress. Guaranteed.

February 18, 2018, 5:55 pm

≫ Next: TFS 2018.1 RTM is available

≪ Previous: One Email Rule – Have a separate Inbox and an Inbox CC to reduce email stress. Guaranteed.

Two folders in your email client. One called I've mentioned this tip before but once more for the folks in the back. This email productivity tip is a game-changer for most information workers.

We all struggled with email.

Some of us just declare Email Bankruptcy every few months. Ctrl-A, delete, right? They'll send it again.
Some of us make detailed and amazing Rube Goldbergian email rules and deliberately file things away into folders we will never open again.
Some of us just decide that if an email scrolls off the screen, well, it's gone.

Don't let the psychic weight of 50,000 unread emails give you headaches. Go ahead, declare email bankruptcy - you're already in debt - then try this one email rule.

One Email Rule

Email in your inbox is only for email where you are on the TO: line.

All other emails (BCC'ed or CC'ed) should go into a folder called "Inbox - CC."

That's it.

If you like this tip, check out my complete list of Productivity Tips!

↧

TFS 2018.1 RTM is available

February 20, 2018, 4:14 pm

≫ Next: Introducing backup for Azure file shares

≪ Previous: One Email Rule – Have a separate Inbox and an Inbox CC to reduce email stress. Guaranteed.

Today we released the final build of Team Foundation Server 2018 Update 1. The key links are:

This release is primarily bug fixes for important issues and a few select features. The next “big” feature release will be TFS 2018.2 – due in the May timeframe. I wrote at some length in the 2018.1 RC post on how to think about this release and our update cadence moving forward. I won’t try to repeat all of that here.

See the release notes for details on installing this release.

Please let us know if you have any issues with the release.

Brian

↧

Introducing backup for Azure file shares

February 21, 2018, 1:00 am

≫ Next: Your guide to Azure services for apps built with Xamarin

≪ Previous: TFS 2018.1 RTM is available

Today, we are excited to announce the public preview of backup for Azure file shares. Azure Files is a cloud-first file share solution with support for industry standard SMB protocol. Through this preview, Azure Backup enables a native backup solution for Azure file shares, a key addition to the feature arsenal to enable enterprise adoption of Azure Files. Using Azure Backup, via Recovery Services vault, to protect your file shares is a straightforward way to secure your files and be assured that you can go back in time instantly.

Key features

Discover unprotected file shares: Utilize the Recovery Services vault to discover all unprotected storage accounts and file shares within them.
Backup multiple files at a time: You can back up at scale by selecting multiple file shares in a storage account and apply a common policy over them.
Schedule and forget: Apply a Backup policy to automatically schedule backups for your file shares. You can schedule backups at a time of your choice and specify the desired retention period. Azure Backup takes care of pruning these backups once they expire.
Instant restore: Since Azure Backup utilizes file share snapshots, you can restore just the files you need instantly even from large file shares.
Browse individual files/folders: Azure Backup lets you browse the restore points of your file shares directly in the Azure portal so that you can pick and restore only the necessary files and folders.

Core benefits

Zero infrastructure solution: Azure Backup creates and manages the infrastructure required for protecting your file shares. No agents or virtual machines (VMs) need to be deployed to enable the solution.
Comprehensive backup solution: Azure Backup helps you manage the backup of Azure Files as well as Azure IaaS VMs, SQL Server running in IaaS VMs (preview), and on-premises servers. Backup and restore jobs across all workloads can be monitored from a single dashboard.
Directly recover files from the Azure portal: Apart from providing the ability to restore entire file shares, Azure Backup also lets you browse a recovery point directly in the portal. You can browse all the files and folders in a recovery point and choose to restore necessary items.
Cost effective: Backup for Azure Files is free* of charge during the preview and you can start enjoying all its benefits immediately.
Coming soon: Azure Backup has lined up an amazing list of features for backing Azure file shares and you can expect an update from us as early as next month. Stay tuned!

*Azure file share snapshots will be charged once the snapshot capability is generally available.

Get started

Start protecting your file shares by using the Recovery Services vaults in your region. The new backup goal options in the vault overview will let you choose Azure file shares to back up from storage accounts in your region. You can refer to our documentation for more details.

Your guide to Azure services for apps built with Xamarin

February 21, 2018, 9:00 am

≫ Next: Sync SQL data in large scale using Azure SQL Data Sync

≪ Previous: Introducing backup for Azure file shares

When talking about app development today, the cloud is almost always part of the conversation. While many developers have an idea of the benefits that cloud can offer them – scalability, ready-to-use functionality, and security, to name a few – it’s sometimes hard to figure out where to start for the specific scenario you have in mind. Luckily, our mobile developer tools docs team has you covered!

Today, we’re happy to announce the availability of the “Mobile apps using Xamarin + Azure” poster. This poster serves as your one-stop guide to the most relevant cloud services that Azure has to offer to you as a mobile developer with Visual Studio and Xamarin.

Download Your Poster Here

We’re excited to hear your feedback on the poster and how we can make it even better for you to get the most out of Azure services for your mobile apps built with Xamarin. Leave us a comment below with what you think and happy coding!

	Craig Dunn, Principal Program Manager @conceptdev Craig works on the Mobile Developer Tools documentation team, where he enjoys writing cross-platform code for iOS, Android, Mac, and Windows platforms with Visual Studio and Xamarin.
	Rajen Kishna, Sr. Product Marketing Manager @rajen_k Rajen does product marketing at Microsoft for Visual Studio for Mac, as well as mobile and game developer tools, focusing on .NET

↧

Sync SQL data in large scale using Azure SQL Data Sync

February 21, 2018, 2:00 am

≫ Next: New Azure GxP guidelines help pharmaceutical and biotech customers build GxP solutions

≪ Previous: Your guide to Azure services for apps built with Xamarin

Azure SQL Data Sync allows users to synchronize data between Azure SQL Databases and SQL Server databases in one-direction or bi-direction. This feature was first introduced in 2012. By that time, people didn't host a lot of large databases in Azure. Some size limitations were applied when we built the data sync service, including up to 30 databases (five on-premises SQL Server databases) in a single sync group, and up to 500 tables in any database in a sync group.

Today, there are more than two million Azure SQL Databases and the maximum database size is 4TB. But those limitations of data sync are still there. It is mainly because that syncing data is a size of data operation. Without an architectural change, we can’t ensure the service can sustain the heavy load when syncing in a large scale. We are working on some improvements in this area. Some of these limitations will be raised or removed in the future. In this article, we are going to show you how to use data sync to sync data between large number of databases and tables, including some best practices and how to temporarily work around database and table limitations.

Sync data between many databases

Large companies and ISVs use data sync to distribute data from a central master database to many client databases. Some customers have hundreds or even thousands of client databases in the whole topology. Users may hit one of the following issues when trying to sync between many databases:

Hit the 30 databases per sync group limitation.
Hit the five on-premises SQL Server databases per sync group limitation.
Since all member databases will sync with the hub database, there’s significant performance impact to workload running in the hub database.

To work around the 30 databases or five on-premises databases per sync group limitation, we suggest you use a multi-level sync architecture. You can create a sync group to sync your master database with several member databases. And those member databases can become the hub databases of the sub sync groups and sync data to other client databases. According to your business and cost requirement, you can use the databases in the middle layers as client databases or dedicated forwarders.

There are benefits from this multi-level sync architecture even you don’t hit the 30 databases per sync group limitation:

You can group clients based on certain attributes (location, brand…) and use different sync schema and sync frequency.
You can easily add more clients when your business is growing.
The forwarders (member databases in the middle layers) can share the sync overhead from the master database.

To make this multi-level sync topology work in your system, you will need a good balance between how many client databases in a single sync group and how many levels in the overall system. The more databases in a single sync group, the higher impact it will add to the overall performance in the hub database. The more levels you have in your system, the longer it takes to have data changes broadcasted to all clients.

When you are adding more member databases to the system, you need to closely monitor the resource usage in the hub databases. If you see consistent high resource usage, you may consider upgrading your database to a higher SLO. Since the hub database is an Azure SQL database, you can upgrade it easily without downtime.

Sync data between databases with many tables

Currently, data sync can only sync between databases with less than 500 tables. You can work around this limitation by creating multiple sync groups using different database users. For example, you want to sync two databases with 900 tables. First, you need to define two different users in the database where you load the sync schema from. Each user can only see 450 (or any number less than 500) tables in the database. Sync setup requires ALTER DATABASE permission which implies CONTROL permission over all tables so you will need to explicitly DENY the permissions on tables which you don’t want a specific user to see, instead of using GRANT. You can find the exact privilege needed for sync initialization in the best practice guidance. Then you can create two sync groups, one for each user. Each sync group will sync 450 tables between these two databases. Since each user can only see less than 500 tables, you will be able to load the schema and create sync groups! After the sync group is created and initialized, we recommend you follow the best practice guidance to update the user permission and make sure they have the minimum privilege for ongoing sync.

Optimize the sync initialization

After the sync group is created, the first time you trigger the sync, it will create all tracking tables and stored procedures and load all data from source to target database. The initial data loading is a size-of-data operation. Initializing sync between large databases could take hours or even days if it is not set up properly. Here are some tips to optimize the initialization performance:

Data sync will initialize the target tables using bulk insert if the target tables are empty. If you have data on both sides, even if data in source and target databases are identical (data sync won’t know that!), data sync will do a row-by-row comparison and insertion. It could be extremely slow for large tables. To gain the best initialization performance, we recommend you consolidate data in one of your databases and keep the others empty before setting up data sync.
Currently, the data sync local agent is a 32 bits application. It can only use up to 4GB RAM. When you are trying to initialize large databases, especially when trying to initialize multiple sync groups at the same time, it may run out of memory. If you encountered this issue, we recommend you add part of the tables into the sync group first, initialize with those tables, and then add more tables. Repeat this until all tables are added to the sync group.
During initialization, the local agent will load data from the database and store it as temp files in your system temp folder. If you are initializing sync group between large databases, you want to make sure your temp folder has enough space before you start the sync. You can change your temp folder to another drive by set the TEMP and TMP environment variables. You will need to restart the sync service after you update the environment variable. You can also add and initialize tables to the sync group in batch. Make sure the temp folder is cleaned up between each batch.
If you are initializing data from on-premises SQL Server to Azure DB, you can upgrade your Azure DB temporarily before the initialization. You can downgrade the database to the original SLO after the initialization is done. The extra cost will be minimum. If your target database is SQL Server running in a VM, add more resources to the VM will do the same.

Experiment of sync initialization performance

Following is the result of a simple experiment. I created a sync group to sync data from a SQL Server database in Azure VM to an Azure SQL database. The VM and SQL database are in the same Azure region so the impact of network latency could be ignored. It was syncing one table with 11 columns and about 2.1M rows. The total data size is 49.1GB. I did three runs with different source and target database configuration:

In the first run, the target database is S2 (50 DTU), and source database is running in D4S_V3 VM (4 vCPU, 16GB RAM). It takes 50 min to extract data to the temp folder and 471 min to load the data from the temp folder to the target database.

I upgraded the target database to S6 (400 DTU) and the Azure VM to D8S_V3 (8 vCPU, 32GB RAM) for the second run. It reduced the loading time to 98 min! The data extracting surprisingly took longer time in this run. I can’t explain the regression since I didn’t capture the local resource usage during the run. It might be some disk I/O issue. Even though, upgrading the target database to S6 reduced the total initialization time from 521 min to 267 min.

In the third run, I upgraded the target database to S12 (3000 DTU) and used the local SSD as temp folder. It reduced data extract time to 39 min, data loading time to 56 min and the total initialization time to 95 min. It was 5.5 time faster than the first configuration with extra cost of a cup of coffee!

Conclusion

Upgrade the target database (Azure DB) to higher SLO will help to improve the initialization time significantly with manageable extra cost.
Upgrade the source database doesn’t help too much since the data extract is an I/O bound operation and 32bits local agent can only use up to 4GB RAM.
Using attached SSD as temp folder will help on the data extract performance. But the ROI is not as high as upgrading target database. You also need to consider if the temp files can fit into the SSD disk.

Runs	Target database SLO (Azure DB)	Source database SLO (VM)	Total Initialization time	Data extract time	Data load time
1	S2	D4S_V3	521 min	50 min	471 min
2	S6	D8S_V3	267 min	*169 min	98 min
3	S12	D8S_V3, Attached SSD	95 min	39 min	56 min

In this article, we provided some best practices about how to sync data using Azure SQL Data Sync service between many databases and databases with many tables. Please find more information about data sync in the online documentation. More data sync best practice is available at Best Practices for Azure SQL Data Sync.

↧

New Azure GxP guidelines help pharmaceutical and biotech customers build GxP solutions

February 21, 2018, 3:00 am

≫ Next: Unlock Query Performance with SQL Data Warehouse using Graphical Execution Plans

≪ Previous: Sync SQL data in large scale using Azure SQL Data Sync

We recently released a detailed set of GxP qualification guidelines for our Azure customers. These guidelines give life sciences organizations, such as pharmaceutical and biotechnology companies, a comprehensive toolset for building solutions that meet GxP compliance regulations.

GxP is a general abbreviation for "good practice" quality guidelines and regulations. Technology systems that use GxP processes such as Good Laboratory Practices (GLP), Good Clinical Practices (GCP), and Good Manufacturing Practices (GMP) require validation of adherence to GxP. Solutions are considered qualified when they can demonstrate the ability to fulfill GxP requirements. GxP regulations include pharmaceutical requirements, such as those outlined in the U.S. Food and Drug Administration CFR Title 21 Part 11, and EU GMP Annex 11.

Life sciences organizations are increasingly moving to the cloud to increase efficiency and reduce costs, but in order to do so they must be able to select a cloud service provider with processes and controls that help to assure the confidentiality, integrity, and availability of data stored in the cloud. Of equal importance are those processes and controls that must be implemented by life sciences customers to ensure that GxP systems are maintained in a secured and validated state.

Life sciences organizations building GxP solutions on Azure can take advantage of the cloud’s efficiencies while at the same time helping protect patient safety, product quality, and data integrity. Customers also benefit from Azure’s multiple layers of security and governance technologies, operational practices, and compliance policies to enforce data privacy and integrity at very specific levels.

The Azure GxP qualification guidelines give customers the tools they need to build on Azure’s security foundation by providing:

The shared responsibilities between Microsoft and customers for meeting GxP requirements
Documentation of the extensive controls implemented as part of Azure’s internal development of security and quality practices
Visibility into crucial areas of internal Azure quality management, IT infrastructure qualification, and software development practices
Recommendations for customer GxP compliance readiness
Descriptions of GxP-relevant tools and features within Azure

We are partnering with our life sciences customers to make cloud-based GxP systems a safer, more efficient model for driving innovation and maintaining regulatory compliance.

Download the GxP qualification guidelines for Azure.

↧

Unlock Query Performance with SQL Data Warehouse using Graphical Execution Plans

February 21, 2018, 4:00 am

≫ Next: Machine Learning in R with TensorFlow

≪ Previous: New Azure GxP guidelines help pharmaceutical and biotech customers build GxP solutions

The Graphical Execution Plan feature within SQL Server Management Studio (SSMS) is now supported for SQL Data Warehouse (SQL DW)! With a click of a button, you can create a graphical representation of a distributed query plan for SQL DW.

Before this enhancement, query troubleshooting for SQL DW was often a tedious process, which required you to run the EXPLAIN command. SQL DW customers can now seamlessly and visually debug query plans to identify performance bottlenecks directly within the SSMS window. This experience extends the query troubleshooting experience by displaying costly data movement operations which are the most common reasons for slow distributed query plans. Below is a simple example of troubleshooting a distributed query plan with SQL DW leveraging the Graphical Execution Plan.

The view below displays the estimated execution plan for a query. As we can see, this is an incompatible join which occurs when there is a join between two tables distributed on different columns. An incompatible join will create a ShuffleMove operation, where temp tables will be created on every distribution to satisfy the join locally before streaming the results back to the user. The ShuffleMove has become a performance bottleneck for this query:

Taking a closer look at our table definition, we can see that the two tables being joined are distributed on different columns:

By leveraging the estimated Graphical Execution Plan, we have identified that we can improve the performance of this query by redistributing the table on the appropriate column to remove the ShuffleMove operation.

Troubleshooting and tuning distributed SQL queries just got easier. Download the latest SSMS release to start using this feature today! If you need help for a POC, contact us directly. Stay up-to-date on the latest Azure SQL DW news and features by following us on Twitter @AzureSQLDW

↧

Machine Learning in R with TensorFlow

February 21, 2018, 10:31 am

≫ Next: Monitor network connectivity to applications with NPM’s Service Endpoint Monitor – public preview

≪ Previous: Unlock Query Performance with SQL Data Warehouse using Graphical Execution Plans

Modern machine learning platforms like Tensorflow have to date been used mainly by the computer science crowd, for applications like computer vision and language understanding. But as JJ Allaire pointed out in his keynote at the RStudio conference earlier this month (embedded below), there's a wealth of applications in the data science domain that have yet to be widely explored using these techniques. This includes things like time series forecasting, logistic regression, latent variable models, and censored data analysis (including survival analysis and failure data analysis).

The keras package for R provides a flexible, high-level interface for specifying machine learning models.(RStudio also provides some nice features when using the package, including a dynamically-updated convergence chart to show progress.) Networks defined with keras are flexible enough to specify models for data science applications, that can then be optimized using frameworks like Tensorflow (as opposed to traditional maximum-likelihood techniques), without limitations on data set size and with the ability to apply modern computational hardware.

For learning materials, RStudio's Tensorflow Gallery provides a good place to get started with several worked examples using real-world data. The book Deep Learning with R (Chollet and Allaire) provides even more worked examples translated from the original Python. If you want to dive into the mathematical underpinnings, the book Deep Learning (Goodfellow et al) provides the details there.

RStudio blog: TensorFlow for R

↧

Monitor network connectivity to applications with NPM’s Service Endpoint Monitor – public preview

February 21, 2018, 5:00 am

≫ Next: New VSTS Messaging Extension for Microsoft Teams

≪ Previous: Machine Learning in R with TensorFlow

As more and more enterprises are hosting their applications in cloud and are relying more on SaaS and PaaS applications to provide services, they are increasingly becoming dependent on multiple external applications and the networks in between. The traditional network monitoring tools work in silos and do not provide end-to-end visibility. Therefore, if an application is found to be running slow, it becomes difficult to identify whether the problem is because of your network, the service provider or the application. Network Performance Monitor (NPM) introduces Service Endpoint Monitor that integrates the monitoring and visualization of the performance of your internally hosted & cloud applications with the end-to-end network performance. You can create HTTP, HTTPS, TCP and ICMP based tests from key points in your network to your applications, allowing you to quickly identify whether the problem is due to the network or the application. With the network topology map, you can locate the links and interfaces experiencing high loss and latencies, helping you identify external & internal troublesome network segments.

Some of the capabilities of NPM’s Service Endpoint Monitor are listed below:

Monitor end-to-end connectivity to applications

Service Endpoint Monitor monitors total response time, network latency and packet loss between your resources (branch offices, datacenters, office sites, cloud infrastructure) and the applications you use, such as websites, SaaS, PaaS, Azure services, File servers, SQL etc. By installing the NPM agents at the vantage points in your corporate perimeter, you can get the performance visibility from where your users are accessing the application. You can setup alerts to get proactively notified whenever the response time, loss or latency from any of your branch offices crosses the threshold. In addition to viewing the near real-time values and historical trends of the performance data, you can use the network state recorder to go back in time to view particular network state in order to investigate the difficult-to-catch transient issues.

Correlate application delivery with network performance

The capability plots both the response time as well as the network latency trends on the same chart. This helps you easily correlate the application response time with the network latency to determine whether the performance degradation is due to the network or the application.

The following snippet demonstrates one such scenario. The chart shows a spike in the application response time whereas the network latency is consistent. This suggests that the network was in a steady state, when the performance degradation was observed - therefore, the problem is due to an issue at the application end.

The example image below illustrates another scenario where spikes in the application response time are accompanied with corresponding spikes in the network latency. This suggests that the increase in response time is due to an increase in network latency, and therefore, the performance degradation is due to the underlying network.

Once You’ve established the network to be the problem area, you can then use the network topology view to identify the troublesome network segment.

Identify troublesome network interfaces and links

NPM’s interactive topology view provides end-to-end network visibility from your nodes to the application. You can not only view all the paths and interfaces between your corporate premises and application endpoint, but also view the latency contributed by each interface to help you identify the troublesome network segment. The below example image illustrates one such scenario where most of the latency is because of the highlighted network interface.

The below example image illustrates another scenario where you can get the network topology from multiple nodes to www.msn.com in a single pane of view and identify the unhealthy paths in red.

When you are using external services such as Office 365, several intermediate hops will be outside of your corporate network. You can simplify the topology map by hiding the intermediate hops using the slider control in filters. You can also choose to view only the unhealthy paths.

Built-in tests for Microsoft Office 365 and Microsoft Dynamics 365

NPM provides built-in tests that monitor connectivity to Microsoft’s Office 365 and Dynamics 365 services, without any pre-configuration. The built-in tests provide simple one-click setup experience where you only have to choose the Office 365 and Dynamics 365 services you are interested in monitoring. Since the capability maintains a list of endpoints associated with these services, you do not have to enter the various endpoints associated with each service.

Create custom queries and views

All data that is exposed graphically through NPM’s UI is also available natively in Log Analytics search. You can perform interactive analysis of data in the repository, corelate data from different sources, create custom alerts, create custom views and export the data to Excel, PowerBI or a shareable link.

Get Started

You can find detailed instructions about how to setup Service Endpoint Monitor in NPM and learn more about the other capabilities in NPM.

Please send your feedback

There are a few different routes to give feedback:

UserVoice: Post new ideas for Network Performance Monitor on our UserVoice page.
Join our cohort: We’re always interested in having new customers join our cohorts to get early access to new features and help us improve NPM going forward. If you are interested in joining our cohorts, simply fill out this quick survey.

↧

New VSTS Messaging Extension for Microsoft Teams

February 21, 2018, 8:02 am

≫ Next: The Squishy Side of Open Source

≪ Previous: Monitor network connectivity to applications with NPM’s Service Endpoint Monitor – public preview

Today we are releasing our new Messaging Extension to add to the integrations between Microsoft Teams and Visual Studio Team Services (VSTS). The messaging extension allows you to search, find, and discuss specific work items in your channel or private chats. It is great way to have a group conversation about your work, without leaving... Read More

↧

The Squishy Side of Open Source

February 21, 2018, 4:19 am

≫ Next: Get started with Azure Cosmos DB through this technical training series

≪ Previous: New VSTS Messaging Extension for Microsoft Teams

A few months back my friend Keeley Hammond and I did a workshop for Women Who Code Portland called The Squishy Side of Open Source. We'd done a number of workshops before on how to use Git and the Command Line, and I've done a documentary film with Rob Conery called Get Involved In Tech: The Social Developer (watch it free!) but Keeley and I wanted to really dive into the interpersonal "soft" or squishy parts. We think that we all need to work to bring kindness back into open source.

Contributing to open source for the first time can be scary and a little overwhelming. In addition to the technical skills required, the social dynamics of contributing to a library and participating in a code review can seem strange.

That means how people talk to each other, what to do when pull requests go south, when issues heat up due to misunderstandings,

Keeley has published the deck up on SpeakerDeck. In this workshop, we talked about the work and details that go into maintaining an open source community, tell real stories from his experiences and go over what to expect when contributing to open source and how to navigate it.

Key Takeaways:

Understanding the work that open source maintainers do, and how to show respect for them.
Understanding Codes of Conduct and Style Guides for OSS repos and how to abide by them.
Tips for communicating clearly, and dealing with uncomfortable or hostile communication.

Good communication is a key part of contributing to open source.

Give context.
Do your homework beforehand. It’s OK not to know things, but before asking for help, check a project’s README, documentation, issues (open or closed) and search the internet for an answer.
Keep requests short and direct. Many projects have more incoming requests than people available to help. Be concise.
Keep all communication public.
It’s okay to ask questions (but be patient!). Show them the same patience that you’d want them to show to you.

Keep it classy. Context gets lost across languages, cultures, geographies, and time zones. Assume good intentions in these conversations.

Where to start?

https://github.com/MunGell/awesome-for-beginners

https://yourfirstpr.github.io/

http://Firsttimersonly.com

What are some good resources you've found for understanding the squishy side of open source?

Sponsor: Get the latest JetBrains Rider for debugging third-party .NET code, Smart Step Into, more debugger improvements, C# Interactive, new project wizard, and formatting code in columns.

↧

Get started with Azure Cosmos DB through this technical training series

February 22, 2018, 1:00 am

≫ Next: LUIS.AI: Automated Machine Learning for Custom Language Understanding

≪ Previous: The Squishy Side of Open Source

Are you building a new application which requires low latency at any scale? Or are you in the process of migrating your NoSQL databases to the cloud? Or looking for the right resources to help you get started with Azure Cosmos DB?

Join us for one or all of a seven-week Azure Cosmos DB technical training series, which explores the capabilities and potential of Azure Cosmos DB. Whether you’re brand new to Azure Cosmos DB or an experienced user, you’ll leave this series with a better understanding of database technology and have the practical skills necessary to get started.

Azure Cosmos DB is the world’s first globally distributed, multi-model database service with native NoSQL support. Designed for the cloud, Azure Cosmos DB enables you to build planet-scale applications that bring data to where your users are with SLA-guaranteed low latency, throughput, and 99.99% availability.

In this training series, you’ll learn everything necessary to get your cloud database up and running. In the first session, we covered a technical overview of Azure Cosmos DB, and, in the following weeks, we’ll progress into deeper topics like migrating Mongo DB applications to Azure Cosmos DB, building serverless applications, and enabling real-time analytics with Azure Cosmos DB, Azure Functions, and Spark. These technical sessions also cover live demos of Azure Cosmos DB features and service offerings, including Graph API, Table API, and Mongo DB API.

Session One: Technical overview of Azure Cosmos DB
Session Two: Build real-time personalized experiences with AI and serverless technology
Session Three: Using Graph API and Table API with Azure Cosmos DB
Session Four: Build or migrate your Mongo DB app to Azure Cosmos DB
Session Five: Understanding Operations of Cosmos DB
Session Six: Build Serverless Apps with Azure Cosmos DB and Azure Functions
Session Seven: Apply real-time analytics with Azure Cosmos DB and Spark

Azure Cosmos DB empowers you to more easily build amazingly powerful, planet-scale apps. Join our technical series to learn more from the Azure Cosmos DB engineering team. Select the sessions you’d like to attend—whether that’s a single session (above), or the complete set (below).

Register for the series

↧

LUIS.AI: Automated Machine Learning for Custom Language Understanding

February 22, 2018, 2:00 am

≫ Next: Using AI to automatically redact faces in videos

≪ Previous: Get started with Azure Cosmos DB through this technical training series

This blog post was co-authored by Riham Mansour, Principal Program Manager, Fuse Labs.

Conversational systems are rapidly becoming a key component of solutions such as virtual assistants, customer care, and the Internet of Things. When we talk about conversational systems, we refer to a computer’s ability to understand the human voice and take action based on understanding what the user meant. What’s more, these systems won’t be relying on voice and text alone. They’ll be using sight, sound, and feeling to process and understand these interactions, further blurring the lines between the digital sphere and the reality in which we are living. Chatbots are one common example of conversational systems.

Chatbots are a very trendy example of conversational systems that can maintain a conversation with a user in natural language, understand the user’s intent and send responses based on the organization’s business rules and data. These chatbots use Artificial Intelligence to process language, enabling them to understand human speech. They can decipher verbal or written questions and provide responses with appropriate information or direction. Many customers first experienced chatbots through dialogue boxes on company websites. Chatbots also interact verbally with consumers, such as Cortana, Siri and Amazon’s Alexa. Chatbots are now increasingly being used by businesses to augment their customer service.

Language understanding (LU) is a very centric component to enable conversational services such as bots, IoT experiences, analytics, and others. In a spoken dialog system, LU converts from the words in a sentence into a machine-readable meaning representation, typically indicating the intent of the sentence and any present entities. For example, consider a physical ﬁtness domain, with a dialog system embedded in a wearable device like a watch. This dialog system could recognize intents like StartActivity and StopActivity, and could recognize entities like ActivityType. In the user input “begin a jog”, the goal of LU is to identify the intent as StartActivity, and identify the entity ActivityType= ’’jog’’.

Historically, there have been two options for implementing LU, machine learning (ML) models and handcrafted rules. Handcrafted rules are accessible for general software developers, but they are difﬁcult to scale up, and do not beneﬁt from data. ML-based models are trained on real usage data, generalize to new situations, and are superior in terms of robustness. However, they require rare and expensive expertise, access to large sets of data, and complex Machine Learning (ML) tools. ML-based models are therefore generally employed only by organizations with substantial resources.

In an effort to democratize LU, Microsoft’s Language Understanding Intelligent Service, LUIS shown in Figure 1 aims at enabling software developers to create cloud-based machine-learning LU models speciﬁc to their application domains, and without ML expertise. It is offered as part of the Microsoft Azure Cognitive Services Language offering. LUIS allows developers to build custom LU models iteratively, with the ability to improve models based on real traffic using advanced ML techniques. LUIS technologies capitalize on the continuous innovation of Microsoft in Artificial Intelligence and its applications to natural language understanding with research, science, and engineering efforts dating back at least 20 years or more. In this blog, we dive deeper into the LUIS capabilities to enable intelligent conversational systems. We also highlight some of our customer stories that show how large enterprises use LUIS as an automated AI solution to build their LU models. This blog aligns with the December 2017 announcement of the general availability of our conversational AI and language understanding tools with customers such as Molson Coors, UPS, and Equadex.

Figure 1: Language Understanding Service

Building Language Understanding Model with LUIS

A LUIS app is a domain-specific language model designed by you and tailored to your needs. LUIS is a cloud-based service that your end users can use from any device. It supports 12 languages and is deployed in 12 regions across the globe making it an extremely attractive solution to large enterprises that have customers in multiple countries.

You can start with a prebuilt domain model, build your own, or blend pieces of a prebuilt domain with your own custom information. Through a simple user experience, developers start by providing a few example utterances and labeling them to bootstrap initial reasonably-accurate application. The developer trains and publishes the LUIS app to obtain an HTTP endpoint on Azure that can receive real traffic. Once your LUIS application has endpoint queries, LUIS enables you to improve individual intents and entities that are not performing well on real traffic through active learning. In the active learning process, LUIS examines all the endpoint utterances, and selects utterances that it is unsure of. If you label these utterances, train, and publish, then LUIS identifies utterances more accurately. It is highly recommended that you build your LUIS application in multiple short and fast iterations where you use active learning to improve individual intents and entities until you obtain satisfactory performance. Figure 2 depicts the LUIS application development lifecycle.

Figure 2: LUIS Application Development Lifecycle

After the LUIS app is designed, trained, and published, it is ready to receive and process utterances. The LUIS app receives the utterance as an HTTP request and responds with extracted user intentions. Your client application sends the utterance and receives LUIS's evaluation as a JSON object, as shown in Figure 3. Your client app can then take appropriate action.

Figure 3: LUIS Input Utterances and Output JSON

There are three key concepts in LUIS:

Intents: An intent represents actions the user wants to perform. The intent is a purpose or goal expressed in a user's input, such as booking a flight, paying a bill, or finding a news article. You define and name intents that correspond to these actions. A travel app may define an intent named "BookFlight."
Utterances: An utterance is text input from the user that your app needs to understand. It may be a sentence, like "Book a ticket to Paris", or a fragment of a sentence, like "Booking" or "Paris flight." Utterances aren't always well-formed, and there can be many utterance variations for a particular intent.
Entities: An entity represents detailed information that is relevant in the utterance. For example, in the utterance "Book a ticket to Paris", "Paris" is a location. By recognizing and labeling the entities that are mentioned in the user’s utterance, LUIS helps you choose the specific action to take to answer a user's request.

LUIS supports a powerful set of entity extractors that enable developers to build apps that can understand sophisticated utterances. LUIS offers a set of pre-built entities that offer common types which developers need often in their apps like date and time recognizers, money, number, etc. Developers can build custom entities based on top-notch machine learning algorithms as well as lexicon-based entities or a blend of both. Entities created through machine learning could be simple entities like “organization name”, hierarchical or composite. Additionally, LUIS enables developers to build list entities that are lexicon-based in a quick and easy way through recommended entries offered by huge-size dictionaries mined from the web.

Hierarchical entities span more than one level to model “Is-A” relation between entities. For instance, to analyze an utterance like “I want to book a flight from London to Seattle”, you need to build a model the could differentiate between the origin “London” and the destination “Seattle” given that both are cities. In that case, you build a hierarchical entity “Location” that has two children “origin” and “destination”.

Composite entities model “Has-A” relation among entities. For instance, to analyze an utterance like “I want to order two fries and three burgers”, you want to make sure that the utterance analysis binds “two” with “fries” and “three” with “burgers”. In this case, you build a composite entity in LUIS called “food order” that is composed of “number of items” and “food type”.

LUIS provides a set of powerful tools to help developers get started quickly on building custom language understanding applications. These tools are combined with customizable pre-built apps and entity dictionaries, such as calendar, music, and devices, so you can build and deploy a solution more quickly. Dictionaries are mined from the collective knowledge of the web and supply billions of entries, helping your model to correctly identify valuable information from user conversations.

Prebuilt domains as shown in Figure 4 are pre-built sets of intents and entities that work together for domains or common categories of client applications. The prebuilt domains have been pre-trained and are ready for you to add to your LUIS app. The intents and entities in a prebuilt domain are fully customizable once you've added them to your app. You can train them with utterances from your system so they work for your users. You can use an entire prebuilt domain as a starting point for customization, or just borrow a few intents or entities from a prebuilt domain.

Figure 4: LUIS pre-built domains

LUIS provides developers with capabilities to actively learn in production and gives guidance on how to make the improvements. Once the model starts processing input at the endpoint, developers can go to the Improve app performance tab to constantly update and improve the model. LUIS examines all the endpoint utterances and selects utterances that it is unsure of and surfaces it to the developer. If you label these utterances, train, and publish, then LUIS processes these utterances more accurately.

LUIS has two ways to build a model, the Authoring APIs and the LUIS.ai web app. Both methods give you control of your LUIS model definition. You can use either LUIS.ai or the Authoring APIs or a combination of both to build your model. The management capabilities we provide includes models, versions, collaborators, external APIs, testing, and training.

Customer Stories

LUIS enables multiple conversational AI scenarios that were much harder to implement in the past. The possibilities are now vast, including productivity bots like meeting assistants and HR bots, digital assistants that present better service to customers and IoT applications. Our value proposition is strongly evidenced through our customers who use LUIS as an automated AI solution to enable their digital transformation.

UPS recently completed a transformative project that improves service levels via a Chatbot called UPS Bot, which runs on the Microsoft Bot Framework and LUIS. Customers can engage UPS Bot in text-based and voice-based conversations to get the information they need about shipments, rates, and UPS locations. According to Katie Duffy, Application Architect, UPS "Conversation as a platform is the future, so it's great that we’re already offering it to our customers using the Bot Framework and LUIS".

Working with Microsoft Services, Dixons Carphone has developed a Chatbot called Cami that is designed to help customers navigate the world of technology. Cami currently accepts text-based input in the form of questions, and she also accepts pictures of products’ in-store shelf labels to check stock status. The bot uses the automated AI capabilities in LUIS for conversational abilities, and the Computer Vision API to process images. Dixons Carphone programmed Cami with information from its online buying guide and store colleague training materials to help guide customers to the right product.

Rockwell Automation has customers in more than 80 countries, 22,000 employees, and reported annual revenue of US $5.9 billion in 2016. To give customers real-time operational insight, the company decided to integrate the Windows 10 IoT Enterprise operating system with existing manufacturing equipment and software, and connect the on-premises infrastructure to the Microsoft Azure IoT Suite. Instead of connecting an automation controller in a piece of equipment to a separate standalone computer, the company designed a hybrid automation controller with the Windows 10 IoT Enterprise operating system embedded next to their industry leading Logix 5000TM controller engine. The solution eliminates the need for a separate standalone computer and easily connects to the customer’s IT environment and Azure IoT Suite and Cognitive Services including LUIS for advanced analytics.

LUIS is part of a much larger portfolio of capabilities now available on Azure to build AI applications. I invite you to learn more about how AI can augment and empower every developer as shown in Figure 5. We’ve also launched the AI School to help developers get up to speed with all of the AI technologies shown in Figure 4.

Figure 5: Resources for developers to get started with AI technologies.

Dive in and learn how to infuse conversational AI into your applications today.

↧

Using AI to automatically redact faces in videos

February 22, 2018, 3:00 am

≫ Next: VNet service endpoints for Azure SQL Database now generally available

≪ Previous: LUIS.AI: Automated Machine Learning for Custom Language Understanding

In the last few years, many law enforcement agencies have adopted body worn cameras. In this blog post, I will provide some background on what is driving the growth and will talk about how AI can help law enforcement agencies with the processing of videos captured by body-worn cameras.

Background on body-worn cameras

A body worn camera is a wearable audio, video or photographic recording system. Law enforcement agencies are not the only consumers of body-worn cameras. Other consumers include journalists, medical professionals, athletes, and so on. The forecast unit shipments of body-worn cameras can be seen on this webpage published by Statista.

The National Institute of Justice (NIJ), the research, development and evaluation agency of the US Department of Justice, conducted research on body-worn cameras for law enforcement and conducted a market survey on body-worn cameras for criminal justice. The survey updated in 2016, aggregates and summarizes information on a number of makes and models of body-worn cameras available today, including the approximate costs of each unit. The full market survey on body-worn camera technologies can be found on NIJ’s website.

Freedom of Information Act (FOIA)

FOIA is defined on foia.gov as a law that gives citizens the right to access information from the federal government. It is often described as the law that keeps citizens in the know about their government. Per the law, federal agencies are required to disclose any information requested under the FOIA unless it falls under one of nine exemptions which protect interests such as personal privacy, national security, and law enforcement.

The law empowers any citizen to submit a FOIA request. Upon receiving a request the agency will typically search for records and then review the records, to determine which and what parts of the records can be released. The agency will then redact any information protected from disclosure by one of the FOIA’s exemption. The records in question can include documents, images and/or videos. Redacting documents and images are relatively easier tasks, but redacting videos is challenging.

Video redaction challenge

The amount of video being archived by law enforcement agencies has increased in recent years and it is expected to grow at an accelerated rate due to the adoption of body-worn cameras and dashcams by police officers. Which has been driven aggressively after several highly publicized cases of videos showing questionable police action. There is also more video being archived from crime sites via surveillance cameras, cellphones, and camcorders of bystanders.

Redacting videos requires a good understanding of video editing software. That translates to either training police officers to work with video editing software or having a dedicated staff for redacting videos. Video redaction is also a time-consuming activity. Depending on the complexity of the video, it can take upwards of 10 minutes to redact a single minute of video.

Some police departments had to stop or delay the roll-out of body-worn cameras due to these challenges. This happened with a police department in 2014, after an anonymous person asked for all videos from dashboard mounted cameras and planned to request them from body-worn cameras as well.

Body-worn camera vendors

Body-worn camera vendors have seen their business boom due to the adoption of their cameras. Most of the vendors also offer cloud based archival of videos captured by the cameras. The vendors are aware of the challenges associated with video redaction. Solving the video redaction problem is a great business opportunity for them as they can increase their revenues by providing redaction as a premium service.

Using AI for video redaction

AI technologies have matured in recent years and have also become economically viable. At Microsoft, our research teams have developed an AI based algorithm for detecting, tracking, and redacting faces in videos and is available for customers to use as part of Azure Media Analytics. To learn more, see this detailed documentation on how to redact faces. The current approach is the result of us working with various vendors and involves dividing the redaction process in to two parts:

Face detection and tracking.
Redaction.

We arrived at this approach based on us working together with various body-worn camera vendors. We initially started with a 100% automated approach, i.e. taking videos captured by body-worn cameras as input and generating an output video with all faces redacted. While technically this was great, it actually didn’t solve the business problem. Law enforcement agencies did not want all faces to be redacted and while we have a great algorithm for detecting and tracking faces, some faces can sometimes be missed due to a variety of reasons such as faces being partially covered, fast motion, etc. This is when we decided to split the workflow into two parts.

↧

VNet service endpoints for Azure SQL Database now generally available

February 22, 2018, 4:00 am

≫ Next: Deploying WordPress application using VSTS and Azure – part one

≪ Previous: Using AI to automatically redact faces in videos

This blog post was co-authored by Anitha Adusumilli, Principal Program Manager, Azure Networking.

We are excited to announce the general availability of Virtual Network (VNet) Service Endpoints for Azure SQL Database in all Azure regions. This ability allows you to isolate connectivity to your logical server from only a given subnet or set of subnets within your virtual network. The traffic to Azure SQL Database from your VNet will always stay within the Azure backbone network. This direct route will be preferred over any specific routes that take Internet traffic through virtual appliances or on-premises.

There is no additional billing for virtual network access through service endpoints. Current pricing model for Azure SQL DB applies as is.

VNet service endpoints for SQL Data Warehouse (DW) continues to be in public preview, for all Azure regions.

Firewall rules and VNet Service Endpoints can be used together

Turning on VNet Service Endpoints does not override Firewall rules that you have provisioned on your SQL Server or Database. Both continue to be applicable.

VNet Service Endpoints don’t extend to on-premises. To allow access from on-premises, Firewall rules can be used to limit connectivity only to your public (NAT) IPs.

To enable VNet protection, first enable service endpoints for SQL in the VNet.

On the SQL Server, you can allow access to multiple subnets belonging to one or more VNets. It is also possible for you to configure Firewall rules in conjunction to your VNet rules.

Turning on service endpoints for servers with pre-existing firewall rules

When you connect to your server with service endpoints turned on, the source IP of SQL connections will switch to the private IP space of your VNet. If at present, your server or database firewall rules allow specific Azure public IPs, then the connectivity will break until you allow the given VNet/subnet by specifying it in the VNet firewall rules. To ensure connectivity, you can preemptively specify VNet firewall rules before turning on service endpoints by using IgnoreMissingServiceEndpoint flag.

Support for ASE

As part of GA, we now support service endpoints for App Service Environment (ASE) subnets deployed into your VNets.

Next Steps

To get started, refer to the documentation Virtual Network Service Endpoints and VNet Service Endpoints and rules for Azure SQL Database.

For feature details and scenarios please watch the Microsoft Ignite session, Network security for applications in Azure.

↧

Getting Started

Authorization Flow Chart

Register a new application in Azure AD

Features of Spring Security Azure AD

Access Control with Azure AD Group

Next Steps

Feedback

The original MSAsset architecture

Using a phased migration approach

Creating a read-only Azure SQL Database using Transactional Replication

Cleanup opportunities

Redesigning around compatibility issues

Understanding networking and connectivity with Azure SQL Database

Modernizing the platform and unlocking cloud scalability

Reducing high severity issues and database management overhead

Next Steps

Colorization for inactive regions

Join the Insiders program

Tell us what you think

One Email Rule

One Email Rule

Key features

Core benefits

Get started

Related links and additional content

Sync data between many databases

Sync data between databases with many tables

Optimize the sync initialization

Experiment of sync initialization performance

Conclusion

Download the GxP qualification guidelines for Azure.

Monitor end-to-end connectivity to applications

Correlate application delivery with network performance

Identify troublesome network interfaces and links

Built-in tests for Microsoft Office 365 and Microsoft Dynamics 365

Create custom queries and views

Get Started

Please send your feedback

Building Language Understanding Model with LUIS

Customer Stories

Background on body-worn cameras

Freedom of Information Act (FOIA)

Video redaction challenge

Body-worn camera vendors

Using AI for video redaction

Firewall rules and VNet Service Endpoints can be used together

Turning on service endpoints for servers with pre-existing firewall rules

Support for ASE

Next Steps