For most companies, machine learning seems to be something super complex, expensive and requires serious specialists. And if you intend to create a new Netflix recommendation system, then it is. However, the trend of turning everything into a service has also affected this complex area. It is possible to start an ML project from scratch without much investment, and this is the right decision if your company is new to data science and wants to start with solving the simplest problems.
One of the most inspiring stories about ML is a story about a Japanese farmer who decides to automatically sort cucumbers to help his parents with this tedious job. Unlike large corporations, this guy had neither experience in machine learning nor a large budget. However, he managed to master TensorFlow and apply deep learning to recognize different classes of cucumbers.
With machine learning cloud services, you can start building your first working models, drawing valuable insights from predictions even with a small team. Let’s take a look at the best machine learning platforms on the market and talk about the infrastructure decisions that need to be made.
What is Machine Learning as a Service
Machine learning as a service (MLaaS) is an umbrella term that combines various cloud platforms that solve most infrastructure tasks, including data preprocessing, training and model evaluation with further predictions. Forecasts can be linked to the company’s internal IT infrastructure via the REST API.
Amazon Machine Learning, Azure Machine Learning, Google AI Platform, and IBM Watson Machine Learning are the four leading MLaaS cloud services for rapid model training and deployment. If you are assembling your own data science team from the company’s software developers, then you should start with them.
In this article, we will first provide a brief overview of the major machine-learning-as-a-service platforms from Amazon, Google, Microsoft, and IBM, and then compare the machine learning APIs that these companies support. It is worth noting that this review will not provide comprehensive instructions on how and when to use these platforms, but rather information on what to look for when studying their documentation.
If you want a drag-and-drop interface, try Microsoft ML Studio first. In terms of simulation platforms, all four companies listed above provide similar products.
Amazon Machine Learning and SageMaker
Amazon has two main machine learning products. The first platform is called Amazon Machine Learning, the second, newer, is SageMaker.
Amazon Machine Learning
Amazon Machine Learning for predictive analytics is one of the most automated ML solutions on the market and is best suited for operations where deadlines are critical. The service can load data from multiple sources, including Amazon RDS, Amazon Redshift, CSV files, and so on. All data pre-processing operations are performed automatically: the service determines which of the fields are categorical and which are numeric, and does not ask the user to select methods for further data pre-processing (dimensionality reduction and data whitening).
Amazon ML’s predictive capabilities are limited to three options: binary classification, multi-class classification , andregression . However, Amazon ML does not support unsupervised learning techniques, and the user must select a target variable to mark up in the training dataset. In addition, the user is not required to know machine learning techniques, as Amazon automatically selects them after examining the provided data.
It is worth remembering that after 2021, Amazon no longer updates either the documentation or the Machine Learning platform itself. The service is still running, but is not accepting new users. This is because SageMaker and its related services outperform AML in every way, and essentially provide the same functionality to users.
Predictive analytics can be used as real-time data or on-demand using two separate APIs. The only thing to consider is that Amazon currently seems to be focusing on its more powerful ML services, such as SageMaker, described below.
This high level of automation is both a strength and a weakness when using Amazon ML. If you need a fully automated but limited solution, AML fits your expectations. Otherwise, choose SageMaker.
SageMaker is a machine learning framework designed to simplify the work of a data scientist. It provides tools for quickly creating and deploying models. For example, she has a Jupyter notebook to make it easier to explore and analyze data without the hassle of managing a server.
In 2021, Amazon launched SageMaker Studio, the first IDE for machine learning. This tool provides a web interface that allows you to run all ML model training tests in a single environment. All development methodologies and tools, including notebooks, debugging tools, data modeling, and automated generation, are available in SageMaker Studio.
Amazon also has built-in algorithms optimized for large data sets and computing in distributed systems. Including:
- Linear learner – supervised learning technique for classification
- Factorization machines for classification and regression ; designed for sparse data arrays.
- XGBoost is a supervised tree boosting algorithm that improves the accuracy of predictions in classification, regression, and ranking problems by combining the predictions of simpler algorithms.
- Image classification is based on ResNet which can also be used for transfer learning.
- Seq2seq is a supervised algorithm for predicting sequences (for example, to translate sentences, convert strings of words to shorter summaries, and so on).
- K-means is an unsupervised learning technique for clustering
- Principal component analysis is used for dimensionality reduction.
- Latent Dirichlet allocation is an unsupervised technique used to find categories in documents.
- Neural topic model (NTM) – an unsupervised technique exploring documents; identifies top-ranking words and defines topics (users cannot pre-define topics, but they can specify the expected number of topics).
- DeepAR forecasting is a supervised learning algorithm used to predict time sequences; uses recurrent neural networks (RNN).
- BlazingText is a natural language processing (NLP) algorithm built on top of Word2vec that allows you to match words in large collections of texts with vector descriptions.
- Random Cut Forest is an unsupervised anomaly detection algorithm capable of assigning each sample data an anomaly score.
- Learning to Rank (LTR) is a plug-in for Amazon Elasticsearch that allows you to rank search results queries using ML.
- K-nearest neighbor (k-NN) is an index-based algorithm that can be used in combination with the Neural Topic Model to create custom recommendation services. In addition, there is a separate Amazon Personalize engine for real-time recommendations, used by Amazon.com itself.
SageMaker’s built-in techniques overlap heavily with the ML APIs offered by Amazon, but here data scientists can experiment with them and use their own datasets.
If you don’t want to use them, you can add your own techniques and run models with SageMaker using its deployment features. Or you can integrate SageMaker with TensorFlow, Keras , Gluon , Torch, MXNet, and other machine learning libraries .
In general, Amazon’s machine learning services provide enough freedom for both experienced data scientists and those who need to complete tasks without deepening into dataset preparation and modeling. These services can be a solid option for companies that already use Amazon cloud services and do not plan to switch to other cloud providers.
The popularity of DevOps in the software development community has given rise to the concept of “MLOps“. DevOps is a software development technique that involves merging development and operations teams to optimize software development processes through a system of short and quick releases. It is implemented by applying a high level of automation to routine tasks. MLOps, in turn, applies the same principles to machine training, leading to automated data management, model training/deployment, and monitoring.
That is why MLaaS service providers have begun offering tools to companies using MLOps to manage these machine learning pipelines. Amazon has released the MLOps framework for building and managing an MLOps infrastructure. It has a template architecture that contains standard AWS services, which allows you to quickly start building your own system on top of them.
Microsoft Azure AI Platform
Azure AI Platform is a single platform for machine learning with APIs and infrastructure services. Below I will list all the major services offered by Azure for machine learning solutions.
Azure Machine Learning
Azure Machine Learning is the primary framework for data management, training, and model deployment.
The platform has Machine Learning Studio, a low-code web environment that allows you to quickly configure machine learning operations and pipelines. In general, Azure Studio has tools for exploration, data preprocessing, method selection, and validation of simulation results. Studio supports approximately one hundred techniques used in classification (binary+multiclass), anomaly detection , regression, recommendations, and text analysis . It is also worth noting that the platform has one clustering algorithm (K-means ).
Like the Amazon platform, Azure offers Jupyter integration for writing and running code in ML Studio. It also provides ONNX Runtime to accelerate ML models on various operating systems, hardware platforms and frameworks. Runtime can also be used to organize communication between different ML frameworks. Azure supports popular frameworks like TensorFlow, PyTorch, scikit-learn, and more.
There are many features in ML Studio that you need to know about.
- Azure Machine Learning designer is a graphical drag-and-drop UI for ML studio that provides access to and control over platform features. In it, you can modify data, apply ML techniques, and deploy solutions on the server.
- Automated ML is an SDK that provides no code or low code model training. In fact, Automated ML complements ML studio by providing a high degree of automation of routine tasks and support for data mining, model tuning and deployment. Azure developers indicate that classification, regression, and time series tasks are applicable to learning with Automated ML tools.
- The Azure ML Python and R SDKs are fully integrated into ML Studio.
- Support for ML frameworks like PyTorch, TensorFlow and scikit-learn. In addition, Azure provides inter-framework communication using the ONNX Runtime.
- Modular pipelines are built into the platform , allowing the data science team to create specialized data pipelines for their own machine learning project.
- Support for data markup projects , including data and team management tools, markup progress, markup progress tracking, and markup study.
- Customizable compute target platforms for deploying models support various cloud services such as Azure Kubernetes services, container instances, and compute clusters
- There is an MLOps toolkit for managing models, deploying them and monitoring them within automated pipelines.
Both ML Designer and Automated ML provide inexperienced users with the means to create ML solutions. In turn, Azure Machine learning studio contains many features that can be used by expert data scientists in enterprise-level solutions. However, it does not limit these tools because Azure ML is designed to be used as a single platform with all its capabilities.
Learning to master Azure-based machine learning is required. But in the end, it leads to a deeper understanding of the underlying technologies in this area. The Azure ML GUI visualizes each step of the workflow and helps beginners. Perhaps the main advantage of using Azure is the variety of algorithms that you can experiment with.
Azure AI Gallery
Another big part of Azure ML is the Azure AI Gallery. This is a community-provided collection of machine learning solutions; it can be studied and reused by data scientists. Azure is a powerful tool for getting started with machine learning and introducing new employees to its capabilities.
Azure currently has the Azure Percept product open in preview mode . The main idea of Percept is to provide an SDK for creating models that can be integrated into hardware devices of Microsoft partners. This makes it easy to create and integrate computer vision or speech recognition tools. In addition, a whole range of APIs can be connected to the system, which we will discuss below.
Platform updates for 2019 mainly focus on the Python Machine Learning SDK and the launch of Azure ML Workspaces (in fact, this is the UI for the ML platform). They allow developers to deploy models, visualize data, and work on data preparation all in one place.
Google AI Platform (Unified)
Google AI Platform (Unified) combined ML tools that previously existed separately. The platform consists of the AI Platform (Classic), AutoML, frameworks and APIs that are inside the AI Platform Unified. Let’s look at them separately.
AI Platform (Classic)
It should be understood that the AI Platform Classic is a tool that includes many features for machine learning specialists and data scientists. AI Platform Classic offers the following services for building custom models:
Training Service provides an environment for building models using built-in algorithms (currently in beta) or custom algorithms. Users can upload their own learning methodologies or create custom containers to install the learning application.
Predictive Service allows you to integrate the generated forecasts into business applications or any other service.
Data Labeling Service is a tool that requires a team of people to label data. The service supports video, text, and image markup, which will be processed according to your instructions.
Deep Learning Image provides a virtual machine image for deep learning tasks. The image is pre-configured for ML and data science tasks, with popular frameworks and tools preinstalled.
In AI Platform Notebooksthe user can create and manage virtual machine instances and configure memory types for data processing (CPU or GPU). The tools are also pre-integrated with TensorFlow and PyTorch instances, deep learning packages, and Jupyter Notebook.
AI Platform Classic is designed for advanced users.
Google Cloud AutoML
Google Cloud AutoML is a cloud-based ML platform for building data-driven solutions in a no-code fashion. AutoML allows both novice and experienced machine learning engineers to create custom models. However, the platform also allows you to use a set of ready-made models available through the API. We will review them below.
The core concept of the Google platform can be described as building blocks for AI . In essence, these are different tools like AutoML, TensorFlow, and APIs that should be used together when building ML solutions. This means that you can combine your own model and pre-trained models in one product.
In addition, ML solutions can be deployed on your website or in a specialized AI Infrastructure containing various data processing techniques on the GPU or CPU. Of course, AutoML is fully integrated with all Google services and stores data in the cloud. Trained models can be deployed using the REST API.
If we perceive the platform as a single entity, then there are two types of solutions that can be used by different users. AI Platform (Classic) provides advanced options for building your own models and manually managing algorithms and learning processes. It is more suitable for experienced machine learning developers. AutoML involves building models, using data, and integrating forecasts on a no-code basis.
TensorFlow is another Google product that is an open source machine learning library for various data science tools, not an ML-as-a-service. It doesn’t have a visual interface, and learning TensorFlow can be quite tricky. However, the library is also intended for software developers planning to move into the field of data science. Google TensorFlow is quite powerful, but is mainly aimed at deep neural network tasks.
In essence, the combination of TensorFlow and Google Cloud service involves infrastructure-as-a-service and platform-as-a-service solutions in accordance with the three-tier cloud service model. We talked about this concept in our white paper on digital transformation .
Google’s MLOps solution provides AWS-like capabilities for building and managing machine learning pipelines. However, since Azure has a modular system configured for use with ML Studio, it seems to be the most advanced of the three vendors.
IBM Watson Machine Learning Studio
The IBM Machine Learning platform is organized similarly to the platforms of the vendors described above. Strictly speaking, two approaches are possible in the system: automated and manual (for experienced teams).
Watson Studio and AutoAI
Watson Studio includes AutoAI , which provides a fully automated data processing and model building interface that requires little or no training to get started processing data, preparing models, and deploying them to production.
The automated part can solve three main types of problems: binary classification, multiclass classification , and regression . You can choose either a fully automated approach or manually select the ML technique to use. Currently, IBM has ten methods for performing these three groups of tasks:
- Logistic regression
- decision tree classifier
- Random forest classifier
- Gradient boosted tree classifier
- Naive Bayes
- linear regression
- decision tree regressor
- random forest regressor
- gradient boosted tree regressor
- Isotonic regression
In addition to AutoAI, there are two more services that can be used to create models.
SPSS Modeler. SPSS is a software package used to transform data into statistical business information. Acquired by IBM in 2009 and integrated as a standalone ML service, this non-GUI product allows you to load an array of data, use SQL statements to manipulate data, and train models to manage business information.
Neural Network and Deep Learning
The Neural network and deep learning service is slightly different from SPSS Modeler. This is a tool for modeling neural networks using a special GUI . The service is integrated into Watson Studio, which allows you to manage data using its built-in data integration tool. The main scope of the service is deep learning functions and training on big data. In addition, neural network services are integrated with a set of ML frameworks like Keras, PyTorch, and TensorFlow.
Separately, IBM offers a deep neural network training workflow with a flow editor interface similar to that used in Azure ML Studio.
If you need advanced features, IBM ML has laptops (like Jupiter) for manually programming models using popular frameworks like TensorFlow, scikit-learn, PyTorch, and more.
Let’s sum up the review of machine learning as a service (MLaaS) platforms: Azure seems to have the most flexible set of tools in the MLaaS market at the moment. It allows you to solve most of the tasks related to ML, provides two products for creating your own models, and has a decent set of APIs for those who do not want to fight data science with their bare hands.
One of the latest updates, introduced in 2019, is a departure from further development of the old model builder system that AutoAI replaced. Models trained with the model builder can still work in ML Studio, but new models can now be trained in AutoAI. Other updates include support for the latest versions of TensorFlow and Python.
Comparison of Amazon, Microsoft, Google, and IBM Machine Learning APIs
Along with full-featured platforms, high-level APIs can also be used. These are services that contain ready-made trained models; they can send data and receive results. APIs require absolutely no experience in machine learning. Currently, the APIs of these four vendors can be generally divided into three large groups:
1) Text recognition and translation, text analysis
2) Image + video recognition, as well as the corresponding analysis
3) Other – this includes specific services that do not correspond to any or categories.
Microsoft offers the richest list of features. However, the most important of them are provided by all four companies.
Speech and Text API: Amazon
Amazon provides several APIs for common text analytics tasks. In addition, from the point of view of machine learning, they are highly automated and require only the right integration to work.
Amazon Lex. The Lex API is designed to embed chatbots in applications, it has automatic speech recognition (ASR) and natural language processing (NLP) capabilities. They are based on deep learning models. The API can recognize written and spoken text, and the Lex interface allows you to associate recognized input with various backend solutions. Of course, Amazon encourages the use of its Lambda cloud environment.. So before you buy a subscription to Lex, get to know Lambda as well. Along with standalone apps, Lex currently supports chatbots for Facebook Messenger, Slack, and Twilio.
Amazon Transcript. Lex is a comprehensive tool aimed at building chatbots, while Transcribe is purely for spoken text recognition. The tool can recognize multiple interlocutors and works with low-quality audio of telephone conversations. This API makes it convenient to use for cataloging audio archives; it can also be a good support for further analysis of call center data.
Amazon Polly. The Polly service is sort of the opposite of Lex. It turns text into speech, which allows chatbots to respond with voice. However, it cannot compose the text itself, it simply makes it look like human speech. If you’ve used Alexa, you’ll roughly understand how that sounds. The instrument currently supports female and male voices in more than thirty languages, mainly Western European and English dialects. Some languages have multiple male and female voices. Like Lex, Polly is recommended for use with Lambda.
Amazon Competitive. Comprehend is another set of NLP APIs that, unlike Lex and Transcribe, target different text analysis tasks. Comprehend currently supports the following features:
- Entity extraction (recognition of names, dates, organizations, and so on)
- Keyword recognition
- Language detection
- Analysis of emotional mood (positive, neutral or negative text)
- Topic modeling (determining dominant topics using keyword analysis)
This service helps in the analysis of responses and comments in social networks, as well as other voluminous textual data that cannot be subjected to manual analysis; for example, a combination of Comprehend and Transcribe will help in analyzing the emotional state when communicating with the support service on the phone.
Amazon Translate. As the name implies, the Translate service translates texts. Amazon claims that it uses neural networks that provide improved translation quality “compared to rule-based translation systems.”
Text and Speech API: Microsoft Azure Cognitive Services
Like Amazon, Microsoft offers high-level APIs called Cognitive Services that you can integrate into your infrastructure and perform tasks without having a background in data science.
Speech. The speech set contains four APIs that apply different types of NLP techniques for natural speech recognition and other operations:
- Translator Speech API
- Bing Speech API for text-to-speech and speech-to-text
- Speaker Recognition API for voice verification tasks
- Custom Speech Service to leverage Azure NLP capabilities with your own data and models
language. The API Language Group focuses on text analysis similar to Amazon Comprehend:
- Language Understanding Intelligent Service (LUIS) is an API that parses intentions in text that should be recognized as commands (for example, “launch the YouTube app” or “turn on the light in the living room”)
- Text Analysis API is used to analyze the emotional mood and determine themes
- Bing Spell Check
- Translator Text API
- Web Language Model API evaluates word combination probabilities and supports automatic word completion
- The Linguistic Analysis API is used to split sentences, mark up parts of speech, and split texts into marked phrases.
Speech and Text API: Google Cloud ML Services/ Cloud AutoML
While this set of APIs mostly overlap with those offered by Amazon and Microsoft Azure, it has interesting and unique features. Since the AutoML platform has replaced the Prediction API, it now extends the capabilities of Google Cloud ML services. Therefore, any Google API related to automated machine learning is a viable option for training your own models.
Dialogflow. Various chatbots are now at the peak of popularity, so Google also has its own offer. Dialogflow is based on NLP technologies and is designed to detect intent in a text and interpret what a person wants. The API can be customized to your needs using Java, Node.js, and Python.
natural language API. It is almost identical in its basic functionality to Amazon’s Comprehend and Microsoft’s Language.
- Definition of entities in text
- Recognition of emotional state
- Analysis of syntactic structures
- Breaking down topics into categories (e.g. food, news, electronics, etc.)
Speech-to-Text API. This service recognizes natural speech; probably its main advantage over similar APIs is the large number of languages supported by Google. Currently, its dictionary works with more than 125 languages of the world and their variants. It also has additional features:
- Word hints allow you to customize recognition for specific contexts and words that can be spoken (for example, to better understand local or professional jargon)
- Inappropriate content filtering
- Noisy audio processing
Cloud translation API. In fact, this API can be used to enable Google Translate in your products. It contains over a hundred languages and automatic language detection.
AutoML Natural Language API. It allows you to load training data through AutoML UI and train your own models. It has the following features:
- Definition of content in English
- Definition of entities in text
- Syntactic structure analysis
AutoML translation API. The Translation API is currently in beta and only contains information about custom modeling capabilities. Please note that it will be updated in the future.
Speech and Text API: IBM Watson
IBM also competes for the API market. Let’s examine the list of company interfaces.
Speech to text. IBM currently offers speech recognition in nine languages. The API can recognize multiple speakers, find keywords, and process lossy audio. The API has an interesting feature – it intercepts word alternatives and reports them. For example, if the system found the word “Boston” , then it might assume that this could be an alternative to the word “Austin” . After analyzing this hypothesis, the API assigns a confidence score to each alternative.
Text to Speech. Curiously, the text to speech languages only overlap with the languages in the speech to text API. Both products support Western European languages, however Korean and Chinese are missing from Text to speech. In English, German and Spanish you can choose between female and male voices, in other languages only female voices can be used. This is in line with the trends: mostly a female voice is chosen for voice assistants.
language translator. This API supports 48 languages for English and English translation. You can also add your own models and expand the list of available languages. At the moment, the Translator API has been rewritten as a separate service with its own pricing model.
natural language classifier. Unlike most of the APIs mentioned, the IBM classifier cannot be used without your own dataset. Essentially, this tool allows you to train models on your own business data and then classify the incoming records. Among other things, it will be used for product labeling in online commerce, fraud detection, categorization of messages, social media feeds, and so on.
natural language understanding. IBM’s language understanding feature set is impressive. Along with standard information extraction, such as keyword and entity extraction with parsing, the API offers many interesting features not found in other vendors. Their list includes the analysis of metadata and the search for relationships between entities. In addition, IBM offers a separate environment for training your own models for text analytics using Knowledge Studio.
Personality insights. This rather unusual API allows you to analyze texts and extract signs of how the author interacts with the world. Essentially, this means that the system returns the following data:
- Personality characteristics (such as agreeableness, conscientiousness , extraversion , emotional range , and openness )
- Needs (e.g. curiosity, admiration, desire to overcome difficulties )
- Values (eg helping others, achieving success, hedonism ).
Based on this data, the API can draw conclusions about consumer preferences (for example, music, studies, movies ). Most often, this system is used to analyze user-generated content for the purpose of accurately marketing a product.
It is important to note that Personality Insights has been discontinued and support for existing instances ended at the end of 2021.
tone analyzer. Tone analyzer is a standalone API that aims to analyze sentiment in social media research and various customer retention analytics. Don’t be fooled by its rather ambiguous name. Tone analyzer only studies written text, not speech.
Along with work on the test and speech, Amazon, Microsoft, Google and IBM provide quite flexible APIs for image and video analysis.
Most flexible image analytics toolkit now available on Google Cloud
Although image analytics overlaps a lot with video APIs, many video analytics tools are still in development or in beta. For example, Google provides extensive support for various image processing tasks, but the company clearly lacks the video analysis features already available from Microsoft and Amazon.
Microsoft seems to be winning, but we still think Amazon has the most efficient video analytics APIs because they support video streaming. This feature significantly expands the range of possible applications. IBM does not provide an API for video analysis
Image and Video Processing API: Amazon Rekognition
The Rekognition API is used for image recognition tasks and, more recently, video. It has the following features:
- Object recognition and classification (finding and recognizing different objects in images and determining what they are),
- In video, the API can recognize actions like “dance” or complex actions like “putting out a fire”
- Face detection (for detecting faces and matching them) and facial analysis (this feature is quite interesting, it detects smiles, analyzes eyes, and even determines the emotional tone in the video)
- Invalid Video Recognition
- Celebrity recognition in images and videos.
Image and Video Processing API: Microsoft Azure Cognitive Services
Microsoft’s Vision package combines six APIs dealing with different types of image, video, and text analysis.
- Computer vision recognizes objects, actions (such as walking), handwritten and printed text, and determines the predominant colors in images
- Content moderator recognizes inappropriate content in images, texts and videos
- Face API recognizes faces, groups them, determines age, emotions, genders, poses, smiles and facial hair
- Emotion API is another face recognition tool that describes facial expressions
- Custom Vision Service supports creation of face recognition models based on your own data
- Video indexer is a tool for finding people in videos, determining the emotional tone of a speech, and marking up keywords.
Image and Video Processing API: Google Cloud Services/ Cloud AutoML
cloud vision AI. This tool is designed for image recognition tasks; it is powerful enough to find specific image attributes:
- Object markup
- Face recognition and expression analysis (without recognition and identification of specific faces)
- Finding identifying objects and describing the scene (for example, “vacation”, “wedding” and so on)
- Finding texts in images and detecting languages
- Dominant colors
Cloud Video AI. This Google Video Recognition API is in early development and lacks many of the features available in Amazon Rekognition and Microsoft Cognitive Services. The API currently provides the following tools:
- Marking up objects and defining actions
- Identification of content with age restrictions
- Speech transcription
AutoML Vision API. AutoML also comes with many model training products, the first of which was AutoML Vision. Since all AutoML APIs are currently in beta, the product provides the following features:
- Object Markup and People Capture Markup Service
- Registering Trained Models in AutoML
AutoML Video Intelligence Classification API. This is a pre-release video processing API that is able to classify individual video frames using your own data labels.
While Google AI services may lack some features in terms of feature list, the power of the Google API lies in the vast arrays of data that Google has access to.
Image Processing API: IBM Visual Recognition
IBM’s Visual Recognition API does not currently support the video analysis that other vendors already have (which is why there is no mention of video in the title). And the image recognition engine offers a basic feature set that is quite limited compared to what other vendors offer:
- Object recognition
- Face recognition (API returns age and gender)
- Food recognition (For some reason, IBM developed a separate model for food)
- Recognition of inappropriate content
- Text recognition (this part of the API is in closed beta, so access to it must be requested separately)
This API has also been discontinued and will support existing instances until the end of 2021. And it looks like IBM hasn’t come up with an alternative yet. The Visual Recognition API functions have been partially integrated into the Neural Network and Deep Learning service.
Specific APIs and Tools
Here we will talk about specific APIs and tools from Microsoft and Google. We didn’t include Amazon in this section because its API suites just fit into the above categories of text analytics and image+video analytics. However, some of the capabilities of these specific APIs are also present in Amazon products.
Azure Bot Services. Microsoft has put a lot of effort into providing its users with a toolkit for agile bot development. In fact, the service contains a fully functional environment for creating, testing and deploying bots using various programming languages.
Curiously, Bot Service does not necessarily require machine learning. Since Microsoft provides five templates for bots (simple, form, language understanding, proactive, and Q&A), only the language understanding type requires advanced AI techniques.
You can currently use .NET and Node.js technologies to build bots with Azure and deploy them to the following platforms and services:
- web chat
- office 365 email
- Group Me
- facebook messenger
AWS ML hardware . Amazon’s recently introduced physical products come with dedicated APIs for programming hardware with deep/machine learning models. Amazon’s ML-based product line is represented by three devices:
AWS DeepLens is a programmable camera that is used to apply ML to real hardware. In this case, you can use Amazon ML services with this camera, which allows you to recognize visual data and train ML models based on it.
AWS DeepRacer is another piece of hardware from the ML suite, which is essentially a 1/18 scale RC car using reinforcement learning.
AWS Inferentia is a chip designed for deep learning processing that can be used to reduce compute costs. It supports TensorFlow, PyTorch and Apache MXNet.
Bing Search from Microsoft. Microsoft offers seven APIs that connect to Bing’s basic search functionality, including autocomplete, news, image, and video search.
Knowledge from Microsoft. This API group combines text analysis with a wide range of unique tasks:
- Recommendations API allows you to create recommendation systems to personalize purchases
- The Knowledge Exploration Service allows you to enter natural language queries for extracting data from databases, visualizing data, and autocomplete
- The Entity Linking Intelligence API is designed to highlight names and phrases denoting real entities (for example, “The Great Geographical Discoveries”) and to resolve ambiguities
- The Academic Knowledge API auto-completes words, finds similarities in documents by words and concepts, and looks for chart patterns in documents
- The QnA Maker API can be used to match question variations with answers to create chatbots and helpdesk applications
- Custom Decision Service is a reinforcement learning tool for personalizing and ranking different types of content (e.g. links, ads, etc.) based on user preferences
Google Cloud Talent Solution. Unlike traditional job search engines that rely on exact keyword matches, Google uses machine learning to find relevant links among wildly different job descriptions and resolve ambiguities. For example, the engine seeks to reduce the number of irrelevant or overly broad jobs returned, such as all jobs with the keyword “assistant” for the query “sales assistant”. The main functions of the API are:
- Correct spelling errors in job search queries
- Matching the desired level of experience
- Finding relevant jobs containing varying expressions and industry jargon (for example, returning “barista” for “server” instead of “network specialist”; or “engagement specialist” for “biz dev”)
- Handling abbreviations (for example, returning “human resources assistant” for “HR”)
- Matching varying address descriptions
Watson assistant. Chatbot platform Watson (formerly called Conversation) is quite famous among AI engineers who specialize in conversational interfaces. IBM provides a full-featured infrastructure for building and deploying bots capable of live conversation using entity and user intent analysis in messages.
Engineers can use the built-in support for Facebook Messenger and Slack, or create a client application to run a bot on it.
The four platforms described in this article provide comprehensive documentation for experimenting with machine learning and deploying trained models on enterprise infrastructure. There are also many other ML-as-a-Service solutions offered by various start-ups and well-received by data scientists.
The next step
It is easy to get lost in the variety of solutions available. They differ in terms of algorithms, required skill level and tasks. This situation is quite typical for such a young market, and even the four considered solutions do not fully compete with each other. In addition, the speed of change is impressive. There’s a good chance you’ll pick one vendor, but all of a sudden another will come out with something that suits your business needs.
The right thing to do is to formulate what you want to achieve with machine learning as early as possible. It is not simple. Building a bridge between data science and business value can be difficult if you don’t have data science background or knowledge in your domain. Our company often encounters this problem when discussing the application of machine learning with our clients. It can usually be solved by simplifying the whole problem to a single attribute. This may be price prediction or other numerical value, object class or division of objects into several groups. Once you find this attribute, it will be easier for you to choose the supplier and the right product.
DCVC co-founder Bradford Cross says ML-as-a-service is not a viable business model. In his opinion, it fills the gap between data scientists who will use open source products and management who will buy tools to solve problems at a higher level. However, it looks like the industry is now overcoming its growing pains and we will soon see many more companies using ML-as-a-service to avoid hiring expensive employees while getting their hands on versatile data tools.