Monday, June 12, 2017

Oracle Data Visualization Cloud Service - upload files

Within the Oracle Cloud portfolio Oracle has positioned the Oracle Data Visualization Cloud Service as the tool to explore your data, visualize it and share your information with other people within the enterprise. The Oracle Data Visualization Cloud Service can be used as a part of a data democratization strategy within an enterprise to provide all users access to data where and whenever they need it. The concept of data democratization is discussed in another blogpost on this blog.

Currently the Oracle Data Visualization Cloud Service provides two main ways of getting data in the Oracle Data Visualization Cloud Service. One is by connecting it to a oracle database source, for example located in the Oracle database cloud service, another is by uploading a file. Primarily CSV and XLSX files are supported for uploading data.

In most cases it is not a best practice to upload files as the data is relative static and is not connected to a live datasource as you would have with a database connection. However, in some cases it can be a good way to get data in. Examples are; users who add their own content and do not have the means to connect an Oracle database or relative static data.

Example data from DUO
In the example below we add a relative static piece of data to the Oracle Data Visualization Cloud Service. This is a year over year report of people dropping out of schools in the Netherlands. The data is per year, per location and per education type and is freely available as open-data from the DUO website. You can locate the file for reference here.

Loading a data file
When a user wants to load data into the Oracle Data Visualization Cloud Service the most easy way to do so from and end-user perspective is to use the GUI. Loading data includes a limited number of steps.

1) within the Data Sources section navigate to "Create" - " Data Source" Here you can select the type file by default. When selected you are presented with the option to select a file on your local file system.


2) the next step, after the file is uploaded, is to verify and if needed modify the definition of the uploaded data.


3) after this step is completed you will find the file ready for you use in your data sources as shown below.

In effect, those are the only actions needed by a user to add data to the Oracle Data Visualization Cloud Service.

Tuesday, June 06, 2017

Oracle Cloud - The value of edge computing

A term you currently see coming up is edge computing, edge computing is the model where you push computations and intelligence to the edge of the network or the edge of the cloud. Or, in the words of techtarget.com ; “ Edge computing is a distributed information technology (IT) architecture in which client data is processed at the periphery of the network, as close to the originating source as possible. The move toward edge computing is driven by mobile computing, the decreasing cost of computer components and the sheer number of networked devices in the internet of things (IoT).

Analysts state that edge computing is a technology trend with a medium business impact which we will see surfacing in 2017.


Even though the business impact is generally seen as medium it provides a large technology wise benefit and understanding the concepts of edge computing can be of vital importance, especially when you are developing IoT solutions or geographically distributed systems that rely on machine 2 machine communication

The high-level concept
From a high-level point of view edge computing states that computations should be done as close as possible to the location where the data is generated. This implies that raw data should not be send to a central location for computations, the basic computations should rather be done at the collection point.

As an example, if you would do license plate recognition to decide if a security gate would open for a certain car you can architect this in a couple of ways. The “traditional” way of doing this is having a very light weight system which would take a number of pictures as soon as a car would trigger the camera. The pictures would then be send to a central server where a license plate recognition algorithm would extract the information from the picture and compare the result against a database to make a decision to open the security gate or not.

Architecting the same functionality with edge computing would involve a number different steps. In the edge computing model the car would trigger the camera and the pictures would be taken. A small computer, possibly embedded within the camera, would run the license plate recognition algorithm and only the result would be send to a REST API to check if the gate should be opened or should remain closed.

The benefit of edge computing in this case is that you would have a lot less data which needs to be communicated between the camera and the central server. Instead of sending a number of high resolution photos or even a video stream you only have to communicate a JSON object containing the license plate information.  By doing so you can limit the amount of computing power needed at the central location and at the same time improve the speed of the end user experience.

A more data intensive example
The example of the license plate recognition is a good illustration of the concept, a bigger scale and more data intensive example could be an example using smart devices.

Such an example could be industrial (or home use) equipment which relies on the data collected by a set of sensors to make decisions. If we would take an industrial example this could be a smart skid in factory responsible for ensuring that a number of liquid storage containers are always filled to a certain extend and are always at a certain temperature and mix.


Such a skid as described above involves a large set of sensors as well as a large set potential actions on valves, pumps and heating equipment. Traditionally this was done based upon an industrial PLC in a disconnected manner where it was not possible to centrally monitor and manage the skid.

Certain architecture blueprints state that the sensor data should be collected more centrally to ensure a more centralized management and monitoring solution. The result of this is that all data is being send to a central location where computations are done on the received data. The resulting actions are being communicated back again. The result of this is that a lot of data needs to be communicated back and forward and a loss in communication can result is a preventive shutdown of a skid.

In an edge computing architecture, the skid would be equipped with a local computing power solution which would take care of all the computations that are in other cases done in a central location. All decision making would be done locally on the edge of the network  and a subset of data and a log of the actions being undertaken would be send to a central command and control server where the remote skid can be monitored and human intervention could be triggered.

In this model the loss of connectivity would not result in a preventive shutdown, the operations would be continuing for a much longer time given the operational parameters that the edge computer holds.

Oracle Cloud and MQTT
As already mentioned in the example of the remote skid, a way to communicate the data in a IoT fashion is using MQTT. MQTT stands for Message Queue Telemetry Transport, It is a publish/subscribe, extremely simple and lightweight messaging protocol, designed for constrained devices and low-bandwidth, high-latency or unreliable networks. The design principles are to minimize network bandwidth and device resource requirements whilst also attempting to ensure reliability and some degree of assurance of delivery. These principles also turn out to make the protocol ideal of the emerging “machine-to-machine” (M2M) or “Internet of Things” world of connected devices, and for mobile applications where bandwidth and battery power are at a premium.

On this blog we already discussed MQTT in combination with Oracle Linux and the Mosquitto MQTT message broker.

To facilitate the growing amount of IoT devices and the principle of edge computing which is relying on MQTT communication Oracle has included MQTT in the Oracle Cloud. Oracle primarily positions MQTT in combination with the Oracle IOT Cloud service in the form of a MQTT bridge. The Oracle IoT Cloud Service MQTT Bridge is a software application that must be installed and configured on Oracle Java Cloud Service in order to enable devices or gateways to connect to Oracle IoT Cloud Service over the MQTT communication protocol.


Within the Oracle cloud you see the MQTT bridge as a solution to connect remote devices to the Oracle IoT cloud via the MQTT protocol. The MQTT bridge receives the MQTT traffic and "translates" it to HTTPs calls which communicate with the Oracle IoT cloud Service.

In conclusion
As already outlined in the above examples, processing a large part of the computations at the edge of the network and implementing the principles of edge computing will drastically reduce the amount of computing power and storage capacity you need in the Oracle Public Cloud. In many cases, you can rely on MQTT communication or HTTPS communication where you call a REST API.

By pushing a large part of the computations to the edge your remote systems and devices become more reliable, even in cases where network communication is not always a given, and the resulting services become faster. 

Sunday, June 04, 2017

Oracle Linux - Using Consul for DNS based service discovery

Consul, developed by hashicorp,  is a solution for service discovery and configuration. Consul is completely distributed, highly available, and scales to thousands of nodes and services across multiple datacenters. Some concrete problems Consul solves: finding the services applications need (database, queue, mail server, etc.), configuring services with key/value information such as enabling maintenance mode for a web application, and health checking services so that unhealthy services aren't used. These are just a handful of important problems Consul addresses.

When developing microservices it is important that as soon as new instance of a microservice comes online it is able to register itself to a central registry, this process we call service registration. As soon as an instance of a microservice is registered at the central registry it can be used in the load balancing mechanism. If a call to a service is to be initiated to a microservice the service needs to be discovered via the central registry, the process is called service discovery.

In effect two ways are common for service discovery. One is based upon an API discovery model where the calling service discovers the service by executing a call to the service registry based upon a HTTP REST API call and receiving an endpoint which can be used. Commonly an URL based upon a IP and a port number.

The other common way is using a DNS based lookup against a service registry. The effect of doing a DNS based lookup is that you need to ensure that all the instances of a service are always running on the same port on all instance. Enforcing the same port number might be somewhat limiting in cases where your port number for each service instance can vary.

Using Consul on Oracle Linux
In an earlier blogpost I already outlined how to install Consul on Oracle Linux. In this post I will provide a quick inisght in how you can configure it on Oracle Linux. We build upon the installation done in the mentioned post.

We take the example of a service where we have two instances for, the name of the service is web and in effect is nothing more than a simple nginx webserver running in a production instance. Every time we call the service we want to discover the call by using a DNS lookup, we do however want to have this balanced, meaning we want to have a different IP being returned from the DNS server.

Configure a service in consul
We configure the service "web" manually in this case by creating a JSON file in /etc/consul.d and ensure we have the information about the two instances in this JSON file. An example of the file is shown below

{
 "services": [{
   "id": "web0",
   "name": "web",
   "tags": ["production"],
   "Address": "191.168.1.10",
   "port": 80
  },
  {
   "id": "web1",
   "name": "web",
   "tags": ["production"],
   "Address": "191.168.1.11",
   "port": 80
  }
 ]
}

As you can see, we have one name, "web" with two ID's; "web0" and "web1". The id's are used to identify the different instances of the service web. as you can see they have both a port noted next to the address. Even though it is good practice to have this in the configuration file it will not be used in the response from the internal consul DNS service as DNS will only return the addresses and not the ports.

Discover the service via Consul DNS
If we want to discover the service we can have our code to undertake a lookup against the DNS server. If we have configured the underlying Oracle Linux instance to have the Consul server in your /etc/resolv.conf file it will happen almost automatically. It will be important to make sure the ordering of your DNS servers is done correctly to improve resolving speed.

In effect all services configured in Consul will be by default part of .service.consul which will mean that if we want to do a DNS resolving for the web service we will have to do a resolve for web.service.consul. In the below example I have consul running on my localhost at port 8600 and I use dig to explicitly force dig to resolve it at this DNS server. As stated, if you configure it correctly you do not have to explicitly call it and you should be able to do a DNS resolve as you always do.

[root@localhost consul.d]#
[root@localhost consul.d]# dig +noall +answer @127.0.0.1 -p 8600 web.service.consul 
web.service.consul. 0 IN A 191.168.1.10
web.service.consul. 0 IN A 191.168.1.11
[root@localhost consul.d]#

As you can see from the above example I will get the two addresses returned from the Consul DNS server.

Consul and load-balancing
As microservices are commonly build up out of a number of instances of the same service we do want to ensure that load-balancing is done. We can already see from the dig example above that there are two instances. However, having them always in the same order returned will not ensure that the load is balanced over the two instances.

Consul will by default do a load-balancing and will return the IP's in a different order by rotating them in the DNS response. In the below example you can see that this is done when we call the DNS server a couple of times.

[root@localhost consul.d]#
[root@localhost consul.d]# dig +noall +answer @127.0.0.1 -p 8600 web.service.consul 
web.service.consul. 0 IN A 191.168.1.10
web.service.consul. 0 IN A 191.168.1.11
[root@localhost consul.d]#
[root@localhost consul.d]# dig +noall +answer @127.0.0.1 -p 8600 web.service.consul 
web.service.consul. 0 IN A 191.168.1.11
web.service.consul. 0 IN A 191.168.1.10
[root@localhost consul.d]#

In conclusion
If you are running a microservices based IT footprint and you are using Oracle Linux you can read in the referenced article how to install Consul to do service discovery and registration. Consul supports both API based as well as DNS based discovery. If you have your service instances always on the same and pre-defined port using DNS is a very good option to use for your service discovery process.


Oracle Cloud - Data democratization by using REST API’s

The idea of helping everybody to access and understand data is known as data democratization. Data democratization means breaking down silos and providing access to data when and where it is needed at any given moment.  By striving to have full data democratization within the enterprise is actually taking the step to a data driven company and nurture data driven decision making.

The general idea of data democratization and providing access to everyone in the company to use it is a very simple idea, the realization of this idea is however a very complex one in many cases. This is especially true in organically grown companies who have, over time, grown their IT footprint. In general, this includes a large set of legacy applications who do not by nature support integration that well.

However, the fact that an enterprise has a large set of legacy applications should not hold back the ambition to change to a more data driven enterprise. Moving to a more data driven enterprise, democratization of data and base decisions on actual data is a huge benefit for enterprise. Additionally, it is the starting point of integrating other systems and drive business in new and disruptive ways to keep the advantage over competitors.

Getting started
To get started with data democratization the first step is to start finding your data and classify the data sources. The below pointers can be of importance when evaluating the data.
- Data location : where is the data located, how easily can it be accessed
- Data ownership : which department owns the data
- Data confidentiality : How confidential is this data
- Data privacy : is there privacy related data in the set
- Data value : what is the monetary value of the data
- Data alignment : how well aligned is the data with other sources
 
Taking the above questions into mind when classifying all data this will give you a route of action per dataset. I will help you to identify how to handle each data-source, how to classify it and to integrate it. It also helps you to prioritize it.

Moving to the cloud
When moving to a data democratization model, this might be a turning point in how you look at IT and it might be good moment to consider the use of cloud. When trying to integrate and store a large set of data you can select, as an example, the Oracle Cloud to house the data you make available for all your users.

This is not necessarily meaning that you have to move the actual systems to the Oracle Cloud. One can think of a model where the backend systems remain in your current datacenter or cloud and you move / sync your data and the changes to the Oracle Cloud where you unlock them to the users using REST API’s and portals in the form of a data shop.

Opening up with a data shop
The concept of a data shop is the way to get started with data democratization. A data shop is a self-service portal where users can gain access to all the data that you have liberated. It provides users the option to get access to REST API’s or to, as an example, the Oracle Data Visualization Cloud Service, which can show data already included in graphs and other visualization.

As with a real shop, a large number of “products” are available. Some are for the standard users in the form of pre-defined dashboards and reports and some users will require the data in a rawer format to make and share their om reports and analysis.

Making it easy
Making it easy for data consumers to use the data is actually two folded. You will have two types of consumers, the tech consumers and the non-tech consumers. The tech consumers will require REST API’s to gain access to the data and undertake all the actions they need and think are valuable. The other type of consumers are the non-tech users. For non-tech users the REST API approach might be too difficult to master and they will need a more simple way to gain access to the liberated data.

After you moved your data to the cloud , as a first step in the process you will have to ensure that the data is accessible, via REST API’s and also via standard dashboards. Oracle is providing a growing number of options in the Oracle Public Cloud to do both. You can use standard visualization and data exploration tooling to your users within the cloud which have a relative low learning curve and people can start with right away. An example of this is the Oracle Data Visualization Cloud Service.


Oracle is also providing API functionality, even though the services Oracle provide standard from within the database and with some of the cloud services it might very well be beneficial to consider building your own REST API implementation while leveraging both the Oracle Compute Cloud Service with Oracle Linux instances and the Oracle Container Cloud Service.

Putting it all together
With data democratization, you open up your data, break the silo way of architecture and provide your users the option to analyze the data and make use of a active and up-to-date collection of data in one single place, the data shop. Moving to the cloud and leveraging the cloud is a technical solution to make this happen. Moving to the cloud is not the goal for data democratization.