Monday, June 26, 2017

Oracle Linux - bash check for HTTP response code

Bash is one of the most used scripting languages on Linux systems to undertake numerous checks on the system itself and towards other systems. Reason for this, it is available on all Linux hosts and it is very easy to script.

In some cases you might need to be able to know the HTTP status code from a remote HTTP server. A ping command might not do the full trick as it only tells you if the host is responsive to a network ping and getting the whole page might not be a good idea because an error page might be interpreted as a valid response. A better way is to check the HTTP response code that is returned.

In general HTTP resonse codes can be classified as follows:
1xx - informational reponses
2xx - success responses
3xx - redirect responses
4xx - client errors
5xx - server errors

This means that getting a 2xx response will indicate that the server is responding in a good fashion. On all other responses you might want to add some additional actions. Main questions is, how to get only the response code by using a bash script. The below example will do so:

curl -Lw '%{http_code}' -s -o /dev/null -I

The above command will return 200 as a response code. In case we would have used a invalid domain name we would have received a error code. The above example is relative simple however will require curl to be present on the system. In case you do not want curl on your system you will have to find an alternative method to execute the check. 

Monday, June 12, 2017

Oracle Data Visualization Cloud Service - upload files

Within the Oracle Cloud portfolio Oracle has positioned the Oracle Data Visualization Cloud Service as the tool to explore your data, visualize it and share your information with other people within the enterprise. The Oracle Data Visualization Cloud Service can be used as a part of a data democratization strategy within an enterprise to provide all users access to data where and whenever they need it. The concept of data democratization is discussed in another blogpost on this blog.

Currently the Oracle Data Visualization Cloud Service provides two main ways of getting data in the Oracle Data Visualization Cloud Service. One is by connecting it to a oracle database source, for example located in the Oracle database cloud service, another is by uploading a file. Primarily CSV and XLSX files are supported for uploading data.

In most cases it is not a best practice to upload files as the data is relative static and is not connected to a live datasource as you would have with a database connection. However, in some cases it can be a good way to get data in. Examples are; users who add their own content and do not have the means to connect an Oracle database or relative static data.

Example data from DUO
In the example below we add a relative static piece of data to the Oracle Data Visualization Cloud Service. This is a year over year report of people dropping out of schools in the Netherlands. The data is per year, per location and per education type and is freely available as open-data from the DUO website. You can locate the file for reference here.

Loading a data file
When a user wants to load data into the Oracle Data Visualization Cloud Service the most easy way to do so from and end-user perspective is to use the GUI. Loading data includes a limited number of steps.

1) within the Data Sources section navigate to "Create" - " Data Source" Here you can select the type file by default. When selected you are presented with the option to select a file on your local file system.

2) the next step, after the file is uploaded, is to verify and if needed modify the definition of the uploaded data.

3) after this step is completed you will find the file ready for you use in your data sources as shown below.

In effect, those are the only actions needed by a user to add data to the Oracle Data Visualization Cloud Service.

Tuesday, June 06, 2017

Oracle Cloud - The value of edge computing

A term you currently see coming up is edge computing, edge computing is the model where you push computations and intelligence to the edge of the network or the edge of the cloud. Or, in the words of ; “ Edge computing is a distributed information technology (IT) architecture in which client data is processed at the periphery of the network, as close to the originating source as possible. The move toward edge computing is driven by mobile computing, the decreasing cost of computer components and the sheer number of networked devices in the internet of things (IoT).

Analysts state that edge computing is a technology trend with a medium business impact which we will see surfacing in 2017.

Even though the business impact is generally seen as medium it provides a large technology wise benefit and understanding the concepts of edge computing can be of vital importance, especially when you are developing IoT solutions or geographically distributed systems that rely on machine 2 machine communication

The high-level concept
From a high-level point of view edge computing states that computations should be done as close as possible to the location where the data is generated. This implies that raw data should not be send to a central location for computations, the basic computations should rather be done at the collection point.

As an example, if you would do license plate recognition to decide if a security gate would open for a certain car you can architect this in a couple of ways. The “traditional” way of doing this is having a very light weight system which would take a number of pictures as soon as a car would trigger the camera. The pictures would then be send to a central server where a license plate recognition algorithm would extract the information from the picture and compare the result against a database to make a decision to open the security gate or not.

Architecting the same functionality with edge computing would involve a number different steps. In the edge computing model the car would trigger the camera and the pictures would be taken. A small computer, possibly embedded within the camera, would run the license plate recognition algorithm and only the result would be send to a REST API to check if the gate should be opened or should remain closed.

The benefit of edge computing in this case is that you would have a lot less data which needs to be communicated between the camera and the central server. Instead of sending a number of high resolution photos or even a video stream you only have to communicate a JSON object containing the license plate information.  By doing so you can limit the amount of computing power needed at the central location and at the same time improve the speed of the end user experience.

A more data intensive example
The example of the license plate recognition is a good illustration of the concept, a bigger scale and more data intensive example could be an example using smart devices.

Such an example could be industrial (or home use) equipment which relies on the data collected by a set of sensors to make decisions. If we would take an industrial example this could be a smart skid in factory responsible for ensuring that a number of liquid storage containers are always filled to a certain extend and are always at a certain temperature and mix.

Such a skid as described above involves a large set of sensors as well as a large set potential actions on valves, pumps and heating equipment. Traditionally this was done based upon an industrial PLC in a disconnected manner where it was not possible to centrally monitor and manage the skid.

Certain architecture blueprints state that the sensor data should be collected more centrally to ensure a more centralized management and monitoring solution. The result of this is that all data is being send to a central location where computations are done on the received data. The resulting actions are being communicated back again. The result of this is that a lot of data needs to be communicated back and forward and a loss in communication can result is a preventive shutdown of a skid.

In an edge computing architecture, the skid would be equipped with a local computing power solution which would take care of all the computations that are in other cases done in a central location. All decision making would be done locally on the edge of the network  and a subset of data and a log of the actions being undertaken would be send to a central command and control server where the remote skid can be monitored and human intervention could be triggered.

In this model the loss of connectivity would not result in a preventive shutdown, the operations would be continuing for a much longer time given the operational parameters that the edge computer holds.

Oracle Cloud and MQTT
As already mentioned in the example of the remote skid, a way to communicate the data in a IoT fashion is using MQTT. MQTT stands for Message Queue Telemetry Transport, It is a publish/subscribe, extremely simple and lightweight messaging protocol, designed for constrained devices and low-bandwidth, high-latency or unreliable networks. The design principles are to minimize network bandwidth and device resource requirements whilst also attempting to ensure reliability and some degree of assurance of delivery. These principles also turn out to make the protocol ideal of the emerging “machine-to-machine” (M2M) or “Internet of Things” world of connected devices, and for mobile applications where bandwidth and battery power are at a premium.

On this blog we already discussed MQTT in combination with Oracle Linux and the Mosquitto MQTT message broker.

To facilitate the growing amount of IoT devices and the principle of edge computing which is relying on MQTT communication Oracle has included MQTT in the Oracle Cloud. Oracle primarily positions MQTT in combination with the Oracle IOT Cloud service in the form of a MQTT bridge. The Oracle IoT Cloud Service MQTT Bridge is a software application that must be installed and configured on Oracle Java Cloud Service in order to enable devices or gateways to connect to Oracle IoT Cloud Service over the MQTT communication protocol.

Within the Oracle cloud you see the MQTT bridge as a solution to connect remote devices to the Oracle IoT cloud via the MQTT protocol. The MQTT bridge receives the MQTT traffic and "translates" it to HTTPs calls which communicate with the Oracle IoT cloud Service.

In conclusion
As already outlined in the above examples, processing a large part of the computations at the edge of the network and implementing the principles of edge computing will drastically reduce the amount of computing power and storage capacity you need in the Oracle Public Cloud. In many cases, you can rely on MQTT communication or HTTPS communication where you call a REST API.

By pushing a large part of the computations to the edge your remote systems and devices become more reliable, even in cases where network communication is not always a given, and the resulting services become faster. 

Sunday, June 04, 2017

Oracle Linux - Using Consul for DNS based service discovery

Consul, developed by hashicorp,  is a solution for service discovery and configuration. Consul is completely distributed, highly available, and scales to thousands of nodes and services across multiple datacenters. Some concrete problems Consul solves: finding the services applications need (database, queue, mail server, etc.), configuring services with key/value information such as enabling maintenance mode for a web application, and health checking services so that unhealthy services aren't used. These are just a handful of important problems Consul addresses.

When developing microservices it is important that as soon as new instance of a microservice comes online it is able to register itself to a central registry, this process we call service registration. As soon as an instance of a microservice is registered at the central registry it can be used in the load balancing mechanism. If a call to a service is to be initiated to a microservice the service needs to be discovered via the central registry, the process is called service discovery.

In effect two ways are common for service discovery. One is based upon an API discovery model where the calling service discovers the service by executing a call to the service registry based upon a HTTP REST API call and receiving an endpoint which can be used. Commonly an URL based upon a IP and a port number.

The other common way is using a DNS based lookup against a service registry. The effect of doing a DNS based lookup is that you need to ensure that all the instances of a service are always running on the same port on all instance. Enforcing the same port number might be somewhat limiting in cases where your port number for each service instance can vary.

Using Consul on Oracle Linux
In an earlier blogpost I already outlined how to install Consul on Oracle Linux. In this post I will provide a quick inisght in how you can configure it on Oracle Linux. We build upon the installation done in the mentioned post.

We take the example of a service where we have two instances for, the name of the service is web and in effect is nothing more than a simple nginx webserver running in a production instance. Every time we call the service we want to discover the call by using a DNS lookup, we do however want to have this balanced, meaning we want to have a different IP being returned from the DNS server.

Configure a service in consul
We configure the service "web" manually in this case by creating a JSON file in /etc/consul.d and ensure we have the information about the two instances in this JSON file. An example of the file is shown below

 "services": [{
   "id": "web0",
   "name": "web",
   "tags": ["production"],
   "Address": "",
   "port": 80
   "id": "web1",
   "name": "web",
   "tags": ["production"],
   "Address": "",
   "port": 80

As you can see, we have one name, "web" with two ID's; "web0" and "web1". The id's are used to identify the different instances of the service web. as you can see they have both a port noted next to the address. Even though it is good practice to have this in the configuration file it will not be used in the response from the internal consul DNS service as DNS will only return the addresses and not the ports.

Discover the service via Consul DNS
If we want to discover the service we can have our code to undertake a lookup against the DNS server. If we have configured the underlying Oracle Linux instance to have the Consul server in your /etc/resolv.conf file it will happen almost automatically. It will be important to make sure the ordering of your DNS servers is done correctly to improve resolving speed.

In effect all services configured in Consul will be by default part of .service.consul which will mean that if we want to do a DNS resolving for the web service we will have to do a resolve for web.service.consul. In the below example I have consul running on my localhost at port 8600 and I use dig to explicitly force dig to resolve it at this DNS server. As stated, if you configure it correctly you do not have to explicitly call it and you should be able to do a DNS resolve as you always do.

[root@localhost consul.d]#
[root@localhost consul.d]# dig +noall +answer @ -p 8600 web.service.consul 
web.service.consul. 0 IN A
web.service.consul. 0 IN A
[root@localhost consul.d]#

As you can see from the above example I will get the two addresses returned from the Consul DNS server.

Consul and load-balancing
As microservices are commonly build up out of a number of instances of the same service we do want to ensure that load-balancing is done. We can already see from the dig example above that there are two instances. However, having them always in the same order returned will not ensure that the load is balanced over the two instances.

Consul will by default do a load-balancing and will return the IP's in a different order by rotating them in the DNS response. In the below example you can see that this is done when we call the DNS server a couple of times.

[root@localhost consul.d]#
[root@localhost consul.d]# dig +noall +answer @ -p 8600 web.service.consul 
web.service.consul. 0 IN A
web.service.consul. 0 IN A
[root@localhost consul.d]#
[root@localhost consul.d]# dig +noall +answer @ -p 8600 web.service.consul 
web.service.consul. 0 IN A
web.service.consul. 0 IN A
[root@localhost consul.d]#

In conclusion
If you are running a microservices based IT footprint and you are using Oracle Linux you can read in the referenced article how to install Consul to do service discovery and registration. Consul supports both API based as well as DNS based discovery. If you have your service instances always on the same and pre-defined port using DNS is a very good option to use for your service discovery process.

Oracle Cloud - Data democratization by using REST API’s

The idea of helping everybody to access and understand data is known as data democratization. Data democratization means breaking down silos and providing access to data when and where it is needed at any given moment.  By striving to have full data democratization within the enterprise is actually taking the step to a data driven company and nurture data driven decision making.

The general idea of data democratization and providing access to everyone in the company to use it is a very simple idea, the realization of this idea is however a very complex one in many cases. This is especially true in organically grown companies who have, over time, grown their IT footprint. In general, this includes a large set of legacy applications who do not by nature support integration that well.

However, the fact that an enterprise has a large set of legacy applications should not hold back the ambition to change to a more data driven enterprise. Moving to a more data driven enterprise, democratization of data and base decisions on actual data is a huge benefit for enterprise. Additionally, it is the starting point of integrating other systems and drive business in new and disruptive ways to keep the advantage over competitors.

Getting started
To get started with data democratization the first step is to start finding your data and classify the data sources. The below pointers can be of importance when evaluating the data.
- Data location : where is the data located, how easily can it be accessed
- Data ownership : which department owns the data
- Data confidentiality : How confidential is this data
- Data privacy : is there privacy related data in the set
- Data value : what is the monetary value of the data
- Data alignment : how well aligned is the data with other sources
Taking the above questions into mind when classifying all data this will give you a route of action per dataset. I will help you to identify how to handle each data-source, how to classify it and to integrate it. It also helps you to prioritize it.

Moving to the cloud
When moving to a data democratization model, this might be a turning point in how you look at IT and it might be good moment to consider the use of cloud. When trying to integrate and store a large set of data you can select, as an example, the Oracle Cloud to house the data you make available for all your users.

This is not necessarily meaning that you have to move the actual systems to the Oracle Cloud. One can think of a model where the backend systems remain in your current datacenter or cloud and you move / sync your data and the changes to the Oracle Cloud where you unlock them to the users using REST API’s and portals in the form of a data shop.

Opening up with a data shop
The concept of a data shop is the way to get started with data democratization. A data shop is a self-service portal where users can gain access to all the data that you have liberated. It provides users the option to get access to REST API’s or to, as an example, the Oracle Data Visualization Cloud Service, which can show data already included in graphs and other visualization.

As with a real shop, a large number of “products” are available. Some are for the standard users in the form of pre-defined dashboards and reports and some users will require the data in a rawer format to make and share their om reports and analysis.

Making it easy
Making it easy for data consumers to use the data is actually two folded. You will have two types of consumers, the tech consumers and the non-tech consumers. The tech consumers will require REST API’s to gain access to the data and undertake all the actions they need and think are valuable. The other type of consumers are the non-tech users. For non-tech users the REST API approach might be too difficult to master and they will need a more simple way to gain access to the liberated data.

After you moved your data to the cloud , as a first step in the process you will have to ensure that the data is accessible, via REST API’s and also via standard dashboards. Oracle is providing a growing number of options in the Oracle Public Cloud to do both. You can use standard visualization and data exploration tooling to your users within the cloud which have a relative low learning curve and people can start with right away. An example of this is the Oracle Data Visualization Cloud Service.

Oracle is also providing API functionality, even though the services Oracle provide standard from within the database and with some of the cloud services it might very well be beneficial to consider building your own REST API implementation while leveraging both the Oracle Compute Cloud Service with Oracle Linux instances and the Oracle Container Cloud Service.

Putting it all together
With data democratization, you open up your data, break the silo way of architecture and provide your users the option to analyze the data and make use of a active and up-to-date collection of data in one single place, the data shop. Moving to the cloud and leveraging the cloud is a technical solution to make this happen. Moving to the cloud is not the goal for data democratization. 

Wednesday, May 24, 2017

Oracle Linux - capture context switching in Linux

Before we dive into the subject, context switching is normal, the Linux operating system needs context switching to function, no need to worry. The question is, if it is normal why would you like to monitor it? Because, as with everything, normal behavior is accepted however behavior which gets out of bounds will cause an issue. With context switching, you are ok in expecting a certain number of context switching at every given moment in time, however, when the number of context switches get out of hand this can result in slow execution of processes for the users of the system.

Definition of context switching
The definition to context switching given by the Linux Information Project is as follows: “A context switch (also sometimes referred to as a process switch or a task switch) is the switching of the CPU (central processing unit) from one process or thread to another. A process (also sometimes referred to as a task) is an executing (i.e., running) instance of a program. In Linux, threads are lightweight processes that can run in parallel and share an address space (i.e., a range of memory locations) and other resources with their parent processes (i.e., the processes that created them).

A context switch comes with a cost, it takes and capacity to undertake the context switch. Meaning, if you can prevent a context switch this is good and will help in the overall performance of the systems. In effect, context switching comes in two different types, voluntary context switches and non-voluntary context switches.

Voluntary context switches
When running a process can decide to initiate a context switch, if the decision is made by the code itself we talk about a voluntary context switch (voluntary_ctxt_switches). This can be for example that you voluntarily give up your execution time by calling sched_yield or you can put a process to sleep while waiting for some event to happen.

Additionally, a voluntary context switch will happen when your computation completes prior to the allocated timeslice expires.

All acceptable when used in the right manner and when you are aware of the costs of a context switch.

non-voluntary context switches
Next to the voluntary context we have the non-voluntary context switches (nonvoluntary_ctxt_switches). A non-voluntary context switch happens when a process becomes unresponsive, however, it also happens when the task is not completed within the given timeslice. When the task is not completed in the given timeslice the state will be saved and a non-voluntary context switch happens.

Prevent context switching
When trying to develop high performance computing solutions you should try to, at least, be aware of context switching and take it into account. Even better try to minimize the number of voluntary context switches and try to find the cause of every non-voluntary context switch.
As context switching comes with a cost you want to minimize this as much as possible, and when a non-voluntary context switch happens the state needs to be saved and the task is placed back in the scheduler queue needing to wait again for a execution timeslice. This makes the overall performance of your system slow down and the specific code you have written becomes even more slow.

Check proc context switches
When working on Linux, we are using Oracle Linux in this example however this applies for most systems, you can check information on context switches by looking into the status which can be located at /proc/{PID}/status in the below example we check for the voluntarty and non-voluntary context switches of pid 25334.

[root@ce /]#
[root@ce /]# cat /proc/25334/status | grep _ctxt_
voluntary_ctxt_switches: 687
nonvoluntary_ctxt_switches: 208
[root@ce /]#

As you can see the number of voluntary context switches is (at this moment) 687 and the number of non-voluntary context switches is 208. This is a quick and dirty way of determining the number of context switches that a specific PID has had at a specific moment.

Monitor context switches
You can monitor your systems for context switching. Even though you are able to do so, you will need a good case to do it. Even though it provides information on your system in most cases and deployments there is no real need to monitor the number of context switches constantly. Having stated that, there are also a lot of cases where monitoring context switching can be vital for ensuring the health of your server and/or compare nodes in a wide cluster.

A quick and dirty way of monitoring your context switches is by taking a sample. For example you could take a sample of the average number of context switches for all processes on you Linux instance that execute a context switch in the sample timeframe.

The below example script takes a 10 second sample of the context switches and provide the output of only the relevant data for this we use the pidstat command which can be installed by installing the sysstat package which is available on the Oracle Linux YUM repository.

pidstat -w 2 1 | grep Average | grep -v pidstat | sort -n -k4 | awk '{ if ($2 != "PID") print "ctxt sample:" $2" - "  $3 " - " $4 " - "  $5}'

The full example in our case looks like the one below:

[root@ce tmp]# pidstat -w 2 1 | grep Average | grep -v pidstat | sort -n -k4 | awk '{ if ($2 != "PID") print "ctxt sample:" $2" - "  $3 " - " $4 " - "  $5}'
ctxt sample:12 - 0.50 - 0.00 - watchdog/0
ctxt sample:13 - 0.50 - 0.00 - watchdog/1
ctxt sample:15 - 0.50 - 0.00 - ksoftirqd/1
ctxt sample:18 - 3.00 - 0.00 - rcuos/1
ctxt sample:2183 - 1.00 - 0.00 - memcached
ctxt sample:2220 - 1.00 - 0.00 - httpd
ctxt sample:52 - 1.00 - 0.00 - kworker/1:1
ctxt sample:56 - 1.50 - 0.00 - kworker/0:2
ctxt sample:7 - 14.00 - 0.00 - rcu_sched
ctxt sample:9 - 11.50 - 0.00 - rcuos/0
[root@ce tmp]#

to understand the output we have to look at how pidstat normally provides the output. The below is an example of the standard pidstat output:

[root@ce tmp]# pidstat -w 2 1
Linux 4.1.12-61.1.28.el6uek.x86_64 (  05/23/2017  _x86_64_ (2 CPU)

03:24:37 PM       PID   cswch/s nvcswch/s  Command
03:24:39 PM         3      0.50      0.00  ksoftirqd/0
03:24:39 PM         7     14.43      0.00  rcu_sched
03:24:39 PM         9      9.45      0.00  rcuos/0
03:24:39 PM        18      3.98      0.00  rcuos/1
03:24:39 PM        52      1.00      0.00  kworker/1:1
03:24:39 PM        56      1.49      0.00  kworker/0:2
03:24:39 PM      1557      0.50      0.50  pidstat
03:24:39 PM      2183      1.00      0.00  memcached
03:24:39 PM      2220      1.00      0.00  httpd

Average:          PID   cswch/s nvcswch/s  Command
Average:            3      0.50      0.00  ksoftirqd/0
Average:            7     14.43      0.00  rcu_sched
Average:            9      9.45      0.00  rcuos/0
Average:           18      3.98      0.00  rcuos/1
Average:           52      1.00      0.00  kworker/1:1
Average:           56      1.49      0.00  kworker/0:2
Average:         1557      0.50      0.50  pidstat
Average:         2183      1.00      0.00  memcached
Average:         2220      1.00      0.00  httpd
[root@ce tmp]#

As you can see from the “script” we print $2, $3, $4 and $5 for all average data where S2 is not “PID”. This gives us all the clear data. In our case the columns we show are the following:

$2 – the PID
$3 – number of voluntary context switches in the given sample time
$4 – number of non-voluntary context switches in the given sample time
$5 – the command name

How to use the monitor data
Collecting data, collecting sample data via monitoring is great, however when not used it is worthless and has to justify the costs of running the collector. As collecting the number of context switches has a cost you need to make sure you really need the data. A couple of ways you can use the data are described below and given a potential value in your maintenance and support effort.

Case 1 - Node comparison
This can be useful when you want to compare nodes in a wider cluster. Checking the number of context switches will be part of a wider set of checks and taking sample data. The number of context switches can be a good datapoint in the overall comparison of what is happening and what the difference between nodes is.

Case 2 - Version comparison
This can be a good solution in cases where you often have new version (builds / deployments) of code to your systems and want to track subtle changes in behavior of how the systems are working in a subtle manner.

Case 3 – Outlier detection
Outlier detection to detect subtle changes in the way the system is behaving over time. You can couple this to machine learning to detect changes over time. The number of context switches changing over time can be an indicator of a number of things and can be a pointer for a deeper investigation to tune your code.

Case 4 – (auto) scaling
Detecting the number of context switches, in combination with other datapoints can be input for scaling the number of nodes up and down. This in general is coupled with CPU usage, transaction timing and others. Adding context switching as an additional datapoint can be very valuable.

The site reliability engineering way
When applying the above you can adopt this in your SRE (site reliability engineering) strategy as one of the inputs to monitor your systems, automatically detect trends and prevent potential issues and feedback to developers on certain behaviour of the code in real production deployments.

Tuesday, May 09, 2017

Oracle Linux - Installing dtrace

When checking the description of dtrace for Oracle Linux on the Oracle website we can read the following: "DTrace is a comprehensive, advanced tracing tool for troubleshooting systematic problems in real time.  Originally developed for Oracle Solaris and later ported to Oracle Linux, it allows administrators, integrators and developers to dynamically and safely observe live systems for performance issues in both applications and the operating system itself.  DTrace allows you to explore your system to understand how it works, track down problems across many layers of software, and locate the cause of any aberrant behavior.  DTrace gives the operational insights that have long been missing in the data center, such as memory consumption, CPU time or what specific function calls are being made."

Which sounds great, and to be honest using dtrace helps enormously in finding and debugging issues on your Oracle Linux system in cases you need to go one level deeper than you would normally go to find an issue.

Downloading dtrace
If you want to install dtrace one way you can do this is by downloading the files from the oracle website, you can find the two RPM's at this location.

Installing dtrace
when installing dtrace you might run into some dependency issues that are not that obvious to resolve. Firstly they have a dependency on each other. This means you will have to install the RPM files in the right order. You can see this below;

[root@ce vagrant]# rpm -ivh dtrace-utils-0.5.1-3.el6.x86_64.rpm 
error: Failed dependencies:
 cpp is needed by dtrace-utils-0.5.1-3.el6.x86_64
 dtrace-modules-shared-headers is needed by dtrace-utils-0.5.1-3.el6.x86_64
 libdtrace-ctf is needed by dtrace-utils-0.5.1-3.el6.x86_64 is needed by dtrace-utils-0.5.1-3.el6.x86_64 is needed by dtrace-utils-0.5.1-3.el6.x86_64
[root@ce vagrant]# rpm -ivh dtrace-utils-devel-0.5.1-3.el6.x86_64.rpm 
error: Failed dependencies:
 dtrace-modules-shared-headers is needed by dtrace-utils-devel-0.5.1-3.el6.x86_64
 dtrace-utils(x86-64) = 0.5.1-3.el6 is needed by dtrace-utils-devel-0.5.1-3.el6.x86_64
 libdtrace-ctf-devel > 0.4.0 is needed by dtrace-utils-devel-0.5.1-3.el6.x86_64 is needed by dtrace-utils-devel-0.5.1-3.el6.x86_64 is needed by dtrace-utils-devel-0.5.1-3.el6.x86_64 is needed by dtrace-utils-devel-0.5.1-3.el6.x86_64
[root@ce vagrant]# 

As you can see, you also have a number of other dependencies. The most easy way to resolve this is to simply use YUM to install both RPM's from your local machine and leverage the power of YUM to install the rest of the dependencies. For this we will use the yum localinstall dtrace-utils-* command.

Now we can quickly check if dtrace is indeed installed by executing the dtrace command without any specific option. You should see the below on your terminal:

[root@ce vagrant]# dtrace
Usage: dtrace [-32|-64] [-aACeFGhHlqSvVwZ] [-b bufsz] [-c cmd] [-D name[=def]]
 [-I path] [-L path] [-o output] [-p pid] [-s script] [-U name]
 [-x opt[=val]] [-X a|c|s|t]

 [-P provider [[ predicate ] action ]]
 [-m [ provider: ] module [[ predicate ] action ]]
 [-f [[ provider: ] module: ] func [[ predicate ] action ]]
 [-n [[[ provider: ] module: ] func: ] name [[ predicate ] action ]]
 [-i probe-id [[ predicate ] action ]] [ args ... ]

 predicate -> '/' D-expression '/'
    action -> '{' D-statements '}'

 -32 generate 32-bit D programs and ELF files
 -64 generate 64-bit D programs and ELF files

 -a  claim anonymous tracing state
 -A  generate driver.conf(4) directives for anonymous tracing
 -b  set trace buffer size
 -c  run specified command and exit upon its completion
 -C  run cpp(1) preprocessor on script files
 -D  define symbol when invoking preprocessor
 -e  exit after compiling request but prior to enabling probes
 -f  enable or list probes matching the specified function name
 -F  coalesce trace output by function
 -G  generate an ELF file containing embedded dtrace program
 -h  generate a header file with definitions for static probes
 -H  print included files when invoking preprocessor
 -i  enable or list probes matching the specified probe id
 -I  add include directory to preprocessor search path
 -l  list probes matching specified criteria
 -L  add library directory to library search path
 -m  enable or list probes matching the specified module name
 -n  enable or list probes matching the specified probe name
 -o  set output file
 -p  grab specified process-ID and cache its symbol tables
 -P  enable or list probes matching the specified provider name
 -q  set quiet mode (only output explicitly traced data)
 -s  enable or list probes according to the specified D script
 -S  print D compiler intermediate code
 -U  undefine symbol when invoking preprocessor
 -v  set verbose mode (report stability attributes, arguments)
 -V  report DTrace API version
 -w  permit destructive actions
 -x  enable or modify compiler and tracing options
 -X  specify ISO C conformance settings for preprocessor
 -Z  permit probe descriptions that match zero probes
[root@ce vagrant]# 

All ready and set to start with dtrace on your Oracle Linux instance. As an addition, you will also have to install the below mentioned packages for your specific machine:

yum install dtrace-modules-`uname -r`

Oracle Linux - using pstree to find processes

When checking which processes are running on your Oracle Linux instance you can use the ps command. Most likely the ps command is the most likely the most used command to find processes, and for good reasons as it is very easy to use. However, when you want some more insight and a more easy view what is related to what the pstree option can be very useable.

pstree shows running processes as a tree. The tree is rooted at either pid or init if pid is omitted. If a user name is specified, all process trees rooted at processes owned by that user are shown. pstree visually merges identical branches by putting them in square brackets and prefixing them with the repetition count. As an example you can see the below standard output of pstree without any additional options specifief;

[root@ce tmp]#
[root@ce tmp]#  pstree
[root@ce tmp]#
[root@ce tmp]# 

As you can see in the above example httpd is between brackets and 10 is mentioned. Which means that 10 which indicates that more processes are running as httpd. Below is shown a part of the full tree (removed the lower part for readability):

[root@ce tmp]# pstree -p
        │                   ├─{VBoxService}(1186)
        │                   ├─{VBoxService}(1187)
        │                   ├─{VBoxService}(1188)
        │                   ├─{VBoxService}(1189)
        │                   ├─{VBoxService}(1190)
        │                   └─{VBoxService}(1191)
        │             ├─httpd(3617)
        │             ├─httpd(3618)
        │             ├─httpd(3619)
        │             ├─httpd(3620)
        │             ├─httpd(3621)
        │             ├─httpd(3622)
        │             ├─httpd(3623)
        │             ├─httpd(5020)
        │             └─httpd(5120)
        │            ├─{java}(3997)
        │            ├─{java}(3998)

This shows the main process (PID 3612) and all other processes that are forked from this process. Using pstree is primarily (in my opinion) to support you when doing some investigation on a machine and is not by default the best tool to use when scripting solutions on Oracle Linux. Having stated that, it is a great tool to use.

Monday, May 08, 2017

Oracle Linux - get your external IP in bash with

Developing code that helps you in automatic deployments can be a big timesaver. Repeating tasks for installing servers, configuring them and deploying code on them is something which is more and more adopted by enterprises as part of DevOps and continuous integration and continuous deployment methods. When you use scripting for automatic deployment of your code in your own datacenter the beauty is that you fairly well know how the infrastructure looks and you have a fairly good view on how, for example, your machine will be accessible from the outside world. For example, if you deploy a server that has an external IP address on the outside of the network edge you should be able to determine this relatively easy even in cases where this IP is not the IP of your actual machine.

If you however provide scripting which you distribute you will not be able to apply the logic you might apply in your own network. For this you need some way to find out the external IP address. And, as stated, this can be something totally different than the IP which the machine actually has from your local operating system point of view.

The people at have done some great work by providing a quick service to resolve this problem. provide a service that will provide you all the information need (and more) in a manner that is easily included in bash scripting.

As an example, in case you would need your external IP to use in a configuration in your Oracle Linux deployment you could execute the below command:

[root@ce tmp]# curl
[root@ce tmp]#

(do note, this is not my IP as I do not use a google webhost as one of my test machines). As you can see this is relative easy and to provide an example of how you could include this in a bash script you can review the below code snippet:


 myIp=`curl -s`

 echo $myIp

And, even though this is a very quick and easy solution to a problem you could face when you try to automate a number of steps while scripting the provides more options. A number of options to get information from the "external" view are available and can all be found at the website. However most important one is the ability to do a curl to which will return a JSON based response with all the information in it. This makes it parsable. And to make it more easy, Oracle has included jq in the YUM repository which makes parsing JSON even more easy. An example of the JSON response from is shown below (again.... using a fake google webhost and not my own private information.

 "connection": "",
 "ip_addr": "",
 "lang": "",
 "remote_host": "",
 "user_agent": "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2",
 "charset": "",
 "port": "54944",
 "via": "",
 "forwarded": "",
 "mime": "*/*",
 "keep_alive": "",
 "encoding": ""

Thursday, May 04, 2017

Oracle Linux - find files after yum installation

Installing software on Oracle Linux is relative easy using the yum command. The downside of the ease of installing is that you do have a lot of files being "dumped" on the filesystem without keeping clear track of what is exactly installed where. Keeping your filesystem clean and understanding what ends up in which location is vital for maintaining a good working Linux instance.

A couple of options are available to keep track of what is installed where and provide you with a list of files which shows where things ended up on the filesystem.

First option is making use of the repoquery utility. Where repoquery is part of the yum-utils package you will have to ensure that you have installed repo-utils. Basically you can check this by checking if you have the repoquery utility (which is a good hint) or you can check by using the yum command as shown below:

[root@ce ~]# yum list installed yum-utils
Loaded plugins: security
Installed Packages
yum-utils.noarch                                                                                    1.1.30-40.0.1.el6                                                                                    @public_ol6_latest
[root@ce ~]#

If you have the repoquery utility you an use it to find out which files are installed into which location. An example of this is shown below where we check what is installed and where it is installed when we did the installation of yum-util 

[root@ce ~]# 
[root@ce ~]# repoquery --installed -l yum-utils
[root@ce ~]# 
[root@ce ~]# 

This will help you to keep track of what is installed in which location and can support in ensuring you have a clean system. 

Sunday, April 30, 2017

Oracle Linux - Short Tip 7 - find inode number of file

Everything Linux is storing as a file is done based upon a model that makes use of inodes. The inode is a data structure in a Unix-style file system that describes a filesystem object such as a file or a directory. Each inode stores the attributes and disk block location(s) of the object's data. Filesystem object attributes may include metadata (times of last change, access, modification), as well as owner and permission data. In some cases it can be very convenient to know what the inode ID is for a specific file. you can find the inode number by using the ls command or the stat command as an example.

below you can see the ls command where we extend the ls -l with i to esure we have the inode information we need.

[vagrant@ce log]$ ls -li
total 128
1835019 -rw-r--r--  1 root root   1694 Apr 19 12:04 boot.log
1835122 -rw-------  1 root utmp      0 Apr 19 13:10 btmp
1835323 -rw-------. 1 root utmp      0 Mar 28 10:28 btmp-20170419
1835124 -rw-------  1 root root      0 Apr 28 18:21 cron
1835030 -rw-------  1 root root    250 Apr 19 12:04 cron-20170419
1835108 -rw-------  1 root root      0 Apr 19 13:10 cron-20170428
1835015 -rw-r--r--  1 root root  27726 Apr 19 12:04 dmesg
1835022 -rw-r--r--. 1 root root      0 Mar 28 10:28 dmesg.old
1837835 -rw-r--r--. 1 root root      0 Mar 28 10:28 dracut.log
1835316 -rw-r--r--. 1 root root 146292 Apr 30 12:56 lastlog
1970601 drwxr-xr-x. 2 root root   4096 Mar 28 10:28 mail
1835125 -rw-------  1 root root      0 Apr 28 18:21 maillog
1837833 -rw-------. 1 root root    181 Apr 19 12:04 maillog-20170419
1835118 -rw-------  1 root root      0 Apr 19 13:10 maillog-20170428
1835126 -rw-------  1 root root    789 Apr 30 12:54 messages
1837831 -rw-------. 1 root root  38625 Apr 19 13:10 messages-20170419
1835119 -rw-------  1 root root   5362 Apr 28 18:17 messages-20170428
1837825 drwxr-xr-x. 2 ntp  ntp    4096 Feb  6 05:58 ntpstats
1835130 -rw-------  1 root root      0 Apr 28 18:21 secure
1837832 -rw-------. 1 root root   6740 Apr 19 12:20 secure-20170419
1835120 -rw-------  1 root root      0 Apr 19 13:10 secure-20170428
1835131 -rw-------  1 root root      0 Apr 28 18:21 spooler
1835031 -rw-------  1 root root      0 Apr 19 12:04 spooler-20170419
1835121 -rw-------  1 root root      0 Apr 19 13:10 spooler-20170428
1835302 -rw-------. 1 root root      0 Mar 28 10:28 tallylog
1835128 -rw-r--r--. 1 root root      0 Mar 28 10:28 vboxadd-install.log
1835129 -rw-r--r--. 1 root root     73 Apr 19 12:04 vboxadd-install-x11.log
1835057 -rw-r--r--. 1 root root      0 Mar 28 10:28 VBoxGuestAdditions.log
1835321 -rw-rw-r--. 1 root utmp   6912 Apr 30 12:56 wtmp
1835028 -rw-------. 1 root root     64 Apr 19 12:13 yum.log
[vagrant@ce log]$

Another example of how to get the inode number is by using the stat command. The below example shows how we use stat on the boot.log file in Oracle Linux to get the inode number and other information.

[vagrant@ce log]$ stat /var/log/boot.log 
  File: `/var/log/boot.log'
  Size: 1694       Blocks: 8          IO Block: 4096   regular file
Device: fb01h/64257d Inode: 1835019     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-04-19 12:04:01.517000000 +0000
Modify: 2017-04-19 12:04:05.524262651 +0000
Change: 2017-04-19 12:04:05.524262651 +0000
[vagrant@ce log]$

Friday, April 14, 2017

Oracle Linux - Install maven with yum

When developing Java in combination with Maven on Oracle Linux you most likely want to install Maven with a single YUM command. The issue you will be facing is that Oracle is not by default providing maven in the YUM repository for Oracle Linux. The escape for this is to make use of the Fedora YUM repository. This means that you have to ensure that you add a Fedora repository to your list of YUM repositories.

As soon as you have done so you can make use of a standard YUM command to take care of the installation.

The below steps showcase how you can add the yum repository from and after that you can execute the yum install command

wget -O /etc/yum.repos.d/epel-apache-maven.repo

yum install -y apache-maven

This should result in the installation of Maven on your Oracle Linux instance and should enable to you to start developing on Oracle Linux with Maven. To check if the installation has gone correctly you can execute the below command which will show you the information on the version of Maven.

[root@localhost tmp]#
[root@localhost tmp]# mvn --version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00)
Maven home: /usr/share/apache-maven
Java version: 1.8.0_121, vendor: Oracle Corporation
Java home: /usr/java/jdk1.8.0_121/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "4.1.12-61.1.28.el6uek.x86_64", arch: "amd64", family: "unix"
[root@localhost tmp]#
[root@localhost tmp]#

Oracle Linux - Download Java JDK with wget

When working with Oracle Linux it might very well be that you do not have a graphical user interface. In case you need to download something most commonly you will be using wget or curl. In most cases that works perfecty fine, in some cases however this is not working as you might expect. One of the issues a lot of people complain about is that they want to download the java JRE or Java JDK from the Oracle website using wget or curl. When executing a standard wget command however they run into the issue that the response they get is not the rpm (or package) they want. Instead the content of the file is html.

Reason or this is that the page that controls the download works with cookies that force you to accept the license agreement prior to downloading. As you most likely will not have a graphical user interface you cannot use a browser to download it.

This holds that we need to trick the Oracle website in believing we have clicked the button to agree with the license agreement and serve the file instead of html code. The command you can use for this is shown below:

wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie"

We now instructed wget to send a header which states "Cookie: oraclelicense=accept-securebackup-cookie". This is the response needed by Oracle to serve the rpm file itself.

Now you can install the JDK by making use of the rpm -ivh command. This should ensure you have it installed on your Oracle Linux system.As a check you can execute the below version which should tell you exactly what is now installed on the system:

[root@localhost tmp]#
[root@localhost tmp]# java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
[root@localhost tmp]#
[root@localhost tmp]#

Monday, April 10, 2017

Oracle Linux - use Vagrant to run Oracle Linux 6 and 7

Vagrant is an open-source software product build by HashiCorp for building and maintaining portable virtual development environments. The core idea behind its creation lies in the fact that the environment maintenance becomes increasingly difficult in a large project with multiple technical stacks. Vagrant manages all the necessary configurations for the developers in order to avoid the unnecessary maintenance and setup time, and increases development productivity. Vagrant is written in the Ruby language, but its ecosystem supports development in almost all major languages.

As part of the Vagrant ecosystem people can create Vagrant boxes and share them with others. We already have seen companies package solutions in virtual images to be used to provide customers with showcases of a working end-to-end solution. Even though this is a great way of sharing standard images of an operating systems including packaged products vagrant is more ented to provide custom images, boxes, to be used by developers for example.

Oracle Linux is already available within the vagrant boxes shared by the community. You can search for boxes in Atlas from Hashicorp build upon Oracle Linux in combination with Oracle Virtualbox. Even though that is great news, it has now improved with Oracle also providing official vagrant boxes from the Oracle website.

At this moment you can download official Oracle Linux boxes from for Oracle Linux 7.3, 6.8 and 6.9.

As an example of how to get started the below commands show the process to get a running Oracle Linux 6.8 box on a macbook with vagrant already installed.

Step 1 Download the box and add it to vagrant
vagrant box add --name ol6

Step 2 Initialize the box (in a directory where you want it)
vagrant init

Step 3 start the vagrant box (virtualbox)
vagrant up

Step 4 Login to the virtual machine.
vagrant ssh

For those who like to see a short screencast, you can watch the below screencast of the process.

We have not shown the process of installing vagrant itself, you can find the instructions of how to install Vagrant on your local machine on the vagrant website. Having vagrant on your local machine will help you to speed up the rapid prototyping of new ideas without the need to install and re-install a virtual machine every time.

Having your own private vagrant boxes within an enterprise and provide them to your developers to enable them to work with virtual machines that match the machines you will deploy in your datacenter will speed up the process for developers and removes the time needed to install and re-install virtual machine. Making sure developers can focus on what they want to focus on, developing solutions and coding.

Monday, April 03, 2017

Oracle Linux - Install Neo4j

Neo4j is a graph database  a graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data. A key concept of the system is the graph (or edge or relationship), which directly relates data items in the store. The relationships allow data in the store to be linked together directly, and in many cases retrieved with one operation.

This contrasts with conventional relational databases, where links between data are stored in the data, and queries search for this data within the store and use the join concept to collect the related data. Graph databases, by design, allow simple and fast retrieval of complex hierarchical structures that are difficult to model in relational systems. Graph databases are similar to 1970s network model databases in that both represent general graphs, but network-model databases operate at a lower level of abstraction and lack easy traversal over a chain of edges.

When developing an solution which is heavily depending on the relationship between data points the choice for a graph database such as Neo4j is a good choice. Examples of such an application can be for example a solution where you need to gain insight in the relationships between historical events, the relationship between people and actions or the relationship between events in a complex system. The last might be an alternative way for logging in a distributed microservice architecture based solution.

Install Neo4j on Oracle Linux
For those who like to setup Neo4j and get started with it to explore the options it might give you company, the below short set of instructions shows how to setup Neo4j on Oracle Linux. For those who use RedHat, the instructions below will most probably also work on RedHat Linux. However, the installation is done and tested on Oracle Linux.

First thing we need to do is to ensure we are able to use yum for the installation of Neo4j on our system. Other ways of obtaining Ne04j are also available and can be used however the yum way of doing things is the most easy way and provides the quickest result. An word of caution, Neo4j currently states that the yum based installation is experimental, we have  however not found any issue while using yum.

To ensure we have the gpg key associated with the Ne04j yum repository we have to import it, shown below is an example of how you can download the key.

[root@oracle-65-x64 tmp]#
[root@oracle-65-x64 tmp]# wget
--2017-04-02 13:14:07--
Connecting to||:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4791 (4.7K) [application/octet-stream]
Saving to: “neotechnology.gpg.key”

100%[===========================================>] 4,791 --.-K/s   in 0.005s 

2017-04-02 13:14:08 (1.01 MB/s) - “neotechnology.gpg.key” saved [4791/4791]
[root@oracle-65-x64 tmp]#

As soon as we have obtained the key by downloading it from the Neo4j download location we can import the key by using the import option from the rpm command as shown below:

[root@oracle-65-x64 tmp]#
[root@oracle-65-x64 tmp]# rpm --import neotechnology.gpg.key
[root@oracle-65-x64 tmp]#

Having the key will help validate the packages we download from the Neo4j repository on our Oracle Linux machine to be installed. We do however have to ensure yum is able to locate the Neo4j repository. This is done by adding a repo file to the yum repository directory. Below is shown how the repository is added to the yum configuration, this command will create the file neo4j.repo in /etc/yum./repos.d where yum can locate it and use it include the repository as a valid repository to search packages.

cat <<EOF> /etc/yum.repos.d/neo4j.repo
name=Neo4j Yum Repo

Having both the key and the repository present on your system will enable you to use yum for the installation of Neo4j. This means you can now use a standard yum command to install Neo4j on Oracle Linux, an example of the command is shown below.

[root@oracle-65-x64 tmp]#
[root@oracle-65-x64 tmp]# yum install neo4j

This should ensure you have Neo4j installed on your Oracle Linux instance.

Configuring Neo4j
As soon as you have completed the installation a number of tasks needs to be executed to ensure you have a proper working Neo4j installation.

By default NEo4j will not allow external connections to be made. This means that you can only connect to Neo4j by using the address for the localhost. Even though this might very well be enough for a development or local test environment this is not what you want when deploying a server. It will be required that the Neo4j instance is also accessible from the outside world. This requires a configuration change to the Neo4j configuration file. The standard location for the configuration file, when Neo4j is deployed on an Oracle Linux machine, is /etc/neo4j . In this location you will notice the neo4j.conf file which holds all the configuration data for the Neo4j instance.

By default the below lines is commented out. Ensure you uncomment the line, this should ensure that Neo4j will accept non-local connections:


Additionally your want Neo4j to starts during boot. For this you will have to ensure Neo4j is registered as a servic and activate. You can do so by executing the below command:

[root@oracle-65-x64 tmp]#
[root@oracle-65-x64 tmp]# chkconfig neo4j on

Now NEo4j should be registered as a service that will start automatically when the machine boots. To check this you can check this by using the below command.

[root@oracle-65-x64 tmp]# chkconfig --list | grep neo4j
neo4j          0:off 1:off 2:on 3:on 4:on 5:on 6:off
[root@oracle-65-x64 tmp]#

This however is not stating your  Neo4j instance is running, you will have to start it the first time after installation manually. To check the status of Ne04j on Oracle Linux you can use the below command:

[root@oracle-65-x64 ~]# service neo4j status
neo4j is stopped
[root@oracle-65-x64 ~]#

To start it you can use the below command:

[root@oracle-65-x64 ~]# service neo4j start
[root@oracle-65-x64 ~]# service neo4j status
neo4j (pid  5643) is running...
[root@oracle-65-x64 ~]#

Now you should have a running Neo4j installation on your Oracle Linux instance which is ready to be used. You now also should be able to go to the web interface of Neo4j and start using it.

Neo4J in the Oracle Cloud
When running Neo4J in the Oracle cloud the main installation of the Neo4J is already described in the section above. A number of additional things need to be kept in consideration when deploying it within the Oracle Public Cloud.

When deploying Neo4j in the Oracle cloud you will deploy it using the Oracle Public Cloud Compute Cloud Service. In the Compute Cloud Service you will have the option to provision an Oracle Linux machine and using the above instructions you will have a running machine in the Oracle Cloud within a couple of minutes.

The main key pointers you need to consider are around how to setup your network security within the Oracle Cloud. This also ties into the overall design, who can access Neo4j, which ports should be open and which routes should be allowed.

The way Oracle Cloud works with networks, firewalls and zone configuration is a bit different from how it is represented in a standard environment. However, even though the Oracle Compute Cloud service uses some different terms and different ways of doing things it provides you with exactly the same building blocks as a traditional IT deployment to do proper zone configuration and shield your database and applications from unwanted visitors.

A general advice when deploying virtual machines in the Oracle Public Cloud is to plan ahead and ensure you have your entire network and network security model mapped out and configured prior to deploying your first machine.

For the rest, using the Oracle cloud for a Neo4j installation is exactly the same as you would do in your own datacenter, with the exception that you can make use of the flexibility and speed of the Oracle Cloud.