Categories
Software Technology

How to Build a Development Environment with Docker

Building a reliable and comfortable development environment is not an easy task. Running multiple versions of the same software might be a bit tricky. This post shows how to build an awesome development environment with Docker.

MySQL Docker


Docker is an open-source platform that automates the development of applications inside software containers.
What are the advantages of using an external tool to manage software on a development machine?

  • running multiple versions. Usually only a single version of the application can be found in package manager. Very often, it’s not the latest version. Example: there’s no Oracle MySQL package in the ArchLinux packages repository.
  • saving time spent on compiling an application from sources. Managing multiple versions of the same software is time-consuming.
  • avoiding dependency hell. You can use a few versions of the application without introducing any dependency compatibility issues.
  • reliability. Official Docker images are more reliable than third party repositories.

Using Docker to install multiple versions of MySQL

MySQL has been used as an example in this post, but the process is almost the same for any kind of software. You can find the details about the official MySQL Docker images at the Docker Hub page.
You’ll need the following dependencies to achieve our goal:

  • Docker
  • systemd or equivalent
  • (optional) a build tool (GNU Make would match the requirements perfectly) – this
    example uses simple shell scripts.

The first step is to create a directory where the configuration files will be stored. A home directory will be a good choice:
mkdir -p ~/.docker/mysql/{5.6,5.7} (the location of the ). Two MySQL versions will be used as an example: 5.6 & 5.7.
Every container requires a Dockerfile, so creating those is the second step: touch ~/.docker/mysql/{5.6,5.7}/Dockerfile.
You can open Dockerfiles using any text editor: emacs, vim, gedit and so on.
~/.docker/mysql/5.6/Dockerfile

FROM mysql:5.6
ENV MYSQL_ALLOW_EMPTY_PASSWORD true
VOLUME /var/lib/mysql/5.6
EXPOSE 3306
CMD ["mysqld"]

~/.docker/mysql/5.7/Dockerfile

FROM mysql:5.7
ENV MYSQL_ALLOW_EMPTY_PASSWORD true
VOLUME /var/lib/mysql/5.7
EXPOSE 3307
CMD ["mysqld"]

Some explanation of the Dockerfiles content:

  • FROM specifies the image name & image tag used in the following format: name:tag
  • ENV sets a environment variable. In this example, an empty root user password is allowed.
  • VOLUME sets the location of mounted volume as database files should be stored on the host machine.
  • EXPOSE determines the port number that will be available from the container host.
  • CMD sets up the default entry point.

As we’re using multiple versions of MySQL, the volume path and exposed port number both need to be customized. This isn’t necessary if you’re using single instance.
After the Dockerfiles are ready, an automation tool will be required to build and run the containers. This example uses simple shell scripts, but any build tool can be used: touch ~/.docker/mysql/{5.6,5.7}/build.sh.
~/.docker/mysql/5.6/build.sh

#!/bin/bash
docker stop mysql-5.6
docker rm mysql-5.6
docker build -t mysql-5.6 .
docker run -d \
	-p 3306:3306 \
	-v /srv/mysql:/var/lib/mysql \
	--name mysql-5.6 \
	mysql-5.6
docker start mysql-5.6

~/.docker/mysql/5.7/build.sh

#!/bin/bash
docker stop mysql-5.7
docker rm mysql-5.7
docker build -t mysql-5.7 .
docker run -d \
	-p 3306:3306 \
	-v /srv/mysql:/var/lib/mysql \
	--name mysql-5.7 \
	mysql-5.7
docker start mysql-5.7

Some details about the build scripts:

  • docker stop command stops any running the container named mysql-5.6 or mysql-5.7
  • docker rm removes any existing container with the specified name
  • docker build builds a new image with the specified tag – in this example mysql-5.6 or mysql-5.7
  • docker run runs a container using a previously created image
  • docker start starts the newly created container

MySQL Docker – Almost there…

At this stage, MySQL Docker containers are fully functional and accessible from the host machine. They store MySQL user home directory on the host. There’s one issue with the containers – they do not start on the system boot. To start them, an init system or a cron job might be used. In this post, I’ll use systemd.
/etc/systemd/system/mysql-5.6.service

[Unit]
Description=MySQL 5.6 Docker container
Requires=docker.service
After=docker.service
[Service]
Restart=always
ExecStart=/usr/bin/docker start -a mysql-5.6
ExecStop=/usr/bin/docker stop -t 2 mysql-5.6
[Install]
WantedBy=multi-user.target

/etc/systemd/system/mysql-5.7.service

[Unit]
Description=MySQL 5.7 Docker container
Requires=docker.service
After=docker.service
[Service]
Restart=always
ExecStart=/usr/bin/docker start -a mysql-5.7
ExecStop=/usr/bin/docker stop -t 2 mysql-5.7
[Install]
WantedBy=multi-user.target

These newly created service units depend on Docker service. The services will always start after the docker service has been started. To start on system boot, you should enable the following services:

sudo systemctl enable mysql-5.6.service
sudo systemctl enable mysql-5.7.service

Now both MySQL containers should be started after system boot.
You can use this method as a replacement for the default system packages. Personally, I use it on all my development machines to install software that isn’t available in the system package repositories. I also use it to run multiple versions of the same server (PostgreSQL, MySQL, MongoDB, etc) for all non-dockerized applications I work on.
Why don’t you also check out our Introduction to Docker presentation?

Categories
Software Technology

ElasticSearch: How to Solve Advanced Search Problems

A few months ago me and my team were faced with a challenge: to provide an advanced search engine with many simple (and more complicated) criteria and an ability to use a full text search mechanism. In addition, we knew that our client demanded high efficiency and scalability. I’d like to focus on full text search and other advantages of the Elasticsearch solution.

Data, data, data – their amount seems to grow, month after month. We didn’t really have that ‘problem’ a few years ago. Formerly, web applications didn’t have to process that much data at once. Developers didn’t have to pay attention to their application’s efficiency. Then, a vast majority of businesses moved their solutions to the web. They were suddenly faced with the problem of processing big amounts of data. We – the developers – needed to take a different approach.
Most of the problems we face during development can be solved by using one of the main databases. These include MySQL, PostgreSQL or Oracle, or implementing cache systems or making use of non-relational databases like MongoDB.

What if this isn’t enough? And our customer needs something more powerful, a system with a highly efficient data search engine – and scalability?
This is where ElasticSearch comes in. It’s a distributed, real time search engine, based on the Apache Lucene engine.
It gives us:

  • easy full text search
  • advanced search queries
  • many helpful built-in solutions
  • manageable data result scoring
  • shorter development time
  • scalability

Communication with Elasticsearch is based on Rest API. Programming languages like Java or PHP provide libraries to handle Elasticsearch. This makes life easier for developers.

CHALLENGE: Advanced data search engine with full text search field and data scoring functionality

DATA:

  • MySQL version: 5.6.28
  • Table: 169123 records
  • Average length of searchable field: 2550 characters
  • Maximum length of searchable field: 686141 characters

Let’s say a user would like to find all rows containing words like: “job”, “php” and “test”.

MySQL – first attempt (typical simple query on long text field):

SELECT count(ap.id) FROM database.application_multimedia ap WHERE content LIKE '%test%' OR content like ‘%job%’ OR content like '%php%';

Duration time: 9,140 seconds
As we can see, the execution time of very simple query is quite high when it comes to the search engine. Our client wouldn’t accept that solution.

MySQL – second attempt (query based on full text indexed field):

Step 1:
We need to change an index of a field:

ALTER TABLE `database`.`application_multimedia` ADD FULLTEXT INDEX `FTS` (`name` ASC, `content` ASC);

Step 2:
Execute query:

SELECT COUNT(ap.id) FROM database.application_multimedia ap WHERE MATCH (name, content) AGAINST('+test +job +php' IN NATURAL LANGUAGE MODE)

Duration time: 0,45 seconds
In this case, the query’s execution time is low. If our search engine were be based only on a full-text search, then the use of MySQL is enough. However, if we’d like to apply more advanced search conditions (and data sorting based on those conditions), then we should consider Elasticsearch.

Elasticsearch – third attempt:

The same query in Elasticsearch:
elasticsearch
Duration time: 0,471 seconds
As we can see, the queries’ execution time is similar. So why is Elasticsearch better? Just like I wrote before, if customer wants a full-text search only, MySQL or PostgreSQL are enough. But if s/he wants to develop that engine and make it more complicated, we should definitely consider Elasticsearch.
A simple example of one Elasticsearch feature that makes developers’ lives easier and makes full-text search more useful – highlights. This option returns the context of a searched word.
Example:
elasticsearch
Result:
elasticsearch
What’s the value of the feature shown above? You’re practically getting something for free. A developer’s effort is minimal – from a customer’s point of view, that’s a big advantage. Achieving a similar functionality while basing on a common relational database would demand a much greater workload.
An additional advantage of using Elasticsearch is its scalability. Along with data growth, we can add more Elasticsearch nodes to the system to ensure its stability and efficiency.
To sum up, an Elasticsearch engine is a solution perfectly matched to advanced search in complex data structures . This is especially true if we want to get precisely ordered results in our application. In the next article, I would like to outline more features and abilities of Elasticsearch – like influence on scoring and search results. Any questions? Ask away!

Categories
Software Technology Uncategorized

Geecon 2016: Java 9, Spring and the Nyan Cat

Our goal for GeeCON 2016 was to broaden our knowledge about topics we encounter on a daily basis at work. We chose talks concerning Java 9 (and 8), microservices, reactive programming and Docker. Here are a few words on some of the most interesting and inspiring ones.

Tomek

I’ve heard that Sven Peters is a great speaker so without hesitation I chose his Rise of the Machines – Automate Your Development. What a talk it was! Passionate delivery and beautiful slides (do check them out) combined with inspiring content gave me lots of ideas to improve our daily development processes at Espeo. The general idea is to automate all mundane and repeatable processes that do not require human interaction and creativity. The key is to look further than the usual CI/CD tools. I thought that at Espeo we were pretty well automated already but now I know we still have room for improvement. Which is great and I can’t wait to implement a bot or two.

We’ve had Java 8 around for some time now and the direction is ever-changing towards functional programming. But do we really know how to do it and what price we may pay? That’s what Daniel Sawano and Daniel Deogun wanted to explain in their talk “Beyond lambdas – the Aftermath”. It was a code-only presentation and that’s what really matters to us, programmers. They gave us many examples of bad code and showed ways of how to refactor it to remove code smells and side effects. They outlined where the functional style might introduce hidden complexity to our code (coders love those one-liners), generate unnecessary function calls etc. Have you ever analyzed the bytecode of a lambda? Well, at the presentation we did. We learned how to avoid stateful lambdas and how it affects the runtime performance. All in all, a very rich-in-content and detailed talk.
Geecon 2016: Java 9, Spring and the Nyan Cat

Iga

This year’s Geecon was a little bit less interesting than the previous one but I found some interesting speeches. The best one was about Java 9 and its modularity. The runners-up were the charismatic speeches of Josh Long about Spring.

During only 50 minutes, Josh built a brand new application with RESTful services, a pretty GUI and well-designed architecture – all using Spring Boot, convention-over-configuration framework. In another amazing 50 minutes he talked about Spring Cloud – tools for developers to quickly build some of the most common patterns in distributed systems. Examples: configuration management, service discovery, circuit breakers, intelligent routing, micro-proxy, control bus, one-time tokens, global locks, leadership election, distributed sessions, cluster state. All of this was mentioned in context of microservices – a concept highly recommended and praised, but one which can lead to architectural complexity. In his amazing talk, he talked about how organizations like Ticketmaster, Alibaba, and Netflix cope with this complexity with Spring Boot and Spring Cloud.

What was especially interesting was his incredible performance and fluency in writing good code in an extremely short time. Everyone was amazed of how quickly and easily he created working application using those tools. He is also a very talented speaker. The audience laughed many times during his hilarious jokes and funny tricks like changing the Spring logo to the Nyan cat during the build process. If all speakers could talk about software in such funny and interesting way, showing both great tools, great tips and great jokes – Geecon would be the best conference in entire world.
Geecon 2016: Java 9, Spring and the Nyan Cat

Michał

“Java 9 Modularity in Action”. The Java world has been trying to tackle modularity issues for a long time by initiatives such as OSGi. Yet, they were never highly adopted because of the effort needed to actually understand and use these tools. Project Jigsaw, the highlight of Java 9, promises to deal with the problem of modularity at its roots. Project Jigsaw proposes a revolution by getting rid of classpath and introducing a new concept of highly encapsulated modules (so now we will have a… modulepath). These modules will (or rather – should) only expose interfaces to talk with the outer world, and no implementations. Apart from good modular design, which should be a goal in itself, it also solves some annoying problems like the clash of different versions of the same class on the classpath. Of course, to provide backward compatibility, some tools will be provided to make the transition to the world of Jigsaw more painless. The modules for legacy code will be generated automatically. I was surprised to see a live coding session during this talk, and see this modularity concept working, even though it is still some time before Java 9 will be released.

Nobody expects the Spanish Inquisition, and nobody from the Geecon team expected such an interest in the “Java and Docker, a Good Idea” talk by Christopher Batey. The room was packed to the brim with Docker-hungry programmers. Despite the name of the talk, it was not about the question whether to use Docker with JVM or not. It was focused on the not-really-obvious traps to avoid during running JVM inside Docker containers. especially under high load. For instance, subjects like operating near memory limits and page swapping limits were covered.
Geecon 2016: Java 9, Spring and the Nyan Cat
There were many more interesting talks on e.g. event sourcing, reactive programming in general and in detail (RxJava), as well as a bird’s-eye view of microservices and gore implementation details. Plus some interesting concepts like self-healing systems and many performance-related talks. Want to know how storage works? How do traditional HDDs work and how they differ from SSDs? What’s coordinated omission in performance testing and why does it matter? Not to mention Big Data topics (we wrote about those too!). All that and much more on GeeCON. Now let’s put theory into practice and see you there next year!

Categories
Software Technology

Road To Angular 2 – Reactive Programming (RxJS)

Welcome to the first stop on the Road To Angular 2! The new version of Angular isn’t as simple as the previous one. It introduces a lot of new, hot and trendy stuff, so you have to prepare yourself before reaching the final destination!

In this series we’ll introduce you to:

  • Reactive Programming
  • ES6 features
  • TypeScript
  • Main concepts of Angular 2 apps
  • Integrating Angular 2 with Symfony backend

Today I’m going to tell you about Reactive Programming using RxJS, are you ready? Let’s start!

RxJS – why? It isn’t Angular 2, right?

Angular 1 is famous for its simplicity. It includes a lot of necessary features, so we don’t have to use jQuery for XHR communication, for example. As a result, it doesn’t force us to include external libraries. The Angular team has changed their attitude. Microsoft technology (TypeScript) and RxJS are included in Angular 2. As you can see, we have to get to know these technologies before learning our favourite framework.
Reactive programming is a subject important not only for JavaScript Developers, but also for every developer who wants to include asynchronous events into their apps. Rx isn’t only for JavaScript. There are versions for .Net, Java, Python and many others. So, I’d like to encourage you to read this article even if you don’t know JavaScript.
Let’s dive in…

Callback vs Promise vs Observable

JavaScript is a language which offers a few approaches that support handling asynchronous events, so let me just remind you what the difference is between those methods.
If you feel comfortable with callbacks and promises you can simply skip this part.
Callback is technically a function which we pass as a parameter into another function, but what’s more important is that it isn’t executed immediately, but right after the parent function is complete. It gives us the freedom to manage asynchronous events as we want.
https://jsfiddle.net/q288pfb1/7/
As you can see in the example, we get data from the server using Ajax and we pass callback as a parameter. In the callback, we retrieve the response and pass objects to another function in which we insert some items into the list. Easy, right? Is it pretty? Nope!
Have you ever heard of callback hell? It looks like this:

It just isn’t readable.
I promise that there are better places to live in than Callback Hell, one of them – for sure – is Promise Heaven!
Of course we can deal with callback hell but it isn’t main topic of this article. You can read more here: http://callbackhell.com/
Most readability problems can be solved with promises. ES6 introduces the Promise object as an object in JavaScript. Before this, we had to deal with external libraries like jQuery which offered such objects.
A brief reminder. Promise has three states:

  • pending – when initializing object – “Hello, I really want to do something, I promise, but it can take some time, just wait”
  • fulfilled – after function has completed – “Hi! I finished my work and everything is alright, mate!”
  • rejected – meaning – “Something went wrong, sorry”

The same code, but written using Promise, looks like this:
https://jsfiddle.net/y9czkyoc/4/
To put it simply, Promise is an object which can be rejected or resolved. After fulfilling we can process returned data using the then() method. Promise is a set of callbacks under the hood. If you are interested in implementing it, I encourage you to read this article: https://www.promisejs.org/implementing/
As we are talking about Angular, promises are ubiquitous. When we want to get data using Ajax we use the $http service which returns… promise represented by $q object! Our Angular services often look like this:
http://plnkr.co/edit/rUmN1rwPA3795Hytw6R3?p=preview
When I sat down to Angular 2 without documentation, I was surprised that my then() function didn’t work with object returned from http service. What is more, the construction http.get() didn’t make a request. Then I realized that the returned object type is Observable.
Quoting the documentation: “The return value may surprise us. Many of us would expect a promise <that’s the point>. We’d expect to chain a call to then() […] Instead we’re calling a map() method. Clearly, this is not a promise.”
…and so it began… Angular 2 uses a special and trendy reactive way to solve asynchronous methods, which looks like this:
return this.http.get(thistechnologyUrl)
               .map(res => res.json().data)
               .catch(this.handleError);
Before I explain these lines of code, I want to describe what reactive programming means.
Remember React.js !== RxJS

Why does Angular 2 use RxJS? Are promises out of fashion?

A few years ago, web applications weren’t as interactive as they are now. Only asynchronous action was used with the form submit button. Now the situation is a bit different. Often, we want to use many asynchronous actions at the same time. The answer to this problem is reactive programming, which can deal with it. In this situation, we have to change our thinking. In reactive programming, everything is an asynchronous data stream. What does this mean?
It’s likely all of us have used an asynchronous data stream without even realizing it. Are you familiar with code like this?
object.addEventListener("click", function(){
    // Do something after clicking
});
Yes? You often use it, and an asynchronous data stream is equal to such actions, but with additional features. We can say that this stream from reactive programming is an array which contains some values returned by a function over a period of time. As it is an array, it gives us a possibility to use the advantages of functional programming. Often, streams are illustrated like this:

On the top we have one timeline (representing the asynchronous data stream) on which there are some values generated by events. Below the top timeline there is a block representing the functions which, for example, map/filter this stream and return new stream with desirable data.

There are a few definitions you have to know before analysing the code:
Observer pattern – Technically, it is said that there are two types of objects: one of them is called Observable which sends signal/notification to objects called Observers. Observers can react to this signal.
In RxJS, subscribe is a method to listen (observer). The subscribe function gets 3 parameters which are callbacks. The first of them is called onNext and it is passed when value is emitted from Observable. The second one is onError which is passed when something goes wrong (for example status 500 from the server). The last parameter is onCompleted – it’s passed when a stream finishes work.
A lot of people claim that Observables are “lazy”, but what does that actually mean? As I said at the beginning, I was surprised when I wrote http.get() in Angular 2 and it didn’t make a request to the server. This is the difference between Observables and Promises.
I will illustrate it using a simple analogy:
Observables are like the guys who don’t “talk” when nobody wants to listen. They are ready to talk as long as somebody is interested in listening. What is more, they’re smart – and they stop talking when listeners signalise that they don’t want to listen anymore.
Promises think differently, they have only one thing to say and they have to hurry because their lives are short, so they start talking right after they’re born… ‘till they die. We cannot stop them.
Some code illustrating this analogy:
https://jsfiddle.net/wwLkvzbj/4/
Analyse this code and try to understand it. As you can see, promise started working at the beginning where it was declared, but observable started working when the first subscriber/listener appeared. Furthermore, values of observable change over time – different listeners have different values. Promise after resolve cannot be changed or cancelled but observable has these features.
If you aren’t convinced of using Observable you can still use promises in Angular 2. Basically, they convert observable into promise, so would it make sense? Maybe in some situations. This code shows how can you do it:
return this.http.get(this.technologiesUrl)
                .toPromise() // This line convert Observable into promise
                .then(res => res.json().data, this.handleError)
                .then(data => { console.log(data); return data; });

Should we use RxJS in Angular only for server communication?

There are some cases where we can take advantage of reactive programming. It is a good idea when have a lot of asynchronous actions over which we want to have full control. Imagine that you’re writing a web IDE. You want to add autocompletion and hinting syntax. It isn’t good idea to make a request to server every time you fire event keydown – requests will kill your server and slow your application. In this situation, people will be punished for their fast writing and your startup will be ruined. What’s more, you plan to give clients useful shortcuts. There will be combinations of mouse and keyboard events (sometimes connected with request to the server). Using the traditional way you’ll have to:

  • add some variables to control state
  • manually take care of timeouts and clear them
  • spend time to manage code properly for shortcuts

Reactive programming can help you solve these problems. Try to think how you would do it with and without reactive programming. It’s a good exercise.

Conclusion

Reactive programming is difficult and it requires us to change the way we think about solving problems. The topic is so wide that I only managed to scratch the surface of RxJS in basic Angular 2 usage. I think the effort to understand reactive programming will be rewarded in the future.
Some recommended reading:

See you soon at the next stop, ES6 and TypeScript!