Monthly Archives: July 2019

LearnWith: Concept to Implementation – Docker Development Environment

In this installment of the LearnWith series I create a standalone development environment using Docker and invite you along on the journey of working through the process outlined here. I go from having a vague idea of what I need with very little prior Docker experience, to a working solution. Rather than be a tutorial on Docker, the focus of this article is on the method and reasoning behind getting to a solution. I do present a solution at the end containing all the notes and explanations gained from following the process. So if you’re here for that, this article is not all theory, I’ve got you covered too. Come LearnWith me.

NOTE: If you are here just for the solution, skip to The Implementation section.

Defining the Problem: The Why

I’ve been working on Project ︎◼︎◼︎◼︎◼︎◼︎’◼︎ ◼︎◼︎◼︎◼︎◼︎ for a while now. So far, it has a Front End which is a little baby skeleton of a website and a browser extension; both of which are supported by a data API. The website is a web server running NodeJS, while the API is made up of Elastic Search and another web server running Java.

For the next part of Project ◼︎◼︎◼︎◼︎◼︎’◼︎ ◼︎◼︎◼︎◼︎◼︎, I’m creating an Editing Suite to manage the source data the Front End needs to work. To support the Editing Suite I’ll need a web server, a queue, queue processors and an a second API composed of a web server and a database.

Getting to the problem… Imagine running the full system on your local machine; the Front End and the Editing Suite. A minimum of 8 processes would need to be running on the same machine. That’s a lot of starting and stopping of services and a lot of port management. I currently need half of that and I’ve already started losing track which exact processes are needed for specific parts to work.

This very close to an actual conversation I’ve had when doing an impromptu demo: “… and after the extension is installed you see a list…? Where’s the list?! Ahh, I forgot to start the web server. Refresh and…. hold on, forgot to start elastic search…? Refresh and …Ngggggllff!! I changed the port earlier… Restart and refresh… What’s wrong now…? I’ll show you another time….”.

As any system grows, there comes a point when you need spend time on managing the complexity to continue being efficient.

Fortunately that demo scenario I described was to an understanding fellow dev friend, so not a big deal. So far I’ve not had any significant problems dealing with the complexity. Doubling the number of processes, doubles the complexity of the system, which also doubles the points of failure. As any system grows, there comes a point when you need spend time on managing the complexity to continue being efficient.

Side note: For Devs that don’t have any love for project managers, managing complexity is a part of what they do. They manage the complexity of coordinating different types of people doing different things that are all part of something larger. It’s not easy by any means. Shout out to my PM friends!

To summarise and to state concisely why we’re fixing this problem and what the proposed solution should be working to fix:

  1. Working in the development environment of the app is becoming complex.
  2. Multiple services all use the same port sometimes causing clashes.
  3. Getting a service running is tedious and error prone.
  4. Managing different versions of software for each service is difficult on the local machine.
  5. Versions of software installed on the local machine may clash with software needed for other projects.

Additionally this is going to be used for development, so ideally the solution should not fundamentally change the development process too much.

Defining the Problem: The What

We just described what we don’t want. Let’s describe it in the positive, let’s describe what we do want:

  1. Different parts of the app (the Front End and the Editing Suite) can be run independently.
  2. Simplify the starting and stopping of services.
  3. Every process needed will start up if it’s needed for a service to function correctly.
  4. Each process can have it’s own software version.
  5. Changing a file locally must still change the file represented in the output.

We’ve described what an outline of the solution should look like. This where come up with different technologies and evaluate them based off of these criteria. In an ideal world you’d go with one that has the most merit. In the real world, your situation (money, time, politics, available skills) often guides your choice. I’ve done this before using Vagrant. This time, as the title suggests, I use Docker. For the sake of brevity, and because this isn’t an article on the pros and cons of Docker vs Vagrant, let’s assume an analysis has happened and we’ve verified Docker fills all 5 of the requirements above.

Service Architecture. (If I had a child they’d have drawn it like this. I don’t have a child, so I had to draw it like this myself.)

Referencing the diagram above, we’re getting to the details of the setup we need. A successful implementation would mean we need all the following running in docker containers:

  1. An instance of the Java Application
  2. An instance of MySQL
  3. An instance of RabbitMQ
  4. An instance of the Python Worker Application

Additionally the following constraints would need to be satisfied.

  1. The Java application must able to communicate with RabbitMQ.
  2. The Python application must be able to communicate with RabbitMQ.
  3. The Java Application must be able to communicate with MySQL.
  4. Altering the Java code on the local machine must alter the code running in the container.
  5. Altering the Python code on the local machine must alter the code running in the container.

Note on technologies chosen: The choice of technology to build your project in, is in some cases arbitrary. I chose MySQL because amongst other things, it’s free, open source, stable, and a relational database that matches the requirements of my data. I’ve worked with it before and it’s more than good enough to get the job done. There may be better a database but does it really matter? Everyone has a bias. A few of the biases/reasons I’ve heard are “I know how to do it this way”, “It just feels right”, “Google does it this way”, “We’ve always done it this way”, “I don’t trust other people’s code”, “It’s cheaper and easier to find developers”, “I was here first”. Some of those are valid, others not. The process should be: have some options, build consensus around one or more of the options, choose one, implement it.

I chose Java and Vertx because I wanted to broaden my skillset. I could have used PHP or NodeJS and the result would have be the same. If any juniors are reading this, if someone is telling you your programming language is inferior, they’re likely gate keeping. If you press those people, you’ll generally find in most cases they’ve barely used the technology they’re criticising. Some of the criticisms might be valid, take what they’re saying with a pinch of salt, get a second opinion and take that with a pinch of salt too.

I chose RabbitMQ quite frivolously. I’ve heard it mentioned and it’s the name I’m most familiar with when it comes to queues and queue processing – even though I’ve never used it. That and it’s also open source and free. I might find it can’t do what I need, I doubt it, but if that happens I’ll just swap out the message transfer part for something else.

As an aside: If you find value in using open source software, make an effort to support those communities. Where would the world be without them?

Python arguably has the best support for Natural Language Processing so I chose that for this specific worker. There may be other workers that work better in other languages. Though it might make sense to keep your stack as small as possible.

Ultimately everyone has a bias, I’m showing mine because this is meant to be an honest peak behind the curtains of software development. As I mentioned, your situation can dictate your solution. If I make a mistake on this project, no one will die. With that in mind, my approach is: give something some thought, do that something, maybe make a mistake, learn from it, improve, update the implementation.

Preparing for Implementation

We know what we need to do, but we know nothing about Docker. For us mortals that mostly deal with common problems, we’re usually not the first person to try and solve a problem. And even if we are, chances are someone has done some work on the subject before us and we can build on what they’ve done. Maybe someone has written about it, maybe there are some tutorials, some online courses. All of these things can help us get to where we need to be.

Think of learning the terminology of a technology in the same way you’d ask for directions in a country where you don’t speak the language. You will probably get there, but various misunderstandings might mean there will be a few wrong turns along the way.

The first thing with any new technology is to read it’s introduction. This will give us a brief understanding of what we’re dealing with and the terminology we need to speak about our solution. The terminology is important because it gives us a way of speaking to the wider community allowing them to be able to understand and help us. Think of learning the terminology of a technology in the same way you’d ask for directions in a country where you don’t speak the language. You will probably get there, but various misunderstandings might mean there will be a few wrong turns along the way.

Docker’s overview page seems the perfect place to get this good initial understanding. Get started with Docker looks like a good next step. And a quick browse of the menu shows a Develop With Docker section. This looks promising! The official website looks like it has all the information we need to get to a solution. I can’t stress how happy this makes me: All or most of the knowledge will come straight from the source; The methods used will be officially supported known ways of doing things; And people already versed in the technology will be more likely to follow the implementation. All good things!

Now to read through the links and run the tutorials posted above. Depending on the time I have available, it’s likely a few weeks will have passed before I get to the next section.

Completed the Readings

I’ve been trying to read during my lunch break and after work. 2 weeks have passed. It would have taken me longer but I’d already gone through 60% of the readings before starting this write up. Most of it was a refresh.

I’ve got lucky with some very good documentation. I’ve read the overview and completed the getting started tutorials. I have enough information to start! I’ve not read the develop with docker section. I’m probably missing a key piece of knowledge to solve the problem completely. So… enough information to start but not enough to finish. I prefer starting as soon as I can. Others may prefer to do more reading. In this case, I don’t to have worry about the piece of string issue because I have already defined the problem. As long as I have that goal in mind, the work I do will generally get me closer to a solution. I also know from the readings, that anything I’ll need to change won’t need significant effort or have significant repercussions.

As I mentioned previously, this isn’t meant to replace the well written tutorials on the web, rather an exploration of the method of reaching the solution. With that in mind, onto the implementation.

Onto the implementation!

The Implementation

Note: If you want to see a step by step evolution of how I got to the solution, take a look a the source code on github. Look through branches ‘Part 1’ through to ‘Part 7’ to see the incremental steps needed to arrive at this solution.

You’ll need to add two files to your project to get it to. A Dockerfile file to tell docker how to package up the code and how to prepare the environment the code needs to run in. And a docker-compose.yml file to tell docker what other services are needed for the app to run and how to run them.

The Dockerfile:

FROM openjdk:8u212-jdk
WORKDIR /app
COPY . /app
RUN ["./gradlew", "clean"]
EXPOSE 8888
CMD ["./gradlew", "run"]

Lets look at this line by line:

FROM openjdk:8u212-jdk
This tells Docker to base my image off of the official openjdk image. This means I will have a specific version of the Java OpenJDK installed before I even begin. Docker will try to find this image on the local machine. If it can’t it will try download it from the docker hub. There are loads of official images on the docker hub you can use in your app.

WORKDIR /app
This tell Docker that the commands following this one will be run in the container from the directory /app.

COPY . /app
This tell Docker to copy everything in the current directory (denoted by the dot) to the folder /app on the container.

RUN [“./gradlew”, “clean”]
This will tell Docker to run the java specific command ‘./gradlew clean‘ (from the previously set WORKDIR) – I run this to make sure the gradle wrapper is available on the container before packaging it.

EXPOSE 8888
The Java application runs on port 8888 and this tells Docker to allow access to the application from via this port.

CMD [“./gradlew”, “run”]
CMD is the command Docker uses to start up your application. In this case to start my Java application Docker will run “./gradlew run” from the WORKDIR

Docker will now know how to build your code into an image and have it be able to run on it’s own.

Next docker-compose.yml:

version: "3"
networks:
  java_to_db:
  java_to_queue:
services:
  java_app:
    build: .
    volumes:
      - './:/app'
    ports:
      - "4000:8888"
    depends_on:
      - mysql
    networks:
      - java_to_db
      - java_to_queue
  mysql:
    image: mysql
    command: --default-authentication-plugin=mysql_native_password
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: example
    networks:
      - java_to_db

If you’ve not seen yaml before, it’s a text file format where the number of space before the text has meaning. If something has more spaces than the line before it, that means it is relating to the line before it. So in the example java_app and mysql are sections that belong to services. Let’s continue line by line:

version: “3”
Tells docker what version of the docker file structure we are using. Different versions have different sections and capabilities.

networks:
This section defines the list of networks and lets Docker know how to set up these networks. A network is used to allow communication between services and the outside world. The lines below network ‘java_to_db:‘ and ‘java_to_queue:‘ tell Docker to set up two networks with the respective names.

services:
Under this section we put all the process we need to run. MySQL, python, Java, RabbitMQ would all be defined under this section.

java_app:
This is a name for the service. The name itself can be anything and has no specific meaning. However you may need to refer to it in your application so make it something meaningful.

build: .
This is tabbed into the java_app section meaning it’s a command for the java_app. This tells Docker to build run all the instruction in the Dockerfile in the current directory. (the one we defined above).

volumes:
We’ll do volumes and – ‘./:/app’ together. Volumes is part of java_app and volume usually means refers to data storage such as files. – ‘./:/app’ falls under volumes. So, with these two lines we’re defining the file storage for the java app. You might remember ./ and /app from the COPY command in the Dockerfile. The copy command copied the all the code at the time the command was run. If any changes have been made since the code is now out of date. – ‘./:/app’ is called a bind-mount and tells Docker that we want to have the current directory in place of the /app directory on the container. This means changes made in the local directory linked in the Docker container too.

ports:
Also part of java_app. This and the line below, – “4000:8888”, map port 4000 to port 8888. What this means is that by visiting port 4000 on the (virtual) hardware it runs on, Docker will make sure all data that transfers through 4000 to port 8888 on the container.

depends_on:
This section is used to determine the order in which to start the services. The line below depends_on tells us that the java_app should wait for the mysql service to be ready before it starts.

networks:
Sitting under ‘java_app‘ this tells Docker java_app needs to talk to everything in the java_to_db and everything in the java_to_queue networks. ‘java_app‘ won’t be able to communicate with anything anything else.

mysql:
This is the start of a new MySQL service. The service has the name mysql but could have been anything. I mentioned earlier has is important. In this case when trying to connect to MySQL from the Java App, we need use the service name, ‘mysql’, as the hostname.

image: mysql
This is tells docker to create a service using the official MySQL image.

Other than ‘networks‘, the rest of the commands in the mysql section setup mysql to run in a specific way, you can find more details on the official MySQL Docker image page.

And that’s it. We’re finished! That’s everything needed to a development environment running in Docker.

Reflecting on the Goal

Are we finished? Let’s look back at what we set out to do. Let’s look at what we have:

  • An instance of the Java Application
  • An instance of MySQL
  • The Java Application must be able to communicate with MySQL.
  • Altering the Java code on the local machine must alter the code running in the container.

And what we we still need:

  • An instance of RabbitMQ
  • An instance of the Python Worker Application
  • The Java application must able to communicate with RabbitMQ.
  • The Python application must be able to communicate with RabbitMQ.
  • Altering the Python code on the local machine must alter the code running in the container.

That looks like more than half of the original plan is still missing. How in any way is this finished?

The Business Answer: Priorities change constantly when working on a project. Things outside of your control often affect what you’re doing on the project. Perhaps something your boss asked for, perhaps something a competitor has done now means something very important yesterday, has had to be moved out of the way to make place for something else today. I purposefully put in an extra bit to show how a modular iterative approach to solving a problem can mean that even though we’ve only done a piece of the work, the solution in it’s current state still helps us move forward. If we come back to the solution later, we can pick up where we left off.

The Truth: I’m currently the only developer building the proof of concept for Project ◼︎◼︎◼︎◼︎◼︎’◼︎ ◼︎◼︎◼︎◼︎◼︎. It’s taken a long time for me to write this article. I’m doing all this while still having a full time job, so it’s taken maybe a month or more. On top of that, I’ve not allowing myself to continue with the code until I finish this documenting the solution. Having the solution means I am unblocked and I’m REALLY excited to get back to writing code again! Having said that, I feel like I have solved the whole problem conceptually. All the core concepts needed to add Python and RabbitMQ are already included in the solution. Adding them in will a copy and paste job with a little bit of reading to get the specifics of RabbitMQ. If there was another developer waiting on the full solution, I’d definitely have made sure it was complete.

Finally…

Don’t be discouraged if you get stuck. Take a break and come back to it later. While writing I had two moments where I got stuck and despondent. My internal thoughts were: “The solution isn’t doing what I want! I should quit!”, “Is Project ◼︎◼︎◼︎◼︎◼︎’◼︎ ◼︎◼︎◼︎◼︎◼︎ even worth doing?”, “If I can’t even do this small, how dare I write an article and to give others advice?”. Most people I’ve worked with think I’m a good software engineer. To be honest, I am happy with my skills and I do have my moments but I’ve never ever felt like “one of the best developers I’ve worked with“. If that person is right and I am a good developer, I say this to let you know if you’re feeling discouraged, just remember, we all feel that way sometimes, just keep at it.

That’s it! Thank you for reading. I hope you learned something with me and had fun doing it.