Clear Docker Buildx Cache

You might have encountered a situation where you cannot build the latest Docker image when using the Buildx. If so, you may find this article helpful to give you a little insight into your question, “Why I keep seeing the stale image in the list!”.

What I tried was building the Docker image supporting ARM64 architecture. To achieve my goal, it requires me to enable Buildx experimental feature of Docker. It allows us to build a Docker image supporting multiple architectures. When I have found a problem in the image, I wanted to change the Dockerfile to reflect the fix I applied. But it failed. As shown in the following output, the CREATED time of the images keeps the past time even I have created just a few seconds before.

$ docker images
REPOSITORY                                                    TAG                      IMAGE ID            CREATED             SIZE
presto                                                        341-SNAPSHOT             c15822305160        5 hours ago         1.05GB
lewuathe/presto-worker                                        341-SNAPSHOT             eb1d11521b04        5 hours ago         1.38GB
lewuathe/presto-coordinator                                   341-SNAPSHOT             8e0085374165        5 hours ago         1.38GB

That’s was so annoying that I could not test my fix was adequately resolving the issue. Here are two options to overcome this stressful situation.

Build without any cache

As well as normal build command, buildx also provides --no-cache option. It enables us to build an image from scratch. The latest image will be created for sure.

$ docker buildx build \
    --no-cache \ # Without using cache
    --platform linux/arm64/v8 \
    -f Dockerfile-arm64v8 \
    -t lewuathe/prestobase:340-SNAPSHOT-arm64v8

Clearing the cache completely

Another option is clearing the cache. However, it has a side-effect affecting other image build time. Since removing all layer caches, it can make the build time for other images longer. But if the images you are holding is not so many, deleting the cache can be a reasonable option.

The builder instance holds the cache. The following command will clear the cache hold by all builders.

$ docker builder prune --all

Afterward, you can build the image as usual. We can see the build time is refreshed as follows.

$ docker images
REPOSITORY                                                    TAG                      IMAGE ID            CREATED             SIZE
presto                                                        341-SNAPSHOT             c15822305160        a second ago        1.05GB
lewuathe/presto-worker                                        341-SNAPSHOT             eb1d11521b04        a second ago        1.38GB
lewuathe/presto-coordinator                                   341-SNAPSHOT             8e0085374165        a second ago        1.38GB

To learn the practical techniques of Docker, you may find the following guide from Manning useful. Docker has many options or configurations. If you know these details, Docker will be more attentive tool for you.

Thanks for reading as usual!

References

Presto Docker Container with Graviton 2 Processor

I recently tried to run Presto on Arm architecture system to evaluate how it can cost-effectively achieve faster performance as part of my work. Thanks to AWS, we can make use of server machines having Arm processors such as Graviton 1/2. We have succeeded in experimenting without having much difficulty. The result of the research was described in the following articles.

But I have found that they do not uncover the full detail of how to set up the Docker environment and steps to build an Arm-supporting docker image. Graviton 2 is now publicly available. We can even personally try the processor for running the distributed system like Presto. Therefore, I’m going to restate the aspect of the process step by step here.

The topics this article will cover are:

  • How to install docker engine in the Arm machine
  • How to build Arm-supporting docker image for Presto
  • How to run the Arm-supporting container in Graviton 2 instance

Setup Graviton 2

I used the Ubuntu 18.04 (LTS) built for the Arm64 platform. The AMI id is ami-0d221091ef7082bcf. As it does not contain a docker engine inside, we need to install it manually. The instance type I used is m6g.medium. Once the instance is ready, follow the below steps.

Setup the Repository

Install necessary packages first.

$ sudo apt-get update

$ sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common

Add docker’s official GPG key.

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

Verify the key.

$ sudo apt-key fingerprint 0EBFCD88

pub   rsa4096 2017-02-22 [SCEA]
      9DC8 5822 9FC7 DD38 854A  E2D8 8D81 803C 0EBF CD88
uid           [ unknown] Docker Release (CE deb) <[email protected]>
sub   rsa4096 2017-02-22 [S]

Finally, add the repository for installing the docker engine for the Arm platform.

$ sudo add-apt-repository \
   "deb [arch=arm64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

Install Docker Engine

$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli containerd.io

See the list of available versions.

$ apt-cache madison docker-ce
docker-ce | 5:19.03.12~3-0~ubuntu-bionic | https://download.docker.com/linux/ubuntu bionic/stable arm64 Packages
docker-ce | 5:19.03.11~3-0~ubuntu-bionic | https://download.docker.com/linux/ubuntu bionic/stable arm64 Packages
docker-ce | 5:19.03.10~3-0~ubuntu-bionic | https://download.docker.com/linux/ubuntu bionic/stable arm64 Packages
docker-ce | 5:19.03.9~3-0~ubuntu-bionic | https://download.docker.com/linux/ubuntu bionic/stable arm64 Packages
docker-ce | 5:19.03.8~3-0~ubuntu-bionic | https://download.docker.com/linux/ubuntu bionic/stable arm64 Packages
docker-ce | 5:19.03.7~3-0~ubuntu-bionic | https://download.docker.com/linux/ubuntu bionic/stable arm64 Packages
docker-ce | 5:19.03.6~3-0~ubuntu-bionic | https://download.docker.com/linux/ubuntu bionic/stable arm64 Packages
docker-ce | 5:19.03.5~3-0~ubuntu-bionic | https://download.docker.com/linux/ubuntu bionic/stable arm64 Packages
docker-ce | 5:19.03.4~3-0~ubuntu-bionic | https://download.docker.com/linux/ubuntu bionic/stable arm64 Packages
...

I choose the latest one, 5:19.03.12~3-0~ubuntu-bionic.

Install the package.

$ sudo apt-get install \
    docker-ce=5:19.03.12~3-0~ubuntu-bionic \
    docker-ce-cli=5:19.03.12~3-0~ubuntu-bionic \
    containerd.io

To run the docker command without root permission, add the user into the docker group.

$ sudo usermod -aG docker ubuntu

Login into the instance again to reflect the latest user setting.

Build the Docker Image

Let’s build an Arm-supporting image from the source. Put the following Dockerfile under any directory as you like. I put it under /path/to/presto-arm64v8.

FROM arm64v8/openjdk:11

RUN \
    set -xeu && \
    apt-get -y -q update && \
    apt-get -y -q install less && \
    apt-get -q clean all && \
    rm -rf /var/cache/yum && \
    rm -rf /tmp/* /var/tmp/* && \
    groupadd presto --gid 1000 && \
    useradd presto --uid 1000 --gid 1000 && \
    mkdir -p /usr/lib/presto /data/presto && \
    chown -R "presto:presto" /usr/lib/presto /data/presto

ARG PRESTO_VERSION
COPY --chown=presto:presto presto-server-${PRESTO_VERSION} /usr/lib/presto

EXPOSE 8080
USER presto:presto
ENV LANG en_US.UTF-8
CMD ["/usr/lib/presto/bin/run-presto"]

We also need a script to launch the process as follows. The following file is put under /path/to/presto-arm64v8/bin/run-presto.

#!/bin/bash

set -xeuo pipefail

if [[ ! -d /usr/lib/presto/etc ]]; then
    if [[ -d /etc/presto ]]; then
        ln -s /etc/presto /usr/lib/presto/etc
    else
        ln -s /usr/lib/presto/default/etc /usr/lib/presto/etc
    fi
fi

set +e
grep -s -q 'node.id' /usr/lib/presto/etc/node.properties
NODE_ID_EXISTS=$?
set -e

NODE_ID=""
if [[ ${NODE_ID_EXISTS} != 0 ]] ; then
    NODE_ID="-Dnode.id=${HOSTNAME}"
fi

exec /usr/lib/presto/bin/launcher run ${NODE_ID} "[email protected]"

Afterward, we can build the latest presto.

$ cd /path/to/presto
$ ./mvnw -T 1C install -DskipTests

Make sure to find the artifact under /path/to/presto/presto-server/target. Finally, the following commands will provide the docker image supporting Arm architecture.

$ export PRESTO_VERSION=340-SNAPSHOT
$ export WORK_DIR=/path/to/presto-arm64v8

# Copy presto server module
$ cp /path/to/presto/presto-server/target/presto-server-${PRESTO_VERSION}.tar.gz ${WORK_DIR}
$ tar -C ${WORK_DIR} -xzf ${WORK_DIR}/presto-server-${PRESTO_VERSION}.tar.gz
$ rm ${WORK_DIR}/presto-server-${PRESTO_VERSION}.tar.gz
$ cp -R /path/to/bin default ${WORK_DIR}/presto-server-${PRESTO_VERSION}

docker buildx build ${WORK_DIR} \
    --platform linux/arm64/v8 \
    -f Dockerfile --build-arg "PRESTO_VERSION=340-SNAPSHOT" \
    -t "presto:${PRESTO_VERSION}-arm64v8" \
    --load

The image you want should be listed in the list of images.

$ docker images
REPOSITORY                                                            TAG                    IMAGE ID            CREATED             SIZE
presto                                                                340-SNAPSHOT-arm64v8   cf9c4124516f        3 hours ago         1.25GB

Run the Container

We can transfer the image by using save and load command of docker. The following command will serialize the image in the tar.gz format so that we can copy the image to the Graviton2 instance through the network. It will take several minutes to complete.

$ docker save presto:340-SNAPSHOT-arm64v8 | gzip > presto-arm64v8.tar.gz

Copy the image to the instance. It will also take several minutes.

$ scp -i ~/.ssh/mykey.pem \
    presto-arm64v8.tar.gz [email protected]:/home/ubuntu

Using the load command will bring the image into the executable format in the instance.

$ ssh -i ~/.ssh/mykey.pem [email protected]
$ docker load < presto-arm64v8.tar.gz

Finally, you get there.

$ docker run -p 8080:8080 \
    -it presto:340-SNAPSHOT-arm64v8
...
WARNING: Support for the ARM architecture is experimental
...

Note that Arm support is still an experimental feature as the warning message says. Please let the community know if you find something wrong using Presto in the Arm platform.

Thanks.

Reference

Implicit left-padding of the binary literal in Java

Hey, this is a part of the series describing the situation where I encountered weird behavior in the programming :) Today is about Java. When I wrote a code to do bit manipulation in Java, the unexpected outcome shows up. Unfortunately, I could not find the official specification behind this behavior. Thus this aims to get a chance to find the answer from someone who read this article.

Masking the most significant bit

What we wanted to do was getting the most significant bit in the 2’s complement format. For instance, 1111_1111 is -1 in the signed 8-bit format. To get the most significant bit in the number, we can use the mask of the signed bit.

byte value = -1;
long byteSignMask = 0b1000_0000;;
assertEquals(0b1000_0000, value & byteSignMask);

Yes, it properly works to get only the bit representing the sign of the number. Using binary literal and shift operation, constructing the mask gives us an identical result.

long byteSign1 = 1L << 7;
long byteSign2 = 0b1000_0000;

// byteSign1 = 10000000
System.out.println("byteSign1 = " + Long.toBinaryString(byteSign1));
// byteSign2 = 10000000
System.out.println("byteSign2 = " + Long.toBinaryString(byteSign2));

// OK
assertEquals(byteSign1, byteSign2);

But when I do the same thing for integer, it does not work.

Implicit padding

The following code works correctly as well as the previous example.

long value = -1;
long intSignMask = 0b1000_0000_0000_0000_0000_0000_0000_0000;
assertEquals(0b1000_0000_0000_0000_0000_0000_0000_0000, value & intSignMask);

Okay, let me check the mask is the same with 1L << 31.

long intSign1 = 1L << 31;
long intSign2 = 0b1000_0000_0000_0000_0000_0000_0000_0000;

// intSign1 = 10000000000000000000000000000000
System.out.println("intSign1 = " + Long.toBinaryString(intSign1));
// intSign2 = 1111111111111111111111111111111110000000000000000000000000000000
System.out.println("intSign2 = " + Long.toBinaryString(intSign2));

// Fail: expected:<2147483648> but was:<-2147483648>
assertEquals(intSign1, intSign2);

It’s interesting. Why is the mask constructed by the shift operation 1L << 31 results in a different outcome from the binary literal? Why is the binary literal automatically left-padded with 1? I asked in the StackOverflow as before to get the answer. Please let me know if you an explanation for it.

Thanks.

Reference

Does assignment precede logical operator?

I have encountered a weird situation while I’m writing a piece of code in Ruby. Here is the sample code extracting the essence of the problem I have faced. First, I tried to get the matching result with a line of code as follows.

p1 = /hello/
p2 = /world/

s = "hello, world"

if m1 = s.match(p1) || m2 = s.match(p2)
    puts "m1=#{m1}"
    puts "m2=#{m2}"
end

It shows:

m1=hello
m2=

Oops, I forgot that the logical operator || does the short-circuit evaluation. It makes m2 nil. What I wanted to do was checking both regular expressions are matching with the given string. Here is the correct one.

if m1 = s.match(p1) && m2 = s.match(p2)
    puts "m1=#{m1}"
    puts "m2=#{m2}"
end

But it shows:

m1=world
m2=world

Hmm, the result was unexpected. Why is m1 assigned by the outcome of p2 pattern? I expected that the result of the matching of p1 pattern is assigned to m1 and so forth for m2. Is the precedence of the operators correctly working?

According to the Ruby operator precedence, the logical operator && precedes the assignment operator =. Therefore, the evaluation order of the previous code should be same as:

if (m1 = s.match(p1)) && (m2 = s.match(p2))
    puts "m1=#{m1}"
    puts "m2=#{m2}"
end

Obviously, its outcome is expected.

m1=hello
m2=world

In reality, the evaluation looks like:

if m1 = (s.match(p1) && m2 = s.match(p2))
    puts "m1=#{m1}"
    puts "m2=#{m2}"
end

The logical operator follows the assignment to m2.

Since the result seems weird and I’m still not sure the mechanism behind this behavior, I posted one question in StackOverflow

I would very much appreciate it if you could find the answer to this problem. Thanks!

Reference

When to use var in Java 10

Java 10 officially introduced the var keyword to declare the local variable without typing the type information. Java developers have needed to keep typing the class name explicitly for a long time.

var keyword allows us to declare the local variable without type information supported by the power of type inference of Java compiler. We can simplify the following code.

String str = "Hello, Java!"

like this

var str = "Hello, Java!"

But is it beneficial for practical use? Does it make the code better from the viewpoint of the readability and maintainability?

I have found an interesting discussion in the Presto community about the var usage. Let me summarize the point here.

Pros

  • Speeding the type typing
    • We do not need to write the code as follows anymore.
      AnOverlyLengthyClassName instance = new AnOverlyLengthyClassName();
      
  • Since it’s just an implicit typing, we already use it in the lambda expression.
  • We can avoid the trouble caused by forgetting the L in the long literal.
  • It can have a use when we want to simplify the expression.
    abc = doSomething(somethingElse());
    // ↓ Extracting variables
    List<Map<SomeLongNameHere, List<BlahBlahBlah>>> foo = somethingElse();
    abc = doSomething(foo);
    
  • It encourages developers to use a more descriptive variable name like Kotlin, Scala.

Cons

  • It makes the readability worse by imposing the burden to infer the type by ourselves.
  • The primary purpose of the code is readability, not the speed of typing.
  • Even if we use the var declaration, we can save only a few characters in most cases.
  • Although we can check the actual type by jumping the code with IDE, we cannot do that in GitHub.

Overall, the Presto community does not prefer using the var for now. That’s pretty reasonable. From my experience, the var usage did not improve the readability even though it could worsen it. Thanks to the power of IDE (e.g. IntelliJ IDEA), it would not be much trouble to type lengthy class name in Java anymore.

Therefore, we may need to be careful to use the var declaration in general. If we are sure to simplify the expression by eliminating the tiring generic type declaration (See 4th item in Pros), there might be room to use it.

I am also interested in how the other Java community treats the var usage in their project.

Reference