Caveat using all? and any? in Ruby

There are always several pitfalls in writing a code, regardless of its difficulty. Developers are likely to get into trouble for several hours or more. This time I will briefly describe the situation where I’ve got stuck due to a lack of recognition of the short-circuit evaluation semantics provided by all? and any? in Ruby.

What is short-circuit evaluation?

The short answer is here. It is semantic for a kind of optimization. We only evaluate the second argument of the boolean expression only if the first argument does not enough to provide the total value of the expression. For example, let’s say we have the following boolean expression.

a && b

We do not need to evaluate the variable b if a' is false because we can know the overall final value is false without b`.

But what if a and b have side effects, respectively? We want to make sure to execute the result of side effects. That happened when I used the all? method in Ruby.

my_models = [...]

if my_models.all?(&:valid?)
  puts "All okay."
else
  puts "Someone is not okay."
end

I wanted to collect all models’ errors in my_models so that users can see all possible errors at once. valid? allows us to accumulate validation errors in the model. But all? stops executing valid? if some models are already invalid. So we need to rewrite the code like this.

my_models = [...]
all_validity = my_models.map(&:valid?)

if all_validity.all?
  puts "All okay."
else
  puts "Someone is not okay."
end

This is a simple problem, and well experienced Ruby developer may not have made such a mistake. I hope this caveat may help someone who accidentally forgets the short-circuit semantics when writing the code.

Ruby Build Failure with OpenSSL3

There may be no developer who has not encountered installation errors when using Ruby and rbenv. As a piece of evidence that Ruby and OpenSSL have bad chemistry with each other, you may be able to find a lot of questions line about the error like this.

This time, I tried to upgrade my Ruby environment to Ruby3 and encountered the following issue.

OpenSSL 3 - symbol not found in flat namespace '_SSL_get1_peer_certificate'

This is because OpenSSL 3 has SSL_get1_peer_certificate but Open SSL 1.1 does not, which has SSL_get_peer_certificate instead. When you build Ruby with OpenSSL 1.1 but puma running for Rails using Open SSL 3, the problem shows up.

This solution targets the developer using macOS and Homebrew to manage their system packages.

Short Answer

The quick answer to this problem is deleting OpenSSL 3 from your environment. If you use macOS and Homebrew, you will find which version is installed.

$ brew --prefix openssl
/usr/local/opt/[email protected]

It indicates that the system, including puma, can refer to OpenSSL 3. It might be better to uninstall this version completely. You may imagine changing the build option to compile puma with OpenSSL 1 can work.

$ bundle config build.puma \
    --with-opt-include=/usr/local/opt/[email protected]/include

But it did not work in my case. Eradicating the package from the system was the sole solution.

Loop summation with MLIR

MLIR is not a programming language in a broad sense. As the name suggests, it’s an intermediate representation to express the middle-level structure of the program. This framework is so versatile and flexible by employing the plugin architecture inside. It might be possible (and even natural) to write our program with MLIR by hand. MLIR is powerful in representing the high-level structure we recognize when writing algorithms. I tried to run a simple program adding up all values from 0 to 10 (inclusive).

Affine Dialect

It is straightforward to use Affine Dialect to implement the nested loop in MLIR. The syntax is very similar to what we see with the higher-level programming language like C/C++ and Java. affine.for is an operation representing a loop containing a region in its body. It gets three operands, lower bound, upper bound, and step value.

affine.for $i = 0 to 11 step 1 {
  // Body
}

This code iterates the SSA value $i from 0 to 10. Step operand is optional. The block in affine.for should have one terminator operation affine.yield. This operation yields zero or more SSA values from an affine op region. In this case, we will use this operation to return the final summation value. iter_args is helpful to retain the loop-carryed variables, which are in the scope of the body region of affine.for. This value holds what is returned by the termination operation affile.yield. We will use %sum_iter to keep the current accumulated value.

In addition to the affine dialect, we need to use Arith Dialect, which holds basic integer and floating-point mathematical operations. We utilize this dialect to initialize the constant and add operations.

As a whole, the program will look as follows.

func.func @main() -> i32 {
  %sum_0 = arith.constant 0 : i32
  %sum = affine.for %i = 0 to 11 step 1 iter_args(%sum_iter = %sum_0) -> (i32) {
    %t = arith.index_cast %i : index to i32
    %sum_next = arith.addi %sum_iter, %t : i32
    affine.yield %sum_next : i32
  }
  return %sum : i32
}

Lowering to LLVM

To run the program in MLIR, we need to lower it to the lowest level in the executable format. That means converting one dialect to another dialect in the MLIR sense. We will convert affine and arithmetic dialect to LLVM dialect first. mlir-opt is a handy tool to achieve that type of conversion.

$ mlir-opt \
    --lower-affine \
    --convert-arith-to-llvm \
    --convert-scf-to-cf \
    --convert-func-to-llvm \
    --reconcile-unrealized-casts sum.mlir

module attributes {llvm.data_layout = ""} {
  llvm.func @main() -> i32 {
    %0 = llvm.mlir.constant(0 : i32) : i32
    %1 = llvm.mlir.constant(0 : index) : i64
    %2 = llvm.mlir.constant(11 : index) : i64
    %3 = llvm.mlir.constant(1 : index) : i64
    llvm.br ^bb1(%1, %0 : i64, i32)
  ^bb1(%4: i64, %5: i32):  // 2 preds: ^bb0, ^bb2
    %6 = llvm.icmp "slt" %4, %2 : i64
    llvm.cond_br %6, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    %7 = llvm.trunc %4 : i64 to i32
    %8 = llvm.add %5, %7  : i32
    %9 = llvm.add %4, %3  : i64
    llvm.br ^bb1(%9, %8 : i64, i32)
  ^bb3:  // pred: ^bb1
    llvm.return %5 : i32
  }
}

As you can see, there are several options to complete this conversion.

  • --lower-affine : Lowering affine dialect to standard dialect.
  • --convert-arith-to-llvm : Convert arithmetic dialect to LLVM dialect.
  • --convert-scf-to-cf : Convert structured control flow dialect to the primitive control flow dialect.
  • --convert-func-to-llvm : Convert func dialect to LLVM dialect.

We do not talk about them in detail here, but the final code in MLIR only contains operations from the LLVM dialect. (Note that they start with the llvm prefix). Finally, it’s ready to go down to LLVM IR!

Translate MLIR to LLVM IR

mlir-translate is another handy tool to convert the MLIR program into LLVM IR format. For example, put --mlir-to-llvmir option as follows.

$ mlir-opt \
    --lower-affine \
    --convert-arith-to-llvm \
    --convert-scf-to-cf \
    --convert-func-to-llvm \
    --reconcile-unrealized-casts sum.mlir | \
    mlir-translate --mlir-to-llvmir

; ModuleID = 'LLVMDialectModule'
source_filename = "LLVMDialectModule"

declare ptr @malloc(i64)

declare void @free(ptr)

define i32 @main() {
  br label %1

1:                                                ; preds = %5, %0
  %2 = phi i64 [ %8, %5 ], [ 0, %0 ]
  %3 = phi i32 [ %7, %5 ], [ 0, %0 ]
  %4 = icmp slt i64 %2, 11
  br i1 %4, label %5, label %9

5:                                                ; preds = %1
  %6 = trunc i64 %2 to i32
  %7 = add i32 %3, %6
  %8 = add i64 %2, 1
  br label %1

9:                                                ; preds = %1
  ret i32 %3
}

You may find several additional directives for debugging purposes. But the central part of the program should be identical. Now it should be able to execute.

$ mlir-opt \
    --lower-affine \
    --convert-arith-to-llvm \
    --convert-scf-to-cf \
    --convert-func-to-llvm \
    --reconcile-unrealized-casts sum.mlir | \
    mlir-translate --mlir-to-llvmir | lli

$ echo $?
55

The program returns the summation value as an exit code correctly! If you enjoy the writing program at MLIR, please visit the MLIR website for more detail. You may find excellent examples or hint to implementing the algorithm in MLIR directly.

References

How We Migrate Millions of Queries on The Cloud

As a software engineer maintaining running service on-site, you may have the experience of migrating something owned by users to another platform without breaking any visible functionality. It often happens when you introduce a new version of the software or apply security patches.

But when it comes to the scale of the cloud and web, this task turns into a colossal challenge. The number of resources we must migrate is massive. Luckily or unluckily, our users may hugely rely on these resources we manage. So we have to keep them available during any time of the migration. It takes unignorable time to complete all such demands, checking the compatibility and consistency of the new platform for such resources. It may seem a familiar situation to you regardless of the type of resources in the service.

We, Treasure Data, deal with millions of queries for the data analysis. They were written by marketers, analysts, and engineers in our customer’s companies. So we have to keep them running without any problem every day.

Today, you may find our approach to this challenge in the following paper. We describe how to tackle this common problem in the research published in this year’s DBTest workshop.

Taro L. Saito, et al., “Journey of Migrating Millions of Queries on The Cloud”, 2022

Architecture

I have contributed to initiating the framework to achieve the automatic query simulation. This framework proved helpful in the past five years and successfully lowered the hurdle to safely migrating running queries onto a new platform, an updated version of Trino.

We do not publish any code for the framework, but we contributed to finding several bugs during this simulation in Trino. In addition to that, there should be a lot of learnings generally applicable to various types of migration efforts you have encountered. I hope this paper is informative and insightful for every developer.

Thank you!

Earn Master's Degree from GA Tech

Keep learning is a critical thing contributing to going over a fruitful life, as Lynda Gratton says. Therefore, continuous effort to get knowledge and skills will be even more important to survive the world and forever change.

Three years ago, I decided to admit into the OMSCS program provided by the Georgia Institute of Technology. The program is a fully online course to obtain a master’s degree in computer science. The most notable thing about this course is by far the low tuition fee. It allows us to get a master’s degree from a US university while working in Japan without much cost.

After enjoying three years of studying, I have finally earned a master’s degree in computer science from the Georgia Institute of Technology.

I have learned a lot of things during these years. However, attaining credits from the courses along with my work, housekeeping, and child care was challenging. The fact I often saw a couple of people drop out of the class indicates the difficulty of completing this program. But laying the other rail along with my principal life should have opened the door to further possibilities in my career. So I concluded that going over the program was worth a try overall. If you are interested, let’s give it a try. Your effort will be rewarded for sure at the end of the day.

Thank you!