Dump IR by Glow Compiler

This time, we will visit another machine learning compiler, Glow. The PyTorch community actively develops Glow; meanwhile, MLIR I’ve touched on in the previous posts is mainly built by the LLVM community. Glow is a compiler to lower the given machine learning model into the executable format on various types of hardware such as GPU, accelerators. It supports Caffe2 and ONNX for now.

3 Level IR

As illustrated in the above figure, Glow converts the model into a high-level graph structure first. It keeps the abstract information for each operator and relationship so that Glow can do further optimization by utilizing that information.

Afterward, Glow converts the graph structure into an intermediate representation defined by Glow (Low-Level IR). It looks more similar to what we are familiar with. Although it lacks the high-level information retained by the graph structure, we can employ the state-of-the-art optimization method usually implemented in the standard compilers.

Lastly, the backend generates the machine-executable code from the IR. As is often the case with the compiler backend, the Glow backend is also pluggable to replace them interchangeably. But Glow supports the custom node in graph and instruction. In such a case, the backend is responsible for converting the custom node into the format executable of other hardware. As of today, Glow supports the following backend implementations:

  • CPU (LLVM)
  • Habana
  • Interpreter (Reference Backend)
  • NNPI
  • OpenCL

Seeing is believing. Let’s take a look into how to generate Glow IR from a very tiny ONNX model.

One Linear Layer Model

We create the ONNX model with the following PyTorch code.

import torch
from torch import nn

class MyModel(nn.Module):
  def __init__(self):
    self.classifier = nn.Sequential(
      nn.Linear(10, 5),

  def forward(self, x):
    return self.classifier(x)

  def save(self):
    input_names = ["input_1"]
    output_names = ["output_1"]
    dummy_input = torch.ones(3, 10)
    torch.onnx.export(self, dummy_input, "mymodel.onnx", verbose=True, input_names=input_names, output_names=output_names)

if __name__ == '__main__':
  model = MyModel()

You can visualize the model with Netron as follows.


Glow allows us to compile the model with the model-compiler command-line tool. We can use a docker image provided by the community to launch the environment for Glow development. Since building the whole repository of Glow takes a far long time, I highly recommend only compile model-compiler for this case.

$ mkdir build_Debug
$ cd build_Debug
$ cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug ../glow
$ cmake --build . --target model-compiler

Once the compilation for model-compiler is completed, we can compile our simple model previously built.

$ ./bin/model-compiler \
    -model mymodel.onnx \
    -emit-bundle ./mybundle \
    --backend=CPU \

function mymodel.onnx
declare {
  %classifier_0_bias = WeightVar float<5> const // size: 20 // Users: @in 0
  %classifier_0_weight__1 = WeightVar float<10 x 5> const // size: 200 // Users: @in 0
  %input_1 = WeightVar float<3 x 10> mutable // size: 120 // Users: @in 0
  %output_1 = WeightVar float<3 x 5> mutable // size: 60 // Users: @in 1, @out 0, @out 1

  ; size = 400 bytes

code {
  0 %Gemm_0__1 = fullyconnected @out %output_1, @in %input_1, @in %classifier_0_weight__1, @in %classifier_0_bias
  1 %Sigmoid_1 = sigmoid @out %output_1, @in %output_1

Great, we’ve done it!. dump-ir option of the model-compiler (and any other CLI tools supporting it) enables us to see the IR for the testing purpose.

For the detail of the IR, I’ll have another time to explain that.

How to init Memref in MLIR

Reading the tutorial of one software always brings me to the gateway leading to the other eternal journey. It’s an exciting experience if you are a technology enthusiast. Walking the path to be the software expert on your foot can be an irreplaceable event in your career.

But it’s also true that a guidebook written by the pioneer on the field gives you a distinct viewpoint on the path, and it can make your experience more exciting and profound.

I have walked through the Toy tutorial to learn the MLIR and have found many things to know through the journey. This article aims to clarify the point I struggled to grasp the concept and usage of MLIR based on my experience.

This time I’m going to focus on creating the Memref in a pass in a custom MLIR Dialect.

What is Memref in MLIR?

In the first place, what is a Memref at all? FAQ part of the official documentation gives us a brief introduction of the Memref type.

You can have a memref (a buffer in memory) containing Vectors, but you can’t have a memref of a tensor type.

Looking at this description, Memref is a low-level concept more directly associated with the underlying hardware. It’s just a pointer to the memory location where the tensor data (or vector) is stored. Memref dialect provides the way to manipulate the allocation or layout of the field pointed by the memref type. For instance, memref.alloc enables us to allocate memory space enough for the given data type. The following code allocates the contiguous memory field for 2x3x64 bits.

%0 = memref.alloc() : memref<2x3xf64>

As we use malloc in C, it is practically vital to call the memory resource’s deallocation explicitly. We can free the space by calling memref.dealloc.

memref.dealloc %0 : memref<2x3xf64>

Of course, we need to embed the values into the memref. That can be done by memref.store or affine.store, which can recognize memref type.

affine.store %cst_3, %0[%c1, %c1] : memref<2x3xf64>

How can we create these IRs by using MLIR API?

Intialization Procedure

To make sure to call allocation and deallocation in a block, we get the block where the allocation is created. Block.front() and Block.back() provide us the correct location where allocation/deallocation pair should exist.

mlir::Location loc = ...
mlir::MemRefType type = ...
mlir::PatternRewriter rewriter = ...

auto alloc = rewriter.create<mlir::memref::AllocOp>(loc, type);
auto *parentBlock = alloc->getBlock();
auto dealloc = rewriter.create<mlir::memref::DeallocOp>(loc, alloc);

Store operation can be created as follows.

    rewriter.create<mlir::ConstantOp>(loc, 1.0),
    llvm::makeArrayRef([0, 0]));

loc is a location where this affine.store is created. alloc is a target memref type. The last argument specifies the index within the memref type where the value is assumed to be stored.

Finally, we will get the following MLIR code.

module  {
  func @main() {
    %0 = memref.alloc() : memref<2x3xf64>
    %c0 = constant 0 : index
    %cst = constant 1.000000e+00 : f64
    affine.store %cst, %0[%c0, %c0] : memref<2x3xf64>
    memref.dealloc %0 : memref<2x3xf64>

We can construct any procedure to initialize memref type by obeying this convention overall. Please visit my toy project named mlir-hello for more detail.


When to use describe/context/it in RSpec

The well-structured test suite helps check the necessary cases are covered at a glance. Developers looking into the code, later on, can quickly grasp which case they should add or modify. The famous unit testing framework provides us a way to organize the test cases in that manner.

RSpec is a de facto standard testing framework used in many Ruby projects. Although I have used RSpec in some projects, I did not fully understand how to describe, context, and it keyword correctly. These keywords are used just for representing the meaningless nested structure in my case. But that does not sound nice. Using these keywords properly leads us to inject an understandable form to the unit test written in RSpec. This article summarizes what we should think in writing RSpec test cases in terms of describe, context, and it use.

describe: Target Object

Let’s assume we have the following FizzBuzz class to be tested.

class FizzBuzz
  def self.run(n)
    if n % 3 == 0 && n % 5 == 0
    elsif n % 3 == 0
    elsif n % 5 == 0

We want to ensure that FizzBuzz works as expected with RSpec. The target object is an instance of FizzBuzz.

describe FizzBuzz do
  # Test cases

context: Precondition

context is a place to hold the condition that should be satisfied before running the test. It can be a type of input or precondition imposed on the target class. We put the type of input passed to the run method of FizzBuzz.

describe FizzBuzz do
  context '3-multiple' do
    # Test here

  context '5-multiple' do
    # Test here

  context '15-multiple' do
    # Test here

  context 'other' do
    # Test here

it: Expectation

We describe the expected output from the method or object in it (or example).

describe FizzBuzz do
  context '3-multiple' do
    it 'Get Fuzz' do
      expect(FuzzBuzz.run(3)).to eq('Fuzz')
      expect(FuzzBuzz.run(6)).to eq('Fuzz')

  context '5-multiple' do
    it 'Get Buzz' do
      expect(FuzzBuzz.run(5)).to eq('Buzz')
      expect(FuzzBuzz.run(10)).to eq('Buzz')

  context '15-multiple' do
    it 'Get FizzBuzz' do
      expect(FuzzBuzz.run(15)).to eq('FizzBuzz')
      expect(FuzzBuzz.run(30)).to eq('FizzBuzz')

  context 'other' do
    it 'Get original number' do
      expect(FuzzBuzz.run(4)).to eq(4)
      expect(FuzzBuzz.run(8)).to eq(8)

This guideline is so helpful to me for writing the well-structured test in RSpec. The background information behind the scene is explicit with this structure.

How to add new policy to IAM role by Terraform

Security management in a fine-grained manner is a critical component to deploy the enterprise application successfully. Terraform enables us to manage any resource on the cloud service by using the declarative language, HCL. If you are a software engineer providing any service on AWS like me, Terraform gives us the excellent capability and saves us time for sure. I have found a tiny tip to be shared here about the Terraform usage setting the IAM policy. This article aims to explain the use of aws_iam_role_policy and its potential limitations from the practical viewpoint.

Limitation of aws_iam_role_policy

We used aws_iam_role_policy to set the specific IAM policy to a role. It’s the most straightforward and easy way to attach a policy to the role you are managing. But there is a caveat to be noted. The resource can only create inline policy, which is not designed to be shared by multiple roles afterward.

Looking at the following list, you can notice that the policy attached to my-role does not have any name specified. Even if the policy is sufficiently general to be used by other roles, we have no way with aws_iam_role_policy.

resource "aws_iam_role" "my-role" {
 name = "my-role"

 assume_role_policy = <<EOF
  "Version": "2012-10-17",
  "Statement": [
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      "Effect": "Allow",
      "Sid": ""

resource "aws_iam_role_policy" "my-policy" {
 name = "my-policy"
 role = "${aws_iam_role.my-role.id}"

 # This policy is exclusively available by my-role.
 policy = <<-EOF
   "Version": "2012-10-17",
   "Statement": [
       "Sid": "AccessObject",
       "Effect": "Allow",
       "Action": [
      "Resource": [

Standalone policy with aws_iam_policy

Here comes the aws_iam_policy and aws_iam_role_policy_attachment resources. aws_iam_policy is a resource to create a standalone IAM policy. It’s almost the same as what aws_iam_role_policy does, but it does not attach the policy with any IAM entity such as users, roles, and groups. The policy is isolated and does not affect unless it is attached to the existing IAM entity. aws_iam_role_policy_attachment does that as the name implied. You can attach the existing policy to the existing IAM role. That indicates we can reuse the policy by attaching it to several roles.

resource "aws_iam_policy" "my-policy" {
 name = "my-policy"

 policy = <<EOF
  "Version": "2012-10-17",
  "Statement": [
      "Sid": "AccessObject",
      "Effect": "Allow",
      "Action": [
      "Resource": [

resource "aws_iam_role_policy_attachment" "my-policy-attach" {
  role = "${aws_iam_role.my-role.name}"
  policy_arn = "${aws_iam_policy.my-policy.arn}"

If you have another role named my-role-2, you can attach the my-policy again with the following code’s call.

resource "aws_iam_role_policy_attachment" "my-policy-attach-2" {
  role = "${aws_iam_role.my-role-2.name}"
  policy_arn = "${aws_iam_policy.my-policy.arn}"

That’s a handy way to reuse the existing policy component and be less error-prone because we can avoid rewriting the same policy repeatedly.


We have another resource that has a very similar name, aws_iam_policy_attachment. But we should be careful of the usage of this resource because it attaches the policy exclusively. Across the entire AWS account, only one IAM entity (i.e., users/roles/groups) can be declared by aws_iam_policy_attachement. That limitation is counterintuitive. Using aws_iam_role_policy_attachment will prevent us from wasting time digging deeper into what’s going on when facing an issue.


POST API by Lambda with serverless framework

Serverless is a kind of buzzword in recent years. It brings me a new concept of providing a web service without depending on the fixed amount of server machines (virtually), enabling us to build a more agile and flexible platform responding to changes faster.

Serverless Framework is one of the most notable framework implementing the concept, “serverless”. It supports a lot of major cloud service providers such as AWS, Azure. We can launch a new web-based service with minimal code writing abruptly.

I have created a web API providing a POST endpoint with serverless backed by AWS Lambda and API Gateway. But I needed a little investigation to do so. Therefore, those who are facing the requirement to provide POST API with lambda will find this useful. Here is the guide I would want to have before starting to develop an API.


serverless.yml is a central place controlling all configuration of the infrastructure managed by the serverless application. It specifies the name of the provider, environment variables, and so on.

service: myservice

  # Necessary to purge previous version
  - serverless-prune-plugin
  # Install all dependencies specified by requirements.txt
  - serverless-python-requirements

  name: aws
  runtime: python3.7
  stage: ${opt:stage, 'development'}
  region: us-east-1

custom field provides variables that likely change depending on the environment the application runs.

    - development
    - production
    development: variable_for_development
    production: variable_for_production
    dockerizePip: true
    # Specify the number of retained previous versions
    automatic: true
    number: 10

Function for POST

The function definition for the POST endpoint is easy to write.

    handler: handler.post_endpoint
      - http:
            path: myapp/post_endpoint
            method: post
      # Set the stage specific variable
      A_VARIABLE: ${self:custom.a_variable.${self:provider.stage}}

Since the POST endpoint parses the HTTP request body, there is no need to specify the required parameters in the config.

Handler Method

We can find the POST method in the handler code as follows.

def post_endpoint(event, context):
    print("A POST endpoint")
    # Obtain the body in JSON format
    body = json.loads(event["body"])

We can extract any parameters from the body like body['key']. Note that the validation of the parameter is the responsibility of the handler. The required parameter for the app may be missing in the body. Please make sure to check the existence of the parameter beforehand.

def get_or_none(key, body):
    if key in body:
        return body[key]
        return None

get_or_none('key', body)